Table of Contents
Introduction
When working with a collection of elements in Java, it is very common to have duplicate elements, and Java provides different APIs that we can use to solve the problem.
Java 8 Stream provides the functionality to perform aggregate operations on a collection, and one of the operations includes finding duplicate elements.
In this tutorial, we will see how to find duplicate elements in Stream in Java 8.
Create a custom object named Employee
with fields id
, firstName
, and lastName
and generate AllArgsConstructor
, toString()
, equals()
and hashcode()
methods.
When generating equals() and hashCode() ensure that you use field id
to ensure that objects with the same hashcode()
are stored in the same bucket.
We will use Employee class in our examples to create custom objects in a collection and remove duplicate objects using a Stream.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 |
package com.Java2Code; import java.util.Objects; public class FindDuplicateElementsUsingStream { public static void main(String[] args){ } } class Employee{ private int id; private String firstName; private String lastName; public Employee(int id, String firstName, String lastName) { this.id = id; this.firstName = firstName; this.lastName = lastName; } @Override public String toString() { return "Employee{" + "firstName='" + firstName + '\'' + ", lastName='" + lastName + '\'' + '}'; } @Override public boolean equals(Object o) { if (this == o) return true; if (o == null || getClass() != o.getClass()) return false; Employee employee = (Employee) o; return id == employee.id; } @Override public int hashCode() { return Objects.hash(id); } } |
Using distinct()
Create a List of Strings
and call the stream() method to return a Stream of our elements.
Call the distinct()
method that returns a stream of the elements without duplicates.
Collect the elements to a List by calling the Collectors.toList()
in the collect()
method.
To view the duplicate elements, we just remove the elements in the new List from the original List, and we will be left with the duplicate elements.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
package com.Java2Code; import java.util.ArrayList; import java.util.List; import java.util.stream.Collectors; public class FindDuplicateElementsUsingStream { public static void main(String[] args){ List<String> cars = new ArrayList<>( List.of( "Mercedes", "Toyota", "Nissan", "Volkswagen", "Ford", "Maclaren", "Mercedes", "Nissan", "Ford" ) ); List<String> distinctCars = cars.stream() .distinct() .collect(Collectors.toList()); for (String distinctCar : distinctCars) { cars.remove(distinctCar); } cars.forEach(System.out::println); } } |
Output:
Nissan
Ford
Further reading:
Using Collections.frequency()
Create a List of employee objects and call the stream()
method to return their a Stream.
Call the filter() method, which accepts a single input and returns a boolean.
The input to the filter()
method will be a Collections.frequency()
, which returns a boolean, and to enable this, we have to pass the list and the element we want to check and a condition to check whether the element occurs more than once.
The frequency()
method uses the equals()
method to compare the objects and that is why we generated a hashcode because two objects are equal if they have the same hashcode.
The frequency()
method throws a NullPointerException if the provided collection is null
.
Collect the duplicate objects to a Set
using Collectors.toSet()
in the collect()
method.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
package com.Java2Code; import java.util.Collections; import java.util.List; import java.util.Objects; import java.util.Set; import java.util.stream.Collectors; public class FindDuplicateElementsUsingStream { public static void main(String[] args){ List<Employee> employees = List.of( new Employee(1,"john","doe"), new Employee(2,"peter","parker"), new Employee(3,"mary","public"), new Employee(4,"charles","darwin"), new Employee(1,"john","doe"), new Employee(3,"mary","public") ); Set<Employee> duplicateEmployees = employees.stream() .filter(employee -> Collections.frequency(employees, employee) > 1) .collect(Collectors.toSet()); duplicateEmployees.forEach(System.out::println); } } |
Output:
Employee{firstName=’mary’, lastName=’public’}
Using Collectors.toSet()
Create a List of String
elements, a new HashSet
that we will use to store the elements that are duplicates.
The default implementation of a Set
does not support duplicate values, and adding a duplicate element leaves it unchanged and returns false
.
Note that when you try to add a null
element to the Set, it will throw a NullPointerException, and when an element is added successfully, it returns true
, showing that it was not present in the Set.
Call the stream()
method from the list, we just created to return a Stream of the elements.
Add a filter() method which filters elements which are not added to the Set
and collects them to a Set
using Collectors.toSet()
method.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
package com.Java2Code; import java.util.*; import java.util.stream.Collectors; public class FindDuplicateElementsUsingStream { public static void main(String[] args){ List<String> languages = List.of( "english", "chinese", "french", "spanish", "hindi", "english", "french" ); Set<String> uniqueLanguages = new HashSet<>(); Set<String> duplicateLanguages = languages.stream() .filter(language -> !uniqueLanguages.add(language)) .collect(Collectors.toSet()); duplicateLanguages.forEach(System.out::println); } } |
Output:
french
Using Collectors.toMap()
The Collectors.toMap()
method returns a Map, and we can find the duplicate elements by counting the number of occurrences of the input arguments and storing them as values in the Map.
To achieve this, we need to pass a keyMapper
, valueMapper
and a mergingFunction
to the method.
The keyMapper
is a Function
that we will use to produce the keys by using Function.identity()
, which returns the input arguments.
The valueMapper
is a Function
that we will use to produce the values by using computer -> 1
, which returns 1
for each occurrence.
The Integer::sum
is a BinaryOperator
representing our merging function that will perform an addition operation each time a 1
is returned when a duplicate of the mapped keys is found.
The result of the merge function will be stored as our values in the Map, and this will help us to identify the duplicate elements in a Stream though it does not provide us with the capability to remove them.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
package com.Java2Code; import java.util.*; import java.util.function.Function; import java.util.stream.Collectors; public class FindDuplicateElementsUsingStream { public static void main(String[] args){ List<String> computers = List.of( "Dell", "HP", "IBM", "Apple", "HP", "Apple" ); Map<String, Integer> computerOccurrences = computers.stream() .collect(Collectors .toMap(Function.identity(), computer -> 1, Integer::sum)); System.out.println(computerOccurrences); } } |
Output:
Using Collectors.groupingBy()
The Collectors.groupingBy()
method groups elements based on a classification function and returns the result in a Map.
The Map keys are a result of applying the classification function to the input elements, and the values contain the input elements, which map to the associated key under the classification function.
To achieve this we have to pass Function.identity()
and Collectors.counting()
respectively to the groupingBy()
.
Call entrySet()
to return a Set view of the Map and use the stream()
method again to filter the values that have a value greater than one using the filter()
method.
Add a map()
method and pass Map.Entry::getKey
, which returns a Stream of the elements that are duplicate and collect the result to a set using the Collectors.toSet()
method.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 |
package com.Java2Code; import java.util.*; import java.util.function.Function; import java.util.stream.Collectors; public class FindDuplicateElementsUsingStream { public static void main(String[] args){ List<Employee> employees = List.of( new Employee(1,"john","doe"), new Employee(2,"peter","parker"), new Employee(3,"mary","public"), new Employee(4,"charles","darwin"), new Employee(1,"john","doe"), new Employee(3,"mary","public") ); Set<Employee> duplicateEmployees= employees.stream() .collect(Collectors .groupingBy(Function.identity(), Collectors.counting())) .entrySet() .stream() .filter(employee -> employee.getValue() > 1) .map(Map.Entry::getKey) .collect(Collectors.toSet()); duplicateEmployees.forEach(System.out::println); } } |
Output:
Employee{firstName=’mary’, lastName=’public’}
Conclusion
In this tutorial, you have learned how to find duplicate elements using Java 8 elements using dinstinct()
, frequency()
, Collectors.toSet()
, Collectors.toMap()
, and Collectors.groupingBy()
methods.