Java 8 – Find duplicate elements in Stream

Find duplicate elements in the Stream

Introduction

When working with a collection of elements in Java, it is very common to have duplicate elements, and Java provides different APIs that we can use to solve the problem.

Java 8 Stream provides the functionality to perform aggregate operations on a collection, and one of the operations includes finding duplicate elements.

In this tutorial, we will see how to find duplicate elements in Stream in Java 8.

Create a custom object named Employee with fields id, firstName, and lastName and generate AllArgsConstructor, toString(), equals() and hashcode() methods.

When generating equals() and hashCode() ensure that you use field id to ensure that objects with the same hashcode() are stored in the same bucket.

We will use Employee class in our examples to create custom objects in a collection and remove duplicate objects using a Stream.

Using distinct()

Create a List of Strings and call the stream() method to return a Stream of our elements.

Call the distinct() method that returns a stream of the elements without duplicates.

Collect the elements to a List by calling the Collectors.toList() in the collect() method.

To view the duplicate elements, we just remove the elements in the new List from the original List, and we will be left with the duplicate elements.

Output:

Mercedes

Nissan

Ford

Using Collections.frequency()

Create a List of employee objects and call the stream() method to return their a Stream.

Call the filter() method, which accepts a single input and returns a boolean.

The input to the filter() method will be a Collections.frequency(), which returns a boolean, and to enable this, we have to pass the list and the element we want to check and a condition to check whether the element occurs more than once.

The frequency() method uses the equals() method to compare the objects and that is why we generated a hashcode because two objects are equal if they have the same hashcode.

The frequency() method throws a NullPointerException if the provided collection is null.

Collect the duplicate objects to a Set using Collectors.toSet() in the collect() method.

Output:

Employee{firstName=’john’, lastName=’doe’}
Employee{firstName=’mary’, lastName=’public’}

Using Collectors.toSet()

Create a List of String elements, a new HashSet that we will use to store the elements that are duplicates.

The default implementation of a Set does not support duplicate values, and adding a duplicate element leaves it unchanged and returns false.

Note that when you try to add a null element to the Set, it will throw a NullPointerException, and when an element is added successfully, it returns true, showing that it was not present in the Set.

Call the stream() method from the list, we just created to return a Stream of the elements.

Add a filter() method which filters elements which are not added to the Set and collects them to a Set using Collectors.toSet() method.

Output:

english

french

Using Collectors.toMap()

The Collectors.toMap() method returns a Map, and we can find the duplicate elements by counting the number of occurrences of the input arguments and storing them as values in the Map.

To achieve this, we need to pass a keyMapper, valueMapper and a mergingFunction to the method.

The keyMapper is a Function that we will use to produce the keys by using Function.identity(), which returns the input arguments.

The valueMapper is a Function that we will use to produce the values by using computer -> 1, which returns 1 for each occurrence.

The Integer::sum is a BinaryOperator representing our merging function that will perform an addition operation each time a 1 is returned when a duplicate of the mapped keys is found.

The result of the merge function will be stored as our values in the Map, and this will help us to identify the duplicate elements in a Stream though it does not provide us with the capability to remove them.

Output:

{Dell=1, Apple=2, IBM=1, HP=2}

Using Collectors.groupingBy()

The Collectors.groupingBy() method groups elements based on a classification function and returns the result in a Map.

The Map keys are a result of applying the classification function to the input elements, and the values contain the input elements, which map to the associated key under the classification function.

To achieve this we have to pass Function.identity() and Collectors.counting() respectively to the groupingBy().

Call entrySet() to return a Set view of the Map and use the stream() method again to filter the values that have a value greater than one using the filter() method.

Add a map() method and pass Map.Entry::getKey, which returns a Stream of the elements that are duplicate and collect the result to a set using the Collectors.toSet() method.

Output:

Employee{firstName=’john’, lastName=’doe’}
Employee{firstName=’mary’, lastName=’public’}

Conclusion

In this tutorial, you have learned how to find duplicate elements using Java 8 elements using dinstinct(), frequency(), Collectors.toSet(), Collectors.toMap(), and Collectors.groupingBy() methods.

Was this post helpful?

Leave a Reply

Your email address will not be published. Required fields are marked *