Reputation: 97
I have a scenario where i have a list as below :
List<String> a1 = new ArrayList<String>();
a1.add("1070045028000");
a1.add("1070045028001");
a1.add("1070045052000");
a1.add("1070045086000");
a1.add("1070045052001");
a1.add("1070045089000");
I tried below to find duplicate elements but it will check whole string instead of partial string(first 10 digits).
for (String s:al){
if(!unique.add(s)){
System.out.println(s);
}
}
Is there any possible way to identify all duplicates based upon the first 10 digits of a number & then find the lowest strings by comparing from the duplicates & add in to another list?
Note: Also there will be only 2 duplicates with each 10 digit string code always!!
Upvotes: 1
Views: 528
Reputation: 5754
Here's another way to do it – construct a Set and store just the 10-digit prefix:
Set<String> set = new HashSet<>();
for (String number : a1) {
String prefix = number.substring(0, 10);
if (set.contains(prefix)) {
System.out.println("found duplicate prefix [" + prefix + "], skipping " + number);
} else {
set.add(prefix);
}
}
Upvotes: 0
Reputation: 49606
You may group by a (String s) -> s.substring(0, 10)
Map<String, List<String>> map = list.stream()
.collect(Collectors.groupingBy(s -> s.substring(0, 10)));
map.values()
would give you Collection<List<String>>
where each List<String>
is a list of duplicates.
{
1070045028=[1070045028000, 1070045028001],
1070045089=[1070045089000],
1070045086=[1070045086000],
1070045052=[1070045052000, 1070045052001]
}
If it's a single-element list, no duplicates were found, and you can filter these entries out.
{
1070045028=[1070045028000, 1070045028001],
1070045052=[1070045052000, 1070045052001]
}
Then the problem boils down to reducing a list of values to a single value.
[1070045028000, 1070045028001] -> 1070045028000
We know that the first 10 symbols are the same, we may ignore them while comparing.
[1070045028000, 1070045028001] -> [000, 001]
They are still raw String
values, we may convert them to numbers.
[000, 001] -> [0, 1]
A natural Comparator<Integer>
will give 0
as the minimum.
0
0 -> 000 -> 1070045028000
Repeat it for all the lists in map.values()
and you are done.
The code would be
List<String> result = map
.values()
.stream()
.filter(list -> list.size() > 1)
.map(l -> l.stream().min(Comparator.comparingInt(s -> Integer.valueOf(s.substring(10)))).get())
.collect(Collectors.toList());
Upvotes: 2
Reputation: 298183
A straight-forward loop solution would be
List<String> a1 = Arrays.asList("1070045028000", "1070045028001",
"1070045052000", "1070045086000", "1070045052001", "1070045089000");
Set<String> unique = new HashSet<>();
Map<String,String> map = new HashMap<>();
for(String s: a1) {
String firstTen = s.substring(0, 10);
if(!unique.add(firstTen)) map.put(firstTen, s);
}
for(String s1: a1) {
String firstTen = s1.substring(0, 10);
map.computeIfPresent(firstTen, (k, s2) -> s1.compareTo(s2) < 0? s1: s2);
}
List<String> minDup = new ArrayList<>(map.values());
First, we add all duplicates to a Map
, then we iterate over the list again and select the minimum for all values present in the map.
Alternatively, we may add all elements to a map, collecting them into lists, then select the minimum out of those, which have a size bigger than one:
List<String> minDup = new ArrayList<>();
Map<String,List<String>> map = new HashMap<>();
for(String s: a1) {
map.computeIfAbsent(s.substring(0, 10), x -> new ArrayList<>()).add(s);
}
for(List<String> list: map.values()) {
if(list.size() > 1) minDup.add(Collections.min(list));
}
This logic is directly expressible with the Stream API:
List<String> minDup = a1.stream()
.collect(Collectors.groupingBy(s -> s.substring(0, 10)))
.values().stream()
.filter(list -> list.size() > 1)
.map(Collections::min)
.collect(Collectors.toList());
Since you said that there will be only 2 duplicates per key, the overhead of collecting a List
before selecting the minimum is negligible.
The solutions above assume that you only want to keep values having duplicates. Otherwise, you can use
List<String> minDup = a1.stream()
.collect(Collectors.collectingAndThen(
Collectors.toMap(s -> s.substring(0, 10), Function.identity(),
BinaryOperator.minBy(Comparator.<String>naturalOrder())),
m -> new ArrayList<>(m.values())));
which is equivalent to
Map<String,String> map = new HashMap<>();
for(String s: a1) {
map.merge(s.substring(0, 10), s, BinaryOperator.minBy(Comparator.naturalOrder()));
}
List<String> minDup = new ArrayList<>(map.values());
Common to those solutions is that you don’t have to identify duplicates first, as when you want to keep unique values too, the task reduces to selecting the minimum when encountering a minimum.
Upvotes: 1
Reputation: 11934
While I hate doing your homework for you, this was fun. :/
public static void main(String[] args) {
List<String> al=new ArrayList<>();
al.add("1070045028000");
al.add("1070045028001");
al.add("1070045052000");
al.add("1070045086000");
al.add("1070045052001");
al.add("1070045089000");
List<String> ret=new ArrayList<>();
for(String a:al) {
boolean handled = false;
for(int i=0;i<ret.size();i++){
String ri = ret.get(i);
if(ri.substring(0, 10).equals(a.substring(0,10))) {
Long iri = Long.parseLong(ri);
Long ia = Long.parseLong(a);
if(ia < iri){
//a is smaller, so replace it in the list
ret.set(i, a);
}
//it was a duplicate, we are done with it
handled = true;
break;
}
}
if(!handled) {
//wasn't a duplicate, just add it
ret.add(a);
}
}
System.out.println(ret);
}
prints
[1070045028000, 1070045052000, 1070045086000, 1070045089000]
Upvotes: 0