Reputation: 113
How can we remove duplicate elements from a list of String without considering the case for each word, for example consider below code snippet
String str = "Kobe Is is The the best player In in Basketball basketball game .";
List<String> list = Arrays.asList(str.split("\\s"));
list.stream().distinct().forEach(s -> System.out.print(s+" "));
This still gives the same output as below, which is obvious
Kobe Is is The the best player In in Basketball basketball game .
I need the result as follows
Kobe Is The best player In Basketball game .
Upvotes: 10
Views: 8511
Reputation: 476
The provided solution with TreeSet is elegant. but TreeSet also sorts the elements which makes the solution inefficient. The code below demonstrates how to implement it more efficiently using HashMap that gives precedence to the string that has more upper case letters
class SetWithIgnoreCase {
private HashMap<String, String> underlyingMap = new HashMap<>();
public void put(String str) {
String lowerCaseStr = str.toLowerCase();
underlyingMap.compute(lowerCaseStr, (k, v) -> (v == null) ? str : (compare(v, str) > 0 ? v : str));
}
private int compare(String str1, String str2) {
int upperCaseCnt1 = 0;
int upperCaseCnt2 = 0;
for (int i = 0; i < str1.length(); i++) {
upperCaseCnt1 += (Character.isUpperCase(str1.charAt(i)) ? 1 : 0);
upperCaseCnt2 += (Character.isUpperCase(str2.charAt(i)) ? 1 : 0);
}
return upperCaseCnt1 - upperCaseCnt2;
}
}
Upvotes: 0
Reputation: 4496
Here's a one-line solution that:
This solution makes use of the jOOλ library and its Seq.distinct(Function<T,U>)
method:
List<String> distinctWords = Seq.seq(list).distinct(String::toLowerCase).toList();
Result (when printed like in the question):
Kobe Is The best player In Basketball game .
Upvotes: 0
Reputation: 298429
Taking your question literally, to “remove duplicate strings irrespective of case from a list”, you may use
// just for constructing a sample list
String str = "Kobe Is is The the best player In in Basketball basketball game .";
List<String> list = new ArrayList<>(Arrays.asList(str.split("\\s")));
// the actual operation
TreeSet<String> seen = new TreeSet<>(String.CASE_INSENSITIVE_ORDER);
list.removeIf(s -> !seen.add(s));
// just for debugging
System.out.println(String.join(" ", list));
Upvotes: 13
Reputation: 47
The problem with the repeating string is that those don't occur in exact same case first word is Basketball
and other one is basketball
so both those are not the same ones. Capital B is there in first occurance. So what you can do is you can do the comparison of string into either lower case or UPPER CASE or you can do comparison ignoring case.
Upvotes: 0
Reputation: 56453
Here's a fun solution to get the expected result with the use of streams.
String result = Pattern.compile("\\s")
.splitAsStream(str)
.collect(Collectors.collectingAndThen(Collectors.toMap(String::toLowerCase,
Function.identity(),
(l, r) -> l,
LinkedHashMap::new),
m -> String.join(" ", m.values())));
prints:
Kobe Is The best player In Basketball game .
Upvotes: 3
Reputation: 97272
In case you only need to get rid of consecutive duplicates, you can use a regular expression. The regex below checks for duplicated words, ignoring case.
String input = "Kobe Is is The the best player In in Basketball basketball game .";
String output = input.replaceAll("(?i)\\b(\\w+)\\s+\\1\\b", "$1");
System.out.println(output);
Which outputs:
Kobe Is The best player In Basketball game .
Upvotes: 3
Reputation: 6742
Keeping your uppercase and removing lowercase:
String str = "Kobe Is is The the best player In in Basketball basketball game .";
List<String> list = Arrays.asList(str.split("\\s"));
for(int i = 1; i<list.size(); i++)
{
if(list.get(i).equalsIgnoreCase(list.get(i-1)))
{
// is lower case
if(list.get(i).toLowerCase().equals(list.get(i)))
{
list.set(i,"");
}
else
{
list.set(i-1, "");
}
}
}
list.stream().distinct().forEach(s -> System.out.print(s+" "));
Upvotes: 0
Reputation: 2805
if it's not a problem for you losing while print all the capital letters, you can do in this way
list.stream()
.map(String::toLowerCase)
.distinct()
.forEach(System.out::print)
Output:
kobe is the best player in basketball game .
Upvotes: 1