user301016
user301016

Reputation: 2237

How to find a string which is case-sensitive and ignore it in JAVA

I have a text file (T1.txt) in which it has few strings.out of them 2 are similar but case-sensitive. I have to ignore the other one and get the rest of them..

e.g.. ABCD, XYZ, pqrs, aBCd.

i am using Set to return the strings.. but how I can ignore the duplicate and return only one string( either of ABCD , aBCd).

public static Set findDuplicates(File inputFile)
{
 FileInputStream fis = null;
    BufferedInputStream bis = null;
    DataInputStream dis = null;
    Set<String> set = new HashSet<String>();
    ArrayList<String> inpArrayList = new ArrayList<String>();

    try{

        fis = new FileInputStream(inputFile);

        bis = new BufferedInputStream(fis);
        dis = new DataInputStream(bis);

        while (dis.available() != 0) 
        {
           inpArrayList.add(dis.readLine());
        }

         for(int i=0; i < inpArrayList.size(); i++)
         {
             if(!set.contains(inpArrayList.get(i)))
                set.add(inpArrayList.get(i));
        }

    }
    catch (FileNotFoundException e) {
  e.printStackTrace();
} catch (IOException e) {
  e.printStackTrace();
}
System.out.println(" set" +  set);
return set;        
}

The returning set shall contain only XYZ, pqrs, aBCd or ABCD. but not both.

Thanks Ramm

Upvotes: 0

Views: 276

Answers (9)

perp
perp

Reputation: 3963

You could use a TreeSet and the String.CASE_INSENSITIVE_ORDER comparator, which I find more elegant than the suggested HashMap solutions:

Set<String> set = new TreeSet<String>(String.CASE_INSENSITIVE_ORDER);
set.add("abc");
set.add("AbC");
set.add("aBc");
set.add("DEF");
System.out.println(set); // => "[abc, DEF]"

Note that iteration through this set would give you the keys in lexicographical order. If you want to preserve the insertion order as well, I'd maintain a List on the side like this:

Set<String> set = new TreeSet<String>(String.CASE_INSENSITIVE_ORDER);
List<String> inOrder = new ArrayList<String>();
// when adding stuff inside your loop:
if (set.add(someString)) { // returns true if it was added to the set
    inOrder.add(someString);
}

Upvotes: 2

Michael Rutherfurd
Michael Rutherfurd

Reputation: 14045

If the case of the output is not important you could use a custom FilterInputStream to do the conversion.

    bis = new BufferedInputStream(fis);
    fltis = new LowerCaseInputStream(bis);
    dis = new DataInputStream(fltis);

An example of LowerCaseInputStream comes from here.

Upvotes: 0

Shash316
Shash316

Reputation: 2218

How about using HashMap (HashMap), with key being generated by a your hash function. The hash function would return the string in lowercase.

Shash

Upvotes: 0

Lior Ohana
Lior Ohana

Reputation: 3527

Just as said above, I did something similar earlier this week. You can do something like (just adjust it to your code):

HashMap<String, String> set = new HashMap<String, String>();

while(tokenzier.hasMoreTokens())
{
    String element = tokenzier.nextToken();
    String lowerCaseElement = element.toLowerCase();
    if (!set.containsKey(element)
    {
       set.put(lowerCaseElement, element);
    }
}

At the end the map 'set' will contain what you need.

Upvotes: 0

ngesh
ngesh

Reputation: 13501

inpArrayList.add(dis.readLine().toLowerCase());

adding this line should work...

Upvotes: 1

Derek Li
Derek Li

Reputation: 3111

Create a hash-map, use currentString.toLowerCase() as key, and original string as value. So that two string with different case will have the same key. When storing it, you use the original string as value, so when printing you won't get all lower-case but one of the original.

Upvotes: 2

Tristan
Tristan

Reputation: 9111

Just store your strings in upcase in your set, before storing them in your ArrayList result.

If you can't add a string to the set (because it already exists), don't store it in the ArrayList.

Upvotes: 0

Kilian Foth
Kilian Foth

Reputation: 14336

Convert every string to lowercase before inserting it into the set, and then the set will take care of the uniqueness for you.

(If you also need to preserve the case of the input (returning abcd for AbCd is not acceptable), then you need a second set that stores lower-case variants and use checks on the second set to decide whether or not to add strings to the result set. Same principle, but one more step to program.)

Upvotes: 0

Tom
Tom

Reputation: 44801

You can use the old trick of calling .toLower() before putting it in the set.

And if you want to keep the original case change to a hashmap from the lower case to the natural case then iterate the values.

Upvotes: 0

Related Questions