lhahn
lhahn

Reputation: 1241

Collections.sort() isn't sorting in the right order

I have this code in Java:

List<String> unSorted = new ArrayList<String>();
List<String> beforeHash = new ArrayList<String>();
String[] unSortedAux, beforeHashAux; 
String line = null;

BufferedReader reader = new BufferedReader(new FileReader("C:\\CPD\\temp0.txt"));
    while ((line = reader.readLine()) != null){
        unSorted.add(line);  
        beforeHash.add(line.split("#")[0]); 

    }
    reader.close();

    Collections.sort(beforeHash);
    beforeHashAux = beforeHash.toArray(new String[beforeHash.size()]);
    unSortedAux = unSorted.toArray(new String[unSorted.size()]);

    System.out.println(Arrays.toString(beforeHashAux));
    System.out.println(Arrays.toString(unSortedAux));

It reads a file named temp0.txt, which contains:

Carlos Magno#261
Mateus Carl#12
Analise Soares#151
Giancarlo Tobias#150

My goal is to sort the names in the string, without the string after "#". I am using beforeHash.add(line.split("#")[0]); to do this. The problem is that it reads correctly the file, but it sorts in the wrong order. The correspondent outputs are:

[Analise Soares, Giancarlo Tobias, Mateus Carl, Carlos Magno]
[Carlos Magno#261, Mateus Carl#12, Analise Soares#151, Giancarlo Tobias#150]

The first result is the "sorted" one, note that "Carlos Magno" comes after "Mateus Carl". I cannot find the problem in my code.

Upvotes: 0

Views: 294

Answers (1)

Jon Skeet
Jon Skeet

Reputation: 1500055

The problem is that "Carlos Magno" starts with a Unicode byte-order mark.

If you copy and paste your sample text ([Analise ... Carlos Magno]) into the Unicode Explorer you'll see that just before the "C" of Carlos Magno, you've got U+FEFF.

Basically, you'll need to strip that when reading the file. The easiest way to do this is just use:

line = line.replace("\ufeff", "");

... or check first:

if (line.startsWith("\ufeff")) {
    line = line.substring(1);
}

Note that you should really specify the encoding you want to use when opening the file - use a FileInputStream wrapped in an InputStreamReader.

Upvotes: 8

Related Questions