Reputation: 33
i got a task to sort text file with some requirements:
what i have done: read file and copy everything in to List>, where every element of List is line from file stored in List here is the code:
public class ReadDataFile {
public static List<List<String>> readData(String fileName) throws IOException {
BufferedReader br = new BufferedReader(new FileReader(fileName + ".txt"));
List<List<String>> data = new ArrayList<List<String>>();
String line;
while (true) {
line = br.readLine();
if (line == null)
break;
List<String>lines = Arrays.asList(line.split("\t"));
data.add(lines);
System.out.println(lines);
}
br.close();
return data;
and writes data to another file:
public void writeToFile(String fileName) throws IOException {
FileWriter writer = new FileWriter(fileName);
List<List<String>> data = ReadDataFile.readData("input");
Collections.sort(data, new Comparator<List<String>>() {
@Override
public int compare(List<String> o1, List<String> o2) {
// TODO Auto-generated method stub
return o1.get(0).compareTo(o2.get(0));
}
});
for (List<String> lines : data) {
for (int i = 0; i < lines.size(); i++) {
writer.write(lines.get(i));
if (i < lines.size() - 1) {
writer.write("\t");
}
}
writer.write("\n");
}
writer.close();
}
the problem is that:
public int compare(List<String> o1, List<String> o2) {
// TODO Auto-generated method stub
return o1.get(0).compareTo(o2.get(0));
}
doesn`t sort correctly what i need.
there is example of input file:
-2.2 2 3 4 329 2
2.2 12345q 69 -afg
2.2 12345q 69 -asdf
-22 1234234 asdfasf asdgas
-22 11 abc
-22 -3 4
-1.1
qqqq 1.1
end expected output is:
-22 -3 4
-22 11 abc
-22 1234234 asdfasf asdgas
-2.2 2 3 4 329 2
-1.1
2.2 12345q 69 -afg
2.2 12345q 69 -asdf
qqqq 1.1
but, what i get is:
-1.1
-2.2 2 3 4 329 2
-22 -3 4
-22 11 abc
-22 1234234 asdfasf asdgas
2.2 12345q 69 -afg
2.2 12345q 69 -asdf
qqqq 1.1
the question is, how to write a proper sort? Thanks for the answers
Upvotes: 3
Views: 641
Reputation: 159086
Seems you want string values that are valid numbers to be sorted using number comparison. Since your example contains non-integer values, you can choose to do number comparisons using double
or BigDecimal
. Below code uses BigDecimal
so numbers of any size can be compared, without loss of precision, but it doesn't support the special values for "Infinite"
, "-Infinite"
, and "NaN"
, or the HexFloatingPointLiteral
format that Double.parseDouble()
supports.
Comparing a number to a string should sort number before string.
For comparing string vs. string, you can sort lexicographically, case-insensitively, or using a Collator
for locale-sensitive comparisons. Below code uses a Collator for the default locale.
Comparison will compare first value of list, and if equal will compared second value, and so forth. If one list is shorter, and lists are equal up to that point, the shorter list sorts first.
public final class NumberStringComparator implements Comparator<List<String>> {
private Collator collator = Collator.getInstance();
@Override
public int compare(List<String> r1, List<String> r2) {
for (int i = 0; ; i++) {
if (i == r1.size())
return (i == r2.size() ? 0 : -1);
if (i == r2.size())
return 1;
String v1 = r1.get(i), v2 = r2.get(i);
BigDecimal n1 = null, n2 = null;
try { n1 = new BigDecimal(v1); } catch (@SuppressWarnings("unused") NumberFormatException unused) {/**/}
try { n2 = new BigDecimal(v2); } catch (@SuppressWarnings("unused") NumberFormatException unused) {/**/}
int cmp = (n1 == null ? (n2 == null ? this.collator.compare(v1, v2) : 1) : (n2 == null ? -1 : n1.compareTo(n2)));
if (cmp != 0)
return cmp;
}
}
}
Test
String input = "-2.2\t2\t3\t4\t329\t2\n" +
"2.2\t12345q\t69\t-afg\n" +
"2.2\t12345q\t69\t-asdf\n" +
"-22\t1234234\tasdfasf\tasdgas\n" +
"-22\t11\tabc\n" +
"-22\t-3\t4\n" +
"-1.1\n" +
"qqqq\t1.1";
List<List<String>> data = new ArrayList<>();
try (BufferedReader in = new BufferedReader(new StringReader(input))) {
for (String line; (line = in.readLine()) != null; )
data.add(Arrays.asList(line.split("\t")));
}
data.sort(new NumberStringComparator());
data.forEach(System.out::println);
Output
[-22, -3, 4]
[-22, 11, abc]
[-22, 1234234, asdfasf, asdgas]
[-2.2, 2, 3, 4, 329, 2]
[-1.1]
[2.2, 12345q, 69, -afg]
[2.2, 12345q, 69, -asdf]
[qqqq, 1.1]
Upvotes: 2