Reputation: 3519
I and trying to get the same sorting order for Android and Linux and OSX. I am comparing the sort command results of Linux and OSX compared to so custom code on android that operate on a similar file set.
On Linux / OSX I use this command:
find {folder_name} -type f | sort
and in java / android I am using this - but the sorting orders do not align:
private Enumeration<InputStream> getSortedStreams(HashMap<String,InputStream> collection) {
Vector<InputStream> fileSreams = new Vector<>();
List<String> keys = new ArrayList(collection.keySet());
Collator collator = Collator.getInstance(Locale.US);//<<???
Collections.sort(keys,collator);
for (String key: keys) {
Log.d(TAG, "getSortedStreams: " + key);
fileSreams.add(collection.get(key));
}
return fileSreams.elements();
}
Android output:
1000/abc_d.txt
1000/abc-d.txt
OSX output:
1000/abc-d.txt
1000/abc_d.txt
I am assuming the differences are because of the locales used to sort the file list. From what I gather OSX and Linux are both POSIX compliant although Linux is not certified. Android is also not POSIX compliant but my guess it is fine with regards to sorting.
I have details below trying to make sense and to get a consistent experience across the platforms.
It seems that I can control both Linux and Android to align, but OSX is ignoring the environment variables I set.
I need specific help to set the locales so that I get a consistent results across the platforms.
I have not done tests on IOS yet, if required I can submit them.
More details:
On Fedora Core.
Test case: create two files with the following names in a directory named sort_test
sort_test/abc_d.txt
sort_test/abc-d.txt
On Fedora Linux Core 17 - 3.9.10-100.fc17.x86_64
locale -a for en_US is:
locale -a | grep en_US
en_US
en_US.iso88591
en_US.iso885915
en_US.utf8
USING C
find sort_test/ -type f | env -i LC_COLLATE=C sort
sort_test/abc-d.txt
sort_test/abc_d.txt
USING en_US.utf8
find sort_test/ -type f | env -i LC_COLLATE=en_US.utf8 sort
sort_test/abc_d.txt
sort_test/abc-d.txt
On OSX - seems to messed up, setting the locale has no effect:
local -a gives a list of locales, and the en_US locales are:
en_US
en_US.ISO8859-1
en_US.ISO8859-15
en_US.US-ASCII
en_US.UTF-8
USING C
find sort_test -type f | env -i LC_COLLATE=C sort
sort_test/abc-d.txt
sort_test/abc_d.txt
USING en_US.UTF-8
find sort_test -type f | env -i LC_COLLATE=en_US.UTF-8 sort
sort_test/abc-d.txt
sort_test/abc_d.txt
On Android I set the locale to use a POSIX locale:
Locale locale = new Locale("en", "US", "POSIX");<<< the fix
Collator collator = Collator.getInstance(locale);
Collections.sort(keys,collator);
for (String key: keys) {
Log.d(TAG, "getSortedStreams: " + key);
fileStreams.add(collection.get(key));
}
/1000/abc-d.txt
/1000/abc_d.txt
On Android I set the locale to US:
//Locale locale = new Locale("en", "US", "POSIX");
Collator collator = Collator.getInstance(Locale.US);
Collections.sort(keys,collator);
for (String key: keys) {
Log.d(TAG, "getSortedStreams: " + key);
fileStreams.add(collection.get(key));
}
/1000/abc_d.txt
/1000/abc-d.txt
LINUX locale variables are: locale command output:
LANG=en_US.UTF-8
LC_CTYPE=UTF-8
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=
OSX locale variables are: locale command output:
LANG=
LC_COLLATE="C"
LC_CTYPE="UTF-8"
LC_MESSAGES="C"
LC_MONETARY="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_ALL=
Upvotes: 2
Views: 387
Reputation: 3519
The solution that seems to work for me currently is to align all the operating systems with OSX.
Linux:
find sort_test -type f | env -i LC_COLLATE=C sort
OSX:
find sort_test -type f | env -i LC_COLLATE=C sort
Android:
Locale locale = new Locale("en", "US", "POSIX");<<< the fix
Collator collator = Collator.getInstance(locale);
Collections.sort(keys,collator);
Upvotes: 2