john
john

Reputation: 1125

Arrays sort misbehavior for Character array

String s5 = "peek";
int i[] = {12, 25, 7, 3, 45};
Arrays.sort(i);

for(int x : i) {
  System.out.print(x + ",");
}

Arrays.sort(s5.toCharArray());

System.out.println(s5); // expected here eekp, but showing peek

for(char c : s5.toCharArray()) {
  System.out.print(c + ",");  //expected here e,e,k,p , but showing p,e,e,k
}

Output:

3,7,12,25,45,peek
p,e,e,k,

for the line System.out.println(s5) I expected "eekp" but it's showing "peek".

for the line System.out.print(c + ",") I expected "e,e,k,p" but it's showing "p,e,e,k"

Arrays sort seems to work well for integers but not for character array, seems like am doing something wrong. could you please tell me?

Upvotes: 1

Views: 964

Answers (3)

vvs
vvs

Reputation: 1076

Change Arrays.sort(s5.toCharArray()); To char[] s6 = s5.toCharArray()): Arrays.sort(s6);

Then print the values of s6

toCharArray() gives you a new array from the source but doesn't modify the source. When you print you are printing the source array not the array containing the sorted char

Upvotes: 3

Basil Bourque
Basil Bourque

Reputation: 338326

The Answer by Horse is correct. Your code is discarding a newly created array immediately after sorting. You passed a new anonymous array to Arrays.sort, which sorted that new array, and then the new array went out-of-scope, as you neglected to put the array into a variable.

In addition, I can show code using Unicode code points rather than char.

Unicode code point numbers

The char type in Java is obsolete, unable to represent even half of the characters defined in Unicode. Instead, you should be working with the code point numbers, one number assigned to each character in the Unicode standard.

You can get a stream (IntStream) of those integer code point numbers by calling String#codePoints.

Here is some example code. As an input we use a string that contains a couple of emoji characters. Using char type, we incorrectly see four ? where we expect two character numbers. Then we use code points, and we correctly see a pair of numbers (128075, 128506) for the pair of emoji characters. Lastly, we use Java streams to sort the code point numbers, and collect the sorted numbers to build a new String object.

String input = "๐Ÿ‘‹๐Ÿ—บ Hello World.";
System.out.println( "input = " + input );

// Wrong way, using obsolete `char` type.
char[] chars = input.toCharArray();
System.out.println( "chars before sort: " + Arrays.toString( chars ) );  // Notice that we get *four* `?` where we expect *two* letters (each of two emoji).
Arrays.sort( chars );
System.out.println( "chars after sort: " + Arrays.toString( chars ) );

// Examining code points.
System.out.println( "Print the code point number for each letter in the string. Notice we get *two* larger numbers, one for each emoji." );
input.codePoints().forEach( System.out :: println );

// Sorting by code point numbers.
String sorted =
        input
                .codePoints()
                .sorted()
                .collect( StringBuilder :: new , StringBuilder :: appendCodePoint , StringBuilder :: append )
                .toString();
System.out.println( "sorted = " + sorted );

When run.

input = ๐Ÿ‘‹๐Ÿ—บ Hello World.
chars before sort: [?, ?, ?, ?,  , H, e, l, l, o,  , W, o, r, l, d, .]
chars after sort: [ ,  , ., H, W, d, e, l, l, l, o, o, r, ?, ?, ?, ?]
Print the code point number for each letter in the string. Notice we get *two* larger numbers, one for each emoji.
128075
128506
32
72
101
108
108
111
32
87
111
114
108
100
46
sorted =   .HWdellloor๐Ÿ‘‹๐Ÿ—บ

The first two characters in sorted are actually SPACE characters, originally found between the words in input.

Instead of the streams, we could use arrays.

int[] codePoints = input.codePoints().toArray();
System.out.println( "codePoints before sorting = " + Arrays.toString( codePoints ) );
Arrays.sort( codePoints );
System.out.println( "codePoints after sorting = " + Arrays.toString( codePoints ) );

When run. Notice how the 32 number twice represents the two SPACE characters.

codePoints before sorting = [128075, 128506, 32, 72, 101, 108, 108, 111, 32, 87, 111, 114, 108, 100, 46]
codePoints after sorting = [32, 32, 46, 72, 87, 100, 101, 108, 108, 108, 111, 111, 114, 128075, 128506]

For more info, read The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!).

Upvotes: 2

Thiyanesh
Thiyanesh

Reputation: 2360

Strings are immutable (mostly)

Arrays.sort(s5.toCharArray());

The above sorts a copy of the character array and does not modify the actual string value array.

The internal code of toCharArray in JDK 1.8

  char result[] = new char[value.length];
  System.arraycopy(value, 0, result, 0, value.length);

Store and sort the copy(new object in a reference)

char[] copy = s5.toCharArray();
Arrays.sort(copy);

Do not try unless for fun

  Field field = String.class.getDeclaredField("value");
  field.setAccessible(true);
  String s5 = "peek";
  Arrays.sort((char[]) field.get(s5));
  System.out.println(s5);

Upvotes: 4

Related Questions