name_masked
name_masked

Reputation: 9794

Why char[] performs better than String ?- Java

In reference to the link: File IO Tuning, last section titled "Further Tuning" where the author suggests using char[] to avoid generating String objects for n lines in the file, I need to understand how does

char[] arr = new char{'a','u','t','h', 'o', 'r'}

differ with

String s = "author"

in terms of memory consumption or any other performance factor? Isn't String object internally stored as a character array? I feel silly since I never thought of this before. :-)

Upvotes: 7

Views: 7037

Answers (5)

Parth Pithadia
Parth Pithadia

Reputation: 306

Here are few reasons which makes sense to believe that character array is better choice in Java than String:

Say for Storing the Password

1) Since Strings are immutable in Java, if you store password as plain text it will be available in memory until Garbage collector clears it and since String are used in String pool for reusability there is pretty high chance that it will be remain in memory for long duration, which pose a security threat.

Since any one who has access to memory dump can find the password in clear text and that's another reason you should always used an encrypted password than plain text.

Since Strings are immutable there is no way contents of Strings can be changed because any change will produce new String, while if you char[] you can still set all his element as blank or zero. So Storing password in character array clearly mitigates security risk of stealing password.

2) Java itself recommends using getPassword() method of JPasswordField which returns a char[] and deprecated getText() method which returns password in clear text stating security reason. Its good to follow advice from Java team and adhering to standard rather than going against it.

3) With String there is always a risk of printing plain text in log file or console but if use Array you won't print contents of array instead its memory location get printed. though not a real reason but still make sense.

For this simple program

String strPassword="Unknown";
char[] charPassword= new char[]{'U','n','k','n','o','w','n'};
System.out.println("String password: " + strPassword);
System.out.println("Character password: " + charPassword);

Output:

String password: Unknown
Character password: [C@110b053

That's all on why character array is better choice than String for storing passwords in Java. Though using char[] is not just enough you need to erase content to be more secure.

Hope this will help.

Upvotes: 2

seh
seh

Reputation: 15259

In Oracle's JDK a String has four instance-level fields:

  • A character array
  • An integral offset
  • An integral character count
  • An integral hash value

That means that each String introduces an extra object reference (the String itself), and three integers in addition to the character array itself. (The offset and character count are there to allow sharing of the character array among String instances produced through the String#substring() methods, a design choice that some other Java library implementers have eschewed.) Beyond the extra storage cost, there's also one more level of access indirection, not to mention the bounds checking with which the String guards its character array.

If you can get away with allocating and consuming just the basic character array, there's space to be saved there. It's certainly not idiomatic to do so in Java though; judicious comments would be warranted to justify the choice, preferably with mention of evidence from having profiled the difference.

Upvotes: 9

irreputable
irreputable

Reputation: 45433

The author didn't get the reason right. The real overhead in in.readLine() is the copying a char[] buffer when making a String out of it. The additional copying is the most damning cost when dealing with large data.

It is possible to optimize this within JDK so that the additional copying is not needed.

Upvotes: 2

Peter Brooks
Peter Brooks

Reputation: 1200

My answer is going to focus on other stack questions along this similar line, others have already posted more direct answers.

There have been other questions similar to this, advice seems to go along the lines of using StringBuilder.

If you're concerned with string concentenation this have a look at the performance as described here between three different implementations. With another stack post which can give you some additional pointers and examples you could try yourself to see the performance.

Upvotes: 1

Jon Skeet
Jon Skeet

Reputation: 1499800

In the example you've referred to, it's because there's only a single character array being allocated for the whole loop. It's repeatedly reading into that same array, and processing it in place.

Compare that with using readLine which needs to create a new String instance on each iteration. Each String instance will contain a few int fields and a reference to a char[] containing the actual data - so it would need two new instances per iteration.

I'd usually expect the differences to be insignificant (with a decent GC throwing away unused "young" objects very efficiently) compared with the IO involved in reading the data - assuming it's from disk - but I believe that's the point the author was trying to make.

Upvotes: 6

Related Questions