Reputation: 4433
Let's say I have a String containing non ASCII characters in Java and I want to use String.format()
to set it so the formatted String will have a minimum width regarding of the string's byte length.
String s = "æøå";
String.format(l, "%" + 10 + "s" , s);
This will result in a string with 7 leading white space.
But what I want is there's should be only 4 leading white space since the original string is 6 bytes in size.
This seems to be a common requirement so I would like to ask if there's any already-built class that can achieve this, or should I go to implement the Formattable
interface myself?
Upvotes: 3
Views: 1515
Reputation: 68962
String s ="æøå";
int size = s.getBytes("UTF8").length;
String.format("%" + (10 - size) + "s" , s);
Upvotes: 2
Reputation: 1500495
A string doesn't have a number of bytes - it has a number of characters. The number of bytes it takes to represent a string depends on the encoding you use. I don't know of anything built-in to do what you want in terms of the padding (I don't think it is that common a requirement). You can ask a CharsetEncoder
for the maximum and average number of bytes per character, but I don't see any way of getting the number of bytes for a particular string without basically doing the encoding:
Charset cs = Charset.forName("UTF-8");
ByteBuffer buffer = cs.encode("foobar");
int lengthInBytes = buffer.remaining();
If you're going to encode the string anyway, you might want to just perform the encoding, work out how much padding is required, then write the encoded padding out, then write the already-encoded text. It really depends on what you're doing with the data.
Upvotes: 5