Reputation: 1277
Can anyone suggest me how to check if a String
contains full width characters in Java
? Characters having full width are special characters.
Full width characters in String:
abc@gmail.com
Half width characters in String:
[email protected]
Upvotes: 7
Views: 10966
Reputation: 31
The ascii characters we care about start at \u0020 and continue to \u007F. The wide Unicode characters are conveniently in the same order starting at \uFF00 and ending at \uFF5F. They are offset from each other by 0xFF00 and -0x20; two simple hexadecimal numbers, so we can either use simple addition and subtraction or use bitwise operators. For this example I'll be using bitwise to help visualize the process.
Armed with this knowledge we can now convert them back and forth quite easily by using these two methods.
This will convert relevant ascii character to wide Unicode version.
public static char toWideCharacter(char c) {
// Check if they are the characters that are convertible.
if (c > 0x20 && c < 0x7F)
// Remove the offset and make sure we set these bytes FF00
return (char) (c - 0x20 | 0xFF00);
else return c;
}
And this will converts wide Unicode characters back to their regular ascii.
public static char fromWideCharacter(char c) {
// We can use this statement alone to check if they are wide characters.
if (c > 0xFF00 && c < 0xFF5F)
// Add the offset and mask the bits we want to keep.
return (char) (c + 0x20 & 0xFF);
else return c;
}
Extra credit:
Here is an example of generating a String of characters and converting them back and forth to check our fragile sanity.
StringBuilder regular = new StringBuilder();
StringBuilder widened = new StringBuilder();
StringBuilder converted = new StringBuilder();
for (int i = 0; i < 0x60; i++) {
char c = (char) (0x20 + i);
regular.append(c);
widened.append(toWideCharacter(c));
// Converting to wide and back will ensure parity.
converted.append(fromWideCharacter(toWideCharacter(c)));
}
System.out.println("Regular: " + regular);
System.out.println("Widened: " + widened);
System.out.println("Converted: " + converted);
Upvotes: 0
Reputation: 11
half-width: 1 byte
full-width: > 1 byte (2,3,4.. byte)
-> compare: length of String == byte length
String strCheck = "abc@gmail.com";
if (str.length() != str.getBytes().length) {
// is Full Width
} else {
// is Half Width
}
Upvotes: 1
Reputation: 54801
I'm not sure if you are looking for any or all, so here are functions for both:
public static boolean isAllFullWidth(String str) {
for (char c : str.toCharArray())
if ((c & 0xff00) != 0xff00)
return false;
return true;
}
public static boolean areAnyFullWidth(String str) {
for (char c : str.toCharArray())
if ((c & 0xff00) == 0xff00)
return true;
return false;
}
As for your half width '.'
and possible '_'
. Strip them out first with a replace maybe:
String str="abc@gmail.com";
if (isAllFullWidth(str.replaceAll("[._]","")))
//then apart from . and _, they are all full width
Alternatively if you want to use a regex to test, then this is the actual character range for full width:
[\uFF01-\uFF5E]
So the method then looks like:
public static boolean isAllFullWidth(String str) {
return str.matches("[\\uff01-\\uff5E]*");
}
You can add your other characters to it and so not need to strip them:
public static boolean isValidFullWidthEmail(String str) {
return str.matches("[\\uff01-\\uff5E._]*");
}
Upvotes: 5
Reputation: 3
use regular expression here. \W is used to check for non-word characters.
str will contain full width character if following statement return true:
boolean flag = str.matches("\\W");
Upvotes: 0
Reputation: 4951
You can try something like this:
public static final String FULL_WIDTH_CHARS = "AaBbCcDdEeFfGgHhIiJj"
+ "KkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZz";
public static boolean containsFullWidthChars(String str) {
for(int i = 0; i < FULL_WIDTH_CHARS.length(); i++) {
if(str.contains(String.valueOf(FULL_WIDTH_CHARS.charAt(i)))) {
return true;
}
}
return false;
}
Upvotes: 2
Reputation: 7720
You can compare the UNICODE
Since unicode for alphabets (a-z) is 97-122
, So you can easily diffrentiate between the two
String str="abc@gmail.com";
System.out.println((int)str.charAt(0));
for Input
abc@gmail.com
Output
65345
Upvotes: 2