Kumar
Kumar

Reputation: 1277

how check if String has Full width character in java

Can anyone suggest me how to check if a String contains full width characters in Java? Characters having full width are special characters.

Full width characters in String:

abc@gmail.com

Half width characters in String:

[email protected]

Upvotes: 7

Views: 10966

Answers (6)

GenuineSounds
GenuineSounds

Reputation: 31

The ascii characters we care about start at \u0020 and continue to \u007F. The wide Unicode characters are conveniently in the same order starting at \uFF00 and ending at \uFF5F. They are offset from each other by 0xFF00 and -0x20; two simple hexadecimal numbers, so we can either use simple addition and subtraction or use bitwise operators. For this example I'll be using bitwise to help visualize the process.

Armed with this knowledge we can now convert them back and forth quite easily by using these two methods.

This will convert relevant ascii character to wide Unicode version.

public static char toWideCharacter(char c) {
    // Check if they are the characters that are convertible.
    if (c > 0x20 && c < 0x7F)
        // Remove the offset and make sure we set these bytes FF00
        return (char) (c - 0x20 | 0xFF00);
    else return c;
}

And this will converts wide Unicode characters back to their regular ascii.

public static char fromWideCharacter(char c) {
    // We can use this statement alone to check if they are wide characters.
    if (c > 0xFF00 && c < 0xFF5F)
        // Add the offset and mask the bits we want to keep.
        return (char) (c + 0x20 & 0xFF);
    else return c;
}

Extra credit:

Here is an example of generating a String of characters and converting them back and forth to check our fragile sanity.

StringBuilder regular   = new StringBuilder();
StringBuilder widened   = new StringBuilder();
StringBuilder converted = new StringBuilder();

for (int i = 0; i < 0x60; i++) {
    char c = (char) (0x20 + i);
    regular.append(c);
    widened.append(toWideCharacter(c));
    // Converting to wide and back will ensure parity.
    converted.append(fromWideCharacter(toWideCharacter(c)));
}

System.out.println("Regular:   " + regular);
System.out.println("Widened:   " + widened);
System.out.println("Converted: " + converted);

Upvotes: 0

hieunt
hieunt

Reputation: 11

half-width: 1 byte

full-width: > 1 byte (2,3,4.. byte)

-> compare: length of String == byte length

String strCheck = "abc@gmail.com";
if (str.length() != str.getBytes().length) {
    // is Full Width
} else {
    // is Half Width
}

Upvotes: 1

weston
weston

Reputation: 54801

I'm not sure if you are looking for any or all, so here are functions for both:

public static boolean isAllFullWidth(String str) {
    for (char c : str.toCharArray())
      if ((c & 0xff00) != 0xff00)
        return false;
    return true;
}

public static boolean areAnyFullWidth(String str) {
    for (char c : str.toCharArray())
      if ((c & 0xff00) == 0xff00)
        return true;
    return false;
}

As for your half width '.' and possible '_'. Strip them out first with a replace maybe:

String str="abc@gmail.com";

if (isAllFullWidth(str.replaceAll("[._]","")))
  //then apart from . and _, they are all full width

Regex

Alternatively if you want to use a regex to test, then this is the actual character range for full width:

[\uFF01-\uFF5E]

So the method then looks like:

public static boolean isAllFullWidth(String str) {
    return str.matches("[\\uff01-\\uff5E]*");
}

You can add your other characters to it and so not need to strip them:

public static boolean isValidFullWidthEmail(String str) {
    return str.matches("[\\uff01-\\uff5E._]*");
}

Upvotes: 5

Garima Gangwar
Garima Gangwar

Reputation: 3

use regular expression here. \W is used to check for non-word characters.

str will contain full width character if following statement return true:

boolean flag = str.matches("\\W");

Upvotes: 0

ByteHamster
ByteHamster

Reputation: 4951

You can try something like this:

public static final String FULL_WIDTH_CHARS = "AaBbCcDdEeFfGgHhIiJj"
                      + "KkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZz";

public static boolean containsFullWidthChars(String str) {
    for(int i = 0; i < FULL_WIDTH_CHARS.length(); i++) {
        if(str.contains(String.valueOf(FULL_WIDTH_CHARS.charAt(i)))) {
            return true;
        }
    }
    return false;
}

Upvotes: 2

Neeraj Jain
Neeraj Jain

Reputation: 7720

You can compare the UNICODE Since unicode for alphabets (a-z) is 97-122 , So you can easily diffrentiate between the two

String str="abc@gmail.com";
System.out.println((int)str.charAt(0));

for Input

abc@gmail.com

Output

65345

Upvotes: 2

Related Questions