Thomas Walker
Thomas Walker

Reputation: 145

Check if java string contains unicode character

I'm trying to check to see if a string contains a specific unicode point from the Segoe MDL2 Assets font.

An example of a unicode value that I want to check for is

\uF14B

Here's where I'm grabbing my values from

https://learn.microsoft.com/en-us/windows/uwp/design/style/segoe-ui-symbol-font

How exactly can I check a string to see if it contains one of these values?

I have tried

        if (buttons[i].getText().contains("\uF14B")) {

            buttons[i].setFont(new Font("Segoe MDL2 Assets", Font.PLAIN, 15 )); 
        }

While this does work, I think that it's pretty ineffecient to have to copy and paste each and every value that I plan to use into a if statement.

Is there an easier way to do this?

Edit:

I ended up placing a ~ after each special character in my array, and parsed it like this. Are there any issues in doing this?

/** Creating the names of the buttons. */
String [] buttonNames = {

        "Lsh", "Rsh", "Or", "Xor", "Not","And",
        "\uE752~", "Mod", "CE", "C", "\uF149~", "\uE94A~",
        "A", "B", "\uF14D~", "\uF14E~", "\uE94F~", "\uE947~",
        "C", "D", "\uF14A~", "\uF14B~", "\uF14C~", "\uE949~",
        "E", "F", "\uF14A~", "\uF14B~", "\uF14C~", "\uE948~",
        "(", ")", "\uE94D~", "0", ".", "\uE94E~" 
        };

/** more code here */

if (buttons[i].getText().contains("~")) {

                buttons[i].setFont(new Font("Segoe MDL2 Assets", Font.PLAIN, 15 )); 
                buttons[i].setText(buttons[i].getText().substring(0, buttons[i].getText().lastIndexOf('~')));
            }

Upvotes: 3

Views: 9168

Answers (2)

Andreas
Andreas

Reputation: 159086

The best / easiest way to scan text to find certain characters is to use a regular expression character class.

A character class is written as [xxx] where xxx can be set of single characters, e.g. a or \uF14B, and/or ranges, e.g. a-z or \uE700-\uE71F.

So, you can write a regex like this:

[\uE700-\uE72E\uE730\uE731\uE734\uE735\uE737-\uE756]

and so on (that was about 10% of the code points list on the linked page).

The above can also be done using exclusion, i.e.

[\uE700-\uE756&&[^\uE72F\uE732\uE733\uE736]]

where the [^xxx] means "not any of these characters".

You then compile it and use it to check strings:

String regex = "[\uE700-\uE72E\uE730\uE731\uE734\uE735\uE737-\uE756]";
Pattern p = Pattern.compile(regex);

if (p.matcher(buttons[i].getText()).find()) {

Upvotes: 3

Joop Eggen
Joop Eggen

Reputation: 109547

You can invert the font selection logic:

The Font class has goodies like canDisplay and canDisplayUpTo. Javadoc:

public int canDisplayUpTo​(String str)

Indicates whether or not this Font can display a specified String. For strings with Unicode encoding, it is important to know if a particular font can display the string. This method returns an offset into the String str which is the first character this Font cannot display without using the missing glyph code. If the Font can display all characters, -1 is returned.

Upvotes: 3

Related Questions