Safran Ali
Safran Ali

Reputation: 4497

How to check if user input data is in other than English language?

I am using Facebook API in my app to do the user authentication and then saves the user data into DB. And I am using same (i.e. facebook) username for my app if it exist else I create the username using name, but the problem is that some user's don't have their display name in English. So how can I check for such input at server side?

My app is written in Asp.net.

Upvotes: 1

Views: 693

Answers (3)

Cosmin
Cosmin

Reputation: 2385

You can use regular expressions to check if the characters are only a, b, c...z or A, B, C...Z:

using System.Text.RegularExpressions;

Regex rgx = new Regex("^[a-zA-Z]+$");

if (rgx.IsMatch(inputData))
   // input data is in English alphabet; take appropriate action...
else
   // input data is not in English alphabet; take appropriate action...

Upvotes: 5

cwallenpoole
cwallenpoole

Reputation: 82088

Your problem isn't that the usernames are in a foreign language, but rather that you are trying to store data into a database without using the appropriate character encoding (the only reason I've ever seen those ??? is when character encoding was at least one level too low for the current problem).

At a minimum, you should be using utf-8, but you probably want to use utf-16 (or even utf-32 if you're being really conservative). I also recommend this mandatory reading.


Determining whether a username is in English or not is impossible. There are too many possible variants on proper nouns to be able to provide any reliable metric. Then there are transplanted names and the like. You can try to detect if there are non-ASCII characters (I believe /[^ -~]/ should match all of them — space is the lowest "typeable" character in ASCII, ~ is the highest), but then you are compensating for the unicode problem instead of letting the computer handle that gracefully.

Upvotes: 0

Juicy Scripter
Juicy Scripter

Reputation: 25938

It may be overkill for this task but correct way to detect input language is using something like Extended Linguistic Services APIs or services like Free Language Detection API

In your case I suggesting saving user names in appropriate encoding (like utf-8 or utf-16, which should be fine for user names on Facebook)

Upvotes: 1

Related Questions