Alec Smart
Alec Smart

Reputation: 95880

Javascript find if english alphabets only

Am trying to find some text only if it contains english letters and numbers using Javascript/jQuery.

Am wondering what is the most efficient way to do this? Since there could be thousands of words, it should be as fast as possible and I don't want to use regex.

 var names[0] = 'test';
 var names[1] = 'हिन';
 var names[2] = 'لعربية';

 for (i=0;i<names.length;i++) {
    if (names[i] == ENGLISHMATCHCODEHERE) {
        // do something here
    }
 }

Thank you for your time.

Upvotes: 13

Views: 42091

Answers (5)

vsync
vsync

Reputation: 130065

Iterate each character in the string and check if the key code is not between 65 and 122, which are the Latin alphabet, lowercase and uppercase.

If wished to add punctuations characters, add their keyCode to the check.

function isLatinString(s) {
  var i, charCode;
  for (i = s.length; i--;) {
    charCode = s.charCodeAt(i)
    if (charCode < 65 || charCode > 122)
      return charCode
  }
  return true
}

// tests
[
  "abxSDSzfgr", 
  "aAzZ123dsfsdfעחלעלחי", 
  "abc!", 
  "$abc", 
  "123abc",
  " abc"
]
.forEach(s => console.log(   isLatinString(s), s   ))

Another way, using an explicit whitelist string to allow specific charatcers:

function isLatinString(s){
  var c, whietlist = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
  for( c in s ) // get each character in the argument string
    // if whitelist string doesn't include the character, break
    if( !whietlist.includes(s[c].toUpperCase()) ) 
      return false
  return true
}

// tests
[
  "abCD", 
  "aAאב", 
  "abc!", 
  "$abc", 
  "1abc",
  " abc"
]
.forEach(s => console.log(   isLatinString(s), s   ))

Upvotes: 5

T.J. Crowder
T.J. Crowder

Reputation: 1074038

If you're dead set against using regexes, you could do something like this:

// Whatever valid characters you want here
var ENGLISH = {};
"abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789".split("").forEach(function(ch) {
    ENGLISH[ch] = true;
});

function stringIsEnglish(str) {
    var index;

    for (index = str.length - 1; index >= 0; --index) {
        if (!ENGLISH[str.substring(index, index + 1)]) {
            return false;
        }
    }
    return true;
}

Live Example:

// Whatever valid characters you want here
var ENGLISH = {};
"abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789".split("").forEach(function(ch) {
    ENGLISH[ch] = true;
});

function stringIsEnglish(str) {
    var index;

    for (index = str.length - 1; index >= 0; --index) {
        if (!ENGLISH[str.substring(index, index + 1)]) {
            return false;
        }
    }
    return true;
}

console.log("valid", stringIsEnglish("valid"));
console.log("invalid", stringIsEnglish("invalid!"));

...but a regex (/^[a-z0-9]*$/i.test(str)) would almost certainly be faster. It is in this synthetic benchmark, but those are often unreliable.

Upvotes: 5

mhd196
mhd196

Reputation: 31

You should consider words that may contain special characters. For example {it's}, isn't it english?

Upvotes: 2

raveren
raveren

Reputation: 18523

Using regex is the fastest way to do this I'm afraid. This to my knowledge should be the fastest algorithm:

var names = 'test',
var names[1] = 'हिन';
var names[2] = 'لعربية';

//algorithm follows
var r = /^[a-zA-Z0-9]+$/,
    i = names.length;

while (--i) {
    if (r.test(names[i])) {
        // do something here
    }
}

Upvotes: 1

Pointy
Pointy

Reputation: 413702

A regular expression for this might be:

var english = /^[A-Za-z0-9]*$/;

Now, I don't know whether you'll want to include spaces and stuff like that; the regular expression could be expanded. You'd use it like this:

if (english.test(names[i])) // ...

Also see this: Regular expression to match non-English characters?

edit my brain filtered out the "I don't want to use a regex" because it failed the "isSilly()" test. You could always check the character code of each letter in the word, but that's going to be slower (maybe much slower) than letting the regex matcher work. The built-in regular expression engine is really fast.

When you're worried about performance, always do some simple tests first before making assumptions about the technology (unless you've got intimate knowledge of the technology already).

Upvotes: 35

Related Questions