lorife
lorife

Reputation: 387

How to find if a word has standard ASCII characters?

I'm using Delphi Rio 10.3.2 and I need to understand if a string just has ASCII characters or not.

Also, if it does not contains just ASCII characters I need to understand if it's Cyrillic.

I was thinking about unicode ranges..so

this is Latin:

https://jrgraphix.net/r/Unicode/0020-007F

and this is Cyrillic:

https://jrgraphix.net/r/Unicode/0400-04FF

But I do not know how to check for unicode ranges and also I don't know if it's a good way to achieve what I need.

Can anybody help? Thank you.

Upvotes: 3

Views: 771

Answers (2)

Rob Lambden
Rob Lambden

Reputation: 2293

TPerlRegEx is your friend!

If you haven't used these before don't panic.

I just found this 'SkillSprint' for using these - I haven't watched it but it's probably helfult to you.

There are also lots of tools online to help you test your RegEx syntax to see if it works. This link goes to one I have used myself (there are lots available).

function IsJustAscii(Input: String): Boolean
var
  pRegEx: TPerlRegEx;
begin
  pRegEx:=TPerlRegEx.Create;
  pegEx.RegEx:='^[\x20-\x7f]*$';           // Any number (including 0) of ascii characters 
  pRegEx.Subject:=Input;
  pRegEx.Options:=[preSingleLine, preMultiLine];
  Result:=pRegEx.Match;
  FreeAndNil(pRegEx);
end

function ContainsCyrillic(Input: String): Boolean
var
  pRegEx: TPerlRegEx;
begin
  pRegEx:=TPerlRegEx.Create;
  pegEx.RegEx:='[\x{0400}-\x{04ff}]+';    // one or more cyrillic characters
  pRegEx.Subject:=Input;
  pRegEx.Options:=[preSingleLine, preMultiLine];
  Result:=pRegEx.Match;
  FreeAndNil(pRegEx);
end

The first function checks that the whole string only includes Ascii characters (you may want to allow newlines, tabs, carriage return etc.)

The second function finds if there any Cyrillic characters in the string.

Upvotes: 1

David Heffernan
David Heffernan

Reputation: 612884

Step through the characters one by one and inspect their ordinal value. For example:

var
  c: char;
  str: string;
....
str := ...;
for c in str do
  if InRange(Ord(c), $0020, $007f) then
    // ASCII

Upvotes: 4

Related Questions