copenndthagen
copenndthagen

Reputation: 50732

UTF-8 character set

I have a form field which would allow up to 120 characters and also accept all UTF-8 unicode character set including special, numeric and Alpha to provide for i18ncharacters. It should ignore leading and trailing spaces

As I have mostly used limited ASCII set, I am not sure what UTF-8 would include.

Could you please guide me about the basic differences of the ASCII/UTF-8 and the complete character set which should be allowed given the above requirement.

Thank you.

Upvotes: 0

Views: 600

Answers (2)

Pavel Nikolov
Pavel Nikolov

Reputation: 9541

ASCII contains only 128 characters and the latest version of Unicode contains more than 109,000 characters covering 93 scripts.

http://en.wikipedia.org/wiki/ASCII - the full description about ASCII

http://en.wikipedia.org/wiki/Unicode - the wiki article about Unicode

http://unicode.org/charts/ - list of Unicode charts

Upvotes: 1

ziesemer
ziesemer

Reputation: 28687

Simply, UTF-8 is a superset of US-ASCII. Any character in ASCII can be represented in UTF-8, and using the same bit representations. UTF-8 is one representation of Unicode, that allows for representation of any currently defined character.

Upvotes: 0

Related Questions