Reputation: 31
What is the difference between Unicode and ASCII in terms of memory? How much memory does Unicode and ASCII take in memory? I know that in Unicode it depends on the encoding type (UTF-8, UTF-16 etc..). But I need deeper understanding!
Upvotes: 2
Views: 3913
Reputation: 1153
In short, ASCII uses 7 bit code points (ie 7 bits uniquely identifies every character) where as Unicode is defined using 21 bit code points (0hex to 10FFFFhex, defined as 17 planes of 65536 / 16 bits of characters yields 1,114,112 characters - the nearest power of two is 221). How much memory that uses depends on the way it is encoded in memory (not necessarily the same as the serialisation encoding used to externalise that data in files, typically one of UTF encodings for Unicode).
In practice ASCII is stored as one character per byte in RAM, and it is very rare to see pure ASCII, particularly outside of the USA - it is more common to see ISO8859-1 (an 8 bit encoding that is completely compatible with ASCII, but with other characters that use the extra bit that is available, eg for the £ and ¡ characters needed in some European countries).
Unicode is more complex, and representations vary considerably:
Joel's article is golden reading for this topic.
Upvotes: 2