knowledge
knowledge

Reputation: 1015

ASCII - (encoded) character set or character encoding

is ASCII a (encoded) character set or an encoding? Some sources say its an (7-Bit) encoding others say its a character set.

Whats correct?

Upvotes: 1

Views: 1216

Answers (3)

Tom Blodget
Tom Blodget

Reputation: 20782

You could say that ASCII is a character set that has two encodings: a 7-bit one called ASCII and an 8-bit one called ASCII.

The 7-bit one was sometimes paired with a parity bit scheme when text was sent over unreliable transports. Today, error detection and correction is handled on a separate layer so only the 8-bit encoding is used.

Terms change over time as concepts evolve and convolve. "Character" is currently a very ambiguous term. People often mean grapheme when they say character. Or they mean a particular data type in a specific language.

"ASCII" is a genericized brand and leads to a lot of confusion. The ASCII that I've described above is only used in very specialized contexts.

Upvotes: 0

Harry
Harry

Reputation: 1263

It looks like your question can currently not be answered correctly as "character set" is not defined properly.

https://en.wikipedia.org/wiki/Category:Character_sets The category of character sets includes articles on specific character encodings (see the article for a precise definition, and for why the term "character set" should not be used).

Edit: in my opintion ascii can only bee seen as an encoding, or better code-page. see for example microsoft listing of codepages: 20127 us-ascii 65001 utf-8

Upvotes: -1

Jon Hanna
Jon Hanna

Reputation: 113242

It's an encoding, that only supports a certain set of characters.

Once upon a time, when computers or operating systems would often only support a single encoding it was sensible to refer to the set of characters it supported as a character set for obvious enough reasons.

From 1963 on, ASCII was a commonly-supported character set, and many other character sets where either variations on it, or 8-bit extensions of it.

But as well as defining a set of characters, it also assigned numerical values, so it was a coded character set.

And since it provides a number to each character it also provides a way to store those characters in sequences of bytes, as long as the byte-size is 7-bits or higher, it hence also defined an encoding.

So ASCII was used both to refer to the set of characters it supported, and the encoding rules by which those characters would be stored digitally.

These days most computers use the Universal Character Set. While there are encodings (UTF-8 and UTF-16 being the most prevalent) that can encode the entire UCS, there remain some uses for legacy encodings like ASCII that can only encode a small number.

So, ASCII can refer both to an encoding and the set of characters it supports, but in remaining modern use (especially in cases where an escape mechanism allows for other characters to be indirectly represented, such as character entity references) it's mostly referred to as an encoding. Conversely though character set (or the abbreviation charset) is sometimes still used to refer to encodings. So in common parlance the two are synonyms, as unfortunate (as technically inaccurate) as that may be.

Upvotes: 3

Related Questions