user3148326
user3148326

Reputation: 121

ASN1 UTF-8 string Decoding

I am working to make an ASN.1 parser in the C language (using the Ericsson ASN1 specification document). I want to decode the UTF-8 string type but I can't find information about this online, and the document I'm using does not describe UTF-8 string in detail. Can anybody provide me with some code, or explain how to decode it.

I am new to ASN.1.

Upvotes: 5

Views: 6123

Answers (2)

Norman Gray
Norman Gray

Reputation: 12514

If you're trying to parse ASN.1, then an excellent introductory resource is Kaliski's ‘Layman’s Guide’ (available at various places on the web, in HTML and PDF). However that document doesn't mention the UTF8String type.

The extra information you need to know is that UTF8String has tag 12 (decimal, or 0c in hex), and that it's encoded as a sequence of the bytes representing the string in the UTF-8 encoding.

Thus the string ‘Helló’ would be encoded as

0c 06 48 65 6c 6c c3 b3

(I'm presuming, by the way, that ‘Ericsson ASN1 specification document’ discusses the standard ASN.1, and not some variant.)

Upvotes: 9

hdante
hdante

Reputation: 8030

A full UTF-8 description, which allows you to write an encoder and a decoder is summarized in the table available in the Wikipedia page:

http://en.wikipedia.org/wiki/UTF-8#Description

Upvotes: -3

Related Questions