Reputation: 121
I am working to make an ASN.1 parser in the C language (using the Ericsson ASN1 specification document). I want to decode the UTF-8 string type but I can't find information about this online, and the document I'm using does not describe UTF-8 string in detail. Can anybody provide me with some code, or explain how to decode it.
I am new to ASN.1.
Upvotes: 5
Views: 6123
Reputation: 12514
If you're trying to parse ASN.1, then an excellent introductory resource is Kaliski's ‘Layman’s Guide’ (available at various places on the web, in HTML and PDF). However that document doesn't mention the UTF8String
type.
The extra information you need to know is that UTF8String
has tag 12 (decimal, or 0c
in hex), and that it's encoded as a sequence of the bytes representing the string in the UTF-8 encoding.
Thus the string ‘Helló’ would be encoded as
0c 06 48 65 6c 6c c3 b3
(I'm presuming, by the way, that ‘Ericsson ASN1 specification document’ discusses the standard ASN.1, and not some variant.)
Upvotes: 9
Reputation: 8030
A full UTF-8 description, which allows you to write an encoder and a decoder is summarized in the table available in the Wikipedia page:
http://en.wikipedia.org/wiki/UTF-8#Description
Upvotes: -3