Kasper Lethan
Kasper Lethan

Reputation: 137

Converting an ISO-8859-1 string to a GSM string in C

I read some text (known to be in ISO-8859-1) from a TCP socket using the read function, then I do some basic substring replacing. Finally I would like to convert the string into the GSM equivalent.

Preferably (but not necessarily) I would do something like this:

size_t i;
for (i=0; i<size; i++) {
  switch (string[i]) {
    case 65:
      //Convert this character
    case 163:
      //Convert this character (the pound symbol £)
  }
}

I prefer the switch for readability, but have considered if-else statements as well.

This works for the normal ASCII characters, but the top part of the ISO-8859-1 is causing me all kind of problems. Apparently they are considered as multiple characters. Any help on how to proceed with the conversion will be much appreciated.

Upvotes: 4

Views: 712

Answers (1)

Joni
Joni

Reputation: 111399

In your case char seems to be signed. You could use char literals and circumvent the whole issue with the sign of char values beyond ASCII 127:

/* ascii: */
case '\000': /* U+0000 - nul */
...    
/* extended ascii: */
case '\200': /* U+0080 - non-printable control character */
...
case '\243': /* U+00A3 - sterling pound */
...
case '\377': /* U+00ff - lower case y with dieresis */

The conversion is probably more efficiently to implement by a look-up in an array, though.

If the "extended ASCII" part of your input is represented as multiple characters, it's likely that your input is actually encoded in UTF-8 or something similar.

Upvotes: 4

Related Questions