Reputation: 137
I read some text (known to be in ISO-8859-1) from a TCP socket using the read function, then I do some basic substring replacing. Finally I would like to convert the string into the GSM equivalent.
Preferably (but not necessarily) I would do something like this:
size_t i;
for (i=0; i<size; i++) {
switch (string[i]) {
case 65:
//Convert this character
case 163:
//Convert this character (the pound symbol £)
}
}
I prefer the switch for readability, but have considered if-else statements as well.
This works for the normal ASCII characters, but the top part of the ISO-8859-1 is causing me all kind of problems. Apparently they are considered as multiple characters. Any help on how to proceed with the conversion will be much appreciated.
Upvotes: 4
Views: 712
Reputation: 111399
In your case char
seems to be signed. You could use char literals and circumvent the whole issue with the sign of char values beyond ASCII 127:
/* ascii: */
case '\000': /* U+0000 - nul */
...
/* extended ascii: */
case '\200': /* U+0080 - non-printable control character */
...
case '\243': /* U+00A3 - sterling pound */
...
case '\377': /* U+00ff - lower case y with dieresis */
The conversion is probably more efficiently to implement by a look-up in an array, though.
If the "extended ASCII" part of your input is represented as multiple characters, it's likely that your input is actually encoded in UTF-8 or something similar.
Upvotes: 4