Reputation: 299
I am fetching emails from a mail server and converting the message to UTF-8 charset and save it in DB.To convert the charset I am using mb_convert_encoding but it fails to convert gb2312
and ks_c_5601-1987
. On googling I found that instead of gb2312
I can use CP936
and for ks_c_5601-1987
use CP949
.
Going by the above approach it would mean to maintain a separate list of charset mappings in my code. Is there a way to normalize names of encodings to names internally supported by PHP hence eliminating the need to maintain any map locally?
Upvotes: 7
Views: 3265
Reputation: 17336
According to the list of supported character encodings there are only a small number of encodings listed explicitly by code page. Given the small number of these cases - whilst not a built-in normalisation as requested - a list of mappings may not be too inappropriate.
The relevant ones appear to be the following (the lowercase name on the right is the name you'll need to convert from):
The following are also listed by code-page on the PHP documentation but appear to have suitable synonyms already:
Upvotes: 2