Reputation:
How does a browser know what type of character encoding is used in a HTML page? I could specify, for example utf-8, in the html file, but how does a browser know it needs to use utf-8 before it reaches that string?
What if i specify utf-8 in the document but save the actual text file using a different encoding, what would the complications be? Thanks
Upvotes: 0
Views: 69
Reputation: 707
In order for UTF-8 to work, everything has to be encoded for UTF-8. So if you import a text file that is encoded differently, the web browser will not change the file's encoding. As a programmer, you'll have to either require that the uploaded file is encoded in UTF-8, or you'll have to convert the file's encoding to UTF-8.
Here are a few examples:
UTF-8 text is garbled when form is posted as multipart/form-data
how/unable to convert garbled/strange text to utf-8 android (java)?
Garbled UTF-8 characters in PHP
Lastly, I just came across this while searching for examples, this excellent article: The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)
Upvotes: 1