In my webpage, there is a form with multiple inputs. However, the input chars behave differently from the input "label" chars. I tried setting the file encoding to UTF-8 and UTF-8 +BOM (I'm using EditPlus). Using UTF-8: Using UTF-8 + BOM: The input chars come from a mysql database where the collation is utf8_unicode_ci (using phpmyadmin) so i don't know if that's the problem's source. Any ideas?

mysqlencodingutf-8character-encodingphpmyadmin

Reputation: 610

File enconding (UTF-8 not working properly)

In my webpage, there is a form with multiple inputs. However, the input chars behave differently from the input "label" chars. I tried setting the file encoding to UTF-8 and UTF-8 +BOM (I'm using EditPlus).

Using UTF-8:

enter image description here

Using UTF-8 + BOM:

enter image description here

The input chars come from a mysql database where the collation is utf8_unicode_ci (using phpmyadmin) so i don't know if that's the problem's source. Any ideas?

Upvotes: 0

Answers (2)

Correia JPV

Reputation: 610

solved it: Just changed the file enconding to "Western European (Windows) 1252" (using EditPlus) and now every character is correctly shown.

Upvotes: 0

deceze

Reputation: 522042

This means both pieces of data are not in the same encoding. If the file is interpreted as Latin-1 (or a similar encoding), you get the first result in which the data in the input field is valid (meaning it's Latin-1 encoded) but the label is wrong (meaning it's not Latin-1 encoded). When the file is interpreted as UTF-8, the label is correct (meaning it's UTF-8 encoded) but the data in the input field is wrong (meaning it's not UTF-8 encoded). If data shows up as the � UNICODE REPLACEMENT CHARACTER, it's a sure sign the document is being interpreted as a Unicode encoding (e.g. UTF-8), but the byte sequence is invalid.

I'll guess that the label is hardcoded in the file but the data in the input field comes from a database. In this case you need to set the connection encoding for the database to return UTF-8.

As to why the file is interpreted in Latin-1 without BOM and in UTF-8 with BOM: because the browser recognizes the BOM as signifying UTF-8, without it it defaults to Latin-1. You need to set the correct HTTP header to tell the browser what encoding the file is in, and get rid of the BOM.

Read these resources:

Upvotes: 1

File enconding (UTF-8 not working properly)

Answers (2)

Related Questions