Reputation: 1589
I have an application which has a always worked with no issues. Fast forward to today: all formatting is broken. Basically I am inserting a plain text emails to mysql db, something that has worked for more than 5 years because nothing has changed. In my php code the plain text looked like this:
hello [name],
How are you?
This is a test.
Thank you.
Ceo
Today I looked at the same php code containing the email, so this is just sitting there, like a file. Then I look at existing plain text of the email which has always been in the database and they both look like this:
hello [name],\r\n\r\n�How are you?\r\n\r\n�This is a test.\r\n\r\n�Thank you.\r\n\r\n�
Ceo
Now before I pull all my hair out, do you all know what happened in mysql db, on the browser, the server? (Oh and due to this, I am unable to get emails too.)
The glories of Monday.
Upvotes: 1
Views: 824
Reputation: 1589
ok so i tried using this mysql_real_escape_string
now my email looks like this:
hello [name],\\n\\nHow are you?\\n\\nThis is a test.\\n\\nThank you.\\n\\nCeo
its adding extra slash to it.
my html/php code looks like this:
hello [name],\n\n
How are you?\n\n
This is a test.\n\n
Thank you.\n\n
Ceo
Upvotes: 0
Reputation: 4221
The database's (or table or column) encoding or collation has somehow been changed. If you want to verify, check that column's encoding, and compare it with the encoding of other columns without the problem.
Fortunately, it's easy to change the encoding to the proper format (within cPanel or PHPMyAdmin) without having to update the actual data.
I believe that latin1_swedish_ci is the default collation that causes no problems and utf-8 should be the encoding.
Hope this helps.
Upvotes: 0
Reputation: 3908
"�" has the following characters from latin-1
(iso-8859-1
):
303 195 C3 Ã LATIN CAPITAL LETTER A WITH TILDE
257 175 AF ¯ MACRON
302 194 C2 Â LATIN CAPITAL LETTER A WITH CIRCUMFLEX
277 191 BF ¿ INVERTED QUESTION MARK
275 189 BD ½ VULGAR FRACTION ONE HALF
The byte sequence is, then C3 AF C2 BF C2 BD
. This "smells" like UTF-8. Decoding (per https://en.wikipedia.org/wiki/UTF-8), we turn these into bit-patterns:
That first one (110xxxxx) indicates it's the first byte in a two-byte character, and stripping the marker bits from 11000011 10101111
yields ...00011 ..101111
or 00000000 00000000 00000000 11101111
== U+000000EF
.
Similarly, the next two make ...00010 ..111111
or U+000000BF
.
Then ...00010 ..111101
or U+000000BD
.
U+00EF
U+00BF
U+00BD
(per https://en.wikibooks.org/wiki/Unicode/Character_reference/0000-0FFF) are "�", which is clearly not right.
However, this answer — https://stackoverflow.com/a/6544206/1105015 — seems to provide some insight. EF BF BD
is the UTF-8 representation of the "replacement character" U+FFFD
. So it looks like something way up the line got a character that confused your system, it was stored as the replacement character, and then eventually re-rendered as latin-1.
What i'd suggest looking closely at at this point is actually the encoding you use when inserting into the db. Maybe the only thing that changed is the MySQL client used for that?
Upvotes: 3