Reputation: 614
I am trying to sanitise database input and found a problem with the Ⓡ character.
Ⓡ converts to
Ⓡ
Even with html_entity_decode around the variable.
This is a problem because the field is only meant to allow 4 characters in the database.
® Actually works though and is treated as a single character.
I have the same problem with Ⓒ vs ©.
As far as I know they are just html entities so should be decoded. However they aren't even encoded with htmlspecialchars(). It just echoes out the code
Ⓡ
Does PHP have any built-in functions to solve this? Thanks
Edit just to say what I am trying to do:
I have text fields to input and add to a database which displays in a table below. When I enter any other character like < > &, it enters straight into the database as one character.
I am trying to make Ⓡ and Ⓒ always go in as one character as well (instead of 6).
I am only encoding on output in the table so certain characters don't break the website.
Upvotes: 4
Views: 2513
Reputation: 522567
The problem that the entity doesn't decode when using html_entity_decode
is likely that the target character set given to html_entity_decode
is still the default ISO-8859-1. ISO-8859-1 cannot encode "Ⓡ" (the CIRCLED LETTER R), but it can encode "®" (the REGISTERED MARK).
So, first, to decode it correctly:
html_entity_decode('Ⓡ', ENT_COMPAT, 'UTF-8')
But secondly, "Ⓡ" and "®" are not the same character, and you probably don't want "Ⓡ".
Upvotes: 2