Reputation: 1569
I have created a word dictionary that uses PHP and XML. The user enters a query into an input box on a webpage form, and that value is compared against the words in the XML file using PHP. Any tags whose nodeValue
matches the search term are returned in an HTML table.
The search works fine overall, with just one major problem. I have an option the user can check to search for exact matches. When this box is checked, the PHP script does a simple if ($searchterm == $xmlTagNodeValue)
comparison. It returns correctly for every string, including those with non-alphabetic characters such as hyphens and underscores, with a single exception: strings that contain apostrophes.
In other words, can't
entered into the input box somehow is not equal to can't
in the XML file.
I'm at a total loss. I'm absolutely certain that is the same character in both strings. I even tried hard-coding the value of the input box by copying and pasting the value from the XML file, with both files open in the same text editor. But the comparison always returns false.
The only thing I can imagine is that it is some kind of encoding issue, and that the characters may look identical but have different values. However, the XML file is saved as UTF-8 (with no BOM, in case that's relevant), and the web page is being viewed in UTF-8, so I'm not sure what else I can do.
Upvotes: 0
Views: 1498
Reputation: 11218
It probably is not an encoding problem but rather the two "apostrophes" are actually two different unicode characters. Take a look at U+0027. The "See Also" section lists six other possible unicode characters that are similar. It is possible that the two strings contain similar looking but different characters. You might want to try to convert each character to a number to confirm or refute this theory.
Upvotes: 1