user1257255
user1257255

Reputation: 1171

PHP - preg_match() - matching substitution character black diamond with question mark

I have a problem with substitution character - diamond question mark � in text I'm reading with SplFileObject. This character is already present in my text file, so nothing can't be done to convert it to some other encoding. I decided to search for it with preg_match(), but the problem is that PHP can't find any occurence of it. PHP probably sees it as different character as �. I don't want to just remove this character from text, so that's the reason I want to search for it with preg_match(). Is there any way to match this character in PHP?

I tried with regex line: /.�./i, but without success.

Upvotes: 0

Views: 1146

Answers (2)

user1257255
user1257255

Reputation: 1171

PHP with SplFileObject seems to read the file a little bit different and instead of U+FFFD detects U+0093 and U+0094. If you are having the same problem as I had, then I suggest you to use hexdump to get information on how unrecognized character is encoded in it. Afterwards I suggest you to use this snippet as recommended by @stribizhev in comments, to get hex code recognized by PHP. Once you figure out what is correct hex code of unrecognized character (use conversion tool as suggested by @stribizhev in comments, to get correct value), you can use preg_...() function. Here's the solution to my problem:

preg_replace("/(?|\x93|\x94)/i", "'", $text);

Upvotes: 0

Dhinju Divakaran
Dhinju Divakaran

Reputation: 903

Try this code.Hexadecimal of � character is FFFD

$line = "�";
if (preg_match("/\x{FFFD}/u", $line, $match))
  print "Match found!";

Upvotes: 2

Related Questions