Ron Davis
Ron Davis

Reputation: 346

Replace character not supported in encoding with space

I am getting certain text in utf8 character set, now I want to convert it to ASCII and characters that are not supported in ASCII should be replaced with space in PHP. The current code I use is

  $input_encoding = mb_detect_encoding($toClean);
  mb_substitute_character("long");
  $encoded = mb_convert_encoding($toClean, "ASCII", "auto");

Now it shows characters like "testU+2013ng" in output, I want this U+2013 to be replaced with space. I tried using the regilar expression below

$encoded = preg_replace("~U\+[\d\w]{4}~", " ", $encoded);

Now it is showing text like "Road ' +CB9 +CA4 +CAEU+" in output. How do I remove all the non supported characters using preg or something.

Upvotes: 1

Views: 295

Answers (1)

Jason
Jason

Reputation: 10852

I don't see anything particularly wrong with the regex, but you could simplify it down to:

U\+\d{4}

Upvotes: 1

Related Questions