Reputation: 1136
Hi I still can't work it out. I'm using preg_replace. I've searched but failed to find a solution. I need to remove unknown characters in the string but preserve new lines.
$summary = "ASDASDASDASDSASD
[BS][BS][BS] hello
this is a new line
[BS][BS][BS]
this is another new line";
// [BS] is an unknown character if you ever encountered it before in Notepadd++.
// See screenshot, taken from Notepad++
// The output in the browser is a series of whitespaces.
// I can't paste the unknown symbol here.
echo preg_replace('/[\x00-\x1F\x80-\xFF]/','', $summary);
// Output: ASDASDASDASDSASD hello this is a new line this is another new line
//Expected Output:
//ASDASDASDASDSASD
// hello
// this is a new line
//this is another new line
I'll appreciate all help that i can get.
Upvotes: 1
Views: 2650
Reputation: 272436
I am looking at http://www.asciitable.com/ and feel like the RegEx should be something like this:
/[\x00-\x08\x0B-\x0C\x0E-\x1F\x7F-\xFF]/
The range (in fact a blacklist of characters) excludes ASCII tab, new line and carriage return characters which you probably want to keep.
PS: The BS
is how Notepad++ represents backspace character (ASCII 0x08
).
Upvotes: 4
Reputation: 52
It's this
echo preg_replace('/[\x00-\x09\x0B-\x0C\x0E-\x1F\x80-\xFF]/','', $summary);
Because 0D and 0A (as in \x0D and \x0A being included in \x00-\x1F) are CR+LF. You need to exclude these (and so define multiple Ranges)
Upvotes: 1