peppy
peppy

Reputation: 197

Remove more than 1 blank line between paragraphs with only 1 blank line AND remove excess white spaces

I have a website where users are required to write a simple, short profile description. Some users are writing ugly profiles with a bunch of empty spaces, and excess new lines. It's a lot of work to edit these manually afterwards. Here is an example of this:

$string = "
first sentence has  w    aaa    yy too many   excess spaces



and too many lines
between



paragraphs.




";

This is what I want: (maximum of one blank line between paragraphs, only one space between words)

$string = "first sentence has w aaa yy too many excess spaces

and too many lines
between

paragraphs";

I need a combination of functions in PHP that would correct this. I searched for these and found one that replaces more than 1 new line with just 1, but I wish to have a maximum of one blank line between paragraphs - but no more that. I also found one that removes excess white spaces, but it is removing the new lines, it won't work:

$string = preg_replace('/(\r\n|\r|\n)+/', "\n", trim($string));
$string = preg_replace('/\s+/', ' ', $string);

I know there are different types of new lines /\r\n|\r|\n/ depending on device. I wouldn't have a clue how to handle all of it at once. I need help setting up a combination of functions that would work for my case.

Upvotes: 1

Views: 41

Answers (2)

Tim Biegeleisen
Tim Biegeleisen

Reputation: 522506

You could try:

$string = preg_replace('/[^\S\n\r]{2,}/', ' ', $string);
$string = preg_replace('/(\r?\n)(?:\r?\n)+/', '$1$1', $string);

The first regex replacement replaces all occurrences of 2 or more whitespace (but not CRLF) characters with just a single space. The second replacement removes extra continuous CRLF.

Upvotes: 1

Pablo
Pablo

Reputation: 6058

You were close to a working solution. Here are the few adjustments you need.

In this first statement, you are replacing all extra line breaks with only one line break but based on the expected output you actually want to replace all matches with two line breaks instead one. So your replacement should be \n\n:

$string = preg_replace('/(\r\n|\r|\n)+/', "\n\n", trim($string))

In the second statement, you are using regex shorthand \s which also happenes to match line breaks. So that statement alone is replacing all line breaks as well as spaces. Instead you can try the pattern / +/

$string = preg_replace('/ +/', ' ', $string);

Just note that there are more complex ways to solve this problem that will handle more edge cases with special characters and so on. This here should be good start and easy to understand.

Upvotes: 1

Related Questions