Chazy Chaz
Chazy Chaz

Reputation: 1851

trim lines and shrink whitespaces using regex for multi line string

I'm using a php function want to create a function to trim all unnecessary white spaces from a multi line string.

The regex that it's not working is the one that removes spaces at the end:

// Always trim at the end. Warning: this seems to be the costlier
// operation, perhaps because looking ahead is harder?
$patterns[] = ['/ +$/m', ''];

Given the following string from a textarea:

 first  line... abc   //<-- blank space here
 second  is  here... def   //<-- blank space here
 //<-- blank space here
 fourth  line... hi  there   //<-- blank space here

 sith  is  here....   //<-- blank space here

There are blank spaces at the beginning and end of each line plus more than one between the words.

After I run the function:

$functions->trimWhitespace($description, ['blankLines' => false]);

This is what I get:

first line... abc //<-- blank space here
second is here... def //<-- blank space here
//<-- no bank space here
fourth line... hi there //<-- blank space here

sith is here....//<-- no blank space here

Why is it only removing the trailing space from the last line?

Upvotes: 1

Views: 1448

Answers (5)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626932

You may redefine where $ matches using the (*ANYCRLF) verb.

See the following PHP demo:

$s = " ddd    \r\n  bbb     ";
$n = preg_replace('~(*ANYCRLF)\h+$~m', '', $s); // if the string can contain Unicode chars,
echo $n;                                        // also add "u" modifier ('~(*ANYCRLF)\h+$~um')

Details:

  • (*ANYCRLF) - specifies a newline convention: (*CR), (*LF) or (*CRLF)
  • \h+ - 1+ horizontal whitespace chars
  • $ - end of line (now, before CR or LF)
  • ~m - multiline mode on ($ matches at the end of a line).

If you want to allow $ to match at any Unicode line breaks, replace (*ANYCRLF) with (*ANY).

See Newline conventions in the PCRE reference:

(*CR)        carriage return
(*LF)        linefeed
(*CRLF)      carriage return, followed by linefeed
(*ANYCRLF)   any of the three above
(*ANY)       all Unicode newline sequences

Now, if you need to

  • Trim the lines from both start and end
  • Shrink whitespaces inside the lines into just a single space

use

$s = " Ł    ę  d    \r\n  Я      ёb     ";
$n = preg_replace('~(*ANYCRLF)^\h+|\h+$|(\h){2,}~um', '$1', $s);
echo $n;

See the PHP demo.

Upvotes: 3

Nabeel Khan
Nabeel Khan

Reputation: 3993

You need to /gm instead of just /m

The code should become: (this code won't work, the update one will)

$patterns[] = ['/ +$/mg', ''];

Working example here: https://regex101.com/r/z3pDre/1

Update:

The g identifier, don't work like this. We need to replace preg_match with preg_match_all

Use the regex without g, like this:

$patterns[] = ['/ +$/m', ''];

Upvotes: 1

Oscar Zarrus
Oscar Zarrus

Reputation: 790

 preg_replace('/*(.*) +?\n*$/', $content)

Live Demo

Upvotes: 0

Jan
Jan

Reputation: 43169

Use a two step approach:

<?php

$text = " first  line... abc   
 second  is  here... def   
  <-- blank space here
 fourth  line... hi  there   

 sith  is  here....   ";

// get rid of spaces at the beginning and end of line
$regex = '~^\ +|\ +$~m';
$text = preg_replace($regex, '', $text);

 // get rid of more than two consecutive spaces
$regex = '~\ {2,}~';
$text = preg_replace($regex, ' ', $text);
echo $text;

?>

See a demo on ideone.com.

Upvotes: 1

TheRealMrCrowley
TheRealMrCrowley

Reputation: 976

preg_replace ( mixed $pattern , mixed $replacement , mixed $subject [, int $limit = -1 [, int &$count ]] )

so you want preg_replace('/[\s]+$/m', '', $string)

Upvotes: 0

Related Questions