Steven
Steven

Reputation: 18024

PHP Regex match first newline after x characters for a trimming function

I'm writing a trimming function that takes a string and finds the first newline \n character after the 500th character and returns a string up to the newline. Basically, if there are \n at indices of 200, 400, and 600, I want the function to return the first 600 characters of the string (not including the \n).

I tried:

$output = preg_replace('/([^%]{500}[^\n]+?)[^%]*/','$1',$output);

I used the percent sign because I couldn't find a character class that just encompassed "everthing". Dot didn't do it because it excluded newlines. Unfortunately, my function fails miserably. Any help or guidance would be appreciated.

Upvotes: 0

Views: 1055

Answers (3)

Zenon
Zenon

Reputation: 1456

use

'/(.{500,}?)(?=\n)/s' 

as pattern

the /s at the end makes the dot catch newlines, {500,} means "match 500 or more" with the question mark matching as few as possible. the (?=\n) is a positive lookahead, which means the whole matched string has to be followed by a \n, but the lookahead doesn't capture anything. so it checks that the 500+ character string is followed by a newline, but doesn't include the newline in the match (or the replace, for that matter).

Though the lookahead thingy is a little fancy in this case, I guess

'/(.{500,}?)\n/s'

would do just as well. I just like lookaheads :)

Upvotes: 1

DisgruntledGoat
DisgruntledGoat

Reputation: 72560

Personally I would avoid regex and use simple string functions:

// $str is the original string
$nl = strpos( $str, "\n", 500 ); // finds first \n starting from char 500
$sub = substr( $str, 0, $nl );
$final = str_replace( "\n", ' ', $sub );

You might need to check for \r\n as well - i.e. normalize first using str_replace( "\r\n", "\n", $str ).

Upvotes: 3

Greg
Greg

Reputation: 321796

You can add the s (DOTALL) modifier to make . match newlines, then just make the second bit ungreedy. I've also made it match everything if the string is under 500 characters and anchored it to the start:

preg_match('/^.{500}[^\n]+|^.{0,500}$/s', $output, $matches);
$output = $matches[0];

Upvotes: 1

Related Questions