Kyle Cureau
Kyle Cureau

Reputation: 19366

Regex grabbing extra character

I'm using PHP preg_replace with the following regex:

/(?<=#EXTINF:([0-9])+,).+?(?=#EXT)/gsm

operating on the following string:

#EXTM3U
#EXT-X-TARGETDURATION:10
#EXTINF:10,
Grab_this_string
#EXTINF:5,
Grab_this_string_too
#EXT-X-ENDLIST

This replaces:

, Grab_this_string 
Grab_this_string_too

I'm trying to match it without the first comma (essentially everything that is between #EXTINF:xx, and the next #EXTINF:

Grab_this_string 
Grab_this_string_too

Upvotes: 4

Views: 251

Answers (1)

Wiseguy
Wiseguy

Reputation: 20883

Since you're in multiline mode, you could match on line endings to delineate each line.

This matches two lines and replaces them with the first line only (effectively removing the second line). Notice I've removed "dotall" mode (s).

$regex = '/(^#EXTINF:\d+,$)(\s+)^.+$(?=\s+^#EXT)/m';

echo preg_replace($regex, '$1', $str);

Output:

#EXTM3U
#EXT-X-TARGETDURATION:10
#EXTINF:10,
#EXTINF:5,
#EXT-X-ENDLIST

Update:

Using a lookbehind will not work, as it requires variable-length matching, which is unsupported in most regex engines (including PCRE, which PHP uses).

If you want to capture only the line you want to remove and not have to replace two lines with a subpattern match like I did above, you can use the \K escape sequence to simulate a lookbehind that is not subject to variable-length restrictions. \K resets the match's start position, so anything that was matched before the \K will not be included in the final match. (See the last paragraph here.)

$regex = '/^#EXTINF:\d+,\s+\K^.+?(?=#EXT)/sm';

echo preg_replace($regex, '', $str);

Upvotes: 2

Related Questions