Reputation: 1853
I need to capture a certain pattern multiple times while also remembering what's before, after and in between. For example:
some text "to be captured" some more text "to be captured" some more text
The only things that are predictable are the tokens that delimit the text to be captured. The captured text itself is different every time. In the end I need to place css spans around those captured parts, like so
some text <span class="a">"to be captured"</span> some more text <span
class="a">"to be captured"</span> some more text
I tried
if (preg_match("/(.*?)(\".*?\")(.*)/", $line, $m)
$res .= $m[1] . '<span class="a">' . $m[2] . '</span>' . $m[3];
It works for a line with only one capture. Using preg_match_all doesn't fix this, probably I will also have to change the regex itself, but I don't know how.
Upvotes: 2
Views: 181
Reputation: 75232
The main reason your code doesn't work is because the third group, (.*)
, gobbles up everything after the first quoted section, including all the remaining quotes. If the .
matched newlines, it would eat all the quotes in the rest of the document, not just the rest of the line.
@Cheery's solution addresses that problem by making the third group non-greedy: (.*?)
. That will work, but only because the third group never captures anything. Instead of consuming everything it can, it starts out by consuming nothing. That's acceptable, and there's nothing after that in the regex to force it to consume more, so it stops there.
The correct way to solve this problem is by matching only the part you want to highlight. Use the capturing group to put it back in with the tags surrounding it, and leave the rest of the text alone:
$line = preg_replace('/("[^"]*")/', '<span class="a">$1</span>', $line);
In fact, you don't even need to use a capturing group. Since the match now consists only of the quoted section, you can use $0
to reinsert it:
$line = preg_replace('/"[^"]*"/', '<span class="a">$0</span>', $line);
EDIT: @Cheery edited his answer and my comments about it no longer apply. I think this answer still adds some value though, so I'll go ahead and leave it up.
Upvotes: 1
Reputation: 4843
I don't know PHP, but looking solely at the Regex you need to search for this : ([^"]*)(".*?")
and replace with this $1<span class="a">$2</span>
some text "to be captured" some more text "to be captured" some more text
some text "to be captured" some more text "to be captured"
Will give this :
some text <span class="a">"to be captured"</span> some more text <span class="a">"to be captured"</span> some more text
some text <span class="a">"to be captured"</span> some more text <span class="a">"to be captured"</span>
::EDIT:: This PHP code seems to be working :
$line = 'some text "to be captured" some more text "to be captured" some more text';
$line2 = preg_replace('/([^"]*)(".*?")/', htmlspecialchars('$1<span class="a">$2</span>'),$line);
echo $line2;
Upvotes: 1
Reputation: 16214
Did you try preg_replace?
$line = preg_replace("/(\".*?\")/",
'<span class="a">$1</span>',
$line
);
ps: I'am still not sure what is the problem of OP, without examples. If you have a set of delimiters then regexp could be
$str = 'some text "to be captured" some more text #to be *captured#
some more text* but I would capture that*';
echo preg_replace('/(("|#|\*).*?\\2)/s',
'<span class="a">$1</span>',
$str);
Upvotes: 3
Reputation: 145482
When you basically want to capture everything, but have your specific part separated, then you might be able to use preg_split
:
$matchs_and_in_between = preg_split('/"(.*?)"/', $src,
PREG_SPLIT_DELIM_CAPTURE);
The trick is the flag. And you will have to loop over the result array. Every second entry is what you specified with the regex. The rest are the in-between parts.
Upvotes: 0