Andrew
Andrew

Reputation: 38639

How can I replace only the captured elements of a regex?

I'm trying to extract only certain elements of a string using regular expressions and I want to end up with only the captured groups.

For example, I'd like to run something like (is|a) on a string like "This is a test" and be able to return only "is is a". The only way I can partially do it now is if I find the entire beginning and end of the string but don't capture it:

.*?(is|a).*? replaced with $1

However, when I do this, only the characters preceding the final found/captured group are eliminated--everything after the last found group remains.

is is a test.

How can I isolate and replace only the captured strings (so that I end up with "is is a"), in both PHP and Perl?

Thanks!

Edit: I see now that it's better to use m// rather than s///, but how can I apply that to PHP's preg_match? In my real regex I have several captured group, resulting in $1, $2, $3 etc -- preg_match only deals with one captured group, right?

Upvotes: 1

Views: 892

Answers (3)

Sinan Ünür
Sinan Ünür

Reputation: 118138

If all you want are the matches, the there is no need for the s/// operator. You should use m//. You might want to expand on your explanation a little if the example below does not meet your needs:

#!/usr/bin/perl

use strict;
use warnings;

my $text = 'This is a test';

my @matches = ( $text =~ /(is|a)/g );

print "@matches\n";
__END__

C:\Temp> t.pl
is is a

EDIT: For PHP, you should use preg_match_all and specify an array to hold the match results as shown in the documentation.

Upvotes: 6

Michael Carman
Michael Carman

Reputation: 30831

You can't replace only captures. s/// always replaces everything included in the match. You need to either capture the additional items and include them in the replacement or use assertions to require things that aren't included in the match.

That said, I don't think that's what you're really asking. Is Sinan's answer what you're after?

Upvotes: 1

Jherico
Jherico

Reputation: 29240

You put everything into captures and then replaces only the ones you want.

(.*?)(is|a)(.*?)

Upvotes: 0

Related Questions