Alessandro Astarita
Alessandro Astarita

Reputation: 13

PHP regular expression to extract quoted text in tag body

I'm trying to write a regular expression in PHP. From this code I want to match 'bar'.

<data info="foo">
  "bar"|tr
</data>

I tried this two regex, without success. It matches 'foo"> "bar'.

$regex = '/"(.*?)"\|tr/s';
$regex = '/"[^"]+(.*?)"\|tr/s';

Anyone can help me?

Upvotes: 1

Views: 212

Answers (4)

Tomalak
Tomalak

Reputation: 338416

You need to escape the backslash in PHP strings:

$regex = '/"([^"]*)"\\|tr/s';

I added a capturing group to get the contents of the quotes, which you seem to be interested in.

Since you seem to apply the regex to XML, I just want to warn you that XML and regular expressions don't play well together. Regex is only recommendable in conjunction with a DOM.

Upvotes: 3

instanceof me
instanceof me

Reputation: 39208

$regex = '/"(.+?)"(?=\|tr)/'

Will match "bar" (including the quotes), and you have the bar string (without quotes) in $1. Uses look-ahead.

Upvotes: 0

Julio Greff
Julio Greff

Reputation: 135

Try this:

$regex = '/"([^">]+)"\|tr/s'

If you want to match just letters and numbers, you can do:

$regex = '/"([\w\d]+)"\|tr/s'

Upvotes: 0

slf
slf

Reputation: 22787

\"\w+\"

should match any word char in parenthesis

Upvotes: 0

Related Questions