genesi5
genesi5

Reputation: 453

Perl, replace multiple matches in string

So, i'm parsing an XML, and got a problem. XML has objects containing script, which looks about that:

return [
['measurement' : org.apache.commons.io.FileUtils.readFileToByteArray(new File('tab_2_1.png')),
'kpi' : org.apache.commons.io.FileUtils.readFileToByteArray(new File('tab_2_2.png'))]]

I need to replace all filenames, saving file format, every entry of regexp template, because string can look like that:

['measurement' : org.apache.commons.io.FileUtils.readFileToByteArray(new File('tab_2_1.png'))('tab_2_1.png'))('tab_2_1.png')),

and i still need to replace all image_name before .png

I used this regexp : .*\(\'(.*)\.png\'\), but it catches only last match in line before \n, not in whole string.

Can you help me with correcting this regexp?

Upvotes: 1

Views: 1583

Answers (1)

zdim
zdim

Reputation: 66964

The problem is that .* is greedy: it matches everything it can. So .*x matches all up to the very last x in the string, even if all that contains xs. You need the non-greedy

s/\('(.*?)\.png/('$replacement.png/g;

where the ? makes .* match up to the first .png. The \(' are needed to suitably delimit the pattern to the filename. This correctly replaces the filenames in the shown examples.

Another way to do this is \('([^.]*)\.png, where [^.] is the negated character class, matching anything that is not a .. With the * quantifier it again matches all up to the first .png


The question doesn't say how exactly you are "parsing an XML" but I dearly hope that it is with libraries like XML::LibXML of XML::Twig. Please do not attempt that with regex. The tool is just not fully adequate for the job, and you'll get to know about it. A lot has been written over years about this, search SO.

Upvotes: 2

Related Questions