Reputation: 2389
Edit2: only regex match solutions, please. thank you!
Edit: I'm looking for regex solution, if it's exist. I have other blocks with the same data that are not XML, and I can't use Perl, I added Perl tag as I'm more familiar with regexes in Perl. Thanks in advance!
I Have list like this:
<Param name="Application #" value="1">
<Param name="app_id" value="32767" />
<Param name="app_name" value="App01" />
<Param name="app_version" value="1.0.0" />
<Param name="app_priority" value="1" />
</Param>
<Param name="Application #" value="2">
<Param name="app_id" value="3221" />
<Param name="app_name" value="App02" />
<Param name="app_version" value="1.0.0" />
<Param name="app_priority" value="5" />
</Param>
<Param name="Application #" value="3">
<Param name="app_id" value="32" />
<Param name="app_name" value="App03" />
<Param name="app_version" value="1.0.0" />
<Param name="app_priority" value="2" />
</Param>
How can I get a block for one app if I only know, say, a value of app_name. For example for App02 I want to get
<Param name="Application #" value="2">
<Param name="app_id" value="3221" />
<Param name="app_name" value="App02" />
<Param name="app_version" value="1.0.0" />
<Param name="app_priority" value="5" />
</Param>
Is it possible to get it, if other "name=" lines are not known (but there's always name="app_name"
and Param name="Application #"
)?
Can it be done in a single regex match? (doesn't have to be, but feels like there's probably a way).
Upvotes: 0
Views: 323
Reputation: 6476
I would suggest using one of XML parsers, but if you cannot do so, then the following quick and dirty code should do:
my ($rez) = $data =~/\<Param\s+name\s*=\s*"Application\s#"\s+value\s*=\s*"2"\>((?:.|\n)*?)^\<\/Param\>/m;
print $rez;
(assuming $data contains your xml as a single string, possibly multiline )
Upvotes: 1
Reputation: 336308
I would prefer a parser solution, too. If you absolutely have to use a regex and understand all the disadvantages of this approach, then the following regex should work:
<Param name="Application #"[^>]*>\s+<Param[^>]*>\s+<Param name="app_name" value="App02" />\s+(?:<Param[^>]*>\s+){2}</Param>
This relies heavily on the structure present in your example. A re-ordering of tags, introduction of additional tags or (shudder) nesting of tags will break the regex.
Upvotes: 1
Reputation: 118148
This seems to be a sad case of bogus XML. A misguided attempt to create enterprisey software at best. The developers could have used a sane configuration file format such as:
[App03] app_id = 32767 app_version = 1.0.0 ...
but they decided to drive everyone insane with meaningless BSXML.
I would say, if this file is less than 10 MB in size, just go ahead and use XML::Simple. If the file indeed consists of nothing but repeated blocks of exactly what you posted, you can use the following solution:
#!/usr/bin/perl
use strict; use warnings;
my %apps;
{
local $/ = "</Param>\n";
while ( my $block = <DATA> ) {
last unless $block =~ /\S/;
my %appinfo = ($block =~ /name="([^"]+?)"\s+value="([^"]+?)"/g);
$apps{ $appinfo{app_name} } = \%appinfo;
}
}
use Data::Dumper;
print Dumper $apps{App03};
Edit: If you cannot use Perl and you won't tell us what you can use, there is not much I can do but point out that
/name="([^"]+?)"\s+value="([^"]+?)"/g
will give you all name
-value
pairs.
Upvotes: 3
Reputation: 27323
since your content seems to be some XML why don't use a real parser to do the task ?
use XML::XPath;
use XML::XPath::XMLParser;
my $xp = XML::XPath->new(filename => 'test.xhtml');
my $nodeset = $xp->find('/Param[@name=\'Application #\']'); # find all applications
foreach my $node ($nodeset->get_nodelist) {
print "FOUND\n\n",
XML::XPath::XMLParser::as_string($node),
"\n\n";
}
you can read a bit more about XPath here and have full reference at the w3c.
I advise you not to use reg exp to do that task because it's going to be complicate and not maintenable.
note: also possible to use the DOM API just depend the one you like the most.
Upvotes: 4
Reputation: 192547
Seems like it would be more appropriate to use an XML reader library, but I don't know Perl enough to suggest one.
Upvotes: 1