Keryn Drake
Keryn Drake

Reputation: 121

Perl XML::LibXML searching for attribute value and counting occurences

So this code works to match attribute values from one source to another with exactly the same structure so I haven't shown the xml. I just figured with how flash XML::LibXML is, that there would be a much better way to do it

#get from one data source
for my $movie($review_details1->findnodes('/result_set/results/review')){
    my $id = $movie->findvalue('@movie_id');

    #check if it exists in the other data source
    for my $new_movie($review_details2->findnodes('result_set/results/review')){
        my $new_id = $new_movie->findvalue('@movie_id');
        if ($id eq $new_id){
        print "ID $id matches NEW ID $new_id\n";
        }
    }
}

Cheers

Upvotes: 1

Views: 267

Answers (2)

vanHoesel
vanHoesel

Reputation: 954

my %ids1;
my %ids2;

# count all the IDs in Details1
$ids1{$_->value}++ foreach @{$review_details1->findnodes('book_reviewers/results/reviewer/@movie_id')};

# count all the IDs in Details2
$ids2{$_->value}++ foreach @{$review_details2->findnodes('book_reviewers/results/reviewer/@movie_id')};

# pass through all keys from IDs2 that also exist in IDs1
grep{exists $ids1{$_}} keys %ids2;

that grep statement will return the list of id's; for you to do with whatever you like, print it, assign it to an array - all yours.

Upvotes: 1

Adam Taylor
Adam Taylor

Reputation: 7793

You might be better off looping through each structure once instead of looping through the second XML each time, but, y'know, TMTOWTDI. It probably doesn't matter if the XML files are small but if they were large it might be worth doing.

e.g.

my %movie_ids;
for my $movie($review_details1->findnodes('/result_set/results/review')){
    my $id = $movie->findvalue('@movie_id');
    $movie_ids{$id}++;

for my $new_movie($review_details2->findnodes('result_set/results/review')){
    my $new_id = $new_movie->findvalue('@movie_id');
    $movie_ids{$new_id}++;

Then you could look through %movie_ids and the value of each key would be either be 1 (no match) or > 1 (match).

You could combine both files first and then do something similar but only needing to look through one XML file.

Upvotes: 2

Related Questions