ravi.g teja
ravi.g teja

Reputation: 35

Parsing data using Perl

I was not able to parse the xml data properly. I need your help.

**Code**
#!usr/bin/perl
use strict;
use warnings;
open(FILEHANDLE, "data.xml")|| die "Can't open";
my @line;
my @affi;

my @lines;
my $ct =1 ;
print "Enter the start position:-";

my $start= <STDIN>;
print "Enter the end position:-";


my $end = <STDIN>;

print "Processing your data...\n";
my $i =0;
my $t =0;
while(<FILEHANDLE>)
{
    if($ct>$end)
    {
       close(FILEHANDLE);
       exit;
       
    }
    if($ct>=$start)
    {
       $lines[$t] = $_;
       $t++;
     }
     
     if($ct == $end)
     {
    my $i = 0;
    my $j = 0;
    my @last;
    my @first;
    my $l = @lines;
    my $s = 0;

while($j<$l)
{
    if ($lines[$j] =~m/@/)
    {
        $line[$i] = $lines[$j];
        $u = $j-3;
        $first[$i]=$lines[$s]; 
        $s--;
        $last[$i] = $lines[$u];
        #$j = $j+3;
        #$last[$i]= $lines[$j];
        #$j++;
        #$first[$i] = $lines[$j];
        $i++;
    }
$j++;
}
my $k = 0;
foreach(@line)
{
  $line[$k] =~ s/<.*>(.* )(.*@.*)<.*>/$2/;
  $affi[$k] = $1;
  $line[$k] = $2;
    $line[$k] =~ s/\.$//;
    
    
    $k++;
  }
my $u = 0;
foreach(@first)
{
  $first[$u] =~s/<.*>(.*)<.*>/$1/;
  $first[$u]=$1;  
  $u++;
  }
my $m = 0;
foreach(@last)
{
  $last[$m] =~s/<.*>(.*)<.*>/$1/;
  $last[$m] = $1;    
  $m++;
  }
my $q=@line;
open(FILE,">Hayathi.txt")|| die "can't open";
my $p;

for($p =0; $p<$q; $p++)
{  
  print FILE "$line[$p]  $last[$p],$first[$p]   $affi[$p]\n";  
} 

close(FILE);
     }
     
  
  $ct++;
  }

This code should extract lastName firstName and affiliation from the data and should save in a text file.

I have tried the above code, but I was not able to get the firstName in the output. I request you to please help me by correcting the code. Thank you in advance.

Upvotes: 1

Views: 127

Answers (1)

Polar Bear
Polar Bear

Reputation: 6818

You can take following code sample as basis of your code.

As no text xml sample data file provided the help is very limited based on data image.

Documentation: XML::LibXML

use strict;
use warnings;
use feature 'say';

use XML::LibXML;

my $file = 'europepmc.xml';

my $dom = XML::LibXML->load_xml(location => $file);

foreach my $node ($dom->findnodes('//result')) {
    say 'NodeID:    ', $node->{id};
    say 'FirstName: ', $node->findvalue('./firstName');
    say 'LastName:  ', $node->findvalue('./lastName');
    say '';
}

exit 0;

Upvotes: 1

Related Questions