Pann Phyu Phway
Pann Phyu Phway

Reputation: 55

how to check xml file line by line with perl script

i would like to compare the two file one is user's input file txt file and another file is config file which is xml file. if user's input file value is match with config file then show matched function.

this is user's input file

L84A:FIP:70:155:15:18:
L83A:55FIP:70:155:15:

In the above file: L84A is Design_ID, FIP is Process_ID, and 70 to 18 is register_ID.

this is config file

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<Sigma>
        <Run>
                <DESIGN_ID>L83A</DESIGN_ID>
                <PROCESS_ID>55FIP</PROCESS_ID>
                <RegisterList>
                        <Register>70</Register>
                        <Register>155</Register>
                </RegisterList>
        </Run>
        <Run>
                <DESIGN_ID>L83A</DESIGN_ID>
                <PROCESS_ID>FRP</PROCESS_ID>
                <RegisterList>
                        <Register>141</Register>
                        <Register>149</Register>
                        <Register>151</Register>
                </RegisterList>
        </Run>
        <Run>
                <DESIGN_ID>L84A</DESIGN_ID>
                <PROCESS_ID>55FIP</PROCESS_ID>
                <RegisterList>
                        <Register>70</Register>
                        <Register>155</Register>
                </RegisterList>
        </Run>
</Sigma>

so in this case output should show:

L84A: doesn't has FIP process ID in config file.
L83A:
  55FIP
   70 - existing register ID
   155 - existing register ID
   15 - no existing register ID.

my code doesn't check respective process ID and register ID .it shows below.

L84A
 FIP
  70 - existing register ID
  155 - existing register ID
  15 - existing register ID
  18 - no existing register ID
L83A
 55FIP
  70 - existing register ID
  155 - existing register ID
  15 - existing register ID

below is my code:

use strict; 
use warnings;
use vars qw($file1 $file1cnt @output);
use XML::Simple;
use Data::Dumper;
# create object
my $xml = new XML::Simple;

# read XML file
my $data = $xml->XMLin("sigma_loader.xml");
my $file1 = "userinput.txt";

readFileinString($file1, \$file1cnt); 

while($file1cnt=~m/^((\w){4})\:([^\n]+)$/mig)
{  
  my $DID = $1;
  my $reqconfig = $3;
  while($reqconfig=~m/^((\w){5})\:([^\n]+)$/mig) #Each line from user request
  {  
    my $example1 = $1; #check for FPP/QBPP process
    my $example2 = $3; #display bin full lists.

   if(Dumper($data) =~ $DID)
   {
     print"$DID\n";
     if(Dumper($data) =~ $example1)
     {
       print"$example1\n";
       my @second_values = split /\:/, $example2;
       foreach my $sngletter(@second_values)
        { 

          if( Dumper($data) =~ $sngletter)
          {
            print"$sngletter - existing register ID\n";
          }
          else
          {
            print"$sngletter - no existing register ID\n";
          }
        }
     }
     else
     {
      print"$DID doesn't has $example1 process ID in config file\n";
     }
   }
   else
   {
    print"new Design ID deteced\n";
   } 
 }
 while($reqconfig=~m/^((\w){3})\:([^\n]+)$/mig) #Each line from user request
 {  
    my $example1 = $1; #check for FPP/QBPP process
    my $example2 = $3; #display bin full lists.

   if(Dumper($data) =~ $DID)
   {
     print"$DID\n";
     if(Dumper($data) =~ $example1)
     {
       print"$example1\n";
       my @second_values = split /\:/, $example2;
       foreach my $sngletter(@second_values)
        { 

          if( Dumper($data) =~ $sngletter)
          {
            print"$sngletter - existing register ID\n";
          }
          else
          {
            print"$sngletter - no existing register ID\n";
          }
        }
     }
     else
     {
      print"$DID doesn't has $example1 process ID in config file\n";
     }
   }
   else
   {
    print"new Design ID deteced\n";
   } 
 }
}

sub readFileinString
#------------------>
{
    my $File = shift;
    my $string = shift;
    use File::Basename;
    my $filenames = basename($File);
    open(FILE1, "<$File") or die "\nFailed Reading File: [$File]\n\tReason: $!";
    read(FILE1, $$string, -s $File, 0);
    close(FILE1);
} 

Upvotes: 1

Views: 152

Answers (1)

simbabque
simbabque

Reputation: 54373

There are a couple of things in your code that do not really make sense, like using Data::Dumper and parsing the output with a regular expression. I'm not going to review your code as that is off-topic on Stack Overflow, but instead going to give you an alternate solution and walk you through it.

Please note that XML::Simple is not a great tool. Its use is discouraged because it is very bad at handling certain cases. But for your very simple XML structure it will work, so I have kept it.

use strict;
use warnings;
use XML::Simple;
use feature 'say';

# read XML file and reorganise it for easier use
my $data;
foreach my $run (@{XMLin(\*DATA)->{Run}}) {
    $data->{$run->{DESIGN_ID}}->{$run->{PROCESS_ID}} =
      {map { $_ => 1 } @{$run->{RegisterList}->{Register}}};
}

# read the text file - I've skipped the read
my @user_input = qw(
  L84A:FIP:70:155:15:18:
  L83A:55FIP:70:155:15:
);

foreach my $line (@user_input) {
    chomp $line
      ; # we don't need this in my example, but you do when you read from a file
    my ($design_id, $process_id, @register_ids) = split /:/, $line;

    # extra error checking just in case
    if (not exists $data->{$design_id}) {
        say "$design_id does't exist in data";
        next;
    }
    if (not exists $data->{$design_id}->{$process_id}) {
        say "$design_id: doesn't have $process_id";
        next;
    }

    say "$design_id:";
    say " $process_id";
    foreach my $register_id (@register_ids) {
        if (exists $data->{$design_id}->{$process_id}->{$register_id}) {
            say "  $register_id - existing register ID";
        }
        else {
            say "  $register_id - no existing register ID";
        }
    }
}


__DATA__
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<Sigma>
        <Run>
                <DESIGN_ID>L83A</DESIGN_ID>
                <PROCESS_ID>55FIP</PROCESS_ID>
                <RegisterList>
                        <Register>70</Register>
                        <Register>155</Register>
                </RegisterList>
        </Run>
        <Run>
                <DESIGN_ID>L83A</DESIGN_ID>
                <PROCESS_ID>FRP</PROCESS_ID>
                <RegisterList>
                        <Register>141</Register>
                        <Register>149</Register>
                        <Register>151</Register>
                </RegisterList>
        </Run>
        <Run>
                <DESIGN_ID>L84A</DESIGN_ID>
                <PROCESS_ID>55FIP</PROCESS_ID>
                <RegisterList>
                        <Register>70</Register>
                        <Register>155</Register>
                </RegisterList>
        </Run>
</Sigma>

I've made a few assumptions.

  1. You already know how to read the text file, so I've stuck that into an array line by line. Your file reading code has some issues though, you should be using three-arg open and lexical filehandles. Your call to open should look like this:

    open my $fh, '<', $filename or die "$!: error...";
    

    Alternatively, consider using Path::Tiny.

  2. I'm taking the XML file from the __DATA__ section. This is like a filehandle.

So let's look at my code.

When we read the XML structure, it looks like this straight out of XMLin.

\ {
    Run   [
        [0] {
            DESIGN_ID      "L83A",
            PROCESS_ID     "55FIP",
            RegisterList   {
                Register   [
                    [0] 70,
                    [1] 155
                ]
            }
        },
        [1] {
            DESIGN_ID      "L83A",
            PROCESS_ID     "FRP",
            RegisterList   {
                Register   [
                    [0] 141,
                    [1] 149,
                    [2] 151
                ]
            }
        },
        [2] {
            DESIGN_ID      "L84A",
            PROCESS_ID     "55FIP",
            RegisterList   {
                Register   [
                    [0] 70,
                    [1] 155
                ]
            }
        }
    ]
}

This is not very useful for what we plan to do, so we have to rearrange it. I want to use exists on hash references later, to make it easier to look up if there are matches for the IDs we are looking at. This is called a lookup hash. We can through away the ->{Run} key as XML::Simple combines all <Run> elements into an array reference, and the <Sigma> tag is just skipped because it's the root element.

Every Design ID can have multiple Processes, so we organise these two hierarchically, and we put in another lookup hash, where every register is a key, and we just use 1 as a key. The key does not matter.

This gives us a different data structure:

\ {
    L83A   {
        55FIP   {
            70    1,
            155   1
        },
        FRP     {
            141   1,
            149   1,
            151   1
        }
    },
    L84A   {
        55FIP   {
            70    1,
            155   1
        }
    }
}

That's much easier to understand and use later on.

Now we parse the user input, and iterate over each line. The format seems clear. It's a bit like a CSV file, but using colons :, so we can split. This gives us the two IDs, and all following values are registers, so we stick them in an array.

Your example doesn't have a case where the Design ID does not exist in the XML file, but given this is based on user input, we should check anyway. In the real world data is always dirty.

We can then check if the $process_id exists inside the $design_id in our data. If it does not, we tell the user and skip to the next line.

Then we have to iterate all the Register IDs. Either the $register_id exists in our second lookup hash, or it doesn't.

This gives us the exact output you're expecting.

L84A: doesn't have FIP
L83A:
 55FIP
  70 - existing register ID
  155 - existing register ID
  15 - no existing register ID

This code is much shorter, easier to read and runs faster. I've used Data::Printer to show the data structures.

Upvotes: 2

Related Questions