Reputation: 31
I have an XML file like this:
<Nodes><Node>
<NodeName>Company</NodeName>
<File>employee_details.csv</File>
<data>employee_data.txt</data>
<Node>
<NodeName>dummy</NodeName>
<File>employee_details1.csv</File>
<data>employee_data1.txt</data>
</Node>
</Node>
</Nodes>
#Contents of employee_data.txt
Empname,Empcode,EmpSal:Currency,Empaddr
#Contents of employee_details.csv (like this huge data)
Alex,A001,1000:USD,Bangalore
Aparna,B001,1000:RUBEL,Bombay
#Contents of employee_data1.txt
phone,fax
#Contents of employee_details1.csv (like this huge data)
44568889,123345656
23232323,454545757
Output:
<Company>
<Empname>Alex</Empname>
<Empcode>A001</Empcode>
<EmpSal=USD>1000</EmpSal>
<Empaddr>Bangalore</Empaddr>
<phone>44568889</phone>
<fax>123345656</fax>
</Company>
<Company>
<Empname>Aparna</Empname>
<Empcode>B001</Empcode>
<EmpSal=RUBEL>1000</EmpSal>
<Empaddr>Bombay</Empaddr>
<phone>23232323</phone>
<fax>454545757</fax>
I want to build an XML tree with Sax parser but I am not able to understand how to traverse across all the nodes and create an event.
I should get the above output?
How can I do it in Perl?
Upvotes: 2
Views: 257
Reputation: 12984
.pl file my $factory = XML::SAX::ParserFactory->new(); my $parser = $factory->parser( Handler =>sax_handler->new(arguments_to parse));
sax_handler.pm su new() { //nothing as such ! my ($type); return bless {}, $type; } //follwong 2 methods are important sub start_element { my ($self, $element) = @_;
#attributes of comment tag...m:text is tag
if( $element->{Name} eq "m:text")
{
$name=$element->{Attributes}->{'{}name'}->{'Value'};
}
}
//m:reviewID is tag in u r xml ! sub end_element { my ($self, $element) = @_;
#write down all tags...& print them or manipulate them
if( $element->{Name} eq "m:reviewID"){
} }
Upvotes: 2
Reputation: 12984
Well SAX Parser is slightly different from other parsing techniques. Here you need to write your handler [ perl module]. module must contains following things -> 1. constructor. 2. subroutine start_element 3.end_element. You can manage events inside the subroutines like this [for tag] -->if( $element->{Name} eq "mail_id"){ $user_mail_id=$self->get_text();}
Upvotes: 1
Reputation: 16171
It looks to me that the CSV files can be huge, not the XML one. So really there is no need to use a SAX parser. The XML is used only to give you the location of 4 files. 2 of those files (the .txt
ones) are small, they only contain a list of fields, and the last 2 files can be big. Those are the CSV file.
You should use Text::CSV_XS to parse those 2 huge file. You can then output the XML using plain print (just make sure you escape the text and pay attention to the encoding (BTW in your sample output <EmpSal=USD>
is not well-formed XML, the attribute value needs to be quoted: <EmpSal="USD">
). An other options is XML::Writer, which will take care of escaping and quoting for you. I don't think generating SAX events and passing them to a SAX writer makes sense in this case, it would be more complex and probably slower than the other options.
Upvotes: 1