Reputation: 187
Need to process the file using for loop
I have written below code to convert csv to xml. Here have written separate tag for each column.
In input file have column from 1 to 278. In output file need to have tag from A1 to A278,
Code :
file_in="Prepaid_plan_voucher.csv"
file_out="Prepaid_plan_voucher.xml"
echo '<?xml version="1.0"?>' > $file_out
#echo '<Customers>' >> $file_out
echo ' <TariffRecords>' >> $file_out
echo ' <Tariff>' >> $file_out
while IFS=$',' read -r -a arry
do
# echo ' <TariffRecords>' >> $file_out
# echo ' <Tariff>' >> $file_out
echo ' <A1>'${arry[0]}'</A1>' >> $file_out
echo ' <A2>'${arry[1]}'</A2>' >> $file_out
echo ' <A3>'${arry[2]}'</A3>' >> $file_out
# echo ' </TariffRecords>' >> $file_out
# echo ' </Tariff>' >> $file_out
done < $file_in
#echo '</Customers>' >> $file_out
echo ' <TariffRecords>' >> $file_out
echo ' <Tariff>' >> $file_out
Sample Input file.(this is a sample record in actual input file will contain 278 columns). If input file has two or three records, same needs to be appended in one XML file.
name,Tariff Summary,Record ID No.,Operator Name,Circle (Service Area),list
Prepaid Plan Voucher,test_All calls 2p/s,TT07PMPV0188,Ta Te,Gu,
Prepaid Plan Voucher,test_All calls 3p/s,TT07PMPV0189,Ta Te,HR,
Sample output file The above two TariffRecords, tariff will be hard coded at the beginning and end of xml file.
<TariffRecords>
<Tariff>
<A1>Prepaid Plan Voucher</A1>
<A2>test_All calls 2p/s</A2>
<A3>TT07PMPV0188</A3>
<A4>Ta Te</A4>
<A5>Gu</A5>
<A6></A6>
<Tariff>
<Tariff>
<A1>Prepaid Plan Voucher</A1>
<A2>test_All calls 3p/s</A2>
<A3>TT07PMPV0189</A3>
<A4>Ta Te</A4>
<A5>HR</A5>
<A6></A6>
<Tariff>
<TariffRecords>
Upvotes: 0
Views: 4491
Reputation: 52848
Since it was mentioned in the comments, here's an option using XSLT 3.0.
The processor I tested with is Saxon-HE 9.8 and is run with a java command line. It should be easy to incorporate into a shell script to process multiple files.
CSV Input (added an additional row to show handling of another empty entry and a quoted entry that contains commas that aren't separators)
name,Tariff Summary,Record ID No.,Operator Name,Circle (Service Area),list
Prepaid Plan Voucher,test_All calls 2p/s,TT07PMPV0188,Ta Te,Gu,
Prepaid Plan Voucher,test_All calls 3p/s,TT07PMPV0189,Ta Te,HR,
Prepaid Plan Voucher,,TT07PMPV0190,Ta Te,DH,"some,comma,separated,list"
XSLT 3.0
<xsl:stylesheet version="3.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" expand-text="yes">
<xsl:output method="xml" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:param name="csv-uri"/>
<xsl:param name="csv-encoding" select="'UTF-8'"/>
<xsl:template name="init">
<TariffRecords>
<xsl:choose>
<xsl:when test="unparsed-text-available($csv-uri, $csv-encoding)">
<xsl:call-template name="csv2xml"/>
</xsl:when>
<xsl:otherwise>
<xsl:variable name="error">
<xsl:text>Error reading "{$csv-uri}" (encoding "{$csv-encoding}").</xsl:text>
</xsl:variable>
<xsl:message><xsl:value-of select="$error"/></xsl:message>
</xsl:otherwise>
</xsl:choose>
</TariffRecords>
</xsl:template>
<xsl:template name="csv2xml">
<xsl:variable name="csv_content" select="unparsed-text($csv-uri, $csv-encoding)"/>
<xsl:analyze-string select="$csv_content" regex="\r?\n">
<xsl:non-matching-substring>
<xsl:if test="position() > 1"><!--ignore header-->
<Tariff>
<xsl:analyze-string select="concat(.,',')" regex='"([^"]*)",?|([^,]+),?'>
<!--group 1 is wrapped in quotes-->
<!--group 2 is not wrapped quotes-->
<xsl:matching-substring>
<xsl:element name="A{position()}">
<xsl:value-of select="(regex-group(1),regex-group(2))" separator=""/>
</xsl:element>
</xsl:matching-substring>
<xsl:non-matching-substring>
<xsl:element name="A{position()}"/>
</xsl:non-matching-substring>
</xsl:analyze-string>
</Tariff>
</xsl:if>
</xsl:non-matching-substring>
</xsl:analyze-string>
</xsl:template>
</xsl:stylesheet>
Command line (see here for more info on running Saxon from the command line)
java -cp "C:/apps/SaxonHE9-8-0-11J/saxon9he.jar" net.sf.saxon.Transform -it:init -xsl:"csv2xml.xsl" -o:"output.xml" csv-uri="input.csv"
Output
<?xml version="1.0" encoding="UTF-8"?>
<TariffRecords>
<Tariff>
<A1>Prepaid Plan Voucher</A1>
<A2>test_All calls 2p/s</A2>
<A3>TT07PMPV0188</A3>
<A4>Ta Te</A4>
<A5>Gu</A5>
<A6/>
</Tariff>
<Tariff>
<A1>Prepaid Plan Voucher</A1>
<A2>test_All calls 3p/s</A2>
<A3>TT07PMPV0189</A3>
<A4>Ta Te</A4>
<A5>HR</A5>
<A6/>
</Tariff>
<Tariff>
<A1>Prepaid Plan Voucher</A1>
<A2/>
<A3>TT07PMPV0190</A3>
<A4>Ta Te</A4>
<A5>DH</A5>
<A6>some,comma,separated,list</A6>
</Tariff>
</TariffRecords>
Upvotes: 2
Reputation: 2513
Though, this is not the most elegant solution, but I think you just want to simply do this, if I understand correctly. So doing as many modifications to your code as possible I got:
NUM_OF_COLS=5
echo '<TariffRecords>' >> $file_out
while IFS=$',' read -r -a arry
do
tariff=" <Tariff>\n"
for i in $(seq 0 $NUM_OF_COLS); do
tariff="${tariff} <A$i>${arry[$i]}</A$i>\n"
done
tariff="${tariff} </Tariff>"
echo -e ${tariff} >> $file_out
done < <(tail -n +1 $file_in)
echo '</TariffRecords>' >> $file_out
Things to note:
We are skipping CSV header by:
<(tail -n +1 $file_in)
Generate "foeach" cycle in range from 0
to $NUM_OF_COLS
, which represents column's indices by:
$(seq 0 $NUM_OF_COLS)
Append string by:
tariff="${tariff}......"
Using
echo -e ...
in order to preserve new lines and nice formatting, but you might use another bash utility like xmllint
in order to pretty formatting.
EDIT: For mulitple files
In order to process multiple files, replace hardcoded:
file_in="Prepaid_plan_voucher.csv"
file_out="Prepaid_plan_voucher.xml"
by
file_in="$1" # Take the name as an argument from command line
file_out="${1%.csv}.xml" # Remove csv suffix and append xml
and run the script from command line for every csv
file, e.g. like this:
$ for f in $(ls *.csv); do ./ourscript.sh $f; done
Upvotes: 2