Reputation: 121
Can anybody explain me this small script.
echo -e "\"aa;bb\";cc ;\"dd ;ee\";
ff" | awk -v RS=\" -v ORS=\" 'NR%2==0{gsub(";",",")}
{print}'
In this script fields separated by (;
), but if there is one or more (;)
inside any field then this field is surrounded by ""
.It's CSV-file
.
Therefore it is necessary to replace all (;)
in this fields
for further parsing.
Upvotes: 1
Views: 499
Reputation: 36262
The echo prints two lines:
"aa;bb";cc ;"dd ;ee";
ff
And awk splits records with each double quote, and in the even ones replace all semicolons with commas (gsub
).
So, first record will be the content just before first double quote, it's a blank record but the important part is the condition NR%2==0
. NR
is one so the condition will be false, gsub()
will not be executed, it will be printed with its ORS
so output will be a double quote.
For second record content will be aa;bb
, NR%2==0
will be true and will replace the semicolon.
For third record content will be ;cc ;
, NR%2==0
will be false and it will be printed.
And so on until end of file.
Upvotes: 2