Andry
Andry

Reputation: 121

awk and substitute semicolon (;) in appointed field

Can anybody explain me this small script.

echo -e "\"aa;bb\";cc ;\"dd ;ee\"; 
ff" | awk -v RS=\" -v ORS=\" 'NR%2==0{gsub(";",",")}
{print}'

In this script fields separated by (;), but if there is one or more (;) inside any field then this field is surrounded by "".It's CSV-file.

Therefore it is necessary to replace all (;) in this fields for further parsing.

Upvotes: 1

Views: 499

Answers (1)

Birei
Birei

Reputation: 36262

The echo prints two lines:

"aa;bb";cc ;"dd ;ee"; 
ff

And splits records with each double quote, and in the even ones replace all semicolons with commas (gsub).

So, first record will be the content just before first double quote, it's a blank record but the important part is the condition NR%2==0. NR is one so the condition will be false, gsub() will not be executed, it will be printed with its ORS so output will be a double quote.

For second record content will be aa;bb, NR%2==0 will be true and will replace the semicolon.

For third record content will be ;cc ;, NR%2==0 will be false and it will be printed.

And so on until end of file.

Upvotes: 2

Related Questions