branquito
branquito

Reputation: 4034

awk OFS, getting unexpected results

I can't see why I am getting unexpected result here, if someone could bring some light on this:-

Those are the first 5 records of uk-500.csv input file:

"first_name","last_name","company_name","address","city","county","postal","phone1","phone2","email","web"
"Aleshia","Tomkiewicz","Alan D Rosenburg Cpa Pc","14 Taylor St","St. Stephens Ward","Kent","CT2 7PP","01835-703597","01944-369967","[email protected]","http://www.alandrosenburgcpapc.co.uk"
"Evan","Zigomalas","Cap Gemini America","5 Binney St","Abbey Ward","Buckinghamshire","HP11 2AX","01937-864715","01714-737668","[email protected]","http://www.capgeminiamerica.co.uk"
"France","Andrade","Elliott, John W Esq","8 Moor Place","East Southbourne and Tuckton W","Bournemouth","BH6 3BE","01347-368222","01935-821636","[email protected]","http://www.elliottjohnwesq.co.uk"
"Ulysses","Mcwalters","Mcmahan, Ben L","505 Exeter Rd","Hawerby cum Beesby","Lincolnshire","DN36 5RP","01912-771311","01302-601380","[email protected]","http://www.mcmahanbenl.co.uk"

When I run this command:

awk 'BEGIN { FS="\",?\"?"; OFS="=" } NR < 5 { print $3 }' uk-500.csv

I get:

last_name
Tomkiewicz
Zigomalas
Andrade

If I use:

awk 'BEGIN { FS="\",?\"?"; OFS="=" } NR < 5 { printf $3" " }' uk-500.csv

I get:

last_name Tomkiewicz Zigomalas Andrade

Why in both cases awk ignores OFS value, shouldn't I get:

=last_name=Tomkiewicz=Zigomalas=Andrade=

ADDITION

While we are at a topic, it is worthwhile mentioning that in case of changing FS and OFS, one would expect for plain print or print $0 to output all fields with recalculated values per record, however this wont happen because no field was changed, so this:

awk 'BEGIN { FS="\",?\"?"; OFS="=" } NR < 5 { print }' uk-500.csv

will yield this:

"first_name","last_name","company_name","address","city","county","postal","phone1","phone2","email","web"
"Aleshia","Tomkiewicz","Alan D Rosenburg Cpa Pc","14 Taylor St","St. Stephens Ward","Kent","CT2 7PP","01835-703597","01944-369967","[email protected]","http://www.alandrosenburgcpapc.co.uk"
"Evan","Zigomalas","Cap Gemini America","5 Binney St","Abbey Ward","Buckinghamshire","HP11 2AX","01937-864715","01714-737668","[email protected]","http://www.capgeminiamerica.co.uk"
"France","Andrade","Elliott, John W Esq","8 Moor Place","East Southbourne and Tuckton W","Bournemouth","BH6 3BE","01347-368222","01935-821636","[email protected]","http://www.elliottjohnwesq.co.uk"

The proper way of doing this would be:

awk 'BEGIN { FS="\",?\"?"; OFS="=" } NR < 5 { $1=$1; print }' uk-500.csv

Now the result is like we expected:

=first_name=last_name=company_name=address=city=county=postal=phone1=phone2=email=web=
=Aleshia=Tomkiewicz=Alan D Rosenburg Cpa Pc=14 Taylor St=St. Stephens Ward=Kent=CT2 [email protected]=http://www.alandrosenburgcpapc.co.uk=
=Evan=Zigomalas=Cap Gemini America=5 Binney St=Abbey Ward=Buckinghamshire=HP11 [email protected]=http://www.capgeminiamerica.co.uk=
=France=Andrade=Elliott, John W Esq=8 Moor Place=East Southbourne and Tuckton W=Bournemouth=BH6 [email protected]=http://www.elliottjohnwesq.co.uk=

Upvotes: 0

Views: 170

Answers (1)

jaypal singh
jaypal singh

Reputation: 77085

OFS stands for Output Field Separator. By default that is set to single space. When you use printf, OFS is never used.

What you are probably looking for is ORS which is Output Record Separator which by default is set to newline.

Setting the ORS will give you the following output.

$ awk 'BEGIN { FS="\",?\"?"; ORS="=" } NR < 5 { print $3 }' uk-500.csv
last_name=Tomkiewicz=Zigomalas=Andrade=

You can use END block if newline is important.

Upvotes: 1

Related Questions