Awknewbie
Awknewbie

Reputation: 269

AWK : Replace comma in quoted string with underscore and remove quotes

Data:

1.txt

-1,"AAA",aaa@ymail
-10,"B ,BB","b, bb@ymail"
-7,C,c@gmail

I want to 1.Remove comma in quoted string with underscore 2.Remove quotes also after that

I am using below :

awk -F'"' -v OFS='' '{ for (i=2; i<=NF; i+=2) gsub("[,]", "_", $i) } 1' 1.txt > 2.txt

output:(note that " in first line ["AAA"] is not removed because of there is no comma

-1,"AAA",aaa@ymail
-10,"B ,BB",b_ bb@ymail
-7,C,c@gmail

So additionally I use

awk -F'"' -v OFS='' '{ for (i=1; i<=NF; i+=1) } 1' 2.txt > 3.txt

-1,AAA,aaa@ymail
-10,"B ,BB",b_ bb@ymail
-7,C,c@gmail

Please suggest a better way of doing the above

Upvotes: 1

Views: 1833

Answers (2)

glenn jackman
glenn jackman

Reputation: 247210

I'd select a language with a proper CSV parser. For example, ruby:

ruby -r csv -ne '
  row = CSV.parse_line($_).collect {|f| f.gsub(/,/,"_")}
  puts CSV.generate_line(row)
' <<END
-1,"AAA",aaa@ymail
-10,"B ,BB","b, bb@ymail"
-7,C,c@gmail
END
-1,AAA,aaa@ymail
-10,B _BB,b_ bb@ymail
-7,C,c@gmail

Upvotes: 2

user000001
user000001

Reputation: 33387

If you add an $1=$1 command, the output will be split and remerged, so the field separator will not appear in the output.

$ awk -F'"' -v OFS='' '{ for (i=2; i<=NF; i+=2) gsub("[,]", "_", $i);$1=$1 } 1' file
-1,AAA,aaa@ymail
-10,B _BB,b_ bb@ymail
-7,C,c@gmail

Upvotes: 1

Related Questions