Reputation: 5735
I need a little advice/help with this bash line, which i'm trying to accomplish using awk,
Basically, i have a variable holding comma separated values, like so:
"abc,abd,abf,abz,abz"
Getting each field is very easy with a simple awk loop
echo ${var} | awk -F"," '{for(i=1;i<=NF;i++){print $i}}'
The problem is that sometime these comma separated values contain a string, with comma in the middle, e.g:
"abc,"abd,abf,abz",abh,abr,alk"
In this case "abd,abf,abz" is a single value, i need to tell awk that whats between quotes has to be treated as whole value and not to be separated but i get nowhere, Any advice?
Upvotes: 1
Views: 141
Reputation: 1737
Check out the csvtool
program that enables you to manipulate CSV files.
It can be installed with apt-get
(or with whatever your package manager is) and used in your Bash files to work with CSV files.
Upvotes: 0
Reputation: 85785
Firstly you don't need to loop at all for the first example:
$ awk '{print}' RS=',' <<< 'abc,abd,abf,abz,abz'
abc
abd
abf
abz
abz
For the second example you really want a proper CSV parser. Here is a python
solution:
#!/usr/bin/env python
from csv import reader, writer
from sys import stdin, stdout
writer(stdout, delimiter='\n').writerows(reader(stdin))
Demo:
$ cat file
abc,"abd,abf,abz",abh,abr,alk
$ csv_delimiter.py < file
abc
abd,abf,abz
abh
abr
alk
Upvotes: 1
Reputation: 372
The best I could do with awk:
$ echo 'abc,"xxx,yyy,zzz",abh,abr,alk' | awk -F'"' '{
for(i=1;i<=NF;i++) {
if (i %2 == 0) {
printf "\""$i"\"";
} else {
n=split($i,array,",");
for (j=1; j<n; j++) {
print array[j];
}
}
}
}'
abc
"xxx,yyy,zzz"
abh
abr
alk
This does give empty lines though :(, I'm still trying to find out why.
Update: Fixed + indented
Upvotes: 1
Reputation: 203512
If the first/last double quotes you show in your sample input are actually not present in your input then:
$ echo 'abc,"abd,abf,abz",abh,abr,alk' |
awk -F\" '{
for (i=1;i<=NF;i++) {
if (i%2) {
gsub(/^,|,$/,"",$i)
nf = split($i,a,/,/)
for (j=1; j<=nf; j++) {
print a[j]
}
}
else {
print $i
}
}
}'
abc
abd,abf,abz
abh
abr
alk
If they are present then:
$ echo '"abc,"abd,abf,abz",abh,abr,alk"' |
awk -F\" '{
for (i=2;i<NF;i++) {
if ( !(i%2) ) {
gsub(/^,|,$/,"",$i)
nf = split($i,a,/,/)
for (j=1; j<=nf; j++) {
print a[j]
}
}
else {
print $i
}
}
}'
abc
abd,abf,abz
abh
abr
alk
Upvotes: 1