Reputation: 775
I have the following csv file with 3 columns:
row1value1,row1value2,"row1
multi
line
value"
row2value1,row2value2,"row2
multi
line
value"
Is there a way to loop through its rows like (this does not work, it reads lines):
while read $ROW
do
#some code that uses $ROW variable
done < file.csv
Upvotes: 0
Views: 2103
Reputation: 21
Here is a pure bash solution. The multiline_csv.sh
script translates the multiline csv into standard csv by replacing the newline characters between quotes with some replacement string. So the usage is
./multiline_csv.sh CSVFILE SEP
I placed your example script in a file called ./multi.csv
. Running the command ./multiline_csv.sh ./multi.csv "\n"
yielded the following output
[ericthewry@eric-arch-pc stackoverflow]$ ./multiline_csv.sh ./multi.csv "\n"
r1c2,r1c2,"row1\nmulti\nline\nvalue"
r2c1,r2c2,"row2\nmultiline\nvalue"
This can be easily translated back to the original csv file using printf
:
[ericthewry@eric-arch-pc stackoverflow]$ printf "$(./multiline_csv.sh ./multi.csv "\n")\n"
r1c2,r1c2,"row1
multi
line
value"
r2c1,r2c2,"row2
multiline
value"
This might be an Arch-specific quirk of echo/sprintf (I'm not sure), but you could use some other separator string like ~~~++??//NEWLINE\\??++~~~
that you could sed
out if need be.
# multiline_csv.sh
open=0
line_is_open(){
quote="$2"
(printf "$1" | sed -e "s/\(.\)/\1\n/g") | (while read char; do
if [[ "$char" = '"' ]]; then
open=$((($open + 1) % 2))
fi
done && echo $open)
}
cat "$1" | while read ln ; do
flatline="${ln}"
open=$(line_is_open "${ln}" $open)
until [[ "$open" = "0" ]]; do
if read newln
then
flatline="${flatline}$2${newln}"
open=$(line_is_open "${newln}" $open)
else
break
fi
done
echo "${flatline}"
done
Once you've done this translation, you can proceed as you would normally via the while read $ROW do ... done
method.
Upvotes: 1
Reputation: 785621
Using gnu-awk you can do this using FPAT
:
awk -v RS='"\n' -v FPAT='"[^"]*"|[^,]*' '{
print "Record #", NR, " =======>"
for (i=1; i<=NF; i++) {
sub(/^"/, "", $i)
printf "Field # %d, value=[%s]\n", i, $i
}
}' file.csv
Record # 1 =======>
Field # 1, value=[row1value1]
Field # 2, value=[row1value2]
Field # 3, value=[row1
multi
line
value]
Record # 2 =======>
Field # 1, value=[row2value1]
Field # 2, value=[row2value2]
Field # 3, value=[row2
multi
line
value]
However, as I commented above a dedicated CSV parser using PHP, Perl or Python will be more robust for this job.
Upvotes: 1