js352
js352

Reputation: 374

IFS not parsing well CSV

I am trying to parse a file so I can obtain the first column. The command I'm using is:

while IFS=',' read -r a; do echo "$a"; done < test.csv

However it is still outputting the whole csv instead of the first column. An example of the csv is as follows:

NOM,CODI,DATA,SEXE,GRUP_EDAT,RESIDENCIA,CASOS_CONFIRMAT,PCR,INGRESSOS_TOTAL,INGRESSOS_CRITIC,INGRESSATS_TOTAL,INGRESSATS_CRITIC,EXITUS
    MOIANÃS,42,24/08/2020,Home,Majors de 74,No,0,2,0,0,0,0,0
    ALT CAMP,01,30/07/2020,Dona,Entre 15 i 64,Si,0,0,0,0,0,0,0
    ALT CAMP,01,30/07/2020,Dona,Entre 65 i 74,No,0,1,0,0,0,0,0
    ALT CAMP,01,30/07/2020,Dona,Entre 65 i 74,Si,0,0,0,0,0,0,0

I've been looking elsewhere and all seem to agree that this should be the correct approach when parsing csv using IFS. A thing I've noticed is that if I add a new column to the read function, say b, it outputs the first column instead of everything.

while IFS=',' read -r a b; do echo "$a"; done < test.csv

I don't understand this behaviour and it does not seem to work further than printing the first column. For example, If I were to put c and $c, it wouldn't print the third column and so on.

Can you please explain this behaviour and why this is happening?

Thank you

Upvotes: 2

Views: 723

Answers (3)

chepner
chepner

Reputation: 531808

For simple CSV files, you can simply split on every comma, but you want to read the input into an array, unless you know the number of columns in every row.

For exapmle, if you know there are going to be (at most) 10 columns, you can use

while IFS=, read -r f1 f2 f3 f4 f5 f6 f7 f8 f9 f10; do

However, in bash it is simpler to read the entire split line into a single array:

while IFS=, read -ra f; do

The first field would be "${f[0]}", the second "${f[1]}", etc.

Upvotes: 2

stark
stark

Reputation: 13189

read is working correctly. It splits on IFS and assigns each field to a variable, with the remainder of the line going to the last variable. If you only give one variable, the whole line goes to it.

Upvotes: 3

anubhava
anubhava

Reputation: 785551

bash is not the right tool for parsing a csv file and you should consider awk for this. e.g. to printf first 2 columns use this super simple awk command:

awk -F, '{print $1, $2}' file.csv

Just to highlight your issue: Regarding your bash loop, better use an array to ready all comma separated columns into array:

while IFS=, read -ra arr; do
    # print first 2 columns
    echo "col1=${arr[0]}, col2=${arr[1]}"
done < file.csv

Upvotes: 2

Related Questions