Berk U.
Berk U.

Reputation: 7188

Extracting data from a column in a CSV file with headers and input into an array in Bash

I am trying to write a simple bash script that can extract data from one column in csv file and store it as an array. My question is very similar to that of a previous post, but I am trouble getting the proposed solution from that post to work (possibly because my CSV file has headers).

Specifically, I have a CSV file, weights.csv, with two columns:

w_neg,w_pos
1.000,1.000
0.523,1.477
0.210,1.790
1.420,0.580

and I would like to create an array variable, w_pos, that will contain the entire second column of weights.csv.

w_pos=(1.000 1.477 1.790 0.580)

Based on the answer from this previous post, I tried to do this using the following line of code:

w_pos=( $(cut -d ',' -f2 weights.csv ) )

Unfortunately, it seems as if this only stores the first row of w_pos. As

echo ${w_pos[0]} 
1.000

but

echo ${w_pos[1]} 

yields nothing.

I would appreciate any insight into what the problem might be. Ideally, I would like a solution that does not use packages other than what would be bundled with a barebones Unix installation (the script has to run on a cluster that doesn't have simple tools like "bc" :-/)

Upvotes: 1

Views: 1003

Answers (3)

Md Shihab Uddin
Md Shihab Uddin

Reputation: 561

i like to use awk. Here is an way you can try:

w_neg=($(tail -n +2 weights.csv | awk -F ',' '{print $1;}'))
w_pos=($(tail -n +2 weights.csv | awk -F ',' '{print $2;}'))
echo ${w_neg[1]}
echo ${w_pos[1]}

index start from 1.

Upvotes: 0

jaypal singh
jaypal singh

Reputation: 77105

Here is a way using bash:

while IFS=, read -r col1 col2; do
    [[ $col2 =~ ^[0-9] ]] && w_pos+=( $col2 )
done < weights.csv

declare -p w_pos

Output:

declare -a w_pos='([0]="1.000" [1]="1.477" [2]="1.790" [3]="0.580")'
  • We set the delimiter to , by modifying the IFS.
  • We then read the two columns in two variables
  • Append to w_pos array variable the second column, if the variable starts with a number. [[ $col2 =~ ^[0-9] ]] does that test for us.
  • declare -p will give you the structure of our array.

Upvotes: 2

Vytenis Bivainis
Vytenis Bivainis

Reputation: 2376

Here's the solution:

w_neg=($(tail -n +2 weights.csv | cut -d, -f1))
w_pos=($(tail -n +2 weights.csv | cut -d, -f2))

Upvotes: 1

Related Questions