Splitting one CSV into multiple files based on field value

Question

I do have a CSV which looks like this*:

system,subject,value1,value2
example.org,thing 1,100,4
exmaple.org,thing 2,90,0
example.com,thing 1,200,0
example.com,thing 5,10,10

The header us actually not included, but shown here to make it easier to read the example.

And I want to split that up into two files:

example.org.csv with:

thing 1,100,4
thing 2,90,0

example.com.csv with:

thing 1,200,0
thing 5,10,10

My current solution works this way:

while read line; do
  SYSTEM=$(echo "$line" | cut -d, -f1)
  NOTTHESYSTEM=$(echo "$line" | cut -d, -f2-)
  echo "${NOTTHESYSTEM}" >> "${SYSTEM}.csv"
done <$INPUT

But this is working very inefficient and doesn't perform well with bigger files.

In numbers this means that a 52050 line/ 9 MB file needs about 250 secounds to finish the split.

Any suggestions how to improve the script above are welcome.

Cheers

anubhava · Accepted Answer

Using awk it will be simpler:

awk 'BEGIN{FS=OFS=","} {print $2, $3, $4 > $1 ".csv"}' "$INPUT"

Verification:

cat example.org.csv
thing 1,100,4
thing 2,90,0

cat example.com.csv
thing 1,200,0
thing 5,10,10

Splitting one CSV into multiple files based on field value

Answers (2)

Related Questions