Reputation: 13867
I do have a CSV which looks like this*:
system,subject,value1,value2
example.org,thing 1,100,4
exmaple.org,thing 2,90,0
example.com,thing 1,200,0
example.com,thing 5,10,10
The header us actually not included, but shown here to make it easier to read the example.
And I want to split that up into two files:
example.org.csv with:
thing 1,100,4
thing 2,90,0
example.com.csv with:
thing 1,200,0
thing 5,10,10
My current solution works this way:
while read line; do
SYSTEM=$(echo "$line" | cut -d, -f1)
NOTTHESYSTEM=$(echo "$line" | cut -d, -f2-)
echo "${NOTTHESYSTEM}" >> "${SYSTEM}.csv"
done <$INPUT
But this is working very inefficient and doesn't perform well with bigger files.
In numbers this means that a 52050 line/ 9 MB file needs about 250 secounds to finish the split.
Any suggestions how to improve the script above are welcome.
Cheers
Upvotes: 3
Views: 1128
Reputation: 3086
One way to do it, assuming you have a reasonably small number of systems:
cut -d, -f1 file.csv | \
sort -u | \
while read -r system; do
fgrep -w "$system" file.csv | cut -d, -f2- >"$system".csv
done
Upvotes: 1
Reputation: 784938
Using awk
it will be simpler:
awk 'BEGIN{FS=OFS=","} {print $2, $3, $4 > $1 ".csv"}' "$INPUT"
Verification:
cat example.org.csv
thing 1,100,4
thing 2,90,0
cat example.com.csv
thing 1,200,0
thing 5,10,10
Upvotes: 4