Reputation: 139
I have an employees.csv file with around 500 rows and 11 columns, column fileds are limited by double quotes:
"1","Paula","Paula's Role","Paula's Job Description","Paula's Department","11/10/2008","8","14","10","24","0
"2","John","John's Role","John's Job Description","John's Department","11/10/2008","2","17","6","11","0"
"3","Mark","Mark's Role","Mark's Job Description","Mark's Department","11/10/2008","4","17","13","44","0"
:
:
(more records)
:
:
"499","Maria","Maria's Role","Maria's Job Description","Maria's Department","11/10/2008","8","15","2","9","0"
"500","Peter","Peter's Role","Peters's Job Description","Peters's Department","11/10/2008","8","17","16","22","0"
I'm trying to figure out how to split such file in multiple csv (one line = one file), according to the first field (unique employee ID number). The output of the command shall be 500 separate csv files, containing 1 row each one, and named as follows:
1.csv
2.csv
3.csv
:
:
:
499.csv
500.csv
I've been trying with combination of cat and awk, but there is some mistake in the code:
for i in $(cat unix | awk -F\, '{print $1}' /myfolder/employees.csv);
do
grep $i "/myfolder/employees.csv" > "/myfolder/splittedfiles/$i";
done
Many thanks.
Upvotes: 2
Views: 808
Reputation: 10149
You could use GNU awk like this:
awk 'BEGIN {FPAT="[^\"]+"} { print $0 > "/myfolder/splittedfiles/"$1".csv" }' yourfile
FPAT
defines field content by a regular expression, here it helps us to strip the quotes from $1
.
Upvotes: 2
Reputation: 17041
Edit (and this one is tested), this gawk
script does the job for me:
gawk -F'"' -- '{print $0 >> ("/myfolder/splittedfiles/" $2 ".csv")}' /myfolder/employees.csv
The -F'"'
splits fields at "
, so the employee number is in $2
. Then ("/myfolder/splittedfiles/" $2 ".csv")
builds the filename you want, and print $0 >> ...
prints the original line to that file.
Or if the fields are always in numerical order starting from 1, this should work (not tested)
split -l 1 /myfolder/employees.csv /myfolder/splittedfiles/EMPL
empno=1
for fname in /myfolder/splittedfiles/EMPL* ; do
mv "$f" "/myfolder/splittedfiles/${empno}.csv"
empno=$((empno+1))
done
split
makes each line (-l 1
) a separate file. The for
loops over those files in order. The mv
renames each file to be ${empno}.csv
, starting from empno=1
. Then $((empno+1))
increments empno
.
Upvotes: 0