Reputation: 1
I have 3 CSV files:
Base File(values initialised with 0)
steve tignor ash michael jose sam joshua
0 0 0 0 0 0 0
File 1:
tignor michael jose
888 9 -2
File 2:
ash joshua
77 66
Output I need:
steve tignor ash michael jose sam joshua
File1 0 888 0 9 -2 0 0
File2 0 0 77 0 0 0 66
I tried with sorting the files first with awk and then merge with paste but as I have 1000+ columns and having 30 files it just did not work.
Code:
awk -F"," 'NR==1{
split($0,a,FS);asort(a);
for(i=1;i<=NF;i++)b[$i]=i
} {
for(i=1;i<=NF;i++)printf("%s,",$(b[a[i]]));
print x
}' File1 > 1.csv
awk -F"," 'NR==1{
split($0,a,FS);asort(a);
for(i=1;i<=NF;i++)b[$i]=i
} {
for(i=1;i<=NF;i++)printf("%s,",$(b[a[i]]));
print x
}' File2 > 2.csv
paste -d"\n" 1.csv 2.csv > merge.csv
Need some assistance here. Thanks in advance.
Upvotes: 1
Views: 384
Reputation: 173
I assumed that you omitted the commas in the files. If you're using space separated files you could just change the separator used in the split function.
awk '
ARGIND==1 && FNR==1{
split($0, base, ",")
printf("file,%s\n",$0)
}
ARGIND > 1 && FNR==1{
split($0, names, ",")
printf("%s", ARGV[ARGIND])
}
ARGIND > 1 && FNR==2{
split($0, values, ",")
for(i in names)
line[names[i]] = values[i]
for(i in base){
if(base[i] in line)
printf(",%s", line[base[i]])
else
printf(",0")
}
delete line
print ""
}
' base.csv file1.csv file2.csv
Example:
file1.csv:
tignor,michael,jose
888,9,-2
file2.csv:
ash,joshua
77,66
and base.csv:
steve,tignor,ash,michael,jose,sam,joshua
0,0,0,0,0,0,0
the output is:
file,steve,tignor,ash,michael,jose,sam,joshua
file1.csv,0,888,0,9,-2,0,0
file2.csv,0,0,77,0,0,0,66
Basically, the script is running in 2 steps:
P.S. I made a new POSIX awk compatible version of the script:
awk --posix '
NR==FNR && FNR==1{
split($0, base, ",")
printf("file,%s\n",$0)
}
NR>FNR && FNR==1{
split($0, names, ",")
printf("%s", FILENAME)
}
NR>FNR && FNR==2{
split($0, values, ",")
for(i in names)
line[names[i]] = values[i]
for(i in base){
if(base[i] in line)
printf(",%s", line[base[i]])
else
printf(",0")
}
delete line
print ""
}
' base.csv file1.csv file2.csv
Upvotes: 1