Reputation: 1
I am facing an error as below when I am trying to concatenate too many csv files together.
Facing error
awk: cmd. line:1: (FILENAME=rawdata.2018-01-14.csv.bkp FNR=1069) fatal: cannot open pipe `date "+%F %T" -d "Jan 13 22:00:12 2018"1' (Too many open files) awk: cmd. line:1: (FILENAME=rawdata.2018-01-15.csv.bkp FNR=1070) fatal: cannot open pipe `date "+%F %T" -d "Jan 13 22:00:12 2018"1' (Too many open files)
etc... till FNR=1074
Out of 60 files its processing first 44 files and next 16 files are giving error while they are concatenated.
Code :
for i in rawdata.*.csv;
do
echo $i;
awk '{if($0) printf("%s\t%s\n", FILENAME, $0); else print FILENAME;}' $i > $i.bk;
sed -e "1,2d" $i.bk > $i.bkp
awk -e '{tempdate="date \"+%F %T\" -d \""$6" "$7" "$8" "$9"\"" tempdate | getline tmpdate; print tmpdate "\t" "source-" $1 "\t" $2 "\t" $3 "\t" $4 "\t" $9 "\t" $10 "\t" $11 ; close(tempdate) }' $i.bkp | sed 's/.//5' > $i.bakp
done
cat rawdata.*.bakp > rawdatacombnew.csv
rm rawdata.*.bk
rm rawdata.*.bkp
rm rawdata.*.bakp
any suggestions would be very helpful.
one observation I saw is that , the file size for increased from processing the 45th file in my example. Is size an issue ?
Thanks.
Upvotes: 0
Views: 719
Reputation: 203995
You're missing a semicolon between tempdate="..."
and tempdate | getline
so you're continually appending to tempdate and idk what exactly is being piped to getline!
There's no benefit to cramming your scripts onto single lines, just write it naturally and it'll be much easier to read and spot issue in:
awk -v OFS='\t' '{
tempdate="date \"+%F %T\" -d \""$6" "$7" "$8" "$9"\""
if ( (tempdate | getline tmpdate) > 0 ) {
print tmpdate, "source-" $1, $2, $3, $4, $9, $10, $11
}
close(tempdate)
}' "$i.bkp"
I tidied up a couple of other things while I was at it.
Upvotes: 1