Danny
Danny

Reputation: 65

Drop a column from a delimited file and save it to existing filename

Do anyone know how to drop a column from hundreds of columns and save it to the existing file name? And I have to do it for several files also, is it possible I can do it together at the same time? I have been looking and trying for it but it seems no luck with it. Thanks for those who can help.

awk -F, '{for(i=1;i<=NF;i++)if(i!=x)f=f?f FS $i:$i;print f;f=""}' x=2 file > file1

Is this the correct one?

Upvotes: 1

Views: 50

Answers (1)

codeforester
codeforester

Reputation: 42999

Your awk code looks good. However, cut may be your best friend if you want to speed things up:

# let's say we want to take out column number 2
for file in *; do
  newfile="$file.changed"
  cut -f1,3- -d, "$file" > "$newfile" && mv "$newfile" "$file"
done

Here is an awk vs cut comparison for a file with 2.4 million identical lines like this:

1,2,3,4,5,6

time awk -F, '{for(i=1;i<=NF;i++)if(i!=x)f=f?f FS $i:$i;print f;f=""}' x=2 t >/dev/null

real    0m13.815s
user    0m13.116s
sys 0m0.217s

time cut -f1,3- -d, t >/dev/null

real    0m2.374s
user    0m2.093s
sys 0m0.054s

The rule of thumb I use is this - awk is for things that can't be done with cut, sed, paste etc., and files involved are small. If performance is important or if there is complex logic, always opt for a better language like Perl, Python, or Ruby which help us write more readable code.

Upvotes: 2

Related Questions