wwwilliam
wwwilliam

Reputation: 9592

Can I make this bash script that writes to multiple files run any faster?

I have a script that reads in lines from a file, takes the first column of each line, and appends to a file named that line (I am trying to write many different files, named $id.txt).

Is it possible to have a script that does anything faster than this (on a single-node machine)? Note that I use read -r and id="$(echo $line | awk '{print $1}')" because I have tab-separated fields and there are certain characters like backslashes in some fields that I want to keep.

    while read -r line
    do
        id="$(echo $line | awk '{print $1}')"
        echo "$line" >> $id.txt
    done < $1

Some characteristics of my input:

abc ...
abc ...
def ...
def ...
def ...
def ...
ghi ...
ghi ...

Upvotes: 1

Views: 132

Answers (2)

Christopher Neylan
Christopher Neylan

Reputation: 8272

I'm guessing that your slowness is coming from doing $(echo $line | awk '{print $1}' for every line, which means the operating system needs to go through the work of creating two new processes for every line, made worse by awk being an interpreter. You should condense this into one script using something like awk (by itself) or Perl.

Upvotes: 1

Ignacio Vazquez-Abrams
Ignacio Vazquez-Abrams

Reputation: 798746

Too much work.

awk '{ print >> $1".txt" }' "$1"

Upvotes: 6

Related Questions