Orlo
Orlo

Reputation: 828

Convert while read line to awk

While read line is very slow when you are working working with large files. The general suggestion I find from google is to use awk, but how can I convert the following while to awk?

        while read r; do
            html[$dId]+=$(echo -e "\n$r")
            stopList $(echo -e "$r" | tr -d ' ') all
        done <<< "$list"

what I've tried

        awk '{ 
            html[$dId]+=$(echo -e "\n$0")
            stopList $(echo -e "$0" | tr -d ' ') all
        }' <<< "$list"

Upvotes: 0

Views: 252

Answers (1)

janos
janos

Reputation: 124724

The reason it's slow is because it's running multiple processes per iteration:

while read r; do
    html[$dId]+=$(echo -e "\n$r")
    stopList $(echo -e "$r" | tr -d ' ') all
done <<< "$list"

There are: 2 echos, a tr, and the stopList function, which we don't even know what it does.

To convert this to awk you need to rethink a bit, something like this:

html[$dId]=$(awk '{ printf("\n%s", $0) }' <<< "$list")

That is, instead of appending to html line by line, awk should generate the whole thing. Inside a single awk process you can do very powerful text processing, which will be much more efficient than several echos, trs and such and such in the shell.

My example doesn't include stopList, because you didn't explain what it does. Whatever it does, you need to implement it within awk, so that it can run within the same awk process. Then your script will be much much faster then the current line-by-line while loop.

Upvotes: 2

Related Questions