chx
chx

Reputation: 11750

How to iterate over grep -R output in busybox?

I would like to collect comments in a bunch of files which start with # into files which only contain the comments. I can do it this way:

find . -type f -exec grep '^#' {} > /path/to/comment/storage/{}

Very simple. This, however, is incredibly wasteful as it forks a grep for every file and creates a lot of empty files too. grep -R '^#' . would be much better but what do I do with the output of it? Something like export IFS=:;grep -R '^#' .|while read -r file contents; do echo $file-$contents; done is a promising start however this falls apart when the contents themselves also contain colons. It also has the challenge of properly creating and appending to each file.

As an additional restriction, this need to run with busybox as shell / awk / etc, GNU utils are not available.

Upvotes: 1

Views: 95

Answers (3)

Luuk
Luuk

Reputation: 14891

Because you want to apply the colon separator only on the read, you should do:

grep -R '^#' * | while IFS=':' read -r file contents; do echo $file-$contents; done

More info: Setting an environment variable before a command in Bash is not working for the second command in a pipe

EDIT: When you want the output in a file do:

output="outputfile"; rm -f $output; grep -R '^#' . | while IFS=':' read -r file contents; do echo "$file-$contents" >>$output; done

the different parts of this statement do:

  • output="outputfile" assign the name of the outputfile
  • rm -f $output remove this file if it already exists
  • grep -R '^#' * search for lines starting with as #
  • while IFS=':' read -r file contents read output of previous statement with IFS=':' and assign first column to variable file, and secons column to variable contents
  • echo $file-$contents >>$output output these two variables with -` in between them to the outputfile.

Upvotes: -1

jhnc
jhnc

Reputation: 16642

As I write this it appears your input has no subdirectories, or that the ouput directory hierarchy has already been created. If that is the case, you can do:

find . -type f -exec awk '
        FNR==1 {
            if (NR>FNR) close(out)
            out = "/path/to/comments/" FILENAME
        }
        /^#/ { print >out }
    ' {} +

Upvotes: 2

chx
chx

Reputation: 11750

I was unaware awk supported printing to a file by itself without shell redirects, this answer showed me and the man page confirms

print expr-list >file Print expressions on file. Each expression is separated by the value of OFS. The output record is terminated with the value of ORS.

Although the manpage is of GNU Awk this seems to work with busybox awk as well.

Second, we already have an answer on how to print everything except the first field with awk.

So we can do

grep -R '^#' . | \
awk -v FS=: -v OFS=: '{ key = $1; $1 = ""; a[key] = a[key] "\n" substr($0,2) } 
END { for (key in a) print substr(a[key], 2) > "/path/to/comments/"key }'

the END part needs another substr because every element of the a array starts with an empty line.

Finally, while awk could, of course, subsume grep there does not seem to be a way for it to neatly subsume grep -R.

Upvotes: 1

Related Questions