Reputation: 11750
I would like to collect comments in a bunch of files which start with #
into files which only contain the comments. I can do it this way:
find . -type f -exec grep '^#' {} > /path/to/comment/storage/{}
Very simple. This, however, is incredibly wasteful as it forks a grep
for every file and creates a lot of empty files too. grep -R '^#' .
would be much better but what do I do with the output of it? Something like export IFS=:;grep -R '^#' .|while read -r file contents; do echo $file-$contents; done
is a promising start however this falls apart when the contents themselves also contain colons. It also has the challenge of properly creating and appending to each file.
As an additional restriction, this need to run with busybox
as shell / awk / etc, GNU utils are not available.
Upvotes: 1
Views: 95
Reputation: 14891
Because you want to apply the colon separator only on the read, you should do:
grep -R '^#' * | while IFS=':' read -r file contents; do echo $file-$contents; done
EDIT: When you want the output in a file do:
output="outputfile"; rm -f $output; grep -R '^#' . | while IFS=':' read -r file contents; do echo "$file-$contents" >>$output; done
the different parts of this statement do:
output="outputfile"
assign the name of the outputfilerm -f $output
remove this file if it already existsgrep -R '^#' *
search for lines starting with as #
while IFS=':' read -r file contents
read output of previous statement with IFS=':' and assign first column to variable file
, and secons column to variable contents
echo $file-$contents >>$output
output these two variables with
-` in between them to the outputfile.Upvotes: -1
Reputation: 16642
As I write this it appears your input has no subdirectories, or that the ouput directory hierarchy has already been created. If that is the case, you can do:
find . -type f -exec awk '
FNR==1 {
if (NR>FNR) close(out)
out = "/path/to/comments/" FILENAME
}
/^#/ { print >out }
' {} +
Upvotes: 2
Reputation: 11750
I was unaware awk supported printing to a file by itself without shell redirects, this answer showed me and the man page confirms
print expr-list >file Print expressions on file. Each expression is separated by the value of OFS. The output record is terminated with the value of ORS.
Although the manpage is of GNU Awk this seems to work with busybox awk as well.
Second, we already have an answer on how to print everything except the first field with awk.
So we can do
grep -R '^#' . | \
awk -v FS=: -v OFS=: '{ key = $1; $1 = ""; a[key] = a[key] "\n" substr($0,2) }
END { for (key in a) print substr(a[key], 2) > "/path/to/comments/"key }'
the END
part needs another substr
because every element of the a
array starts with an empty line.
Finally, while awk
could, of course, subsume grep
there does not seem to be a way for it to neatly subsume grep -R
.
Upvotes: 1