Reputation: 57
I am trying to create a text file that contains a listing of all log files that contain a certain string in the first line. More specifically, SAS log files.
Currently I have a simple script that will search the entire system for "*.log" files and output the entire list to a text file.
Is there a way to only output the log files that contain a certain string?
Here is the current command:
find `pwd` -name "*.log" > sas_log_list.txt
Every SAS log file contains the same string on the very first line.
This string is:
1 The SAS System
So basically I want to search a file system for log files containing the string above, and output those file names to a text file.
Thanks in advance, Jason
Upvotes: 0
Views: 1995
Reputation: 11228
I've attempted to make things a bit faster by reading only first line of each file. This prints out file names matching pattern.
( IFS=$'\n' ; for f in $(find `pwd` -name "*log" -type f ) ; do
head -n 1 "$f" | grep -q "The SAS System" && echo "$f"
done )
UPDATE 1: Edited to handle path names containing white space using one of the techniques offered by Charles Duffy. I couldn't use the find -exec .. +
expression as {}
can't appear more than once. Thanks ghostdog74 and Telemachus
UPDATE 2: Add full pathname and last modified time
( IFS=$'\n' ; for f in $(find . -name "*log" -type f ) ; do
head -n 1 "$f" | grep -q "The SAS System" && echo $(readlink -f "$f") $(stat -c %y "$f")
done )
Upvotes: 0
Reputation: 16338
The hardest part of this question is searching only within the first line. The most accurate one liner (broken here for readability) I could come up with was:
find . -name '*.log' -type f -readable ! -size 0 \
-exec sed -n '1{/The SAS System/q0};q1' {} \; \
-print
Due to the obscure nature of sed
syntax, some explanation is in order:
1{...}
will be evaluated for the first line only./regex/q0
command will quit with exit code 0 (success) if the regex had been matched (consider /^regex$/
for matching the entire line against that regex).q1
will quit with exit 1 (fail).find
uses that sed
command as a predicate and -print
only if it was true. However there is a small snag. Apparently if the file is with -size 0
sed
will exit 0
immediately without evaluating its arguments. For that reason we need the ! -size 0
argument to find
.
As suggested by @Brandon Horsley, -type f
will produce less errors, and while we at it lets verify that the file is -readable
as well.
Upvotes: 3
Reputation: 342273
bash 4
shopt -s globstar
shopt -s nullglob
for logfile in **/*.log
do
read firstline<"$logfile"
case "$firstline" in
*"The SAS System"*) echo "$logfile" >> sas_log_list.txt
esac
done
Upvotes: 0
Reputation: 46965
find `pwd` -name "*.log" -exec grep "The SAS System" {} \;
or
find \`pwd\` -name "\*.log" | grep -i "the sas system"
Upvotes: 0
Reputation: 1176
Unless I'm mistaken, you don't need the call to pwd
. I think this will get you what you want. You can use the -l flag on grep to get the filenames rather than the matching lines.
find . -name "*.log" -exec grep -l "The SAS System" {} \; > sas_log_list.txt
Upvotes: 0