Reputation: 370
I am using command to search for files staring from a given Directory in Unix , files ending *.sas and containing the string DB2. I then inturn want to search the resultant set of files for the Strings DSN= or DATASRC= and also print the line containing these strings. SO this is the FInd command I am using :
find '/shrproj/' -type f -name '*.sas' -exec grep -il 'DB2' {} \; 2> /dev/null | xargs egrep -Ri 'DSN=|DATASRC='
This gives me the desired ouput:
/shrproj/files/stp_code/aea_aat_stp/icrv3/bin/macro/cnct_2_eaw.sas: ,"DSN=%UPCASE(&the_database.)"
/shrproj/files/stp_code/aea_aat_stp/icrv3/bin/macro/cnct_2_eaw.sas: ,"DSN=%UPCASE(&the_database.)"
But now i also want to print the properties of the file (using the -ls option) following the above result ie the below is what i intend to achieve :
/shrproj/files/stp_code/aea_aat_stp/icrv3/bin/macro/cnct_2_eaw.sas: ,"DSN=%UPCASE(&the_database.)"
/shrproj/files/stp_code/aea_aat_stp/icrv3/bin/macro/cnct_2_eaw.sas: ,"DSN=%UPCASE(&the_database.)"
61522 19 -rwxrwsr-x 1 sas sas 18546 Jun 2 2010 /shrproj/files/stp_code/aea_aat_stp/icrv3/bin/macro/cnct_2_eaw.sas
The Properties of the file in the last line above is the same when using the find command with -ls option.
find /shrproj/files/stp_code/aea_aat_stp/icrv3/bin/macro/cnct_2_eaw.sas -ls
So how do i acheive this this for each and every file using the very first Find command i am using above ?.
Please let me know. Thanks.
Upvotes: 0
Views: 301
Reputation: 33685
For this specific task BroSlows solution seems the best (albeit not necessarily the most readable). But what if you in the future need something a bit more advanced? This is where GNU Parallel can help you. Make a script or a bash function that you want run for each file:
grepit() {
FILE="$1"
grep -qi DB2 "$FILE" &&
egrep -qi 'DSN=|DATASRC=' &&
ls -l "$FILE"
}
export -f grepit
find '/shrproj/' -type f -name '*.sas' | parallel grepit
This will run 1 job per core. Depending on your disk system it may be faster to run more or fewer jobs in parallel (use -j
to control that).
GNU Parallel is a general parallelizer and makes is easy to run jobs in parallel on the same machine or on multiple machines you have ssh access to.
If you have 32 different jobs you want to run on 4 CPUs, a straight forward way to parallelize is to run 8 jobs on each CPU:
GNU Parallel instead spawns a new process when one finishes - keeping the CPUs active and thus saving time:
Installation
If GNU Parallel is not packaged for your distribution, you can do a personal installation, which does not require root access. It can be done in 10 seconds by doing this:
(wget -O - pi.dk/3 || curl pi.dk/3/ || fetch -o - http://pi.dk/3) | bash
For other installation options see http://git.savannah.gnu.org/cgit/parallel.git/tree/README
Learn more
See more examples: http://www.gnu.org/software/parallel/man.html
Watch the intro videos: https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1
Walk through the tutorial: http://www.gnu.org/software/parallel/parallel_tutorial.html
Sign up for the email list to get support: https://lists.gnu.org/mailman/listinfo/parallel
Upvotes: 1
Reputation: 11593
Something like below should work
find /shrproj/ -type f -name '*.sas' \
-exec grep -iq 'DB2' {} \; \( \
-exec grep -iq 'DSN=' {} \; -o \
-exec grep -iq 'DATASRC=' {} \; \) \
-ls \
-exec egrep -i 'DSN=|DATASRC=' {} \;
Which has some redundancy, since logical -q
will exit on first match (so can not be used in conjunction with printing all matches), but shouldn't be too slow if you don't have a lot of large files with DSN=
or DATASRC=
only found near the end.
Alternatively, without abusing grep -q
too much
find /shrproj/ -type f -name '*.sas' \
-exec grep -iq 'DB2' {} \; \
-exec bash -c 'out=$(egrep -i "DSN=|DATASRC=" "$1"); [[ -n $out ]] && echo "$out" && exit 0 || exit 1 ' bash {} \; \
-ls
Upvotes: 0