pchegoor
pchegoor

Reputation: 370

Seaching a File in Unix for a given string and then another set of Strings within this file

I am using command to search for files staring from a given Directory in Unix , files ending *.sas and containing the string DB2. I then inturn want to search the resultant set of files for the Strings DSN= or DATASRC= and also print the line containing these strings. SO this is the FInd command I am using :

find '/shrproj/'  -type f -name '*.sas'  -exec  grep   -il 'DB2'  {} \;  2> /dev/null  |  xargs   egrep   -Ri  'DSN=|DATASRC='

This gives me the desired ouput:

/shrproj/files/stp_code/aea_aat_stp/icrv3/bin/macro/cnct_2_eaw.sas:                   ,"DSN=%UPCASE(&the_database.)"
/shrproj/files/stp_code/aea_aat_stp/icrv3/bin/macro/cnct_2_eaw.sas:                   ,"DSN=%UPCASE(&the_database.)"

But now i also want to print the properties of the file (using the -ls option) following the above result ie the below is what i intend to achieve :

/shrproj/files/stp_code/aea_aat_stp/icrv3/bin/macro/cnct_2_eaw.sas:                   ,"DSN=%UPCASE(&the_database.)"
/shrproj/files/stp_code/aea_aat_stp/icrv3/bin/macro/cnct_2_eaw.sas:                   ,"DSN=%UPCASE(&the_database.)"
61522   19 -rwxrwsr-x  1 sas       sas          18546 Jun  2  2010 /shrproj/files/stp_code/aea_aat_stp/icrv3/bin/macro/cnct_2_eaw.sas

The Properties of the file in the last line above is the same when using the find command with -ls option.

 find /shrproj/files/stp_code/aea_aat_stp/icrv3/bin/macro/cnct_2_eaw.sas -ls

So how do i acheive this this for each and every file using the very first Find command i am using above ?.

Please let me know. Thanks.

Upvotes: 0

Views: 301

Answers (2)

Ole Tange
Ole Tange

Reputation: 33685

For this specific task BroSlows solution seems the best (albeit not necessarily the most readable). But what if you in the future need something a bit more advanced? This is where GNU Parallel can help you. Make a script or a bash function that you want run for each file:

grepit() {
  FILE="$1"
  grep -qi DB2 "$FILE" && 
    egrep -qi 'DSN=|DATASRC=' &&
    ls -l "$FILE"
}
export -f grepit

find '/shrproj/'  -type f -name '*.sas' | parallel grepit

This will run 1 job per core. Depending on your disk system it may be faster to run more or fewer jobs in parallel (use -j to control that).

GNU Parallel is a general parallelizer and makes is easy to run jobs in parallel on the same machine or on multiple machines you have ssh access to.

If you have 32 different jobs you want to run on 4 CPUs, a straight forward way to parallelize is to run 8 jobs on each CPU:

Simple scheduling

GNU Parallel instead spawns a new process when one finishes - keeping the CPUs active and thus saving time:

GNU Parallel scheduling

Installation

If GNU Parallel is not packaged for your distribution, you can do a personal installation, which does not require root access. It can be done in 10 seconds by doing this:

(wget -O - pi.dk/3 || curl pi.dk/3/ || fetch -o - http://pi.dk/3) | bash

For other installation options see http://git.savannah.gnu.org/cgit/parallel.git/tree/README

Learn more

See more examples: http://www.gnu.org/software/parallel/man.html

Watch the intro videos: https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1

Walk through the tutorial: http://www.gnu.org/software/parallel/parallel_tutorial.html

Sign up for the email list to get support: https://lists.gnu.org/mailman/listinfo/parallel

Upvotes: 1

Reinstate Monica Please
Reinstate Monica Please

Reputation: 11593

Something like below should work

find /shrproj/  -type f -name '*.sas' \
-exec  grep -iq 'DB2' {} \;  \( \
  -exec grep -iq 'DSN=' {} \; -o \
  -exec grep -iq 'DATASRC=' {} \; \)  \
-ls \
-exec egrep -i 'DSN=|DATASRC=' {} \;

Which has some redundancy, since logical -q will exit on first match (so can not be used in conjunction with printing all matches), but shouldn't be too slow if you don't have a lot of large files with DSN= or DATASRC= only found near the end.

Alternatively, without abusing grep -q too much

find /shrproj/  -type f -name '*.sas' \
-exec  grep -iq 'DB2'  {} \; \
-exec bash -c 'out=$(egrep -i "DSN=|DATASRC=" "$1"); [[ -n $out ]] && echo "$out" && exit 0 || exit 1 ' bash {} \; \
-ls

Upvotes: 0

Related Questions