CheeseConQueso
CheeseConQueso

Reputation: 6041

Unix 'find' + 'grep' syntax vs. awk

I was using this line to find the phrase, 'B206' within files in the directory I was in and all of its sub directories.

find . -exec grep -s "B206" '{}' \; -print 

It crashes when it tries to read certain files and actually changes the title bar in putty to a bunch of weird characters

For example, it crashes all the time when it hits a jpg file that's in a sub directory. The title bar changes, and in the screen, there exists:

ÐF»*rkNQeË+Z׳kU£~MÞçÄZ½ªéúýØâÑn¡[U+Þ4ªÒ9/ê£<ú¯4}[IÓ­îÃ¥K»G%ݳ¢

Forcing me to Ctrl+C out back to the prompt and then exit out.

Any way to add code to this line that will exclude jpg files? Even better, a piece of code where I can add a list of extensions to exclude?


EDIT:
-not & -I do not work for me
I found this similar question also related somewhat to mine

Upvotes: 4

Views: 16599

Answers (12)

user99352
user99352

Reputation: 11

To use grep with find, my syntax is:

find . -name "*" -print | xargs grep B206

All the options to filter files, binary and all, then the results are passed as arguments to the grep command.

Upvotes: 1

Don Wakefield
Don Wakefield

Reputation: 8852

Given the recent lovefest over ack, I'm surprised no one's mentioned it yet.

You can configure types by extension so that you 'grep' just the files you want. Or you can just use --nobinary, given the problem you've been facing.

Upvotes: 2

phi
phi

Reputation: 1543

If your environment can't do any fancy grep, maybe your awk can do it:

find . | awk '!/((\.jpeg)|(\.jpg)|(\.png))$/ {print $0;}' | xargs grep "B206"

Upvotes: 4

fortran
fortran

Reputation: 76107

Just a quote, you don't need to close the terminal, you can use the command reset to restore the terminal output mode.

You can also do a previous grep to remove the extensions you don't want:

find -print | grep -v '\(\.jpg\|\.bmp\)$' | xargs grep "B206"

Upvotes: 3

Erik
Erik

Reputation: 898

find . -type f -a -not -name \*.jpg -exec grep -li "string" "{}" \;

This example comes from Mac OSX 10.5, you will need to check the find man page for your environment since there is some divergence between GNU find and other vendor implementations. Checking Solaris ( just for fun, the target OS was never specified ):

find . -type f -a ! -name \*.jpg -exec grep -li "string" "{}" \;

This construction finds all files whose names do not end in .jpg and execs grep for each of them.

Depending on your shell, you may need to escape the bang (!) in order for this to work as advertised.

Upvotes: 2

Vladiat0r
Vladiat0r

Reputation: 595

I think the problem is when you grep a binary file, it outputs binary data. That binary data somehow gets interpreted in your shell.

I would suggest to try to use the command "strings" that makes sure that your output is text-only first, and then grep on the output of "strings."

Upvotes: 1

gpojd
gpojd

Reputation: 23085

grep -r --exclude=*.jpg B206 .

Sorry, from another comment:

Only GNU grep comes with -r (recursive), true UNIX grep doesn't. You must either install GNU grep or use it with find. – Terminus

Upvotes: 1

Adam Rosenfield
Adam Rosenfield

Reputation: 400522

There's no reason to use find: grep comes with a recursive option, -r. To just get a list of the filenames with matches (as opposed to a list of all of the matching lines in all the files), you can use the -l option. If you want to ignore all binary files outright, you can use --binary-files=without-match option. If you only want to ignore files with a certain extension, you can use the --exclude option, e.g. --exclude=*.{jpg,jpeg} to ignore all files ending in .jpg or .jpeg. Thus, you should be able to get what you want with this:

grep -r -l --binary-files=without-match .

Now, you mentioned in one of your comments that your version of grep doesn't have the -r or -l options. That's unfortunate, and I recommend getting a more recent version of grep (preferably of the GNU variety).

One further note: if you use find -exec, you should use a + to end the command instead of a semicoln, e.g.:

find . -exec grep options '{}' '+'

By using a +, find will only fork off a single process and pass all of the matching filenames as command line arguments to one instance of grep. So long as you don't have a million matching files (which would create a command line far longer than the shell can handle), this will be much, much faster. If you use a semicolon instead, find forks a new process for each matching file, which is really slow for a very large number of files.

Upvotes: 8

DVK
DVK

Reputation: 129489

If you have access to gfind, simply add "-not -name '*.jpg'" to the expression.

gfind . -not -name '*.jpg' -exec grep -s "B206" '{}' \; -print

Another option (not needed for this task but a useful trick) is, if you want to use really fancy regexps, to do

find some_easy_high_level_filter_expression -ls | perl -pe '{ /your_Perl_RegExp_of_choice/ }' > ./files_to_search_in

grep options 'cat ./files_to_search_in'

# preceding line should have backticks but I can't get the formatter to escape them

This gives the sometimes-needed benefit of caching the file list in case you want to change the grep expression to fin-tune it or just do more than 1 grep.

Upvotes: 1

Tim Stewart
Tim Stewart

Reputation: 5500

I tried Erik's command but I got an error about no -grep predicate. Perhaps my version of find is too old.

This worked for me:

find . -type f -a -not -name \*.jpg -exec grep "B206" {} \;

Upvotes: 2

ennuikiller
ennuikiller

Reputation: 46975

grep -I -r "string" *

Upvotes: 0

Pesto
Pesto

Reputation: 23901

You can use grep's -I switch:

Process  a  binary  file as if it did not contain matching data;
this is equivalent to the --binary-files=without-match option.

In short, grep will simply assume the file doesn't match, which will keep binary data from being output.

Upvotes: 3

Related Questions