Reputation: 414
I'm using the following regex find command in OS X terminal to find a whole load of files that have 8 digit file names followed by either a .jpg, .gif, .png or .eps extension. The following produces no results even though I've told OS X/BSD find to use modern regex
find -E ./ -iregex '\d{8}'
Using http://rubular.com/ (http://rubular.com/r/YMz3J8Qlgh) shows that the regex pattern produces the expected results and OS X produces the results when typing
find . -iname '[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9].*'
But this seems a little long winded.
Upvotes: 9
Views: 10358
Reputation: 3753
I am using this regex to find and delete iPhone dups:
find -E . -regex '.*/IMG_[0-9]{4}[ ]1.JPG' -print -exec rm '{}' \;
Upvotes: 0
Reputation: 2506
This has been a very eye-opening thread. I'm bringing to the table a solution to my own problem and hopefully clarifying a thing or two for you and other users looking for robustness (like I was).
In my case my mac had a bunch of duplicate photos. When macs make duplicates they append a space and a number to the end before the extension.
IMG_0001.JPG
might have multiplicity complex with IMG_0001 2.JPG
, IMG_0001 3.JPG
and so on. In my case, this went on and on making up about 2,600 useless files.
To get things pumped up, I navigated to the folder in question.
cd ~/Pictures/
Next, let's prove to ourselves that we can list all the files in the directory. You'll notice that in the regex it's necessary to include the .
that says "look in this directory". Also, you have to match the whole file name so the .+
is necessary to catch all the other characters.
find -E . -regex '\..+'
Appropriately, the results will yield the strings that you'll have to match including the .
i mentioned earlier, the slash /
, and everything else.
./IMG_1788.JPG
./IMG_1789.JPG
./IMG_1790.JPG
./IMG_1791.JPG
So I can't write this to find duplicates because it doesn't include the "./"
find -E . -regex 'IMG_[0-9]{4} .+'
but I can write this to find duplicates because it does include the "./"
find -E . -regex '\./IMG_[0-9]{4} .+`
or the more fancy version with .*/
as mentioned by @jackjr300 does the same thing.
find -E . -regex '.*/IMG_[0-9]{4} .+`
Lastly is the confusing part. \d
isn't recognized in BSD. [0-9]
works just as well. Other users' answers cited the re_format manual which lists out how to write common patterns that replace things like \d
with a funny square-colon syntax that looks like this: [:digit:]
. I tried and tried, but it never works. Just use [0-9]
. In my case, I wasted a bunch of time thinking I should have used [:space:]
instead of a space, but I found (as usual!) that I just needed to breath and really read the regex. It turned out to be my mistake. :)
Hope this helps someone!
Upvotes: 0
Reputation: 21
With all your answers, i was finally able to use OSX find (10.8.1) with regex. For giving back, here are my findings: We use custom strings to identify clips, the pattern goes like this: "YYMMDDabc##abc*.ext": Year/Month/Day/3chars/2digits/3chars/whatever/ext
find -E /path/to/folder -type f -regex '^/.*/[0-9]{6}[A-Za-z]{3}[0-9]{2}[A-Za-z0-9]{3}\.*.*\.(ext)$'
The initial ^ makes sure the pattern is at the beginning of the search, [0-9]{6} searches for a 6 digit string, \d does'nt work. \D doesn't work for letters, A-Za-z does. The $ in the end makes sure the last search is the end of the string.
After reading Apples manpage about find and re_format i was completely off track regarding escaping characters.
Upvotes: 2
Reputation: 7191
These commands works on OSX
find -E . -iregex '.*/[0-9]{8}\.(jpg|png|eps|gif)'
this command matches 12345678.jpg , not 123456789.jpg
find -E . -iregex '.*/[0-9]{8,}\.(jpg|png|eps|gif)'
this command matches 12345678.jpg and 123456789.jpg
.*/
equal the folder path or the subFolder path
Upvotes: 10
Reputation: 92569
man re_format
explains the specifics of the modern regex that find
will accept.
This works for me: -iregex '[0-9]{8}'
Upvotes: 1