Reputation: 2712
I have a Directory(Linux/Unix) on a Apache Server with a lot of subdirectory containing lot of files like this:
- Dir - 2010_01/ - 142_78596_101_322.pdf - 12_10.pdf - ... - 2010_02/ - ...
How can i find all files with filesnames looking like: *_*_*_*.pdf
? where * is always a digit!!
I try to solve it like this:
ls -1Rl 2010-01 | grep -i '\(\d)+[_](\d)+[_](\d)+[_](\d)+[.](pdf)$' | wc -l
But the regular expression \(\d)+[_](\d)+[_](\d)+[_](\d)+[.](pdf)$
doesn't work with grep.
Edit 1: Trying ls -l 2010-03 | grep -E '(\d+_){3}\d+\.pdf' | wc -l
for example just return null. So it's dont work perfectly
Upvotes: 5
Views: 2868
Reputation: 2712
Thanks to gbchaosmaster and the wolf I find a way which work for me:
Into a Directory:
find . | grep -P "(\d+_){3}\d+\.pdf" | wc -l
At the Root Directory:
find 20*/ | grep -P "(\d+_){3}\d+\.pdf" | wc -l
Upvotes: 0
Reputation: 35512
First, you should be using egrep vs grep or call grep with -E for extended patterns.
So this works for me:
$ cat test2.txt
- Dir
- 2010_01/
- 142_78596_101_322.pdf
- 12_10.pdf
- ...
- 2010_02/
- ...
Now egrep that file:
cat test2.txt | egrep '((?:\d+_){3}(?:\d+)\.pdf$)'
- 142_78596_101_322.pdf
Since there are parenthesis around the whole pattern, the entire file name will be captured.
Note that the pattern does NOT work with grep in traditional mode:
$ cat test2.txt | grep '((?:\d+_){3}(?:\d+)\.pdf$)'
... no return
But DOES work if you use the extend pattern switch (the same as calling egrep):
$ cat test2.txt | grep -E '((?:\d+_){3}(?:\d+)\.pdf$)'
- 142_78596_101_322.pdf
Upvotes: 1
Reputation: 1704
Try using find
.
The command that satisfies your specification __*_*.pdf where * is always a digit
:
find 2010_10/ -regex '__\d+_\d+\.pdf'
You seem to be wanting a sequence of 4 numbers separated by underscores, however, based on the regex that you tried.
(\d+_){3}\d+\.pdf
Or do you want to match all names containing solely numbers/underscores?
[\d_]+\.pdf
Upvotes: 3