Reputation: 543
I have been trying to match the occurrences of 0s between the range 3 to 5
So my goal is to match all strings that contain 3 to 5 0s.
So far I have,
egrep '[0]{3,5}' *.txt
expected output
20001 [valid]
200134 [invalid]
20103040 [valid]
203004038002 [invalid]
but this would output strings that only have the consecutive zeroes.
How can I modify the code so that it would also match for non-necessarily consecutive zeroes?
Upvotes: 0
Views: 404
Reputation: 203324
An ERE to match integers containing 3-5 0s, if that's what you want, is ^([1-9]*0){3,5}[1-9]*$
, e.g.:
$ grep -E '^([1-9]*0){3,5}[1-9]*$' file
20001
20103040
The difference between this and @Toto's answer is that this will just match integers while @Totos will match any characters with 0
s in between, e.g.:
$ echo '0 foo 0 bar 0' | grep -E '^([1-9]*0){3,5}[1-9]*$'
$ echo '0 foo 0 bar 0' | grep -E '^([^0]*0){3,5}[^0]*$'
0 foo 0 bar 0
Upvotes: 0
Reputation: 1279
The regex you're looking for is:
^(?!(?:.*?0){6,})(?=(?:.*?0){3,})[0-9]+$
Input file:
cat file.txt
20001
200134
20103040
203004038002
Command:
To use the regex I use grep -P
, because the lookaround notation (?!
is not supported in egrep
grep -P '^(?!(?:.*?0){6,})(?=(?:.*?0){3,})[0-9]+$' file.txt
20001
20103040
Explanation: First I use a negative lookahead to make sure you can't type more than six characters of 0
anywhere in the string. After that I use a positive lookahead to make sure that the string must contain at least 3 characters of 0
.
The ^
is the start of the string. And the $
is the end of the string.
Upvotes: 0
Reputation: 91385
Input file:
cat file.txt
10203
1020304
102030405
10203040506
1020304050607
Command:
egrep '^([^0]*0){3,5}[^0]*$' file.txt
1020304
102030405
10203040506
Explanation:
^ # beginning of line
( # start group
[^0]* # 0 or more non zero
0 # 1 zero
){3,5} # group must appear from 3 to 5 times
[^0]* # 0 or more non zero
$ # end of line
Upvotes: 0
Reputation: 1397
I came up with this solution which would allow you to check for 3-5 0s possibly surrounded by anything that isn't a 0 or a space. Hope this helps :)
\b(?:[^0\s]*?0[^0\s]*?){3,5}\b
If you're checking ONLY strings of numbers with no breaks in between or other characters, you could swap the \b
s for ^
and $
and remove the \s and make sure it's only numbers:
^(?:[1-9]*?0[1-9]*?){3,5}$
^ matches the start of the string, and $ matches the end of the string.
Upvotes: 1