Reputation: 466
So today I was trying to put together some regex for a fail2ban filter. This is where I noticed that fail2ban has some issues with nested OR'ing in regex patterns.
Input string: 127.0.0.1 - - [13/Aug/2016:07:01:45 -0400] "a
Pattern: ^<HOST> -.*\"(c|b)|a
Here's an example:
$ fail2ban-regex "127.0.0.1 - - [13/Aug/2016:07:01:45 -0400] \"a" '^<HOST> -.*\"(c|b)|a'
Running tests
=============
Use failregex line : ^<HOST> -.*\"(c|b)|a
Use single line : 127.0.0.1 - - [13/Aug/2016:07:01:45 -0400] "a
Results
=======
Failregex: 0 total
Ignoreregex: 0 total
Date template hits:
|- [# of hits] date format
| [1] Day/MONTH/Year:Hour:Minute:Second
`-
Lines: 1 lines, 0 ignored, 0 matched, 1 missed
|- Missed line(s):
| 127.0.0.1 - - [13/Aug/2016:07:01:45 -0400] "a
`-
I've noted that this will actually succeed and report a match if you run the regex pattern a|(c|b)
, however I need to be able to check both sides of the first OR to see if the first condition is matched (for example, if a HTTP request type is not POST or GET), ignore the rest of the regex pattern, else run the remaining regex pattern after the first OR. One other thing is that grouping doesn't seem to matter, as it will always only seemingly match on the first portion of the most outer OR.
Here we get a match:
$ fail2ban-regex "127.0.0.1 - - [13/Aug/2016:07:01:45 -0400] \"a" '^<HOST> -.*\"a|(c|b)'
Running tests
=============
Use failregex line : ^<HOST> -.*\"a|(c|b)
Use single line : 127.0.0.1 - - [13/Aug/2016:07:01:45 -0400] "a
Results
=======
Failregex: 1 total
|- #) [# of hits] regular expression
| 1) [1] ^<HOST> -.*\"a|(c|b)
`-
Ignoreregex: 0 total
Date template hits:
|- [# of hits] date format
| [1] Day/MONTH/Year:Hour:Minute:Second
`-
Lines: 1 lines, 0 ignored, 1 matched, 0 missed
I say this may be a bug because of my testing with sites like regex101.com and debuggex.com reporting matches with both of these regex patterns.
Upvotes: 0
Views: 345
Reputation: 75222
This regex:
^<HOST> -.*\"(c|b)|a
...is the same as this:
a|^<HOST> -.*\"(c|b)
The only difference is the order the alternatives are tried in. If the regex were all that mattered, this should match either way. However, a quick look at the fail2ban docs tells me every failregex
must match the host name/IP associated with the request. You've got essentially two regexes there (^<HOST> -.*\"(c|b)
and a
), one of which doesn't contain <HOST>
.
I'm not sure what you're trying to accomplish, but if you can't do it by putting the pipe inside parens ((a|b|c)
), you probably need to use two separate regexes.
Upvotes: 1