Reputation: 37
Goal: I want to match all paths which are NOT in elasticmapreduce/j-abc123/node/i-abc123/applications
directory
Following are a set of possible paths:
elasticmapreduce/j-abc123/node/i-abc123/applications/hadoop-yarn/hadoop-yarn-proxyserver-ip.log.2020-05-07-00.gz
elasticmapreduce/j-abc123/node/i-abc123/applications/hadoop-yarn/hadoop-yarn-timelineserver-ip.out.gz
elasticmapreduce/j-abc123/node/i-abc123/applications/hadoop-yarn/hadoop-yarn-proxyserver-ip.log.gz
elasticmapreduce/j-abc123/node/i-abc123/applications/hive/user/hive/hive.log.2020-05-07.gz
elasticmapreduce/j-abc123/node/i-abc123/applications
elasticmapreduce/j-abc123/node/i-abc123/bootstrap-actions/master.log.2020-05-07-00.gz
elasticmapreduce/j-abc123/node/i-abc123/bootstrap-actions
elasticmapreduce/j-abc123/node/i-abc123/daemons/instance-state/instance-state.log-2020-05-08-13-30.gz
elasticmapreduce/j-abc123/node/i-abc123/daemons/setup-dns.log.gz
elasticmapreduce/j-abc123/node/i-abc123/provision-node/abc123/stderr.gz
elasticmapreduce/j-abc123/node/i-abc123/provision-node/apps-phase/0/abc123/stderr.gz
elasticmapreduce/j-abc123/node/i-abc123/provision-node/reports/0/abc123/ip.ec2.internal/201805270306.yaml.gz
elasticmapreduce/j-abc123/node/i-abc123/setup-devices/setup_var_log_dir.log.gz
Following regex matches all paths containing elasticmapreduce/j-abc123/node/i-abc123/applications
:
^elasticmapreduce\/j-.*\/node\/i-.*\/(applications(\/.*)*)$
I want to match all paths which were NOT matched by above regex pattern.
Why doesn't the following regex do this?
^elasticmapreduce\/j-.*\/node\/i-.*\/(?!(applications(\/.*)*))$
PS, I use https://regex101.com/ to test regex patterns.
Upvotes: 0
Views: 39
Reputation: 163207
The pattern that you tried does not work as you intended, as it will match until the last occurrence of a /
and then has to fulfill this part (?!(applications(\/.*)*))$
The part asserts what is directly to the right is not applications
followed by 0 or more repetitions of /
followed by any char. Then assert the end of the string.
It starts backtracking and can not match in any of the examples.
I think it shows better when you omit the $
and see where the match ends:
https://regex101.com/r/aXV8vO/1
As you are not matching a part that contains a a forward slash after j-
and i-
, you could make use of a negated character class instead [^\/]+
matching any char except a forward slash.
Then use the negative lookahead \/(?!applications\b)
right after matching the forward slash.
^elasticmapreduce\/j-[^\/]+\/node\/i-[^\/]+\/(?!applications\b)[^\/]*(?:\/.*)?$
Note If you don't want to cross newlines, you could use [^\/\r\n]+
instead.
Upvotes: 1