Reputation: 11
10.177.116.76 - U031503@nttdata [11/Mar/2013:09:42:44 +0900] "GET /infovia/ga/ga004rp0002.action HTTP/1.1" 302 301 "https://tb-infovia.groupwide.net/infovia/ga/ga013rp0004.action?messageId=errors.Authentication.001" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; .NET CLR 1.1.4322)"
The above is the access log line. There are two action ids.
I want to extract the first action id before HTTP by using regex pattern.
Now I use this pattern ([^/\"]*).action
.
It matched both action id in line anywhere.
I was testing this problem two days ago. Could you please help me?
Upvotes: 1
Views: 118
Reputation: 2749
If you're sure it will always be followed by HTTP
, you can use a lookahead:
([^/\"]*).action(?=\sHTTP)
Upvotes: 0
Reputation: 10367
Try this:
(?<=GET\s).*?([^/\"]*).action
or use this
([^/\"]*).action.*?([^/\"]*).action
and get group 1.
explanation:
*?
Matches the previous element zero or more times, but as few times as possible.(?<=subexpression)
Zero-width positive lookbehind assertion.
Upvotes: 1
Reputation: 424983
This will match the first id:
action \S+" (\d+)
Get group 1 from the match
Upvotes: 1
Reputation: 59699
If I understand your question correctly, your problem is that there are two "action IDs" in the string, and you want to capture both. However, with your current regex, which matches both, depending on how you are evaluating this regex, you may only be getting the first match. So, in order to extract both with one match, you'll need to repeat the regex and then consume everything between the parts you want to capture:
([^/\"]*).action.*([^/\"]*).action
This is your regex ([^/\"]*).action
, repeated twice, with .*
in the middle, which matches anything an unlimited number of times. Then both actions are available in capturing groups one and two.
Upvotes: 0