Reputation: 17879
I have an access log with many lines in the following format:
1.2.3.4:443 - - [11/Mar/2020:09:41:05 +0100] RESPONSE_CODE:[200] AGE: [-] CACHE_MISS: [-] CACHE-STATUS: [-] SIZE: [1288] RESPONSE_TIME: [2/2125012] (microseconds) WAS:[was.internal:9444] "PUT /kudosboards/node/a8740540-801a-43a6-822a-d58a2424fd3f HTTP/1.1" 200 REFERER: "https://ihs.internal/kudosboards/"
I just want to get the response time, so in this example 2/2125012
. My idea was to write a regex pattern, that matches the brackets content in a group, and everything after/before it in other groups. So I could replace the entire line by just this value:
^(.*)RESPONSE_TIME: \[([^\]]+)(.*)$
Using 101regex with an example input string, it gavae me `` as second group as expected:
Group 2 2/2125012
To use this pattern with egrep
, I escaped the brackets like this:
$ sed 's#^\(.*\)RESPONSE_TIME: \[\([\^\]]+\)\(.*\)$#\2#g' testfile
1.2.3.4:443 - - [11/Mar/2020:09:41:05 +0100] RESPONSE_CODE:[200] AGE: [-] CACHE_MISS: [-] CACHE-STATUS: [-] SIZE: [1288] RESPONSE_TIME: [2/2125012] (microseconds) WAS:[was.internal:9444] "PUT /kudosboards/node/a8740540-801a-43a6-822a-d58a2424fd3f HTTP/1.1" 200 REFERER: "https://ihs.internal/kudosboards/"
Why is nothing replaced? I escaped (
and [
.
It seems that this has something to do with the square brackets:
$ sed 's#^\(.*\)RESPONSE_TIME: \[\(.*\)\] (micro\(.*\)$#\2#g' testfile
2/2125012
This worked. But those pattern is not very specific. I'd like make it more specific by having e.g. [0-9]+/[0-9]+
for the pattern inside the brackets instead of (.*)
wildcard pattern.
Upvotes: 1
Views: 64
Reputation: 203209
$ awk -F'[][]' '{print $14}' file
2/2125012
If that's not all you need then edit your question to provide more truly representative sample input/output including cases that the above doesn't work for.
Upvotes: 1
Reputation: 626689
Your pattern contains an issue related to the use of POSIX BRE/ERE: [\^\]]+
matches a char that is either ^
or ]
and then a +
char (demo). You need to use *
(that matches 0 or more occurrences) instead of +
, or \+
in GNU sed
, or \{1,\}
in a generic POSIX BRE.
You may fix the sed
command by using
sed -n 's#.*RESPONSE_TIME: \[\([^]]*\).*#\1#p' testfile
See the online sed
demo.
Details
-n
-suppresses the default line output.*RESPONSE_TIME: \[\([^]]*\).*
- matches any 0+ chars, RESPONSE_TIME:
, space, [
, then captures into Group 1 any zero or more chars other than ]
, and then matches the rest of the string\1
- replaces the match with the Group 1 valuep
- prints the result of the substitution.Upvotes: 1