Reputation: 143
10.1.2.194 (197.84.211.148) - - [08/Oct/2015:09:01:44 +0000] "GET /merlin-web-za/web/images/refinements/loader.gif HTTP/1.1" 200 4178 0 1868 "http://www.autotrader.co.za/makemodel/make/chevrolet/model/aveo/caryearrangeszar/2012/search?sort=PriceAsc&locationName=Cape%20Town&latitude=-33.92584&longitude=18.42322&county=Western%20Cape" "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101 Safari/537.36" "ajp://10.1.4.66:8009"
I need to modify that in:
08/Oct/2015:09:01:44 GET /merlin-web-za/web/images/refinements/loader
How can i do it using awk or egrep? - i tried commands below , But the result of first command shows full strings which contains both follow patterns
awk ' /08/Oct/2015:09:[0-9]{2}:[0-9]{1,2}/ && /GET (/[a-z0-9-]{1,}){1,3}/'
and
cat file | egrep -o "08/Oct/2015:09:[0-9]{2}:[0-9]{1,}.* GET (/[a-z0-9-]{1,}){1,}"
that fills the gaps between aforementioned patterns and as result i can see:
08/Oct/2015:09:01:44 +0000] "GET /merlin-web-za/web/images/refinements/loader
that is not exactly what i want to get
Upvotes: 2
Views: 1123
Reputation: 626903
You may use
awk '{a=$5" "$7" "$8; gsub(/[]["]|\.[^.]*$/, "", a); print a}'
See the online demo
Details
The default field separator - whitespace - is used to split the line into fields.
a=$5" "$7" "$8;
- creates a variable by joining Field 5, 7 and 8 with a spacegsub(/[]["]|\.[^.]*$/, "", a)
- removes [
, ]
and "
and .
+ any 0+ chars other than .
at the end of the stringprint a
- prints the result.However, the file you sent me contains comma+space separated IP addresses inside the first parentheses. You may use
sed -E -n 's/^[^][]*\[([^][[:space:]]+)[^][]*\][ \t]+"([[:alpha:]]+[ \t]+[^[:space:]]+).*/\1 \2/p' access_log > newfile
to get the results you want, namely, time + Get/post +URL
.
Details
^
- matches start of string[^][]*
- any 0 or more chars other than [
and ]
\[
- a [
char([^][[:space:]]+)
- Group 1: 1+ chars other than ]
, [
and whitespace[^][]*
- any 0 or more chars other than [
and ]
\]
- a ]
char[ \t]+
- 1+ horizontal whitespace chars"
- a "
char([[:alpha:]]+[ \t]+[^[:space:]]+)
- Group 2: 1+ letters, 1+ horizontal whitespaces and then 1+ chars other than whitespace.*
- the rest of the string.The result is the concatenation of Group 1 and 2 values.
Upvotes: 1