Reputation: 17308
I'm trying to make a filter on script to make this happen:
Before:
123.125.66.126 - - [05/Apr/2010:09:18:12 -0300] "GET / HTTP/1.1" 302 290
66.249.71.167 - - [05/Apr/2010:09:18:13 -0300] "GET /robots.txt HTTP/1.1" 404 290
66.249.71.167 - - [05/Apr/2010:09:18:13 -0300] "GET /~leonardo_campos/IFBA/Web_Design_Aula_17.pdf HTTP/1.1" 404 324
After:
[05/Apr/2010:09:18:12 -0300] / 302 290
[05/Apr/2010:09:18:13 -0300] /robots.txt 404 290
[05/Apr/2010:09:18:13 -0300] /~leonardo_campos/IFBA/Web_Design_Aula_17.pdf 404 324
If someone could help it would be great...
Thanks in advance !
Upvotes: 0
Views: 92
Reputation: 342949
if your file structure is always like that, you can just use fields. no need complex regex
$ awk '{print $4,$5,$7,$9,$10}' file
[05/Apr/2010:09:18:12 -0300] / 302 290
[05/Apr/2010:09:18:13 -0300] /robots.txt 404 290
[05/Apr/2010:09:18:13 -0300] /~leonardo_campos/IFBA/Web_Design_Aula_17.pdf 404 324
Upvotes: 1
Reputation: 288240
Supporting all HTTP methods:
sed 's#.*\(\[[^]]*\]\).*"[A-Z]* \(.*\) HTTP/[0-9.]*" \(.*\)#\1 \2 \3#'
Upvotes: 1
Reputation: 37477
sed is your friend here, with regexps.
sed 's/^\(\[.*\]\) "GET \(.*\) .*" \(.*\)$/\1 \2 \3/'
Upvotes: 1
Reputation: 2232
It seems a perfect work for "sed".
You can easily construct a pair of "s" replacement patterns to remove the unwanted pieces of lines.
Upvotes: 1