Alucard
Alucard

Reputation: 17308

Question about how to make a filter using script

I'm trying to make a filter on script to make this happen:

Before:

123.125.66.126 - - [05/Apr/2010:09:18:12 -0300] "GET / HTTP/1.1" 302 290
66.249.71.167 - - [05/Apr/2010:09:18:13 -0300] "GET /robots.txt HTTP/1.1" 404 290
66.249.71.167 - - [05/Apr/2010:09:18:13 -0300] "GET /~leonardo_campos/IFBA/Web_Design_Aula_17.pdf HTTP/1.1" 404 324

After:

[05/Apr/2010:09:18:12 -0300] / 302 290
[05/Apr/2010:09:18:13 -0300] /robots.txt 404 290
[05/Apr/2010:09:18:13 -0300] /~leonardo_campos/IFBA/Web_Design_Aula_17.pdf 404 324

If someone could help it would be great...

Thanks in advance !

Upvotes: 0

Views: 92

Answers (4)

ghostdog74
ghostdog74

Reputation: 342949

if your file structure is always like that, you can just use fields. no need complex regex

$ awk '{print $4,$5,$7,$9,$10}' file
[05/Apr/2010:09:18:12 -0300] / 302 290
[05/Apr/2010:09:18:13 -0300] /robots.txt 404 290
[05/Apr/2010:09:18:13 -0300] /~leonardo_campos/IFBA/Web_Design_Aula_17.pdf 404 324

Upvotes: 1

phihag
phihag

Reputation: 288240

Supporting all HTTP methods:

sed 's#.*\(\[[^]]*\]\).*"[A-Z]* \(.*\) HTTP/[0-9.]*" \(.*\)#\1 \2 \3#'

Upvotes: 1

Didier Trosset
Didier Trosset

Reputation: 37477

sed is your friend here, with regexps.

sed 's/^\(\[.*\]\) "GET \(.*\) .*" \(.*\)$/\1 \2 \3/'

Upvotes: 1

andcoz
andcoz

Reputation: 2232

It seems a perfect work for "sed".

You can easily construct a pair of "s" replacement patterns to remove the unwanted pieces of lines.

Upvotes: 1

Related Questions