Reputation: 183
I'm new to python,learning the basics.
My Query : I have multiple pages accessed as a request from a log file like the below,
"GET /img/home/search-user-ico.jpg HTTP/1.1"
"GET /SpellCheck/am.tlx HTTP/1.1"
"GET /img/plan-comp-nav.jpg HTTP/1.1"
"GET /ie6.css HTTP/1.1"
"GET /img/portlet/portlet-content-bg.jpg HTTP/1.1"
"GET /SpellCheck/am100k2.clx HTTP/1.1"
"GET /SpellCheck/am.tlx HTTP/1.1"
My question is i want only the file part from the page,
For example,
Let us consider "GET /img/home/search-user-ico.jpg HTTP/1.1" ,"GET /ie6.css HTTP/1.1"
as a page then from the above i want to split search-user-ico.jpg HTTP
, ie6.css HTTP
.
so experts please help me in writing the python script for the above to split.
Upvotes: 1
Views: 3016
Reputation: 43
data = [
"GET /img/home/search-user-ico.jpg HTTP/1.1",
"GET /SpellCheck/am.tlx HTTP/1.1",
"GET /img/plan-comp-nav.jpg HTTP/1.1" ,
"GET /ie6.css HTTP/1.1",
"GET /img/portlet/portlet-content-bg.jpg HTTP/1.1",
"GET /SpellCheck/am100k2.clx HTTP/1.1" ,
"GET /SpellCheck/am.tlx HTTP/1.1"
]
for url in data:
print url.split(' ')[1].split('/')[-2]
Upvotes: 1
Reputation: 2527
If the format of your links is similar. Another solution would be:
request = "GET /img/home/search-user-ico.jpg HTTP/1.1"
parts = request.split("/")
parts[-2] //returns search-user-ico.jpg HTTP
Upvotes: 0
Reputation: 15722
data = [
"GET /img/home/search-user-ico.jpg HTTP/1.1",
"GET /SpellCheck/am.tlx HTTP/1.1",
"GET /img/plan-comp-nav.jpg HTTP/1.1" ,
"GET /ie6.css HTTP/1.1",
"GET /img/portlet/portlet-content-bg.jpg HTTP/1.1",
"GET /SpellCheck/am100k2.clx HTTP/1.1" ,
"GET /SpellCheck/am.tlx HTTP/1.1"
]
for url in data:
print url.split(' ')[1].split('/')[-1]
Upvotes: 0
Reputation: 5343
Assuming that you don't have spaces in the filenames and that you don't want "HTTP" at the end.
You can split the line by space.
parts = line.split(" ")
and then use the os
module to get the filename from the path.
filename = os.path.basename(parts[1])
For example.
>>> line = "GET /img/home/search-user-ico.jpg HTTP/1.1"
>>> parts = line.split(" ")
>>> parts[1]
'/img/home/search-user-ico.jpg'
>>> os.path.basename(parts[1])
'search-user-ico.jpg'
Upvotes: 3