Reputation: 8648
I have a list of URLS in format of http://WEBSITE.com/XXXXX/YYYYY
where X
and Y
are random characters.
How would I have Python only keep the results that have distinct case-insensitive XXXXX
values? It does not matter if it keeps the YYYYY
portion?
Upvotes: 0
Views: 78
Reputation: 1929
Use a set comprehension:
values = { url.split("/")[3] for url in url_list }
Upvotes: 1
Reputation: 77029
Well, you can easily shave off the last part of the path:
id = "/".join(url.split('/')[:-1]) # split, lose last item, rejoin
Then put your IDs on a set()
to keep them unique:
ids = set()
ids.add(id)
Upvotes: 2
Reputation: 3457
Look at rsplit()
and then use a Set
.
rsplit
is used to split a string by a delimiter, such as '/', and a set
holds unique elements.
https://docs.python.org/2/library/stdtypes.html - rsplit() https://docs.python.org/2/library/stdtypes.html#set - set
Upvotes: 2