Reputation: 116
I am trying to separate a list of URLs into separate lists depending on the name. I have these URLs:
['www.example.com/name/0900','www.example.com/name/1000','www.example.com/name/1130','www.example.com/name1/0900','www.example.com/name1/1000','www.example.com/name1/1130','www.example.com/name2/0900','www.example.com/name2/1000','www.example.com/name2/1130']
I am trying to separate them based on the name variable. This is my desired output:
['www.example.com/name/0900','www.example.com/name/1000','www.example.com/name/1130']
['www.example.com/name1/0900','www.example.com/name1/1000','www.example.com/name1/1130']
['www.example.com/name2/0900','www.example.com/name2/1000','www.example.com/name2/1130']
I found this answer Split a list of urls with similar pattern into dicts but it doesn't output the way I need it and I can't work out how. Any help would be appreciated.
Upvotes: 1
Views: 290
Reputation: 27567
Here is how you can use sorted()
with a custom key:
import re
a = ['www.example.com/name/0900','www.example.com/name/1000','www.example.com/name/1130',
'www.example.com/name1/0900','www.example.com/name1/1000','www.example.com/name1/1130',
'www.example.com/name2/0900','www.example.com/name2/1000','www.example.com/name2/1130']
b = sorted(a,key=lambda c:c.split('/')[-2])
d = len(re.findall('name/',''.join(b)))
e = [b[x:x+d] for x in range(0,len(b),d)]
print(e)
Output:
[['www.example.com/name/0900', 'www.example.com/name/1000', 'www.example.com/name/1130'],
['www.example.com/name1/0900', 'www.example.com/name1/1000', 'www.example.com/name1/1130'],
['www.example.com/name2/0900', 'www.example.com/name2/1000', 'www.example.com/name2/1130']]
Upvotes: 0
Reputation: 703
from collections import defaultdict
data = [
'www.example.com/name/0900', 'www.example.com/name/1000',
'www.example.com/name/1130', 'www.example.com/name1/0900',
'www.example.com/name1/1000', 'www.example.com/name1/1130',
'www.example.com/name2/0900', 'www.example.com/name2/1000',
'www.example.com/name2/1130'
]
output = {
'name': [
'www.example.com/name/0900',
'www.example.com/name/1000',
'www.example.com/name/1130'
],
'name1': [
'www.example.com/name1/0900',
'www.example.com/name1/1000',
'www.example.com/name1/1130'
],
'name2': [
'www.example.com/name2/0900',
'www.example.com/name2/1000',
'www.example.com/name2/1130'
]
}
name_index = 1
result = defaultdict(list)
for url in data:
name = url.split('/')[name_index]
result[name].append(url)
assert output == result
Upvotes: 0
Reputation: 350
To include generalising the positioning of the variable_of_interest and the number, sort by number, and return a list of lists:
# Specify positions of variable/number of interest
i_var = 1
i_num = 2
# Split by variable of interest (per Rakesh's excellent answer)
result = {}
for url in urls:
result.setdefault(url.split("/")[i_var], []).append(url)
# Sort by number
out = []
for key, values in result.items():
out.append(sorted(values, key=lambda x: x.split("/")[i_num]))
Upvotes: 0
Reputation: 1644
You could do
a = ['www.example.com/name/0900','www.example.com/name/1000','www.example.com/name/1130','www.example.com/name1/0900','www.example.com/name1/1000','www.example.com/name1/1130','www.example.com/name2/0900','www.example.com/name2/1000','www.example.com/name2/1130']
b = {}
for elem in a:
name = elem.split("/")[1]
try:
b[name].append(elem)
except:
b[name] = [elem]
print(b)
This is the easiest way to do separation without knowing how many separate link names you got.
Upvotes: 1
Reputation: 82765
This is one approach using str.split
and storing in dict
Ex:
data = ['www.example.com/name/0900','www.example.com/name/1000','www.example.com/name/1130','www.example.com/name1/0900','www.example.com/name1/1000','www.example.com/name1/1130','www.example.com/name2/0900','www.example.com/name2/1000','www.example.com/name2/1130']
result = {}
for url in data:
result.setdefault(url.split("/")[1], []).append(url)
print(result)
Output:
{'name': ['www.example.com/name/0900',
'www.example.com/name/1000',
'www.example.com/name/1130'],
'name1': ['www.example.com/name1/0900',
'www.example.com/name1/1000',
'www.example.com/name1/1130'],
'name2': ['www.example.com/name2/0900',
'www.example.com/name2/1000',
'www.example.com/name2/1130']}
Upvotes: 3
Reputation: 604
You can try this out by iterating them and checking with a simple condition:
al = ['www.example.com/name/0900','www.example.com/name/1000','www.example.com/name/1130','www.example.com/name1/0900','www.example.com/name1/1000','www.example.com/name1/1130','www.example.com/name2/0900','www.example.com/name2/1000','www.example.com/name2/1130']
name = [name for name in al if 'name/' in name]
name1 = [name1 for name1 in al if 'name1/' in name1]
name2 = [name2 for name2 in al if 'name2/' in name2]
So when you print it you will get:
>>> print(name)
['www.example.com/name/0900', 'www.example.com/name/1000', 'www.example.com/name/1130']
>>> print(name1)
['www.example.com/name1/0900', 'www.example.com/name1/1000', 'www.example.com/name1/1130']
>>> print(name2)
['www.example.com/name2/0900', 'www.example.com/name2/1000', 'www.example.com/name2/1130']
Upvotes: 1
Reputation: 1
you need to make a variable first like example = ['www.example.com/name/0900','www.example.com/name/1000','www.example.com/name/1130']
Upvotes: -1