TheAmazingHAzza
TheAmazingHAzza

Reputation: 116

Split list of urls into seperate lists

I am trying to separate a list of URLs into separate lists depending on the name. I have these URLs:

['www.example.com/name/0900','www.example.com/name/1000','www.example.com/name/1130','www.example.com/name1/0900','www.example.com/name1/1000','www.example.com/name1/1130','www.example.com/name2/0900','www.example.com/name2/1000','www.example.com/name2/1130']

I am trying to separate them based on the name variable. This is my desired output:

['www.example.com/name/0900','www.example.com/name/1000','www.example.com/name/1130']

['www.example.com/name1/0900','www.example.com/name1/1000','www.example.com/name1/1130']

['www.example.com/name2/0900','www.example.com/name2/1000','www.example.com/name2/1130']

I found this answer Split a list of urls with similar pattern into dicts but it doesn't output the way I need it and I can't work out how. Any help would be appreciated.

Upvotes: 1

Views: 290

Answers (7)

Red
Red

Reputation: 27567

Here is how you can use sorted() with a custom key:

import re

a = ['www.example.com/name/0900','www.example.com/name/1000','www.example.com/name/1130',
     'www.example.com/name1/0900','www.example.com/name1/1000','www.example.com/name1/1130',
     'www.example.com/name2/0900','www.example.com/name2/1000','www.example.com/name2/1130']

b = sorted(a,key=lambda c:c.split('/')[-2])

d = len(re.findall('name/',''.join(b)))

e = [b[x:x+d] for x in range(0,len(b),d)]

print(e)

Output:

[['www.example.com/name/0900', 'www.example.com/name/1000', 'www.example.com/name/1130'],
 ['www.example.com/name1/0900', 'www.example.com/name1/1000', 'www.example.com/name1/1130'],
 ['www.example.com/name2/0900', 'www.example.com/name2/1000', 'www.example.com/name2/1130']]

Upvotes: 0

Evgeniy_Burdin
Evgeniy_Burdin

Reputation: 703

from collections import defaultdict


data = [
    'www.example.com/name/0900', 'www.example.com/name/1000',
    'www.example.com/name/1130', 'www.example.com/name1/0900',
    'www.example.com/name1/1000', 'www.example.com/name1/1130',
    'www.example.com/name2/0900', 'www.example.com/name2/1000',
    'www.example.com/name2/1130'
]

output = {
    'name': [
        'www.example.com/name/0900',
        'www.example.com/name/1000',
        'www.example.com/name/1130'
    ],
    'name1': [
        'www.example.com/name1/0900',
        'www.example.com/name1/1000',
        'www.example.com/name1/1130'
    ],
    'name2': [
        'www.example.com/name2/0900',
        'www.example.com/name2/1000',
        'www.example.com/name2/1130'
    ]
}

name_index = 1

result = defaultdict(list)

for url in data:
    name = url.split('/')[name_index]
    result[name].append(url)

assert output == result

Upvotes: 0

hemmelig
hemmelig

Reputation: 350

To include generalising the positioning of the variable_of_interest and the number, sort by number, and return a list of lists:

# Specify positions of variable/number of interest
i_var = 1 
i_num = 2

# Split by variable of interest (per Rakesh's excellent answer)
result = {}
for url in urls: 
    result.setdefault(url.split("/")[i_var], []).append(url)

# Sort by number
out = []
for key, values in result.items(): 
    out.append(sorted(values, key=lambda x: x.split("/")[i_num]))

Upvotes: 0

nagyl
nagyl

Reputation: 1644

You could do

a = ['www.example.com/name/0900','www.example.com/name/1000','www.example.com/name/1130','www.example.com/name1/0900','www.example.com/name1/1000','www.example.com/name1/1130','www.example.com/name2/0900','www.example.com/name2/1000','www.example.com/name2/1130']
b = {}

for elem in a:
    name = elem.split("/")[1]
    try:
        b[name].append(elem)
    except:
        b[name] = [elem]

print(b)

This is the easiest way to do separation without knowing how many separate link names you got.

Upvotes: 1

Rakesh
Rakesh

Reputation: 82765

This is one approach using str.split and storing in dict

Ex:

data = ['www.example.com/name/0900','www.example.com/name/1000','www.example.com/name/1130','www.example.com/name1/0900','www.example.com/name1/1000','www.example.com/name1/1130','www.example.com/name2/0900','www.example.com/name2/1000','www.example.com/name2/1130']
result = {}
for url in data:
    result.setdefault(url.split("/")[1], []).append(url)
print(result)

Output:

{'name': ['www.example.com/name/0900',
          'www.example.com/name/1000',
          'www.example.com/name/1130'],
 'name1': ['www.example.com/name1/0900',
           'www.example.com/name1/1000',
           'www.example.com/name1/1130'],
 'name2': ['www.example.com/name2/0900',
           'www.example.com/name2/1000',
           'www.example.com/name2/1130']}

Upvotes: 3

JenilDave
JenilDave

Reputation: 604

You can try this out by iterating them and checking with a simple condition:

al = ['www.example.com/name/0900','www.example.com/name/1000','www.example.com/name/1130','www.example.com/name1/0900','www.example.com/name1/1000','www.example.com/name1/1130','www.example.com/name2/0900','www.example.com/name2/1000','www.example.com/name2/1130']
name = [name for name in al if 'name/' in name]
name1 = [name1 for name1 in al if 'name1/' in name1]
name2 = [name2 for name2 in al if 'name2/' in name2]

So when you print it you will get:

>>> print(name)
['www.example.com/name/0900', 'www.example.com/name/1000', 'www.example.com/name/1130']
>>> print(name1)
['www.example.com/name1/0900', 'www.example.com/name1/1000', 'www.example.com/name1/1130']
>>> print(name2)
['www.example.com/name2/0900', 'www.example.com/name2/1000', 'www.example.com/name2/1130']

Upvotes: 1

Sanad Abdullah
Sanad Abdullah

Reputation: 1

you need to make a variable first like example = ['www.example.com/name/0900','www.example.com/name/1000','www.example.com/name/1130']

Upvotes: -1

Related Questions