Aubrey
Aubrey

Reputation: 507

Sorting a list and adding None values in the right positions

This is somehow related to this question.

I have to lists of URLs. The first list is:

http://example.com/1/1.jpg
http://example.com/2/2.jpg
http://example.com/3/3.jpg
...
http://example.com/45000/45000.jpg

The second list is a subset of the first one: its made of real URLs, the ones that are not broken links.

http://example.com/12/12.jpg
http://example.com/23/23.jpg
http://example.com/34/34.jpg
...

I would like to know how to sort it in a way where I can have something like this

...
None
http://example.com/12/12.jpg
None
None
...
None
http://example.com/23/23.jpg
None
...

The point is to have a sorted list where I can have the real URLs at the right position in the final csv file.

I've tried this reading the first list and try to match with the item in the second list, but I'm failing in using both the double loop and the matching pattern.

I read the lists from files, using open(): this means I have to deal with line breaks (it seems to be a issue).

Upvotes: 2

Views: 85

Answers (4)

ravish.hacker
ravish.hacker

Reputation: 1183

Let us say you first list (superset) is l1 and second (subset) is l2.

l3 = []
for li in l1:
    if li in l2:
        l3.append(li)
    else:
        l3.append(None)

This will do. I am really not an expert in python so there might be better ways, but this is what I will use.

Update

As per your comment. Let us say you have two files. superset.txt (with all urls) and subset.txt (with some urls).

superset.txt

http://example.com/1/1.jpg
http://example.com/2/2.jpg
http://example.com/12/12.jpg
http://example.com/3/3.jpg
http://example.com/23/23.jpg
http://example.com/3/3.jpg
http://example.com/34/34.jpg
http://example.com/45000/45000.jpg

subset.txt

http://example.com/12/12.jpg
http://example.com/23/23.jpg
http://example.com/34/34.jpg

Below script will read them (from the same folder) and create the required list.

f1 = open("superset.txt","r")
f2 = open("subset.txt","r")
l1 = list(f1)
l2 = list(f2)
l3 = []
for li in l1:
    if li in l2:
        l3.append(li.strip())
    else:
        l3.append(None)

print l3     # or you can save this to a file.

Result

[None, None, 'http://example.com/12/12.jpg', None, 'http://example.com/23/23.jpg', None, None, None]

Upvotes: 1

farhawa
farhawa

Reputation: 10408

This should work:

list1 = ['http://example.com/1/1.jpg','http://example.com/2/2.jpg','http://example.com/3/3.jpg']
list2 = ['http://example.com/5/11.jpg','http://example.com/20/20.jpg','http://example.com/9/9.jpg','http://example.com/12/12.jpg']


length_of_list = max(set([int(i) for i in ''.join(list1+list2).split('/') if i.isdigit()]))

final_list = [None]*length_of_list 

for i in list1+list2:
    position = [int(x) for x in [s for s in i.split("/")] if x.isdigit()][0]
    final_list[position-1] = i

for x in final_list:
    print x

>> 
http://example.com/1/1.jpg
http://example.com/2/2.jpg
http://example.com/3/3.jpg
None
http://example.com/5/11.jpg
None
None
None
http://example.com/9/9.jpg
None
None
http://example.com/12/12.jpg
None
None
None
None
None
None
None
http://example.com/20/20.jpg

Upvotes: 0

Bhargav Rao
Bhargav Rao

Reputation: 52111

You can use a simple list-comp along with ternary condition like this

>>> orig = ['http://example.com/1/1.jpg','http://example.com/2/2.jpg','http://example.com/3/3.jpg']
>>> real = ['http://example.com/1/1.jpg']
>>> [i if i in real else None for i in orig]
['http://example.com/1/1.jpg', None, None]

It would be better if the real list is stored into a set as the processing will be faster. In that case, the code would be

>>> orig = ['http://example.com/1/1.jpg','http://example.com/2/2.jpg','http://example.com/3/3.jpg']
>>> real = ['http://example.com/1/1.jpg']
>>> real_set = set(real)
>>> [i if i in real_set else None for i in orig]
[u'http://example.com/1/1.jpg', None, None]

Thanks to mata and Cuadue for the second version using sets. Check their comments below.

Upvotes: 5

derricw
derricw

Reputation: 7036

output_list = [i if i in list2 else None for i in list1]

Upvotes: 2

Related Questions