Mike
Mike

Reputation: 496

Item position in list

I have this dictionary:

db= {'www.baurom.ro':
                     {0: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
                      1: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
                     },
    'slbz2':
            {0: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
             1: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
    }

And a list:

lista=['www.baurom.ro', 'www.baurom.ro', 'www.baurom.ro', 'www.baurom.ro', 'www.baurom.ro', 'www.baurom.ro', 'www.baurom.ro', 'www.listafirme.ro', 'www.romanian-companies.eu', 'www.risco.ro']

What am i doing now is this:

for x in lista:
     if x in db:
        db[x][0][lista.index(x)]+=1

In other words i want to count how many times each site appears in the list and on which position. This works but in the given example it will return something like:

{0: [7, 0, 0, 0, 0, 0, 0, 0, 0, 0]

while i would want it to be:

{0: [1, 1, 1, 1, 1, 1, 1, 0, 0, 0]

How can i achieve this? I can use a variable, initiate it with var=0 and then +=1 and use it as an artificial index but is there a more "pythonic" way of doing it?

Upvotes: 4

Views: 177

Answers (3)

chinskiy
chinskiy

Reputation: 2715

If I rightly understand your question, you already have db dictionary and you're seeking enumerate operator.

And your code will be like below:

for index, element in enumerate(lista):
    if element in db:
        db[element][0][index] = 1 

Upvotes: 1

Eric Duminil
Eric Duminil

Reputation: 54233

If I understand your problem correctly, you could just iterate over lista and create db as needed :

urls = ['www.baurom.ro', 'www.baurom.ro', 'www.baurom.ro', 'www.baurom.ro', 'www.baurom.ro', 'www.baurom.ro', 'www.baurom.ro', 'www.listafirme.ro', 'www.romanian-companies.eu', 'www.risco.ro']
n = len(urls)
db = {}

for i, url in enumerate(urls):
    if not db.get(url):
        db[url] = {0: [0] * n} # NOTE: Use numpy for large arrays
    db[url][0][i] = 1

print(db)
# {'www.romanian-companies.eu': {0: [0, 0, 0, 0, 0, 0, 0, 0, 1, 0]}, 'www.risco.ro': {0: [0, 0, 0, 0, 0, 0, 0, 0, 0, 1]}, 'www.listafirme.ro': {0: [0, 0, 0, 0, 0, 0, 0, 1, 0, 0]}, 'www.baurom.ro': {0: [1, 1, 1, 1, 1, 1, 1, 0, 0, 0]}}

It requires only one pass over lista and should be really fast.

If you have a list of interesting urls, you could use this variant:

from collections import defaultdict

urls = ['www.baurom.ro', 'www.baurom.ro', 'www.baurom.ro', 'www.baurom.ro', 'www.baurom.ro', 'www.baurom.ro', 'www.baurom.ro', 'www.listafirme.ro', 'www.romanian-companies.eu', 'www.risco.ro']

interesting_urls = set(['www.baurom.ro', 'slbz2'])

n = len(urls)

def url_array():
    return {0: [0] * n, 1: [0] * n}

db = defaultdict(url_array)

for i, url in enumerate(urls):
    if url in interesting_urls:
        db[url][0][i] = 1

print(db)
# defaultdict(<function url_array at 0x7fe8a95b87d0>, {'www.baurom.ro': {0: [1, 1, 1, 1, 1, 1, 1, 0, 0, 0], 1: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]}}) 

Upvotes: 0

Ma0
Ma0

Reputation: 15204

You could do something like this:

for entry in db:
    db[entry][0] = [int(x == entry) for x in lista]
print(db)  # {'slbz2': {0: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 1: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]}, 'www.baurom.ro': {0: [1, 1, 1, 1, 1, 1, 1, 0, 0, 0], 1: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]}}

You essentially replace your dictionary values with a list-comprehension that compares the dictionary entry to the lista entry. If the result of the comparison is True you convert that boolean value to an integer (True -> 1, False -> 0).


If the items in lista are very limited in comparison to the dictionary keys you can do this instead:

for entry in set(x for x in lista if x in db):
    # rest stays the same

This way, you loop and edit only those keys in your dictionary that appear in your lista. Also notice that you loop over a set constructed from the elements of lista to ignore its duplicates ('www.baurom.ro' key is edited once, not as many times as it appears on lista).

Upvotes: 0

Related Questions