Reputation: 496
I have this dictionary:
db= {'www.baurom.ro':
{0: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
1: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
},
'slbz2':
{0: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
1: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
}
And a list:
lista=['www.baurom.ro', 'www.baurom.ro', 'www.baurom.ro', 'www.baurom.ro', 'www.baurom.ro', 'www.baurom.ro', 'www.baurom.ro', 'www.listafirme.ro', 'www.romanian-companies.eu', 'www.risco.ro']
What am i doing now is this:
for x in lista:
if x in db:
db[x][0][lista.index(x)]+=1
In other words i want to count how many times each site appears in the list and on which position. This works but in the given example it will return something like:
{0: [7, 0, 0, 0, 0, 0, 0, 0, 0, 0]
while i would want it to be:
{0: [1, 1, 1, 1, 1, 1, 1, 0, 0, 0]
How can i achieve this? I can use a variable, initiate it with var=0 and then +=1 and use it as an artificial index but is there a more "pythonic" way of doing it?
Upvotes: 4
Views: 177
Reputation: 2715
If I rightly understand your question, you already have db
dictionary and you're seeking enumerate operator.
And your code will be like below:
for index, element in enumerate(lista):
if element in db:
db[element][0][index] = 1
Upvotes: 1
Reputation: 54233
If I understand your problem correctly, you could just iterate over lista
and create db
as needed :
urls = ['www.baurom.ro', 'www.baurom.ro', 'www.baurom.ro', 'www.baurom.ro', 'www.baurom.ro', 'www.baurom.ro', 'www.baurom.ro', 'www.listafirme.ro', 'www.romanian-companies.eu', 'www.risco.ro']
n = len(urls)
db = {}
for i, url in enumerate(urls):
if not db.get(url):
db[url] = {0: [0] * n} # NOTE: Use numpy for large arrays
db[url][0][i] = 1
print(db)
# {'www.romanian-companies.eu': {0: [0, 0, 0, 0, 0, 0, 0, 0, 1, 0]}, 'www.risco.ro': {0: [0, 0, 0, 0, 0, 0, 0, 0, 0, 1]}, 'www.listafirme.ro': {0: [0, 0, 0, 0, 0, 0, 0, 1, 0, 0]}, 'www.baurom.ro': {0: [1, 1, 1, 1, 1, 1, 1, 0, 0, 0]}}
It requires only one pass over lista
and should be really fast.
If you have a list of interesting urls, you could use this variant:
from collections import defaultdict
urls = ['www.baurom.ro', 'www.baurom.ro', 'www.baurom.ro', 'www.baurom.ro', 'www.baurom.ro', 'www.baurom.ro', 'www.baurom.ro', 'www.listafirme.ro', 'www.romanian-companies.eu', 'www.risco.ro']
interesting_urls = set(['www.baurom.ro', 'slbz2'])
n = len(urls)
def url_array():
return {0: [0] * n, 1: [0] * n}
db = defaultdict(url_array)
for i, url in enumerate(urls):
if url in interesting_urls:
db[url][0][i] = 1
print(db)
# defaultdict(<function url_array at 0x7fe8a95b87d0>, {'www.baurom.ro': {0: [1, 1, 1, 1, 1, 1, 1, 0, 0, 0], 1: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]}})
Upvotes: 0
Reputation: 15204
You could do something like this:
for entry in db:
db[entry][0] = [int(x == entry) for x in lista]
print(db) # {'slbz2': {0: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 1: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]}, 'www.baurom.ro': {0: [1, 1, 1, 1, 1, 1, 1, 0, 0, 0], 1: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]}}
You essentially replace your dictionary
values with a list-comprehension that compares the dictionary
entry to the lista
entry. If the result of the comparison is True
you convert that bool
ean value to an int
eger (True -> 1
, False -> 0
).
If the items in lista
are very limited in comparison to the dictionary
keys you can do this instead:
for entry in set(x for x in lista if x in db):
# rest stays the same
This way, you loop and edit only those key
s in your dictionary
that appear in your lista
. Also notice that you loop over a set
constructed from the elements of lista
to ignore its duplicates ('www.baurom.ro'
key
is edited once, not as many times as it appears on lista
).
Upvotes: 0