grindelwaldus
grindelwaldus

Reputation: 35

TypeError: unhashable type: 'list' from attempt to create dict

I'm receiving the following error in my programm:

Traceback (most recent call last):
  File "bookmarks.py", line 26, in <module>
    zipping = dict(zip(datelist, matchhref))
TypeError: unhashable type: 'list'

I want to make dictionary from two lists (datelist and matchhref), but somehow when I use zip(), it returns list instead of tuple.

Here's my code:

import re

bm_raw = open('bookmarks.txt', 'r')

bm_line = bm_raw.read()

matchhref = re.findall('(<DT><A HREF=".*?</A>)', bm_line)
massive = list()
datelist = list()
a = 0

for i in matchhref:

    temp = matchhref[a]
    found = re.findall('(\d\d\d\d\d\d\d\d\d\d)', temp)
    datelist.append(found)
    a=a+1

print datelist
print matchhref
zipping = dict(zip(datelist, matchhref))

And here's contents of bookmarks.txt:

 <DT><A HREF="some random data" ADD_DATE="1460617925" ICON="some random data">priomap</A>
 <DT><A HREF="some random data" ADD_DATE="1455024833" ICON="some random data">V.34</A>

Upvotes: 0

Views: 1425

Answers (2)

Padraic Cunningham
Padraic Cunningham

Reputation: 180391

As I commented, you can call re.search and then .group() the add the string and not the list that findall returns so you can use the string as the key but BeautifulSoup will make your life a lot easier:

In [50]:from bs4 import BeautifulSoup, Tag

In [51]: soup = BeautifulSoup(h,"xml")

In [52]: print(dict((dt["ADD_DATE"], dt["HREF"],) for dt in soup.select("DT A[HREF]")))
{u'1455024833': u'some random data', u'1460617925': u'some random data'}

select("DT A[HREF]") finds all the anchor tags i.e A inside a DT tag that have a HREF attribute.

The regex solution would be:

    found = re.search('(\d\d\d\d\d\d\d\d\d\d)', temp)     
    datelist.append(found.group())

But use a html parser like bs4 or something similar.

Upvotes: 1

szym
szym

Reputation: 5846

zip returns a list of tuples, not a tuple.

Besides, a tuple is only hashable if each of its elements are hashable. So a tuple of lists will not be hashable either.

That said, there's nothing wrong with dict(zip(keys, values)) if keys is a list of hashable elements. Your problem is that datelist contains lists (results of re.findall) which are not hashable and cannot be used as dict keys.

But really, read the advice given by others and don't use re to parse HTML. BeautifulSoup is my preferred tool.

Upvotes: 1

Related Questions