sy1vi3
sy1vi3

Reputation: 181

Keyerror with bs4: return self.attrs[key]

I have some code that was working, but recently started giving me an error. The problematic section of code looks like this:

if(new_hash != old_hash):
        print(new_hash)
        print(old_hash)
        # Finds content of the most recent post on the list
        content = BeautifulSoup(vf_html.find('description').findNext('description').find(text=lambda t: isinstance(t, CData)), 'html.parser')
        for img in content.select('img'):
            img.replace_with(img['alt'])
        content = content.text
        new_content_hash = hashlib.md5(str(content).encode('utf-8')).hexdigest()
        toSend = (content[:1000] + '') if len(content) > 75 else content
        # Finds author of the most recent post on the list
        author = vf_xml.find('creator').get_text(strip=True)
        author = author.split()[0]
        author = author[1:]

This was working fine, but a few hours ago it starting throwing me this error:

Traceback (most recent call last):
  File "C:\Users\Taran Mayer\Desktop\CodyBot\scrape.py", line 160, in <module>
    scrape()
  File "C:\Users\Taran Mayer\Desktop\CodyBot\scrape.py", line 83, in scrape
    img.replace_with(img['alt'])
  File "C:\Python38\lib\site-packages\bs4\element.py", line 1401, in __getitem__
    return self.attrs[key]
KeyError: 'alt'

I don't think I changed anything, and I tried reverting to an earlier, working version of the code, but the error persisted. Can anybody help me find what I'm doing wrong? If I comment out the lines

for img in content.select('img'):
    img.replace_with(img['alt'])

the program works, but doesn't do what I want it to.

Upvotes: 1

Views: 1981

Answers (1)

Andrej Kesely
Andrej Kesely

Reputation: 195528

It seems that some images you want to .replace_with don't have alt= attribute.

You can resolve it with:

for img in content.select('img'):
    img.replace_with(img.attrs.get('alt', ''))

this will replace every image (even those missing alt=... attribute)


Or:

for img in content.select('img[alt]'):
    img.replace_with(img['alt'])

this will replace only images with alt=... attribute.

Upvotes: 2

Related Questions