InFlames82
InFlames82

Reputation: 513

How to use "select" with Beautiful Soup 4.7.1?

I used this code with beautifulsoup 4.6. Since version 4.7.1 this code shows me an error.

Can someone help me how to use "select" in the new version?

import json
from urllib.request import urlopen
from bs4 import BeautifulSoup

url= 'http://www.nordhessen-wetter.de'
u = urlopen(url)
soup = BeautifulSoup(u, 'html.parser')

lufttemperatur = soup.select('td:nth-of-type(10)')[0].text

This is the error message:

Traceback (most recent call last): File "main.py", line 9, in lufttemperatur = soup.select('td:nth-of-type(10)')[0].text IndexError: list index out of range

live version of this code on repl.it

Upvotes: 3

Views: 5120

Answers (2)

JoePythonKing
JoePythonKing

Reputation: 1210

lufttemperatur = soup.select('td:nth-of-type(10)')[0]

I think that returns an empty list.

'td:nth-of-type(10)' I think means 'Selects every element that is the tenth element of its parent'. Now, the parent of the td is tr. So, there are only 4 td in a tr.

soup.select('td')[0] gives you what you want?

Upvotes: 0

jpw
jpw

Reputation: 44901

Based on your variable name I assume you are looking to extract the "Lufttemperatur in C" / "Aktuell" value.

If you look at your error you can see that the array index (10) is out of range - this might be because a change in how BeautifulSoup handles CSS selectors in version 4.7 or it might be due to a change in the page.

Anyhow you can get the value you are looking for by changing to code a little bit. Instead of looking for the 10th TD, look for the TDs under the 4th TR and you'll get an array with the TDs for the Lufttemperatur row:

lufttemperatur = soup.select("tr:nth-of-type(4) > td") # array of TDs

or

lufttemperatur = soup.select("tr:nth-of-type(4) > td")[1] # Aktuell value for Lufttemp.

Upvotes: 2

Related Questions