pjdavis
pjdavis

Reputation: 355

Webscraping with BeautifulSoup in Python

resp = urlopen('http://international.o2.co.uk/internationaltariffs 
/getintlcallcosts?countryId=IND').read()
crawler = bs4.BeautifulSoup(resp, 'html.parser')
div = crawler.find('div', {"id": "standardRates"})
div

enter image description here

With the above code it lists all the tags/elements you can see in the image. I want to get the "£2.00". Except when i call .find('td') again as follows:

div = crawler.find('div', {"id": "standardRates"}).find('td')

it only returns Landline and not the line below, even though it has the same tag. I have very little experience in web scraping. How can i target this tag (the line with the £2.00)?

Upvotes: 0

Views: 149

Answers (1)

Bill Bell
Bill Bell

Reputation: 21663

You can use this approach to go fairly directly to the previous sibling of £2.00.

First find the desired table, then find the td with Landline in it as a string. Then get the parent of this td, get the next sibling of this, and finally get the next sibling.

>>> import requests
>>> get = requests.get('http://international.o2.co.uk/internationaltariffs/getintlcallcosts?countryId=IND')
>>> page = get.text
>>> from bs4 import BeautifulSoup
>>> soup = BeautifulSoup(page,'lxml')
>>> Landline_td = soup.find('table', {'id': 'standardRatesTable'}).find_all(string='Landline')[0]
>>> Landline_td
'Landline'
>>> Landline_td.findParent().findNextSibling()
<td>£2.00</td>
>>> Landline_td.findParent().findNextSibling().text
'£2.00'

Upvotes: 1

Related Questions