Reputation: 1430
from bs4 import BeautifulSoup
import urllib2
url = "en.wikipedia.org/wiki/ISO_3166-1"
r = urllib2.urlopen("http://" +url)
soup = BeautifulSoup(r)
#tables = soup.findAll("table")
#i want to fetch data of india and store in a variable
t = soup.find("table")
for t1 in t.find_all('tr'):
#for cell in t1.find_all('td'):
cell = t1.find_all('td')
shortname = cell[0].string
alpha2 = cell[1].a.string
#print cell.find_all(text=True)
print shortname
#cells = t.find_all('td',text="India")
#rn = cells[0].get_text()
#print cells
#soup.find_all('a')
#title = soup.a
#title
Here the comments show the different things I tried before getting data. In the wiki table we have data such as country name and specific codes of country, I want to fetch the codes of the country based on the user input.
Upvotes: 0
Views: 232
Reputation: 4912
Using HTMLParser, you can get anything you want from HTML page. Here is your answer.
from HTMLParser import HTMLParser
import requests
import re
class MyHTMLParser(HTMLParser):
data = []
def handle_data(self, data):
if re.findall('[a-zA-Z-:]', data):
self.data.append(data)
if __name__ == '__main__':
url = 'http://en.wikipedia.org/wiki/ISO_3166-1'
rsp = requests.get(url)
p = MyHTMLParser()
p.feed(rsp.text)
s = p.data[p.data.index('Afghanistan'):p.data.index('ISO 3166-2:ZW')+1]
name = raw_input('please input country name: ')
print s[s.index(name)+3]
Upvotes: 0
Reputation: 281
This would take user input, ask for the country they want to look up the code for, and then return the 3 digit code. If you enter something it can't find, it would return none.
import requests
from bs4 import BeautifulSoup
session = requests.session()
def fetchCode(country):
page = session.get('http://en.wikipedia.org/wiki/ISO_3166-1')
soup = BeautifulSoup(page.text).find('table', {'class': 'wikitable'})
tablerows = soup.findAll('tr')
for tr in tablerows:
td = tr.findAll('td')
if td:
if td[0].text.lower() == country.lower():
return td[3].text
print fetchCode(raw_input('Enter Country Name:'))
Upvotes: 1