Reputation: 11
I want to extract the data from the table given in 'https://statisticstimes.com/demographics/india/indian-states-population.php' and put it in a list or a dictionary.
I am a beginner in Python. From what I have learned so far all I could do is:
import urllib.request , urllib.error , urllib.parse
from bs4 import BeautifulSoup
url = input("Enter url: ")
html = urllib.request.urlopen(url).read()
x = BeautifulSoup(html , 'html.parser')
tags = x('tr')
lst = list()
for tag in tags:
lst.append(tag.findAll('td'))
print(lst)
Upvotes: 1
Views: 39
Reputation: 20052
You can use requests
and pandas
.
Here's how:
import pandas as pd
import requests
from tabulate import tabulate
url = "https://statisticstimes.com/demographics/india/indian-states-population.php"
df = pd.read_html(requests.get(url).text, flavor="bs4")[-1]
print(tabulate(df.head(10), showindex=False))
Output:
--- ---------------- -------- -------- ------- ----- ---- -------------------- ---
NCT Delhi 18710922 16787941 1922981 11.45 1.36 Malawi 63
18 Haryana 28204692 25351462 2853230 11.25 2.06 Venezuela 51
14 Kerala 35699443 33406061 2293382 6.87 2.6 Morocco 41
20 Himachal Pradesh 7451955 6864602 587353 8.56 0.54 China, Hong Kong SAR 104
16 Punjab 30141373 27743338 2398035 8.64 2.2 Mozambique 48
12 Telangana 39362732 35004000 4358732 12.45 2.87 Iraq 36
25 Goa 1586250 1458545 127705 8.76 0.12 Bahrain 153
19 Uttarakhand 11250858 10086292 1164566 11.55 0.82 Haiti 84
UT3 Chandigarh 1158473 1055450 103023 9.76 0.08 Eswatini 159
9 Gujarat 63872399 60439692 3432707 5.68 4.66 France 23
--- ---------------- -------- -------- ------- ----- ---- -------------------- ---
With:
df.to_csv("your_table.csv", index=False)
you can dump the table to a .csv
file:
Upvotes: 1