Reputation: 154
I'm trying to scrape tabular content from this webpage and write the same to a csv file using pandas.read_html(). There are two tables in there with the same selector table.table--overflow[aria-label^='Financials']
and I wish to grab them all. My current implementation can print the content from both of the tables but write only the last table to a csv file.
import requests
import pandas as pd
from bs4 import BeautifulSoup
link = 'https://www.marketwatch.com/investing/stock/mbin/financials/balance-sheet'
def get_tabular_content(s,link):
res = s.get(link)
soup = BeautifulSoup(res.text,"lxml")
for selector in soup.select("table.table--overflow[aria-label^='Financials']"):
df = pd.read_html(str(selector))[0]
df.to_csv('marketwatch.csv', header=True, index=False)
print(df)
with requests.Session() as s:
s.headers['User-Agent'] = 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.150 Safari/537.36'
get_tabular_content(s,link)
How can I add content from multiple tables to a csv file using pandas.read_html()?
Upvotes: 0
Views: 92
Reputation: 23815
stop overriding the output file - use a unique name. With this you will get multiple output files - each one of them represents HTML table from the page.
If you want to have 1 csv that will contain the data from all HTML tables go with adding the df
to a list_of_df
and after the loop is done call frame = pd.concat(list_of_df, axis=0, ignore_index=True)
list_of_df = []
for selector in soup.select("table.table--overflow[aria-label^='Financials']"):
df = pd.read_html(str(selector))[0]
list_of_df.append(df)
frame = pd.concat(list_of_df, axis=0, ignore_index=True)
frame.to_csv('marketwatch.csv', header=True, index=False)
The output ('marketwatch.csv') - 75 records
Item Item,2016,2017,2018,2019,2020,5-year trend
Total Cash & Due from Banks Total Cash & Due from Banks,10.04M,18.91M,25.86M,13.91M,10.06M,
Cash & Due from Banks Growth Cash & Due from Banks Growth,-,88.37%,36.76%,-46.20%,-27.65%,
Investments - Total Investments - Total,1.69B,1.92B,1.67B,3.19B,3.95B,
...
Return On Average Total Equity Return On Average Total Equity,-,-,-,-,24.66%,
Accumulated Minority Interest Accumulated Minority Interest,-,-,-,-,-,
Total Equity Total Equity,206.29M,367.47M,421.24M,653.73M,810.62M,
Liabilities & Shareholders' Equity Liabilities & Shareholders' Equity,2.72B,3.39B,3.88B,6.37B,9.65B,
Upvotes: 1