Reputation: 3275
I have this code that saves downloaded yahoo finance into a dataframe:
import bs4 as bs
import requests
import yfinance as yf
import datetime
import pandas
resp = requests.get('http://en.wikipedia.org/wiki/List_of_S%26P_500_companies')
soup = bs.BeautifulSoup(resp.text, "html.parser")
table = soup.find('table', {'class': 'wikitable sortable'})
tickers = []
for row in table.findAll('tr')[1:]:
ticker = row.findAll('td')[0].text
tickers.append(ticker)
tickers = [s.replace('\n', '') for s in tickers]
start = datetime.datetime(2019,1,1)
end = datetime.datetime(2019,7,17)
data = yf.download(tickers, start=start, end=end)
print(data)
print(type(data))
data.to_csv('stock_data.csv')
I get these results:
Adj Close ... Volume
A AAL AAP ... ZBH ZION ZTS
Date ...
2018-12-31 NaN NaN NaN ... NaN NaN NaN
2019-01-02 65.271561 32.081280 157.739120 ... 1152100.0 2234100.0 2665600.0
2019-01-03 62.866974 29.690985 162.663467 ... 1166100.0 2078400.0 2390900.0
2019-01-04 65.043022 31.646681 158.628098 ... 1580400.0 2370500.0 3383500.0
2019-01-07 66.424164 32.545509 160.955429 ... 900300.0 2459700.0 2360800.0
... ... ... ... ... ... ... ...
2019-07-10 73.212486 32.718483 156.368958 ... 632600.0 1404700.0 1763700.0
2019-07-11 71.585976 32.807880 156.229019 ... 885000.0 1500800.0 1588000.0
2019-07-12 71.496178 33.552837 158.518127 ... 644100.0 1565400.0 1473400.0
2019-07-15 70.398537 33.383980 158.857986 ... 1188100.0 1415200.0 1255200.0
2019-07-16 69.799820 33.989880 161.696884 ... 1099400.0 1508700.0 1214600.0
How can I save these into separate CSV files (one for "A", one for "AAL", one for "AAP", etc)?
Upvotes: 1
Views: 929
Reputation: 3275
Here is what I was looking for:
import bs4 as bs
import requests
import yfinance as yf
import datetime
import pandas
resp = requests.get('http://en.wikipedia.org/wiki/List_of_S%26P_500_companies')
soup = bs.BeautifulSoup(resp.text, "html.parser")
table = soup.find('table', {'class': 'wikitable sortable'})
tickers = []
for row in table.findAll('tr')[1:]:
ticker = row.findAll('td')[0].text
tickers.append(ticker)
tickers = [s.replace('\n', '') for s in tickers]
start = datetime.datetime(2010,1,1)
end = datetime.datetime(2019,11,18)
i=1
for ticker in tickers:
print(i, "Ticker is : ", ticker)
i=i+1
data = yf.download(ticker, start=start, end=end)
data.to_csv(ticker+'.csv')
Upvotes: 0
Reputation: 2361
Since you have MultiIndex
as your index, you want first to get a list of the relevant column names. Then, you can filter on them to get the proper sub-dataframes.
import pandas as pd
df = pd.read_csv('stock_data.csv')
stock_names = df.columns.get_level_values(1)
for stock in stock_names.unique():
df.loc[:, stock_names == stock].to_csv(stock + '.csv')
If the levels in the index are named, you should use the name in get_level_values
, for better readability.
If you want to get rid of the pesky level in the divided dataframe, either drop the levels:
stock_df = df.loc[:, stock_names == stock].copy()
stock_df.columns = stock_df.columns.droplevel(1)
or reorder the levels, so you can access the dataframe using more "naive" access:
df.columns = df.columns.reorder_levels([1,0])
...
df[stock].to_csv(stock + '.csv')
Upvotes: 0
Reputation: 739
You just have to iterate for for every column, for example you should do something like this:
import pandas as pd
df = pd.read_csv('stock_data.csv')
for column in df.columns:
df[column].to_csv(column + '.csv')
Upvotes: 1