adrCoder
adrCoder

Reputation: 3275

dataframe save each column in separate CSV files

I have this code that saves downloaded yahoo finance into a dataframe:

import bs4 as bs
import requests
import yfinance as yf
import datetime
import pandas

resp = requests.get('http://en.wikipedia.org/wiki/List_of_S%26P_500_companies')
soup = bs.BeautifulSoup(resp.text, "html.parser") 
table = soup.find('table', {'class': 'wikitable sortable'})
tickers = []
for row in table.findAll('tr')[1:]:
    ticker = row.findAll('td')[0].text
    tickers.append(ticker)

tickers = [s.replace('\n', '') for s in tickers]
start = datetime.datetime(2019,1,1)
end = datetime.datetime(2019,7,17)
data = yf.download(tickers, start=start, end=end)
print(data)

print(type(data))

data.to_csv('stock_data.csv')

I get these results:

            Adj Close                         ...     Volume
                    A        AAL         AAP  ...        ZBH       ZION        ZTS
Date                                          ...
2018-12-31        NaN        NaN         NaN  ...        NaN        NaN        NaN
2019-01-02  65.271561  32.081280  157.739120  ...  1152100.0  2234100.0  2665600.0
2019-01-03  62.866974  29.690985  162.663467  ...  1166100.0  2078400.0  2390900.0
2019-01-04  65.043022  31.646681  158.628098  ...  1580400.0  2370500.0  3383500.0
2019-01-07  66.424164  32.545509  160.955429  ...   900300.0  2459700.0  2360800.0
...               ...        ...         ...  ...        ...        ...        ...
2019-07-10  73.212486  32.718483  156.368958  ...   632600.0  1404700.0  1763700.0
2019-07-11  71.585976  32.807880  156.229019  ...   885000.0  1500800.0  1588000.0
2019-07-12  71.496178  33.552837  158.518127  ...   644100.0  1565400.0  1473400.0
2019-07-15  70.398537  33.383980  158.857986  ...  1188100.0  1415200.0  1255200.0
2019-07-16  69.799820  33.989880  161.696884  ...  1099400.0  1508700.0  1214600.0

How can I save these into separate CSV files (one for "A", one for "AAL", one for "AAP", etc)?

Upvotes: 1

Views: 929

Answers (3)

adrCoder
adrCoder

Reputation: 3275

Here is what I was looking for:

import bs4 as bs
import requests
import yfinance as yf
import datetime
import pandas

resp = requests.get('http://en.wikipedia.org/wiki/List_of_S%26P_500_companies')
soup = bs.BeautifulSoup(resp.text, "html.parser") 
table = soup.find('table', {'class': 'wikitable sortable'})
tickers = []
for row in table.findAll('tr')[1:]:
    ticker = row.findAll('td')[0].text
    tickers.append(ticker)

tickers = [s.replace('\n', '') for s in tickers]
start = datetime.datetime(2010,1,1)
end = datetime.datetime(2019,11,18)

i=1
for ticker in tickers:
  print(i, "Ticker is : ", ticker)
  i=i+1
  data = yf.download(ticker, start=start, end=end)
  data.to_csv(ticker+'.csv')

Upvotes: 0

tmrlvi
tmrlvi

Reputation: 2361

Since you have MultiIndex as your index, you want first to get a list of the relevant column names. Then, you can filter on them to get the proper sub-dataframes.

import pandas as pd
df = pd.read_csv('stock_data.csv')
stock_names = df.columns.get_level_values(1)
for stock in stock_names.unique():
    df.loc[:, stock_names == stock].to_csv(stock + '.csv')

If the levels in the index are named, you should use the name in get_level_values, for better readability.

If you want to get rid of the pesky level in the divided dataframe, either drop the levels:

stock_df = df.loc[:, stock_names == stock].copy()
stock_df.columns = stock_df.columns.droplevel(1)

or reorder the levels, so you can access the dataframe using more "naive" access:

df.columns = df.columns.reorder_levels([1,0])
...
df[stock].to_csv(stock + '.csv')

Upvotes: 0

Snedecor
Snedecor

Reputation: 739

You just have to iterate for for every column, for example you should do something like this:

import pandas as pd
df = pd.read_csv('stock_data.csv')
for column in df.columns:
    df[column].to_csv(column + '.csv')

Upvotes: 1

Related Questions