Reputation: 441
I am having an issue with getting the correct headers in my pandas dataframes. Below is my full code. I would like to strip all the headers and then add my own at the very end.
import pandas as pd
#Ingest Data
df = pd.read_csv('C:/source/tradingdata/Kraken/XBTEUR.csv', names=['epoc', 'price', 'volume'])
#Convert from epoc to datetime
df['datetime'] = pd.to_datetime(df['epoc'], unit='s')
#set index
df.set_index(df['datetime'], inplace=True, drop=False)
#resample OHLCV data
open_df = df.resample("1T").agg({'price': 'first'})
high_df = df.resample("1T").agg({'price': 'max'})
low_df = df.resample("1T").agg({'price': 'min'})
close_df = df.resample("1T").agg({'price': 'last'})
volume_df = df.resample("1T").agg({'volume': 'sum'})
#create new dataframe
new_df = pd.concat([open_df, high_df, low_df, close_df, volume_df], axis=1, keys=['open', 'high', 'low', 'close', 'volume'])
#only get bars where volume is > 0
new_df = new_df[new_df.volume > 0]
new_df.to_csv('C:/source/tradingdata/Kraken/XBTEUR_1Min.csv')
Dataframe after new_df = pd.concat & the data I am exporting
open high low close volume
price price price price volume
datetime
2021-03-31 23:55:00 49989.4 49998.7 49989.3 49989.3 0.033687
2021-03-31 23:56:00 49989.3 49989.3 49969.2 49969.3 0.013282
2021-03-31 23:57:00 49969.2 49969.2 49931.8 49969.2 1.197345
2021-03-31 23:58:00 49969.1 49969.2 49969.1 49969.2 0.122786
2021-03-31 23:59:00 50027.2 50049.9 50008.3 50008.3 1.271983
What I want the headers to look like.
datetime open high low close volume
2021-03-31 23:55:00 49989.4 49998.7 49989.3 49989.3 0.033687
2021-03-31 23:56:00 49989.3 49989.3 49969.2 49969.3 0.013282
2021-03-31 23:57:00 49969.2 49969.2 49931.8 49969.2 1.197345
2021-03-31 23:58:00 49969.1 49969.2 49969.1 49969.2 0.122786
2021-03-31 23:59:00 50027.2 50049.9 50008.3 50008.3 1.271983
Upvotes: 0
Views: 49
Reputation: 23217
As you have created hierarchical index using the passed keys at the key=
parameter of the call to pd.concat()
on top of the column index of previous aggregations, MultiIndex
columns were created.
To remove the unwanted level of MultiIndex
of columns, you can do a .droplevel()
, as follows:
new_df = new_df.droplevel(level=1, axis=1)
Upvotes: 2
Reputation: 13407
Your column Index
has become a MultiIndex
because of concatenation, you should be able to reassign your columns with the first level from that MultiIndex
new_df.columns = new_df.columns.levels[0]
Upvotes: 3