ziulfer
ziulfer

Reputation: 1369

Concatenate multiple .csv dataframe with multiindex

I am concatenating multiple dfs which look like these:

                 X                  Y
                 mean   std size   mean         std  size
In_X                    
(10.424, 10.43] 10.425  NaN  1      0.003786    NaN   1
(10.43, 10.435] 10.4    NaN  0      NaN         NaN   0

When I didn't have multiindex dfs, I was using:

extension='csv'
all_filenames = [i for i in glob.glob('*.{}'.format(extension))]
all_dfs = pd.concat([pd.read_csv(f) for f in all_filenames ])

But this introduces a row:

mean   std size   mean          std  size

Every time a new df is concatenated to all_dfs. How to have only the original multiindex header and avoid the introduction of the second-level header in the concatenated df?

Upvotes: 1

Views: 59

Answers (2)

Quang Hoang
Quang Hoang

Reputation: 150805

read_csv by defaults only take first row as header. You want to do specify two-row header with header:

all_dfs = pd.concat([pd.read_csv(f, header=[0,1] for f in all_filenames ])

Upvotes: 2

NYC Coder
NYC Coder

Reputation: 7604

Convert your multi-index to regular columns like this:

df.columns = df.columns.map('_'.join)

And then use pd.concat

Upvotes: 1

Related Questions