warrenfitzhenry
warrenfitzhenry

Reputation: 2299

Concatenating csv columns with common index into one df

I have various csv files that have the same columns, and need to concatenate the data into one dataframe/csv for an index of 1-24. The name of each csv file then becomes the name of it's relevant column. Example:

one.csv                 two.csv
Time  hour energy     Time  hour energy     
00:59   1    0        00:59   1    0  
01:59   2    3        01:59   2    5
02:59   3    2        02:59   3    8
...                   ...
23:59   24   6        23:59  24    3

Combined.csv
hour  one  two
1       0   0
2       3   5
3       2   8
...
24      6   3

The closest I have seen is this:

path = r'C:\files'
all_files = glob.glob(os.path.join(path, "*.csv"))   
df_from_each_file = (pd.read_csv(f) for f in all_files)
concatenated_df   = pd.concat(df_from_each_file, ignore_index=True)

But it's not quite right as it adds rows, not columns.

Upvotes: 0

Views: 138

Answers (1)

MaxU - stand with Ukraine
MaxU - stand with Ukraine

Reputation: 210832

You should use pd.concat(..., axis=1) parameter in order to concatenate DFs horizontally:

import os
import glob
import pandas as pd

In [46]: files = glob.glob(r'D:\temp\.data\42011160\*.csv')

In [47]: pd.concat([pd.read_csv(f, usecols=['hour', 'energy'], index_col='hour')
    ...:              .rename(columns=lambda x: os.path.basename(os.path.splitext(f)[0]))
    ...:            for f in files],
    ...:           axis=1).reset_index()
    ...:
Out[47]:
   hour  1  2
0     1  0  0
1     2  3  5
2     3  2  8
3    24  6  3

where:

In [48]: files
Out[48]: ['D:\\temp\\.data\\42011160\\1.csv', 'D:\\temp\\.data\\42011160\\2.csv']

Upvotes: 1

Related Questions