Reputation: 6132
Today I've been working with five DataFrames that are almost the same, but for different courses. They are named df2b2015
, df4b2015
, df6b2015
, df2m2015
.
Every one of those DataFrames has a column named prom_lect2b_rbd
for df2b2015
, prom_lect4b_rbd
for df4b2015
, and so on.
I want to append those DataFrames, but because every column has a different name, they don't go together. I'm trying to turn every one of those columns into a prom_lect_rbd
column, so I can then append them without problem.
Is there a way I can do that with a for
loop and regex
.
Else, is there a way I can do it with other means?
Thanks!
PS: I know some things, like I can turn the columns into what I want using:
re.sub('\d(b|m)','', a)
Where a
is the columns name. But I can't find a way to mix that with loops and column renaming.
Edit:
DataFrame(s) look like this:
df2b2015:
rbd prom_lect2b_rbd
1 5
2 6
df4b2015:
rbd prom_lect4b_rbd
1 8
2 9
etc.
Upvotes: 0
Views: 109
Reputation: 997
Something like this, with .filter(regex=)
? It does assume there is only one matching column per dataframe but your example permits that.
import pandas as pd
import numpy as np
df1 = pd.DataFrame(np.random.rand(10,3),columns=['prom_lect2b_rbd','foo','bar'])
df2 = pd.DataFrame(np.random.rand(10,3),columns=['prom_lect4b_rbd','foo','bar'])
for df in [df1,df2]:
colname = df.filter(regex='prom_lect*').columns.format()
df.rename(columns={colname[0]:'prom_lect_rbd'})
print(df1)
print(df2)
Upvotes: 0
Reputation: 6132
Managed to do it. Probably not the most Pythonic way, but it does what I wanted:
dfs=[df2b2015,df4b2015,df6b2015,df8b2015,df2m2015]
cols_lect=['prom_lect2b_rbd','prom_lect4b_rbd','prom_lect6b_rbd',
'prom_lect8b_rbd','prom_lect2m_rbd']
for j,k in zip(dfs,cols_lect):
j.rename(columns={k:re.sub('\d(b|m)','', k)}, inplace=True)
Upvotes: 1