BhishanPoudel
BhishanPoudel

Reputation: 17164

How to create new column based on top and bottom parts of single dataframe in PANDAS?

I have merged two dataframes having same column names. Is there a easy way to get another column of mean of these two appended dataframes?

Maybe code explains it better.

import numpy as np
import pandas as pd

df1 = pd.DataFrame({'a':[1,2,3,4],'b':[10,20,30,40]})
df2 = pd.DataFrame({'a':[1.2,2.2,3.2,4.2],'b':[10.2,20.2,30.2,40.2]})

df = df1.append(df2)
print(df)

df['a_mean'] = ???

     a     b
0  1.0  10.0
1  2.0  20.0
2  3.0  30.0
3  4.0  40.0
0  1.2  10.2
1  2.2  20.2
2  3.2  30.2
3  4.2  40.2

How to create a new column a_mean with values [1.1, 2.1, 3.1, 4.1, 1.1, 2.1, 3.1, 4.1] effectively ?

Upvotes: 1

Views: 93

Answers (2)

BhishanPoudel
BhishanPoudel

Reputation: 17164

Try this:

df['a_mean'] = np.tile( (df1.a.to_numpy() + df2.a.to_numpy())/2, 2)

As per the comments, there is already a great answer by Anky, but to extend this method you can do this:

df['a_mean2'] = np.tile( (df.iloc[0: len(df)//2].a.to_numpy() + df.iloc[len(df)//2:].a.to_numpy())/2, 2)

Update:

df['a_mean3'] = np.tile(df.a.to_numpy().reshape(2,-1).mean(0), 2)

Outptut

print(df)
     a     b  a_mean2  a_mean  a_mean3
0  1.0  10.0      1.1     1.1      1.1
1  2.0  20.0      2.1     2.1      2.1
2  3.0  30.0      3.1     3.1      3.1
3  4.0  40.0      4.1     4.1      4.1
0  1.2  10.2      1.1     1.1      1.1
1  2.2  20.2      2.1     2.1      2.1
2  3.2  30.2      3.1     3.1      3.1
3  4.2  40.2      4.1     4.1      4.1

Upvotes: 1

anky
anky

Reputation: 75080

melt()

df=df.assign(a_mean=df1.add(df2).div(2).melt().value)

Or taking only df, you can do:

df=df.assign(a_mean=df.groupby(df.index)['a'].mean())

     a     b  a_mean
0  1.0  10.0     1.1
1  2.0  20.0     2.1
2  3.0  30.0     3.1
3  4.0  40.0     4.1
0  1.2  10.2     1.1
1  2.2  20.2     2.1
2  3.2  30.2     3.1
3  4.2  40.2     4.1

Upvotes: 1

Related Questions