Reputation: 131
I have a dataframe
L1_1 L1_2 L3_1 L2_1 L2_2 L1_3 L2_3 L3_2 ....
3 5 1 5 7 2 3 2
4 2 4 1 4 1 4 2
I need to find the mean all "L1" , then all "L2" , and then all "L3 "
I tried
data["Mean"]=data.mean(axis=1)
that give me the sum across all "L1, L2 and L3" together
I also tried
data[['L1_1','L1_2','L1_3','Mean']].head()
but I have L1_1 to L1_20
so a loop sounds good. However, I cannot get a loop to work.
for i in range(1,21):
c = "'L1_" + i + "'," + c
Is a loop a good way to go here? or Is there a better? If a loop is the way to go, How do you get a loop to work in a data frame?
Upvotes: 1
Views: 529
Reputation: 863301
Use groupby
by columns (axis=1)
with custom function of splitted values:
df1 = df.groupby(lambda x: x.split('_')[0], axis=1).mean()
#another solution
#df1 = df.groupby(df.columns.str.split('_').str[0], axis=1).mean()
print (df1)
L1 L2 L3
0 3.333333 5.0 1.5
1 2.333333 3.0 3.0
If want add nex columns to original df
add join
with add_suffix
if want also rename columns names:
df = df.join(df1.add_suffix('_mean'))
print (df)
L1_1 L1_2 L3_1 L2_1 L2_2 L1_3 L2_3 L3_2 L1_mean L2_mean L3_mean
0 3 5 1 5 7 2 3 2 3.333333 5.0 1.5
1 4 2 4 1 4 1 4 2 2.333333 3.0 3.0
Upvotes: 2