Reputation: 8669
I have 2 different dataframes of coin flips. I want to make a function that find 2 things:
Is it possible to make the function dynamic for n columns?
import pandas as pd
import numpy as np
df=pd.DataFrame({'Users': [ 'Bob', 'Jim', 'Ted', 'Jesus', 'James'],
'Round 1': ['np.nan','H','np.nan','T','H'],
'Round 2': ['np.nan','H','H','H','T'],
'Round 3': ['np.nan','T','T','T','T'],
})
df2=pd.DataFrame({'Users': [ 'Boob', 'Paul', 'Todd', 'Zeus', 'Derrik'],
'Round 1': ['H','H','np.nan','T','np.nan'],
'Round 3': ['H','T','H','T','np.nan'],
'Round 5': ['H','T','H','T','np.nan'],
'Round 7': ['H','H','H','H','H'],
})
df = df.set_index('Users')
df2 = df2.set_index('Users')
print (df)
print (df2)
Here is what I tried:
def score(data):
score_map = {'H':1, 'T':0}
data=data.replace(score_map)
data['average']=
data['rounds played']=
df=score(df)
I am guessing I have to use groupby if this is possible
The results should look something like this:
Round 1 Round 2 Round 3 Average Rounds played
Users
Bob np.nan np.nan np.nan NaN 0
Jim 1 1 0 0.66 3
Ted np.nan 1 0 0.5 2
Jesus 0 1 0 0.33 3
James 1 0 0 0.33 2
[5 rows x 3 columns]
Upvotes: 1
Views: 386
Reputation: 6383
In [104]: def score_map(x):
.....: if x=='H': return 1
.....: if x=='T': return 0
.....: return np.nan
.....:
In [105]: def score(data):
.....: return_df = data.applymap(score_map)
.....: avg = return_df.mean(axis=1)
.....: nrounds = return_df.count(axis=1)
.....: return_df['Average'] = avg
.....: return_df['Rounds Played']=nrounds
.....: return return_df
.....:
In [106]: score(df)
Out[106]:
Round 1 Round 2 Round 3 Average Rounds Played
Users
Bob NaN NaN NaN NaN 0
Jim 1 1 0 0.666667 3
Ted NaN 1 0 0.500000 2
Jesus 0 1 0 0.333333 3
James 1 0 0 0.333333 3
[5 rows x 5 columns]
Upvotes: 1