Reputation: 59
Name Runs Score
Richard 2000 2
Richard 3000 3
Richard 7000 8
Richard 8000 7
Richard 2000 4
Richard 9000 7.5
lessthan1 = str(len(df[(df['Score']>=0) & (df['Score']<1)]))
oneto5 = str(len(df[(df['Score']>=1) & (df['Score']<5)]))
fiveto10 = str(len(df[(df['Score']>=5) & (df['Score']<10)]))
For example in category >=1 to <5..am expecting (2000+3000+2000)/31000
Upvotes: 0
Views: 609
Reputation: 825
Also, you can use pandas apply functions with lambdas:
df.loc[df['Score'].apply(lambda x: x < 1 and x > 0), 'Runs'].mean()
df.loc[df['Score'].apply(lambda x: x >= 1 and x < 5), 'Runs'].mean()
df.loc[df['Score'].apply(lambda x: x >= 5 and x < 10), 'Runs'].mean()
Result:
5166.666666666667
nan
2333.3333333333335
To obtain mean averaged over total sum in score you have to change mean()
by sum()
and divide by total sum()
. For example
df.loc[df['Score'].apply(lambda x: x >= 1 and x < 5), 'Runs'].sum()/df['Runs'].sum()
Result:
0.22580645161290322 = (2000 + 3000 + 2000)/31000
Upvotes: 0
Reputation: 142216
Looks like you can use pd.cut
to bin the score data and use that to group the runs, eg:
df.groupby(pd.cut(df['Score'], [0, 1, 5, 10]))['Runs'].mean()
Gives you:
Score
(0, 1] NaN
(1, 5] 2333.333333
(5, 10] 8000.000000
Name: Runs, dtype: float64
Upvotes: 3
Reputation: 5774
import pandas as pd
array = [
["Richard", 2000, 2],
["Richard", 3000, 3],
["Richard", 7000, 8],
["Richard", 8000, 7],
["Richard", 2000, 4],
["Richard", 9000, 7.5]]
df = pd.DataFrame(array, columns=["Name", "Runs", "Score"])
print(df[(df['Score'] >= 0) & (df['Score'] < 1)]["Runs"].mean())
print(df[(df['Score'] >= 1) & (df['Score'] < 5)]["Runs"].mean())
print(df[(df['Score'] >= 5) & (df['Score'] < 10)]["Runs"].mean())
nan
2333.3333333333335
8000.0
You need to select just the Runs column and then use the .mean() method on that after you implement your boolean selection
Upvotes: 0
Reputation: 739
You could try using boolean masking as follows,
lessthan1 = df.loc[(df['Score'] >= 0) & (df['Score'] <1), 'Runs'].mean()
oneto5 = df.loc[(df['Score']>=1) & (df['Score']<5), 'Runs'].mean()
fiveto10 = df.loc[(df['Score']>=5) & (df['Score']<10), 'Runs'].mean()
print('Less than 1:', lessthan1)
print('One to Five:', oneto5)
print('Five to Ten:', fiveto10)
Upvotes: 0