Reputation: 47
I have groupby table:
df.groupby(['Age', 'Movie']).mean()
User Raitings
Age Movie
1 1 4.666667 7.666667
2 4.666667 8.000000
3 2.000000 7.500000
4 2.000000 5.500000
5 3.000000 7.000000
18 1 3.000000 7.500000
2 3.000000 8.000000
3 3.000000 8.500000
25 1 8.000000 7.250000
2 8.000000 7.500000
3 5.500000 8.500000
4 5.000000 7.000000
45 1 9.000000 7.500000
2 9.000000 7.500000
3 11.000000 7.000000
4 11.000000 6.000000
60 1 8.000000 7.000000
2 8.000000 9.000000
3 8.000000 7.000000
please, help with function, which takes integer (Age) and return Movie with MIN raitings in this Age-group. Example def(1) should return 4 (min Raitings in group Age(1) = 5.5, Movies(5.5) = 4)
I can get min Raiting:
df['Raitings'].min()
But i don't know - how to get raiting in particular group (Age)?
Upvotes: 2
Views: 8663
Reputation: 99
I will reshape and do pivot. Think it will help
df.reset_index(inplace = true)
df_Min = pd.pivot_table(df,index = [‘Movie’, ‘User’], columns =‘Age’, values = ‘Raiting’, aggfunc = min )
Upvotes: 0
Reputation: 294488
This gets all of them in one go.
df.groupby('Age').Raitings.idxmin().str[-1]
Age
1 4
18 1
25 4
45 4
60 1
Name: Raitings, dtype: int64
If you want a function, I'd use pd.DataFrame.xs
(xs is for cross section).
By default, xs
will grab from the first level of the index and subsequently drop that level. This conveniently leaves the level in which we want to draw the value in which idxmin
will hand us.
def f(df, age):
return df.xs(age).Raitings.idxmin()
f(df, 1)
4
Setup
Useful for those who try to parse this stuff.
txt = """\
Age Movie User Raitings
1.0 1 4.666667 7.666667
2 4.666667 8.000000
3 2.000000 7.500000
4 2.000000 5.500000
5 3.000000 7.000000
18.0 1 3.000000 7.500000
2 3.000000 8.000000
3 3.000000 8.500000
25.0 1 8.000000 7.250000
2 8.000000 7.500000
3 5.500000 8.500000
4 5.000000 7.000000
45.0 1 9.000000 7.500000
2 9.000000 7.500000
3 11.000000 7.000000
4 11.000000 6.000000
60.0 1 8.000000 7.000000
2 8.000000 9.000000"""
df = pd.read_fwf(pd.io.common.StringIO(txt))
df = df.ffill(downcast='infer').set_index(['Age', 'Movie'])
Upvotes: 4
Reputation: 210902
Source multi-index DF:
In [221]: x
Out[221]:
User Raitings
Age Movie
1.0 1 4.666667 7.666667
2 4.666667 8.000000
3 2.000000 7.500000
4 2.000000 5.500000
5 3.000000 7.000000
18.0 1 3.000000 7.500000
2 3.000000 8.000000
3 3.000000 8.500000
25.0 1 8.000000 7.250000
2 8.000000 7.500000
3 5.500000 8.500000
4 5.000000 7.000000
45.0 1 9.000000 7.500000
2 9.000000 7.500000
3 11.000000 7.000000
4 11.000000 6.000000
60.0 1 8.000000 7.000000
2 8.000000 9.000000
3 8.000000 7.000000
Function:
In [222]: def f(df, age):
...: return df.loc[pd.IndexSlice[age,:], 'Raitings'].idxmin()[1]
...:
Test:
In [223]: f(x, age=1)
Out[223]: 4
Upvotes: 4
Reputation: 1478
If you want the minimum for a specific age, you can do :
df["Age"==1]['Raitings'].min()
If you want to do it for the whole dataframe, you can do:
df.groupby("Age").agg({ "Raitings" : "min" })
I hope it helps,
Upvotes: 0