Reputation: 1019
I have 2 dataframes named geostat and ref, the dataframes are as follows:
geostat:
count percent grpno. state code
0 14.78 1 CA
1 0.00 2 CA
2 8.80 3 CA
3 9.60 4 FL
4 55.90 4 MA
5 0.00 2 FL
6 0.00 6 NC
7 0.00 5 NC
8 6.90 1 FL
9 59.00 4 MA
res:
grpno. MaxOfcount percent
0 1 14.78
1 2 0.00
2 3 8.80
3 4 59.00
4 5 0.00
5 6 0.00
I want to select the first(res.Maxofcount percent), res.grpno., and geostat.first(statecode) from the dataframe geostat and res inner join on columns res.Maxofcount percent = geostat.count percent AND res. grpno. = geostat.grpno. group by res.grpno.
I want to do this python pandas, I am not sure on how to do inner join with group by.Can anyone help me on this?
The output dataframe is given below:
FirstOfMaxOfState count percent state pool number FirstOfstate code
0 14.78 1 CA
1 0.00 2 CA
2 8.80 3 CA
3 59.00 4 MA
4 0.00 5 NC
5 0.00 6 NC
NOTE: FIRST(Column name) is an access function what should be equivalent of it in python?
EDITED: Changed the output dataframe.
Upvotes: 0
Views: 1982
Reputation: 42885
geostat.merge(res, left_on=['count percent', 'grpno.'], right_on=['MaxOfcount percent', 'grpno.'],how='inner')
count percent grpno. state code MaxOfcount percent
0 14.78 1 CA 14.78
1 0.00 2 CA 0.00
2 0.00 2 FL 0.00
3 8.80 3 CA 8.80
4 0.00 6 NC 0.00
5 0.00 5 NC 0.00
6 59.00 4 MA 59.00
Upvotes: 1