Reputation: 7644
i have a pandas dataframe euc data which consists of columns
code1 code2 euclidean_distance
I wanted to get top 50 rows for every group of code1 sorted on euclidean distance, to get this i used:
matrix_top_50 = euc_data.sort_values(['code1', 'euclidean_distance'])
.groupby('code1').head(50).reset_index(drop=True)
Now i want to create another matrix to get the next 100 rows for every group of code1 sorted on euclidean distance
For that i tried to use .iloc
start = 51
end = 151
next_matrix = euc_data.sort_values(['code1', 'euclidean_distance'])
.groupby('code1').iloc[start:end].reset_index(drop=True)
But i am getting error:
Cannot access callable attribute 'iloc' of 'DataFrameGroupBy' objects, try using the 'apply' method
How can i achieve this?
Upvotes: 2
Views: 75
Reputation: 1838
Maybe there is a better solution but you can use apply
as the error hints:
next_matrix = euc_data.sort_values(['code1', 'euclidean_distance'])\
.groupby('code1').apply(lambda x: x.iloc[start:end]).\
reset_index(drop=True)
Upvotes: 2
Reputation: 863791
I think you need GroupBy.apply
, but is necessary data have to contains rows by start
and end
, else error:
ext_matrix = (euc_data.sort_values(['code1', 'euclidean_distance'])
.groupby('code1')
.apply(lambda x: x.iloc[start:end])
.reset_index(drop=True)
)
Upvotes: 1