Reputation: 608
I have a list of np. arrays, representing indexes of pandas dataframe.
I need to groupby index to get each group for each array
let's say, that is the df:
index values
0 2
1 3
2 2
3 2
4 4
5 4
6 1
7 4
8 4
9 4
and that is the list of np.arrays:
[array([0, 1, 2, 3]), array([6, 7, 8])]
from this data I expect to get 2 groups without loop opertaions as a single groupby object:
group1:
index values
0 2
1 3
2 2
3 2
group2:
index values
6 1
7 4
8 4
I would stress again that finally I need to get a single groupby object.
Thank you!
Upvotes: 0
Views: 3396
Reputation: 323226
I still using for-loop to create the groupby
key dict
l=[np.array([0, 1, 2, 3]), np.array([6, 7, 8])]
df=pd.DataFrame([2, 3, 2, 2, 4, 4, 1, 4, 4, 4],columns=['values'])
from collections import ChainMap
L=dict(ChainMap(*[dict.fromkeys(y,x) for x, y in enumerate(l)]))
list(df.groupby(L))
Out[33]:
[(0.0, values
index
0 2
1 3
2 2
3 2), (1.0, values
index
6 1
7 4
8 4)]
Upvotes: 3
Reputation: 153460
This seems like an X-Y problem:
l = [np.array([0,1,2,3]), np.array([6,7,8])]
df_indx = pd.DataFrame(l).stack().reset_index()
df_new = df.assign(foo=df['index'].map(df_indx.set_index(0)['level_0']))
for n,g in df_new.groupby('foo'):
print(g)
Output:
index values foo
0 0 2 0.0
1 1 3 0.0
2 2 2 0.0
3 3 2 0.0
index values foo
6 6 1 1.0
7 7 4 1.0
8 8 4 1.0
Upvotes: 0
Reputation: 1012
df=pd.DataFrame([2,3,2,2,4,4,1,4,4,4],columns=['values'])
df.index.name ='index'
l=[np.array([0, 1, 2, 3]), np.array([6, 7, 8])]
group1= df.loc[pd.Series(l[0])]
group2= df.loc[pd.Series(l[1])]
Upvotes: 2