Reputation: 47
I would like to get {(A):[12,14]. (B):[3,5], (C,E):[8,2], (D,F):[4,1,3,7]}
from the dataframe below:
class type c1 c2 c3
A 0 12 14 nan
B 1 nan 3 5
C 2 8 nan 2
D 3 4 1 3
E 2 nan nan nan
F 3 nan 7 nan
I have issues with grouping values on the last columns given the fact that my dataframe can have more columns.
I basically do:
df.groupby('type')['class'].unique()
to get the list of classes
But I don't manage to get the list of matching values because I have to write a line for each column.
Upvotes: 1
Views: 62
Reputation: 153460
Here's another way:
def makelist(x):
return list(x.dropna())
df.groupby('type')\
.agg({'class':tuple,
'c1':makelist,
'c2':makelist,
'c3':makelist})\
.set_index('class')\
.sum(axis=1).to_frame()\
.T.to_dict(orient='records')
Output:
[{('A',): [12.0, 14.0],
('B',): [3.0, 5.0],
('C', 'E'): [8.0, 2.0],
('D', 'F'): [4.0, 1.0, 7.0, 3.0]}]
Upvotes: 0
Reputation: 2417
This does the trick
def process_row(row) :
values = [x for y in list(row.iloc[:,2:].values) for x in y if not
pd.isnull(x) ]
return {tuple(row['class']): values}
s = df.groupby('type').apply(process_row)
res = dict()
[res.update(di) for di in s]
print(res)
# {('A',): [12.0, 14.0], ('B',): [3.0, 5.0], ('C', 'E'): [8.0, 2.0], ('D', 'F'):
# [4.0, 1.0, 3.0, 7.0]}
Upvotes: 2