Reputation: 1311
I have a dataframe that looks like this:
ID Description
1 A
1 B
1 C
2 A
2 C
3 A
I would like to group by the ID column and get the description as a list of list like this:
ID Description
1 [["A"],["B"],["C"]]
2 [["A"],["C"]]
3 [["A"]]
The df.groupby('ID')['Description'].apply(list)
but this create only the "first level" of lists.
Upvotes: 2
Views: 903
Reputation: 862481
You need create inner list
s:
print (df)
ID Description
0 1 Aas
1 1 B
2 1 C
3 2 A
4 2 C
5 3 A
df = df['Description'].apply(lambda x: [x]).groupby(df['ID']).apply(list).reset_index()
Another solution similar like @jp_data_analysis with one apply
:
df = df.groupby('ID')['Description'].apply(lambda x: [[y] for y in x]).reset_index()
And pure python solution:
a = list(zip(df['ID'], df['Description']))
d = {}
for k, v in a:
d.setdefault(k, []).append([v])
df = pd.DataFrame({'ID':list(d.keys()), 'Description':list(d.values())},
columns=['ID','Description'])
print (df)
ID Description
0 1 [[Aas], [B], [C]]
1 2 [[A], [C]]
2 3 [[A]]
Upvotes: 2
Reputation: 164623
This is slightly different to @jezrael in that the listifying of strings is done via map
. In addition call reset_index()
adds "Description" explicitly to output.
import pandas as pd
df = pd.DataFrame([[1, 'A'], [1, 'B'], [1, 'C'], [2, 'A'], [2, 'C'], [3, 'A']], columns=['ID', 'Description'])
df.groupby('ID')['Description'].apply(list).apply(lambda x: list(map(list, x))).reset_index()
# ID Description
# 1 [[A], [B], [C]]
# 2 [[A], [C]]
# 3 [[A]]
Upvotes: 2