fortuneRice
fortuneRice

Reputation: 4204

Pandas: collapse rows with same column ID and concatenate multiple columns into 1

Without iterating over the dataframe, is it possible using pandas methods to "collapse" from this

index    ID     fruits    qty
0        101    oranges   1
1        134    apples    2
2        576    pears     3
3        837    apples    1
4        576    oranges   2
5        134    pears     3

to this (and generate a new description column in the process that sort of summarizes the info into a single string):

index    ID     description
0        101    101: orange(1)
1        134    134: apples(2) pears(3)
2        576    576: pears(3) oranges(2)
3        837    837: apples(1)

Note that this is a spin on this question.

Upvotes: 0

Views: 1078

Answers (2)

jezrael
jezrael

Reputation: 862741

First join columns fruits with qty, then aggregate join and last add ID converted to strings:

df['description'] = df['fruits'] + '(' + df['qty'].astype(str) + ')'

df = df.groupby('ID')['description'].agg(' '.join).reset_index()
df['description'] = df['ID'].astype(str)  + ': ' + df['description'] 
print (df)
    ID               description
0  101           101: oranges(1)
1  134   134: apples(2) pears(3)
2  576  576: pears(3) oranges(2)
3  837            837: apples(1)

Another idea with GroupBy.apply:

def f(x):
    d = x['fruits'] + '(' + x['qty'].astype(str) + ')'
    return x.name.astype(str)  + ': ' + ' '.join(d)

df = df.groupby('ID').apply(f).reset_index(name='description')

print (df)
    ID               description
0  101           101: oranges(1)
1  134   134: apples(2) pears(3)
2  576  576: pears(3) oranges(2)
3  837            837: apples(1)

Upvotes: 1

Raavgo
Raavgo

Reputation: 334

Using Pandas there is the apply function. Using this function you can apply lambda functions to rows, columns, or even whole dataframes.

Here is the documentation link to the function. https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.apply.html

Upvotes: 0

Related Questions