Raj Rajeshwari Prasad
Raj Rajeshwari Prasad

Reputation: 334

Finding pairs in a dataframe using Python

I am working on a data which is in the form of a dataframe. My dataframe is:

left_id  right_id

  a         b
  a         c
  c         e

I want to code in such a way that I get output as below:

 key    value

  a      b,c
  c       e

in the input dataframe, a has occurred twice. once with c and once with b. hence the value of a is assigned as both b and c. For c the value is assigned as e.

Please help me with this issue.

Upvotes: 0

Views: 67

Answers (2)

Shadab Hussain
Shadab Hussain

Reputation: 804

You can groupby the 'left_id', then call agg() functions of Panda’s DataFrame objects on 'right_id'.

The aggregation functionality provided by the agg() function allows multiple statistics to be calculated per group in one calculation.

df.groupby('left_id', as_index = False).agg({'right_id': ' '.join})

Or if you just want to concateate strings into a column of list objects you can also:

df.groupby('left_id')['right_id'].apply(list)

Upvotes: 0

yatu
yatu

Reputation: 88236

Looks like you want groupby.agg with join:

df.groupby('left_id').right_id.agg(', '.join).reset_index()

Upvotes: 2

Related Questions