John Mark Johnson
John Mark Johnson

Reputation: 21

How to combine two columns into one column with no repeating values in python using pandas?

How do I perform the following dataframe operation going from Dataframe A to dataframe B in pandas for python? I have tried pivot and groupby but I keep getting errors. Any support is greatly appreciated.

DataFrame A

Col A Col B
100 1
100 2
200 3
200 4

DataFrame B

Col A & B
1
2
100
3
4
200

Upvotes: 1

Views: 41

Answers (2)

PaulS
PaulS

Reputation: 25393

Another possible solution:

out = pd.DataFrame({'Col A&B': np.unique(df)})
out

Output:

   Col A&B
0        1
1        2
2        3
3        4
4      100
5      200

Upvotes: 0

mozway
mozway

Reputation: 262149

One option using groupby:

out = (df
   .groupby('Col A', group_keys=False, sort=False)
   .apply(lambda d: d.iloc[:, ::-1].unstack().drop_duplicates())
   .reset_index(drop=True).to_frame(name='Col A&B')
)

Another with concat:

out = (pd
   .concat([df['Col B'], df['Col A'].drop_duplicates(keep='last')])
   .sort_index().reset_index(drop=True).to_frame(name='Col A&B')
)

output:

   Col A&B
0        1
1        2
2      100
3        3
4        4
5      200

If order does not matter, you can stack:

out = df.stack().drop_duplicates().reset_index(drop=True).to_frame(name='Col A&B')

output:

   Col A&B
0      100
1        1
2        2
3      200
4        3
5        4

Upvotes: 1

Related Questions