Reputation: 297

Pandas swap value if the value is found in another column

I have two columns with a list of each strings, named Tag 1, Tag 2. In simple terms I need to check if a string in a cell is found anywhere in the adjacent column and if so, swap the values.

Existing Dataframe

Tag 1    Tag 2
Amazon   Twitter
Amazon   Google
eBay     Amazon
Reddit   Facebook

Desired Output

Tag 1    Tag 2
Amazon   Twitter
Amazon   Google
Amazon   eBay
Reddit   Facebook

In the desired outcome, you can see that Amazon has switched places with eBay because it was found in the Tag 1 column.

Minimum Reproducible Example

import pandas as pd

data = {'Tag 1': ['Amazon', 'Amazon', 'eBay', 'Reddit'],
        'Tag 2': ['Twitter', 'Google', 'Amazon', 'Facebook']}

df = pd.DataFrame(data)

I've been researching similar posts but can't quite seem to get it exactly right.

Here's my code so far.

Upvotes: 0

Answers (3)

Corralien

Reputation: 120559

The answer of @sophocles is probably the right.

However if you just want to reorder columns in a lexicographical order for instance, try:

tags = ['Tag 1', 'Tag 2']

df[tags] = df[tags].apply(lambda x: sorted(tuple(x)), axis=1, result_type='expand')
print(df)

# Output
      Tag 1    Tag 2
0    Amazon  Twitter
1    Amazon   Google
2    Amazon     eBay
3  Facebook   Reddit

Upvotes: 1

Lucas

Reputation: 382

This might do it

import pandas as pd
import numpy as np

data = {'Tag 1': ['Amazon', 'Amazon', 'eBay', 'Reddit'],
        'Tag 2': ['Twitter', 'Google', 'Amazon', 'Facebook']}

df = pd.DataFrame(data)

values = np.unique(df['Tag 1'].values)  # All unique values in Tag 1
swapped = []  # So we don't double swap

for val in values:
    rows = df.loc[df['Tag 2'] == val]  # Get rows in Tag 2 that matches
    for idx, row in rows.iterrows():
        if idx in swapped:  # Ignore if already swapped
            continue
        # Swap the values
        temp = df.iloc[idx]['Tag 2']
        df.iloc[idx]['Tag 2'] = df.iloc[idx]['Tag 1']
        df.iloc[idx]['Tag 1'] = temp
        swapped.append(idx)

There are probably more efficient solutions, but I think this is pretty clear. It goes through each unique value in the 'Tag 1' column and checks for matches in the 'Tag 2' column, and swaps them. I was not sure if you wanted to avoid double swaps, but that can be removed if it's not a requirement.

Upvotes: 1

sophocles

Reputation: 13841

You could use pd.where to check when a value in 'Tag 2' existis in 'Tag 1' using isin and swap the values when you assign back:

df[['Tag 1','Tag 2']] = df[['Tag 2','Tag 1']].where(
    df['Tag 2'].isin(df['Tag 1'].tolist()), df[['Tag 1','Tag 2']].values)

prints back:

    Tag 1     Tag 2
0  Amazon   Twitter
1  Amazon    Google
2  Amazon      eBay
3  Reddit  Facebook

Upvotes: 1

Pandas swap value if the value is found in another column

Answers (3)

Related Questions