quilliam
quilliam

Reputation: 182

Split one column into two columns in pandas by the values

I have a Given_Col in which I have 5 different values. I need to create new columns as follows:

Both A and B goes ---> col1

Neither A nor B goes ---> col1

A goes B doesn't go ---> col2

A doesn't go B goes ---> col2

No idea ---> both if there the value is NaN

      Given_Col                    Expexted_Col1              Expexted_Col2    

Both A and B goes                 Both A and B goes               No idea
Neither A nor B goes             Neither A nor B goes             No idea
A goes B doesn't go                    No idea               A goes B doesn't go  
A doesn't go B goes                    No idea               A doesn't go B goes 
A goes B doesn't go                    No idea               A goes B doesn't go
Neither A nor B goes             Neither A nor B goes             No idea 
No idea                                No idea                    No idea 
Both A and B goes                  Both A and B goes              No idea 

I couldn't think of any solution. What would be the practical way?

Note : I considered duplicating existing column and mapping the values maybe?

Upvotes: 0

Views: 179

Answers (3)

Chris
Chris

Reputation: 29742

One way using pandas.DataFrame.assign with fillna:

mapper = {'col1': ['Both A and B goes', 'Neither A nor B goes'],
 'col2': ["A goes B doesn't go", "A doesn't go B goes"]}

s = df["Given_Col"]
new_df = df.assign(**{k: s[s.isin(v)] for k, v in mapper.items()}).fillna("No idea")
print(new_df)

Output:

              Given_Col                  col1                 col2
0     Both A and B goes     Both A and B goes              No idea
1  Neither A nor B goes  Neither A nor B goes              No idea
2   A goes B doesn't go               No idea  A goes B doesn't go
3   A doesn't go B goes               No idea  A doesn't go B goes
4   A goes B doesn't go               No idea  A goes B doesn't go
5  Neither A nor B goes  Neither A nor B goes              No idea
6               No idea               No idea              No idea
7     Both A and B goes     Both A and B goes              No idea

Upvotes: 1

gtomer
gtomer

Reputation: 6564

You can do that with few np.where functions:

df['col1'] = np.where(df['Given_Col'] == 'Both A and B goes', 'Both A and B goes', df['col1'])
df['col2'] = np.where(df['Given_Col'] == 'Both A and B goes', 'No idea', df['col1'])
df['col1'] = np.where(df['Given_Col'] == 'Neither A nor B goes', 'Neither A nor B goes', df['col2'])
df['col2'] = np.where(df['Given_Col'] == 'Neither A nor B goes', 'No idea', df['col2'])

You can continue from here....

Upvotes: 0

Rory O'Connell
Rory O'Connell

Reputation: 76

I think two conditional column assignments should work.

Each one picks up valid entries for the column based on selection criteria. If you had more than five possibilities, this could be unweildy, but it should work well enough for this case.

df['Expexted_Col1'] = df.apply(lambda x: x['Given_Col'] if (x['Given_Col'] == 'Both A and B goes' or x['Given_Col'] == 'Neither A nor B goes') else 'No idea', axis = 1)
df['Expexted_Col2'] = df.apply(lambda x: x['Given_Col'] if (x['Given_Col'] == "A goes B doesn't go" or x['Given_Col'] == "A doesn't go B goes") else 'No idea', axis = 1)

Upvotes: 1

Related Questions