Reputation: 133
I have a data like this -
ID CITY STATE Location
1 a b a
1 c w a
2 e b e
2 a f e
I want a new column of STATE such that when CITY and Location are same then STATE is "b" for all ID's.
Expected Output:
ID CITY STATE Location STATE2
1 a b a b
1 c d w b
2 e m e m
2 u f e m
Upvotes: 0
Views: 234
Reputation: 34086
Use numpy.where
:
In [1724]: import numpy as np
In [1732]: x = df.loc[df.CITY == df.Location, 'STATE'].unique()[0]
In [1725]: df['STATE2'] = np.where(df.CITY == df.Location, df.STATE, x)
In [1726]: df
Out[1726]:
ID CITY STATE Location STATE2
0 1 a b a b
1 2 c d a b
2 3 e b a b
3 4 a b a b
Upvotes: 1
Reputation: 27
Solution 1:
df.loc[df["City"]==df["Location"],"state2"] = "b"
Solution 2: if you would like to specify a value for rows that do no match
df.apply(lambda x: "b" if x["City"]==x["Location"] else "others",axis=1 )
Solution 3: if the conditions are more complex then its better to use a mask
mask = (df['City']==df["Location"])
df["state3"] = df["state3"].mask(mask,"b")
Upvotes: 0
Reputation: 16683
Please see two solutions depending on what you are trying to do.
Solution #1: Assuming the input:
ID CITY STATE Location
0 1 a b a
1 1 c d w
2 2 e m e
3 2 u f e
You can use mask
to change STATE
values for ID
that have at least one matching row of CITY
and Location
:
m = df.loc[df['CITY'] == df['Location'], 'ID']
df['STATE 2'] = df['STATE'].mask(df['ID'].isin(m),
df.groupby('ID')['STATE'].transform('first'))
df
Out[1]:
ID CITY STATE Location STATE 2
0 1 a b a b
1 1 c d w b
2 2 e m e m
3 2 u f e m
Solution #2:
Assuming slightly different input (the approach changes a good amount depending on desired output):
ID CITY STATE Location
0 1 g f c
1 1 a b a
2 1 c d w
3 2 e m e
4 2 u f e
m = df.loc[df['CITY'] == df['Location']].drop_duplicates('CITY').index
df['State 2'] = np.where(df.index.isin(m), df['STATE'], np.nan)
df['State 2'] = df.groupby('ID')['State 2'].transform('first')
df
Out[2]:
ID CITY STATE Location State 2
0 1 g f c b
1 1 a b a b
2 1 c d w b
3 2 e m e m
4 2 u f e m
This solution assumes you want the output to be the first row within each group that matches. Please see my comment. One of these two solutions are hopefully what you are looking for.
Upvotes: 3