lmccann
lmccann

Reputation: 149

How to set a value in a third column depending on values in two other columns

I have a dataframe which looks like the below.

Key|Direction
:--|-------: 
x  | Sell
x  | Buy
x  | BUY
y  | Sell
y  | Sell
y  | Sell
Z  | Buy
Z  | Buy
a  | Buy
a  | Sell

What I want to do is create a third column where for all the same keys if there is a buy and a sell for that key the third column will say yes. If not it just says no. I was playing around with groupby but I finding it hard to re-assign the values back into the data frame. This is what I want the final column to look like

Key|Direction |Cross
:--|-------   |------
x  | Sell     | yes
x  | Buy      | yes
x  | BUY      | yes
y  | Sell     | no
y  | Sell     | no
y  | Sell     | no
Z  | Buy      | no 
Z  | Buy      | no
a  | Buy      | yes
a  | Sell     | yes

Upvotes: 1

Views: 121

Answers (2)

jezrael
jezrael

Reputation: 863166

You can use groupby + transform with comparing sets and last map by dict:

d = {True:'yes', False:'no'}
df['Cross'] = df.groupby('Key')['Direction'] \
                .transform(lambda x: set(x) == set(['Buy','Sell'])).map(d)
print (df)
  Key Direction Cross
0   x      Sell   yes
1   x       Buy   yes
2   x       Buy   yes
3   y      Sell    no
4   y      Sell    no
5   y      Sell    no
6   Z       Buy    no
7   Z       Buy    no
8   a       Buy   yes
9   a      Sell   yes

Another solution with create Series of sets, map by Series for new column, compare with eq (==) and last map by dict:

d = {True:'yes', False:'no'}
s = df.groupby('Key')['Direction'].apply(set)
df['Cross'] = df['Key'].map(s).eq(set(['Buy','Sell'])).map(d)
print (df)
  Key Direction Cross
0   x      Sell   yes
1   x       Buy   yes
2   x       Buy   yes
3   y      Sell    no
4   y      Sell    no
5   y      Sell    no
6   Z       Buy    no
7   Z       Buy    no
8   a       Buy   yes
9   a      Sell   yes

Similar solution with numpy.where:

s = df.groupby('Key')['Direction'].apply(set)
df['Cross'] = np.where(df['Key'].map(s).eq(set(['Buy','Sell'])), 'yes', 'no')
print (df)
  Key Direction Cross
0   x      Sell   yes
1   x       Buy   yes
2   x       Buy   yes
3   y      Sell    no
4   y      Sell    no
5   y      Sell    no
6   Z       Buy    no
7   Z       Buy    no
8   a       Buy   yes
9   a      Sell   yes

Upvotes: 1

Rayhane Mama
Rayhane Mama

Reputation: 2424

One way to do it is to use groupby first:

df1 = df.groupby('Key',sort=False)['Direction'].apply(', '.join).reset_index()
print(df1)

Note that you need to set sort to False

df1 looks like:

  Key         Direction
0   x    Sell, Buy, Buy
1   y  Sell, Sell, Sell
2   Z          Buy, Buy
3   a         Buy, Sell

then you just create your new column with the correct number of 'yes' or 'no' depending on how many keys you have.

Note that we use split to know how many directions there are for a single key

cross=[]
for row in df1.index:
    elem = df1.ix[row,'Direction']

    if Sell in elem and Buy in elem:
        for i in range(len(elem.split(','))):
            cross.append('yes')
    else:
        for i in range(len(elem.split(','))):
            cross.append('no')

df['Cross'] = pd.Series(cross)
print(df)

output:

  Key Direction Cross
0   x      Sell   yes
1   x       Buy   yes
2   x       Buy   yes
3   y      Sell    no
4   y      Sell    no
5   y      Sell    no
6   Z       Buy    no
7   Z       Buy    no
8   a       Buy   yes
9   a      Sell   yes

PS: while creating the dataframe in your examples i added this to go faster, so please take that in consideration:

Sell='Sell'
Buy='Buy'

Upvotes: 0

Related Questions