Reputation: 149
I have a dataframe which looks like the below.
Key|Direction
:--|-------:
x | Sell
x | Buy
x | BUY
y | Sell
y | Sell
y | Sell
Z | Buy
Z | Buy
a | Buy
a | Sell
What I want to do is create a third column where for all the same keys if there is a buy and a sell for that key the third column will say yes. If not it just says no. I was playing around with groupby but I finding it hard to re-assign the values back into the data frame. This is what I want the final column to look like
Key|Direction |Cross
:--|------- |------
x | Sell | yes
x | Buy | yes
x | BUY | yes
y | Sell | no
y | Sell | no
y | Sell | no
Z | Buy | no
Z | Buy | no
a | Buy | yes
a | Sell | yes
Upvotes: 1
Views: 121
Reputation: 863166
You can use groupby
+ transform
with comparing set
s and last map
by dict
:
d = {True:'yes', False:'no'}
df['Cross'] = df.groupby('Key')['Direction'] \
.transform(lambda x: set(x) == set(['Buy','Sell'])).map(d)
print (df)
Key Direction Cross
0 x Sell yes
1 x Buy yes
2 x Buy yes
3 y Sell no
4 y Sell no
5 y Sell no
6 Z Buy no
7 Z Buy no
8 a Buy yes
9 a Sell yes
Another solution with create Series
of set
s, map
by Series
for new column, compare with eq
(==
) and last map by dict
:
d = {True:'yes', False:'no'}
s = df.groupby('Key')['Direction'].apply(set)
df['Cross'] = df['Key'].map(s).eq(set(['Buy','Sell'])).map(d)
print (df)
Key Direction Cross
0 x Sell yes
1 x Buy yes
2 x Buy yes
3 y Sell no
4 y Sell no
5 y Sell no
6 Z Buy no
7 Z Buy no
8 a Buy yes
9 a Sell yes
Similar solution with numpy.where
:
s = df.groupby('Key')['Direction'].apply(set)
df['Cross'] = np.where(df['Key'].map(s).eq(set(['Buy','Sell'])), 'yes', 'no')
print (df)
Key Direction Cross
0 x Sell yes
1 x Buy yes
2 x Buy yes
3 y Sell no
4 y Sell no
5 y Sell no
6 Z Buy no
7 Z Buy no
8 a Buy yes
9 a Sell yes
Upvotes: 1
Reputation: 2424
One way to do it is to use groupby first:
df1 = df.groupby('Key',sort=False)['Direction'].apply(', '.join).reset_index()
print(df1)
Note that you need to set sort to False
df1 looks like:
Key Direction
0 x Sell, Buy, Buy
1 y Sell, Sell, Sell
2 Z Buy, Buy
3 a Buy, Sell
then you just create your new column with the correct number of 'yes' or 'no' depending on how many keys you have.
Note that we use split to know how many directions there are for a single key
cross=[]
for row in df1.index:
elem = df1.ix[row,'Direction']
if Sell in elem and Buy in elem:
for i in range(len(elem.split(','))):
cross.append('yes')
else:
for i in range(len(elem.split(','))):
cross.append('no')
df['Cross'] = pd.Series(cross)
print(df)
output:
Key Direction Cross
0 x Sell yes
1 x Buy yes
2 x Buy yes
3 y Sell no
4 y Sell no
5 y Sell no
6 Z Buy no
7 Z Buy no
8 a Buy yes
9 a Sell yes
PS: while creating the dataframe in your examples i added this to go faster, so please take that in consideration:
Sell='Sell'
Buy='Buy'
Upvotes: 0