Reputation: 11
I'm conducting an experiment(using python 2.7, panda 0.23.4) where I have three levels of a stimulus {a,b,c} and present all different combinations to participants, and they have to choose which one was rougher? (example: Stimulus 1 = a , Stimulus 2=b, participant choose 1 indicating stimulus 1 was rougher)
After the experiment, I have a data frame with three columns like this:
import pandas as pd
d = {'Stim1': ['a', 'b', 'a', 'c', 'b', 'c'],
'Stim2': ['b', 'a', 'c', 'a', 'c', 'b'],
'Answer': [1, 2, 2, 1, 2, 1]}
df = pd.DataFrame(d)
Stim1 Stim2 Answer
0 a b 1
1 b a 2
2 a c 2
3 c a 1
4 b c 2
5 c b 1
For my analysis, the order of which stimulus came first doesn't matter. Stim1= a, Stim2= b is the same as Stim1= b, Stim2= a. I'm trying to figure out how can I swap Stim1 and Stim2 and flip their Answer to be like this:
Stim1 Stim2 Answer
0 a b 1
1 a b 1
2 a c 2
3 a c 2
4 b c 2
5 b c 2
I read that np.where
can be used, but it would do one thing at a time, where I want to do two (swap and flip).
Is there some way to use another function to do swap and flip at the same time?
Upvotes: 1
Views: 192
Reputation: 11395
You can start by building a boolean series that indicates which rows should be swapped or not:
>>> swap = df['Stim1'] > df['Stim2']
>>> swap
0 False
1 True
2 False
3 True
4 False
5 True
dtype: bool
Then build the fully swapped dataframe as follows:
>>> swapped_df = pd.concat([
... df['Stim1'].rename('Stim2'),
... df['Stim2'].rename('Stim1'),
... 3 - df['Answer'],
... ], axis='columns')
>>> swapped_df
Stim2 Stim1 Answer
0 a b 2
1 b a 1
2 a c 1
3 c a 2
4 b c 1
5 c b 2
Finally, use .mask()
to select initial rows or swapped rows:
>>> df.mask(swap, swapped_df)
Stim1 Stim2 Answer
0 a b 1
1 a b 1
2 a c 2
3 a c 2
4 b c 2
5 b c 2
NB .mask
is roughly the same as .where
, but it replaces rows where the series is True
instead of keeping the rows that are True
. This is exactly the same:
>>> swapped_df.where(swap, df)
Stim2 Stim1 Answer
0 b a 1
1 b a 1
2 c a 2
3 c a 2
4 c b 2
5 c b 2
Upvotes: 2
Reputation: 260580
Can you try if this works for you?
import pandas as pd
import numpy as np
df = pd.DataFrame(d)
# keep a copy of the original Stim1 column
s = df['Stim1'].copy()
# sort the values
df[['Stim1', 'Stim2']] = np.sort(df[['Stim1', 'Stim2']].values)
# exchange the Answer if the order has changed
df['Answer'] = df['Answer'].where(df['Stim1'] == s, df['Answer'].replace({1:2,2:1}))
output:
Stim1 Stim2 Answer
0 a b 1
1 a b 1
2 a c 2
3 a c 2
4 b c 2
5 b c 2
Upvotes: 2