Reputation: 3086
I need to create a new column, which will contain letters {a,b,c,d} based on rules:
{'a' if (df['q1']==0 & df['q2']==0),
'b' if (df['q1']==0 & df['q2']==1),
'c' if (df['q1']==1 & df['q2']==0),
'd' if (df['q1']==1 & df['q2']==1)}
so, the new third column should contain a letter which corresponds to a particular combination of {0,1} in two columns.
q1 q2
0 0 1
1 0 1
2 0 1
3 0 1
4 0 1
5 0 1
6 0 1
7 0 1
8 0 1
9 0 1
10 1 1
11 1 1
12 0 1
13 0 1
14 1 0
15 0 0
16 0 0
17 0 0
18 0 0
19 0 0
20 0 0
21 0 0
I thought about converting numbers in each row from binary to decimal format and then apply dictionary rules.
Upvotes: 2
Views: 854
Reputation: 403198
Another method with df.map
and df.transform
:
In [90]: mapping = {(0, 0) :'a', (0, 1) : 'b', (1, 0): 'c', (1, 1): 'd'}
In [91]: df['val'] = df.transform(lambda x: (x['q1'], x['q2']), axis=1).map(mapping); df
Out[91]:
q1 q2 val
0 0 1 b
1 0 1 b
2 0 1 b
3 0 1 b
4 0 1 b
5 0 1 b
6 0 1 b
7 0 1 b
8 0 1 b
9 0 1 b
10 1 1 d
11 1 1 d
12 0 1 b
13 0 1 b
14 1 0 c
15 0 0 a
16 0 0 a
17 0 0 a
18 0 0 a
19 0 0 a
20 0 0 a
21 0 0 a
You can also use zip
to generate columns, apply pd.Series
and then do the mapping:
In [119]: df['val'] = pd.Series(list(zip(df.q1, df.q2))).map(mapping); df
Out[119]:
q1 q2 val
0 0 1 b
1 0 1 b
2 0 1 b
3 0 1 b
4 0 1 b
5 0 1 b
6 0 1 b
7 0 1 b
8 0 1 b
9 0 1 b
10 1 1 d
11 1 1 d
12 0 1 b
13 0 1 b
14 1 0 c
15 0 0 a
16 0 0 a
17 0 0 a
18 0 0 a
19 0 0 a
20 0 0 a
21 0 0 a
jezrael's solution:
In [552]: %%timeit
...: idx = pd.MultiIndex.from_product([[0,1],[0,1]], names=('q1','q2'))
...: s = pd.Series(['a','b','c','d'], index=idx, name='val')
...: df.join(s, on=['q1','q2'])
...:
100 loops, best of 3: 2.84 ms per loop
Proposed in this post:
In [553]: %%timeit
...: mapping = {(0, 0) :'a', (0, 1) : 'b', (1, 0): 'c', (1, 1): 'd'}
...: df.transform(lambda x: (x['q1'], x['q2']), axis=1).map(mapping)
...:
1000 loops, best of 3: 1.7 ms per loop
Upvotes: 1
Reputation: 863651
You can use join
by Series
with MultiIndex
:
idx = pd.MultiIndex.from_product([[0,1],[0,1]], names=('q1','q2'))
s = pd.Series(['a','b','c','d'], index=idx, name='val')
print (s)
q1 q2
0 0 a
1 b
1 0 c
1 d
Name: val, dtype: object
df = df.join(s, on=['q1','q2'])
print (df)
q1 q2 val
0 0 1 b
1 0 1 b
2 0 1 b
3 0 1 b
4 0 1 b
5 0 1 b
6 0 1 b
7 0 1 b
8 0 1 b
9 0 1 b
10 1 1 d
11 1 1 d
12 0 1 b
13 0 1 b
14 1 0 c
15 0 0 a
16 0 0 a
17 0 0 a
18 0 0 a
19 0 0 a
20 0 0 a
21 0 0 a
Upvotes: 3