Reputation: 4429
All I'm trying to do is add columns data1
and data2
if in the same row letters
is a
and subtract if it is c
. multipy if it is b
. Here is my code.
import pandas as pd
a=[['Date', 'letters', 'data1', 'data2'], ['1/2/2014', 'a', 6, 1], ['1/2/2014', 'a', 3, 1], ['1/3/2014', 'c', 1, 3],['1/3/2014', 'b', 3, 5]]
df = pd.DataFrame.from_records(a[1:],columns=a[0])
df['result']=df['data1']
for i in range(0,len(df)):
if df['letters'][i]=='a':
df['result'][i]=df['data1'][i]+df['data2'][i]
if df['letters'][i]=='b':
df['result'][i]=df['data1'][i]*df['data2'][i]
if df['letters'][i]=='c':
df['result'][i]=df['data1'][i]-df['data2'][i]
>>> df
Date letters data1 data2 result
0 1/2/2014 a 6 1 7
1 1/2/2014 a 3 1 4
2 1/3/2014 c 1 3 -2
3 1/3/2014 b 3 5 15
My question: is there a way to do it in one line without looping? something to the spirit of:
df['result']=df['result'].map(lambda x:df['data1'][i]+df['data2'][i] if x =='a' df['data1'][i]-df['data2'][i] elif x =='c' else x)`
Upvotes: 0
Views: 1892
Reputation: 53668
You can use df.apply
in combination with a lambda function. You have to use the keyword argument axis=1
to ensure you work on rows as opposed to the columns.
import pandas as pd
a=[['Date', 'letters', 'data1', 'data2'], ['1/2/2014', 'a', 6, 1], ['1/2/2014', 'a', 3, 1], ['1/3/2014', 'c', 1, 3]]
df = pd.DataFrame.from_records(a[1:],columns=a[0])
from operator import add, sub, mul
d = dict(a=add, b=mul, c=sub)
df['result'] = df.apply(lambda r: d[r['letters']](r['data1'], r['data2']), axis=1)
This will use the dictionary d
to get the function you wish to use (add
, sub
, or mul
).
Original solution below
df['result'] = df.apply(lambda r: r['data1'] + r['data2'] if r['letters'] == 'a'
else r['data1'] - r['data2'] if r['letters'] == 'c'
else r['data1'] * r['data2'], axis=1)
print(df)
Date letters data1 data2 result
0 1/2/2014 a 6 1 7
1 1/2/2014 b 3 1 3
2 1/3/2014 c 1 3 -2
The lambda function is a bit complex now so I'll go into it in a bit more detail...
The lambda function uses a so-called ternary operator to make boolean conditions inside one line, a typical ternary expession is of the form
a if b else c
Unfortunately you can't have an elif
with a ternary expression, but what you can do is place another one inside the else
statement, then it becomes
a if b else c if d else e
Upvotes: 1
Reputation: 77941
You can use the .where
method:
where(cond, other=nan, inplace=False, axis=None, level=None, try_cast=False, raise_on_error=True) method of pandas.core.series.Series instance
Return an object of same shape as self and whose corresponding entries are from self where cond is True and otherwise are from other.
as in:
>>> df['data1'] + df['data2'].where(df['letters'] == 'a', - df['data2'])
0 7
1 4
2 -2
dtype: int64
alternatively, numpy.where
:
>>> df['data1'] + np.where(df['letters'] == 'a', 1, -1) * df['data2']
0 7
1 4
2 -2
dtype: int64
Upvotes: 1