Reputation: 503
I need to create a column that computes the difference between another column's elements:
Column A Computed Column
10 blank # nothing to compute for first record
9 1 # = 10-9
7 2 # = 9-7
4 3 # = 7-4
I am assuming this is a lambda function, but i am not sure how to reference the elements in 'Column A'
Any help/direction you can provide would be great- thanks!
Upvotes: 0
Views: 1539
Reputation: 3848
I would just do:
df = pd.DataFrame(data=[10,9,7,4], columns=['A'])
df['B'] = abs(df['A'].diff())
The reason for abs()
is because diff()
computes the difference between current - previous
whereas you want previous - current
. This method is already built-in to the Series
class, so using abs()
will get you the correct result by taking the absolute value either way.
To support:
import pandas as pd
df = pd.DataFrame(data=[10,9,7,4], columns=['A'])
df['B'] = abs(df['A'].diff())
>>> df
# Output
A B
0 10 NaN
1 9 1.0
2 7 2.0
3 4 3.0
df2 = pd.DataFrame(data=[10,4,7,9], columns=['A'])
df2['B'] = abs(df2['A'].diff())
>>> df2
# Output
A B
0 10 NaN
1 4 6.0
2 7 3.0
3 9 2.0
To still out perform that of @cosmic_inquiry's solution:
import pandas as pd
df = pd.DataFrame(data=[10,9,7,4], columns=['A'])
df2 = pd.DataFrame(data=[10,4,7,9], columns=['A'])
df['B'] = df['A'].diff() * -1
df2['B'] = df2['A'].diff() * -1
>>> df
# Output:
A B
0 10 NaN
1 9 1.0
2 7 2.0
3 4 3.0
>>> df2
# Output:
A B
0 10 NaN
1 4 6.0
2 7 -3.0
3 9 -2.0
Upvotes: 0
Reputation: 1607
You can do it by shifting the column.
import pandas as pd
dict1 = {'A': [10,9,7,4]}
df = pd.DataFrame.from_dict(dict1)
df['Computed'] = df['A'].shift() - df['A']
print(df)
giving
A Computed
0 10 NaN
1 9 1.0
2 7 2.0
3 4 3.0
EDIT: OP extended his requirement to multi columns
dict1 = {'A': [10,9,7,4], 'B': [10,9,7,4], 'C': [10,9,7,4]}
df = pd.DataFrame.from_dict(dict1)
columns_to_update = ['A', 'B']
for col in columns_to_update:
df['Computed'+col] = df[col].shift() - df[col]
print(df)
By using the columns_to_update, you can choose the columns you want.
A B C ComputedA ComputedB
0 10 10 10 NaN NaN
1 9 9 9 1.0 1.0
2 7 7 7 2.0 2.0
3 4 4 4 3.0 3.0
Upvotes: 2
Reputation: 2684
Use diff.
df = pd.DataFrame(data=[10,9,7,4], columns=['A'])
df['B'] = df.A.diff(-1).shift(1)
Output:
df
Out[140]:
A B
0 10 NaN
1 9 1.0
2 7 2.0
3 4 3.0
Upvotes: 0