Reputation:
I have a TYPE column and a VOLUME column What I'm looking to do if first check if TYPE column == 'var1'
If so I would like to make a calculation in the VOLUME column.
So far I have something like this:
data.loc[data['TYPE'] == 'var1', ['VOLUME']] * 2
data.loc[data['TYPE'] == 'var1', ['VOLUME']] * 4
This seems to set the entire column that meets the condition to the last variable. So I end up with just two values.
Out:
4
4
4
4
8
8
8
Another option:
data['VOLUME'] = data.loc[data['TYPE'] == 'var1', ['VOLUME']] * 2
This works for the first condition but show NaN for the second condition Then when I run:
data['VOLUME'] = data.loc[data['TYPE'] == 'var2', ['VOLUME']] * 4
The whole column show as NaN.
Upvotes: 1
Views: 1222
Reputation: 402553
Consider a simple example which demonstrates what is happening.
df = pd.DataFrame({'A': [1, 2, 3]})
df
A
0 1
1 2
2 3
Now, only values below 2 in column "A" are to be modified. So, try something like
df.loc[df.A < 2, 'A'] * 2
0 2
Name: A, dtype: int64
This series only has 1 row at index 0. If you try assigning this back, the implicit assumption is that the other index values are to be reset to NaN.
df.assign(A=df.loc[df.A < 2, 'A'] * 2)
A
0 2.0
1 NaN
2 NaN
What we want to do is to modify only the rows we're interested in. This is best done with the in-place modification arithmetic operator *=
:
df.loc[df.A < 2, 'A'] *= 2
In your case, it is
data.loc[data['TYPE'] == 'var1', 'VOLUME'] *= 2
Upvotes: 1
Reputation: 375535
You can use *=
with loc:
In [11]: df = pd.DataFrame([[1], [2]], columns=["A"])
In [12]: df
Out[12]:
A
0 1
1 2
In [13]: df.loc[df.A == 1, "A"] *= 3
In [14]: df
Out[14]:
A
0 3
1 2
Upvotes: 1
Reputation: 1208
You are really close. The problem is in how you are storing the result. This should work:
data.loc[data['TYPE'] == 'var1', ['VOLUME']] = data['VOLUME'] * 2
Upvotes: 1