Reputation: 9869
This is similar to this question, however differs in that I am concerned about a subset of the dataframe.
Suppose I have the following dataframe:
import pandas as pd
import numpy as np
np.random.seed(42)
df = pd.DataFrame(np.random.randn(5), columns=['A'])
and I wish to have a column 'B' that has the value ["neg"]
for the negative values of A. However, when I do the following as was suggested in the linked question, I lose the list nature of what I was expecting.
idx = df.A < 0
df.loc[idx, "B"] = [["neg"]]*idx.sum()
>>>
Out[17]:
A B
0 0.496714 NaN
1 -0.138264 neg
2 0.647689 NaN
3 1.523030 NaN
4 -0.234153 neg
What am I doing wrong here? The only thing I can do in the meantime to fix this is to do: df.loc[idx, 'B'] = df.loc[idx, 'B'].map(lambda x: [x])
.
Upvotes: 1
Views: 577
Reputation: 169434
You need to provide a pd.Series
, and you must supply the length of the entire DataFrame to associate ["neg"]
to any of the negative value(s):
df.loc[idx, "B"] = pd.Series([["neg"]]*len(df))
Result:
A B
0 0.496714 nan
1 -0.138264 [neg]
2 0.647689 nan
3 1.523030 nan
4 -0.234153 [neg]
Upvotes: 1