Reputation: 319
Got some problems with pandas, I think I'm not using it properly, and I would need some help to do it right.
So, I got a mask for rows of a dataframe, this mask is a simple list of Boolean values. I would like to assign a 2D array, to a new or existing column.
mask = some_row_mask()
my2darray = some_operation(dataframe.loc[mask, column])
dataframe.loc[mask, new_or_exist_column] = my2darray
# Also tried this
dataframe.loc[mask, new_or_exist_column] = [f for f in my2darray]
Example data:
dataframe = pd.DataFrame({'Fun': ['a', 'b', 'a'], 'Data': [10, 20, 30]})
mask = dataframe['Fun']=='a'
my2darray = [[0, 1, 2, 3, 4], [4, 3, 2, 1, 0]]
column = 'Data'
new_or_exist_column = 'NewData'
Expected output
Fun Data NewData
0 a 10 [0, 1, 2, 3, 4]
1 b 20 NaN
2 a 30 [4, 3, 2, 1, 0]
dataframe[mask] and my2darray have both the exact same number of rows, but it always end with :
ValueError: Mus have equal len keys and value when setting with ndarray.
Thanks for your help!
EDIT - In context:
I just add some precisions, it was made for filling folds steps by steps: I compute and set some values from sub part of the dataframe. Instead of this, according to Parth:
dataframe[new_or_exist_column]=pd.Series(my2darray, index=mask[mask==True].index)
I changed to this:
dataframe.loc[mask, out] = pd.Series([f for f in features], index=mask[mask==True].index)
All values already set are overwrite by NaN values otherwise. I miss to give some informations about it.
Thanks!
Upvotes: 1
Views: 154
Reputation: 644
Try this:
dataframe[new_or_exist_column]=np.nan
dataframe[new_or_exist_column]=pd.Series(my2darray, index=mask[mask==True].index)
It will give desired output:
Fun Data NewData
0 a 10 [0, 1, 2, 3, 4]
1 b 20 NaN
2 a 30 [4, 3, 2, 1, 0]
Upvotes: 1