Romain Cendre
Romain Cendre

Reputation: 319

Pandas - Row mask and 2d ndarray assignement

Got some problems with pandas, I think I'm not using it properly, and I would need some help to do it right.

So, I got a mask for rows of a dataframe, this mask is a simple list of Boolean values. I would like to assign a 2D array, to a new or existing column.

mask = some_row_mask()
my2darray = some_operation(dataframe.loc[mask, column])
dataframe.loc[mask, new_or_exist_column] = my2darray
# Also tried this
dataframe.loc[mask, new_or_exist_column] = [f for f in my2darray]

Example data:

dataframe = pd.DataFrame({'Fun': ['a', 'b', 'a'], 'Data': [10, 20, 30]})
mask = dataframe['Fun']=='a'
my2darray = [[0, 1, 2, 3, 4], [4, 3, 2, 1, 0]]
column = 'Data'
new_or_exist_column = 'NewData'

Expected output

  Fun  Data          NewData

0   a    10  [0, 1, 2, 3, 4]

1   b    20              NaN

2   a    30  [4, 3, 2, 1, 0]

dataframe[mask] and my2darray have both the exact same number of rows, but it always end with :

ValueError: Mus have equal len keys and value when setting with ndarray.

Thanks for your help!

EDIT - In context:

I just add some precisions, it was made for filling folds steps by steps: I compute and set some values from sub part of the dataframe. Instead of this, according to Parth:

dataframe[new_or_exist_column]=pd.Series(my2darray, index=mask[mask==True].index)

I changed to this:

dataframe.loc[mask, out] = pd.Series([f for f in features], index=mask[mask==True].index)

All values already set are overwrite by NaN values otherwise. I miss to give some informations about it.

Thanks!

Upvotes: 1

Views: 154

Answers (1)

Parth
Parth

Reputation: 644

Try this:

dataframe[new_or_exist_column]=np.nan
dataframe[new_or_exist_column]=pd.Series(my2darray, index=mask[mask==True].index)

It will give desired output:

Fun  Data          NewData
0   a    10  [0, 1, 2, 3, 4]
1   b    20              NaN
2   a    30  [4, 3, 2, 1, 0]

Upvotes: 1

Related Questions