ruedi
ruedi

Reputation: 5555

Multiple assignments in filtered datasets

Trying to do multiple assigments with a filtered dataset I encountered a strange behavior I cannot explaim myself. My Testdata:

import pandas as pd
wert = 2.5
df = pd.DataFrame([['Test', 12, None, None], ['Test2', 15, None, None]], columns=['A','B','C','D'])

My first question occured executing this line of code:

df.loc[(df['A'] == 'Test'), ['D']] = df['B'] * wert

the filter is only on the left side so how does df['B'] knows where to assign the values? I thought df['B'] should be filtered as well but this is obviously not neccessary. So I stepped forward doing multiple assignment with condition and tried to execute this line:

df.loc[(df['A'] == 'Test'), ['C', 'D']] = [1, df['B'] * wert]

Now I get an error ValueError: cannot set using a list-like indexer with a different length than the value. My explanation would be that the array df['B'] is longerthan df.loc[df['A']=='Test) but since this worked fine in example 1 this cannot be the exlanation. Could anyone tell me why this is not working and giving me this error?

Upvotes: 3

Views: 121

Answers (2)

jezrael
jezrael

Reputation: 863266

Your solution working, if filter column B by same mask - is necessary same index values of filtered rows for alignment of data (not only length):

mask = (df['A'] == 'Test')
df.loc[mask, ['C', 'D']] = [1, df.loc[mask, 'B'] * wert]
print (df)
       A   B     C     D
0   Test  12     1    30
1  Test2  15  None  None

So if filter by another value:

mask = (df['A'] == 'Test2')
df.loc[mask, ['C', 'D']] = [1, df.loc[mask, 'B'] * wert]
print (df)
       A   B  C     D
0   Test  12  1   NaN
1  Test2  15  1  37.5

What means:

ValueError: cannot set using a list-like indexer with a different length than the value

Honestly, not understand error, so ask in pandas github.

And add answer from pandas devs later.

Upvotes: 0

user3471881
user3471881

Reputation: 2724

Why does this happen?

Because pandas will raise a ValueError when:

the indexer is an ndarray or list and the lengths don't match.

An special-case is allowed for when the indexer is a boolean array and the number of true values equals the length of value. In this case, no exception is raised.

source

You can use df.assign() if you don't want to filter df['B'] to match:

df.loc[(df['A'] == 'Test')].assign(C=1, D=df['B'] * wert)

Upvotes: 2

Related Questions