Reputation: 741
Can someone please help. No matter what I do I always get some kind of length mismatch error when trying to set multiple columns.
These 2 lines work:
df.loc[(condition), ['column1', 'column2']] = 10, 20
df.loc[(condition), ['column1', 'column2']] = df['column3'] + 10
This gives error Must have equal len keys and value when setting with an ndarray
df.loc[(condition), ['column1', 'column2']] = df['column3'] + 10, df['column3'] - 10
Doesn't make sense to me because len(column1) = len(column2)
and len(df['column3'] + 10) = len(df['column3'] - 10)
Upvotes: 1
Views: 494
Reputation: 8219
There are two things at least that are going on
One is -- what is condition
? if it selects only some rows on the left hand side, there will be a length mismatch
even if the lengths are the same, on the left you have an array of size (N,2)
(where N is the number of rows) and on the right you have a tuple of two arrays so (2,N)
. Pandas or numpy does not know how to broadcast them into the same shape
The easiest I think is to go column by column. But the closest I could get to your syntax that works is
df.loc[(condition), ['column1', 'column2']] = df[['column3']].values + np.array([10,20])
Here numpy broadcast rules kick in on the rhs getting it into the right shape
Upvotes: 1