kekert
kekert

Reputation: 966

update dataframe with series

having a dataframe, I want to update subset of columns with a series of same length as number of columns being updated:

>>> df = pd.DataFrame(np.random.randint(0,5,(6, 2)), columns=['col1','col2'])
>>> df

   col1  col2
0     1     0
1     2     4
2     4     4
3     4     0
4     0     0
5     3     1

>>> df.loc[:,['col1','col2']] = pd.Series([0,1])
...
ValueError: shape mismatch: value array of shape (6,) could not be broadcast to indexing result of shape (2,6)

it fails, however, I am able to do the same thing using list:

>>> df.loc[:,['col1','col2']] = list(pd.Series([0,1]))
>>> df
   col1  col2
0     0     1
1     0     1
2     0     1
3     0     1
4     0     1
5     0     1

could you please help me to understand, why updating with series fails? do I have to perform some particular reshaping?

Upvotes: 7

Views: 3774

Answers (1)

piRSquared
piRSquared

Reputation: 294278

When assigning with a pandas object, pandas treats the assignment more "rigorously". A pandas to pandas assignment must pass stricter protocols. Only when you turn it to a list (or equivalently pd.Series([0, 1]).values) did pandas give in and allow you to assign in the way you'd imagine it should work.

That higher standard of assignment requires that the indices line up as well, so even if you had the right shape, it still wouldn't have worked without the correct indices.

df.loc[:, ['col1', 'col2']] = pd.DataFrame([[0, 1] for _ in range(6)])
df

enter image description here

df.loc[:, ['col1', 'col2']] = pd.DataFrame([[0, 1] for _ in range(6)], columns=['col1', 'col2'])
df

enter image description here

Upvotes: 4

Related Questions