Reputation: 487
I am trying to modify two values in a single row of a data frame. However, I get an exception, which I am unable to explain the reason for.
In [1]: import numpy as np
In [2]: import pandas as pd
In [3]: df = pd.DataFrame(np.random.rand(2,3), index=['one', 'two'],
columns=list('ABC'))
In [4]: df['Z'] = list(range(len(df.index)))
In [5]: df.head(1)
Out[5]:
A B C Z
one 0.977917 0.734311 0.069476 0
In [6]: df.iloc[0] = dict(B=3.5, Z=10)
/home/rajatgirotra/tools/miniconda2/envs/shriram/lib/python2.7/site-packages/pandas/core/indexing.pyc in _setitem_with_indexer(self, indexer, value) 525 526 if len(labels) != len(value): --> 527 raise ValueError('Must have equal len keys and value ' 528 'when setting with an iterable') 529
ValueError: Must have equal len keys and value when setting with an iterable
Is this way incorrect? How can I easily modify one or more cell values in the same row?
Upvotes: 3
Views: 4560
Reputation: 294516
@jezrael's df.iloc[0] = pd.Series(d)
is my preference.
But you can also use pd.DataFrame.update
and wrap your dictionary in a pd.DataFrame
df.update(pd.DataFrame(dict(B=3.5, Z=10), ['one']))
df
A B C Z
one 0.339970 3.500000 0.528206 10.0
two 0.553827 0.117207 0.784605 1.0
While I'm at it, here is a creative way using pd.DataFrame.set_value
and a list comprehension. This has the advantage of no overhead building the dataframe and notice the dtype
is preserved on column 'Z'
[df.set_value('one', k, v) for k, v in dict(B=3.5, Z=10).items()];
df
A B C Z
one 0.099669 3.500000 0.248170 10
two 0.604340 0.305114 0.897305 1
Not that it matters all that much, but this is the timing over the tiny data sample
%timeit [df.set_value('one', k, v) for k, v in dict(B=3.5, Z=10).items()];
%timeit df.update(pd.DataFrame(dict(B=3.5, Z=10), ['one']))
%timeit df.iloc[0] = pd.Series(dict(B=3.5, Z=10))
100000 loops, best of 3: 5.29 µs per loop
1000 loops, best of 3: 1.51 ms per loop
1000 loops, best of 3: 402 µs per loop
Upvotes: 3
Reputation: 863531
I think you need select only columns by keys of dict by loc
or iloc
, else get NaN
s:
d = dict(B=3.5, Z=10)
df.loc[df.index[0], d.keys()] = pd.Series(d)
print (df)
A B C Z
one 0.062352 3.500000 0.225811 10.0
two 0.655920 0.386443 0.063906 1.0
df.iloc[0, df.columns.get_indexer(d.keys())] = pd.Series(d)
print (df)
A B C Z
one 0.422479 3.500000 0.951087 10.0
two 0.097426 0.702746 0.257591 1.0
df.loc[df.index[0]] = pd.Series(d)
print (df)
A B C Z
one NaN 3.500000 NaN 10.0
two 0.050399 0.917007 0.951725 1.0
df.iloc[0] = pd.Series(d)
print (df)
A B C Z
one NaN 3.500000 NaN 10.0
two 0.5356 0.844221 0.023227 1.0
Upvotes: 3