tim654321
tim654321

Reputation: 2248

Update a column of a pandas DataFrame using a partial map

I have a large DataFrame with many columns. I also have a smaller DataFrame with two columns - call them 'label' and 'value', both of which are in the larger DataFrame. I want to replace the 'value' column of the larger DataFrame with the 'value' column of the smaller DataFrame, for rows of the larger DataFrame where 'label' matches a row in the smaller DataFrame.

The logic in terms of a for loop is:

largeDF = pd.DataFrame([['a',1],['b',2],['c',3],['d',4],['e',5]], columns=['label','value']).set_index('label')
smallDF = pd.DataFrame([['d',6],['e',7]], columns=['label','value']).set_index('label')
for label in smallDF.index:
    largeDF.loc[label,'value'] = smallDF.loc[label,'value']

Unfortunately the obvious answer of

largeDF['value'] = smallDF['value']

doesn't work because it NaNs the labels that are not in the small DF. Using .map() has a similar problem.

The for loop I'm using seems very unpythonic, unpandas and slow. What is the better way of doing this that I am missing?

Upvotes: 1

Views: 1498

Answers (1)

jezrael
jezrael

Reputation: 863301

You can use combine_first:

print (smallDF['value'].combine_first(largeDF['value']))
label
a    1.0
b    2.0
c    3.0
d    6.0
e    7.0
Name: value, dtype: float64

All together:

largeDF = pd.DataFrame([['a',1],['b',2],['c',3],['d',4],['e',5]], columns=['label','value'])
            .set_index('label')
largeDF['a'] = 1
largeDF['b'] = 'f'
smallDF = pd.DataFrame([['d',6],['e',7]], columns=['label','value']).set_index('label')
print (largeDF)
       value  a  b
label             
a          1  1  f
b          2  1  f
c          3  1  f
d          4  1  f
e          5  1  f

print (smallDF)
       value
label       
d          6
e          7

largeDF['value'] = smallDF['value'].combine_first(largeDF['value']).astype(int)
print (largeDF)
       value  a  b
label             
a          1  1  f
b          2  1  f
c          3  1  f
d          6  1  f
e          7  1  f

Upvotes: 3

Related Questions