Ana
Ana

Reputation: 1586

filling columns based on multiple row values in Python

I have a dataframe A with some empty cells that I want to fill to have dataframe B. Here is a sample data:

A=
    Month   Type    Sale
    2016-01 A       20
    2016-02 A       10
    2016-01 B       40
    2016-02 B       30
    2016-01 C   
    2016-02 C   
    2016-01 D   
    2016-02 D   

B=
    Month   Type    Sale
    2016-01 A       20
    2016-02 A       10
    2016-01 B       40
    2016-02 B       30
    2016-01 C       60
    2016-02 C       40
    2016-01 D       60
    2016-02 D       40

Here is what I have done:

emptTypes= ['C', 'D']
x = A.groupby('Month', sort = False).Sale.sum()
B['Sale'][B['Type'].isin(emptTypes) & B['Month'].isin(x.index)]=x

And nothing happens!

Upvotes: 1

Views: 1250

Answers (2)

pbreach
pbreach

Reputation: 16987

Actually, since pandas 0.14 you can use fillna directly on the dataframe you need to fill in missing values.

A.fillna(B)
Out[8]: 
        Type  Sale
Month             
2016-01    A  20.0
2016-02    A  10.0
2016-01    B  40.0
2016-02    B  30.0
2016-01    C  60.0
2016-02    C  40.0
2016-01    D  60.0
2016-02    D  40.0 

I remember this because a PR was made to incorporate this after I asked a similar question a while back.

This will fill the missing values in A with those corresponding to B which have matching labels (index/column). This is more general as A and B do not need to be identically labelled, and the code stays the same when you have a multi-index.

Upvotes: 0

jezrael
jezrael

Reputation: 862571

I think you can use fillna by sum:

df['Sale'] = df.groupby('Month', sort = False).Sale.apply(lambda x: x.fillna(x.sum()))
print (df)
     Month Type  Sale
0  2016-01    A  20.0
1  2016-02    A  10.0
2  2016-01    B  40.0
3  2016-02    B  30.0
4  2016-01    C  60.0
5  2016-02    C  40.0
6  2016-01    D  60.0
7  2016-02    D  40.0

Upvotes: 1

Related Questions