Python/Pandas - building a new column based in columns comparison

Question

I have this dataframe:

df:
                    CNPJ     Revenues 2016    Revenues 2015     Revenues 2014
0     01.637.895/0001-32     R$ 12.696.658              NaN     R$ 10.848.213   
1     02.916.265/0001-60               NaN   R$ 162.914.526    R$ 142.412.432
2     02.932.074/0001-91               NaN              NaN      R$ 1.928.312
3     03.853.896/0001-40     R$ 19.333.453    R$ 18.891.833     R$ 12.645.986

Each row represents a company and each "Revenues" column represent the companies revenues in the referenced year.

I want to make a new column called "last_revenues" that will have the last value of revenues. If 2016 is the last we've got, will be 2016, if we don't have 2016 but have 2015, 2015 will be the one. If we don't have neither 2016 or 2015, last_revenues will have the 2014 value.

It would have to look like this:

                    CNPJ     last_revenues
0     01.637.895/0001-32     R$ 12.696.658
1     02.916.265/0001-60    R$ 162.914.526
2     02.932.074/0001-91      R$ 1.928.312
3     03.853.896/0001-40     R$ 19.333.453

Can someone suggest a way of doing it?

Scott Boston · Accepted Answer

df1 = df.set_index('CNPJ')
df1['last_revenues'] = df1.fillna(method='bfill',axis=1).iloc[:,0]

or as DSM suggests we can shorten this to

df1['last_revenues'] = df1.bfill(axis=1).iloc[:,0]
df1.reset_index()

Output:

                 CNPJ  Revenues 2016   Revenues 2015   Revenues 2014  \
0  01.637.895/0001-32  R$ 12.696.658             NaN   R$ 10.848.213   
1  02.916.265/0001-60            NaN  R$ 162.914.526  R$ 142.412.432   
2  02.932.074/0001-91            NaN             NaN    R$ 1.928.312   
3  03.853.896/0001-40  R$ 19.333.453   R$ 18.891.833   R$ 12.645.986   

    last_revenues  
0   R$ 12.696.658  
1  R$ 162.914.526  
2    R$ 1.928.312  
3   R$ 19.333.453

Python/Pandas - building a new column based in columns comparison

Answers (2)

Related Questions