Long Vuong
Long Vuong

Reputation: 181

Adding two dataframe columns with a + yield NaN, while using .add(axis=1) works as expected?

I have a dataframe (output here: https://pastebin.com/7RCPsHet; can be read with pd.DataFrame.from_dict(orient='tight')) with two columns that I want the total of. They look like:

           Tiered Rates Prices       
                         lower higher
year month                           
2022 11                  41.36   0.00
     12                  74.42   0.00
2023 1                   72.31   0.00
     2                   71.01   0.00

However, when I tried to add them using the + symbol, I got this error:

monthly_pivot[('Tiered Rates Prices', 'higher')] + monthly_pivot[('Tiered Rates Prices', 'lower')]

Expected a one-dimensional object, got a DataFrame with 2 columns instead.

Printing out the expression gave me this table of all NaN. This was a little strange, as I had expected the result to be a single column. At this point I think the most likely culprit is the fact that this dataframe has the 'year/month' multiindex, and I'm not calling the columns accurately to sum them up with the + operator. So I'm wondering where did I go wrong?

           Tiered Rates Prices      
                        higher lower
year month                          
2022 11                    NaN   NaN
     12                    NaN   NaN
2023 1                     NaN   NaN
     2                     NaN   NaN

I was able to total the two columns together, and the resulting column looks like:

monthly_pivot[(('Tiered Rates Prices'))].sum(axis=1)

year  month
2022  11        82.72
      12       148.84
2023  1        144.62
      2        142.02

Upvotes: 0

Views: 67

Answers (1)

mozway
mozway

Reputation: 262234

Assuming this reproducible input:

monthly_pivot = pd.DataFrame.from_dict(
    {
        'index': [
            (2022, 11),
            (2022, 12),
            (2023, 1),
            (2023, 2),
            (2023, 3),
            (2023, 4),
            (2023, 5),
        ],
        'columns': [
            ('Tiered Rates Prices', 'lower'),
            ('Tiered Rates Prices', 'higher'),
        ],
        'data': [
            [41.36, 0.0],
            [74.42, 0.0],
            [72.31, 0.0],
            [71.01, 0.0],
            [65.05, 0.0],
            [37.8, 0.0],
            [38.36, 0.0],
        ],
        'index_names': ['year', 'month'],
        'column_names': [None, None],
    },
    orient='tight',
)

Your code should work as expected:

monthly_pivot[('Tiered Rates Prices', 'higher')] + monthly_pivot[('Tiered Rates Prices', 'lower')]

year  month
2022  11       41.36
      12       74.42
2023  1        72.31
      2        71.01
      3        65.05
      4        37.80
      5        38.36
dtype: float64

To yield your shown output, you would need to select the columns as 2D (DataFrames), not 1D (Series):

monthly_pivot[[('Tiered Rates Prices', 'higher')]] + monthly_pivot[[('Tiered Rates Prices', 'lower')]]

           Tiered Rates Prices      
                        higher lower
year month                          
2022 11                    NaN   NaN
     12                    NaN   NaN
2023 1                     NaN   NaN
     2                     NaN   NaN
     3                     NaN   NaN
     4                     NaN   NaN
     5                     NaN   NaN

In this case, the columns wouldn't be aligned and the addition would indeed yield NaNs.

Make sure that the output of monthly_pivot[('Tiered Rates Prices', 'higher')] (and the other column) are Series:

type(monthly_pivot[('Tiered Rates Prices', 'higher')])

pandas.core.series.Series

Upvotes: 1

Related Questions