pmdaly
pmdaly

Reputation: 1212

How to combine rows with the same timestamp?

I'm trying to combine all rows of a dataframe that have the same time stamp into a single row. The df is 5k by 20.

             A      B      ...
 timestamp
    11:00    NaN    10     ...
    11:00    5      NaN    ...
    12:00    15     20     ...
    ...      ...    ...

group the 2 11:00 rows as follows

             A      B        ...
timestamp
    11:00    5      10       ...
    12:00    15     20       ...
    ...      ...    ...

Any help would be appreciated. Thank you.

I have tried

df.groupby( df.index ).sum()

Upvotes: 3

Views: 3210

Answers (4)

selwyth
selwyth

Reputation: 2497

groupby after replacing the NaN values with 0's.

df.fillna(0, inplace=True)
df.groupby(df.index).sum()

Upvotes: 2

selwyth
selwyth

Reputation: 2497

You could melt ('unpivot') the DataFrame to convert it from wide form to long form, remove the null values, then aggregate via groupby.

import pandas as pd

df = pd.DataFrame({'timestamp' : ['11:00','11:00','12:00'],
               'A' : [None,5,15],
               'B' : [10,None,20]
              })

    A   B   timestamp
0   NaN 10  11:00
1   5   NaN 11:00
2   15  20  12:00

df2 = pd.melt(df, id_vars = 'timestamp') # specify the value_vars if needed

    timestamp   variable    value
0   11:00       A           NaN
1   11:00       A           5
2   12:00       A           15
3   11:00       B           10
4   11:00       B           NaN
5   12:00       B           20

df2.dropna(inplace=True)
df3 = df2.groupby(['timestamp', 'variable']).sum()

                        value
timestamp   variable    
11:00       A           5
            B           10
12:00       A           15
            B           20

df3.unstack()

            value
variable    A   B
timestamp       
11:00       5   10
12:00       15  20

Upvotes: 2

Alexander
Alexander

Reputation: 109546

Try using resample:

>>> df.resample('60Min', how='sum')
                      A   B
2015-05-28 11:00:00   5  10
2015-05-28 12:00:00  15  20

More examples can be found in the Pandas Documentation.

Upvotes: 1

J.J
J.J

Reputation: 3607

You cannot sum a number and a NaN in python. You probably need to use .aggregate() :)

Upvotes: 0

Related Questions