Reputation: 1568
How do I group by date?
I've tried df.grouby(df.date)
and it produces a blank df. Essentially I want the NaNs removed and values 1 and values 2 grouped by date.
date value1 value2
0 2001-01-01 800 NaN
1 2001-01-02 900 NaN
2 2001-01-03 1200 NaN
3 2001-01-04 700 NaN
4 2001-01-05 600 NaN
5 2001-01-01 NaN 1175
6 2001-01-02 NaN 1176
7 2001-01-03 NaN 1177
8 2001-01-04 NaN 1778
9 2001-01-05 NaN 1779
Desired DataFrame:
date value1 value2
0 2001-01-01 800 1175
1 2001-01-02 900 1176
2 2001-01-03 1200 1177
3 2001-01-04 700 1778
4 2001-01-05 600 1179
Upvotes: 0
Views: 66
Reputation: 75080
You can use stack()
which by default drops the nan
followed by unstack
:
df.set_index('date').stack().unstack().reset_index()
date value1 value2
0 2001-01-01 800.0 1175.0
1 2001-01-02 900.0 1176.0
2 2001-01-03 1200.0 1177.0
3 2001-01-04 700.0 1778.0
4 2001-01-05 600.0 1779.0
Upvotes: 1
Reputation: 30920
Use groupby.first
:
new_df = df.groupby('date',as_index = False).first()
print(new_df)
date value1 value2
0 2001-01-01 800.0 1175.0
1 2001-01-02 900.0 1176.0
2 2001-01-03 1200.0 1177.0
3 2001-01-04 700.0 1778.0
4 2001-01-05 600.0 1779.0
If you can have more than one value per column and date then you need:
df.groupby('date').apply(lambda x: x.ffill().bfill()).drop_duplicates()
Upvotes: 2