Count unique dates in pandas dataframe

Question

I have a dataframe of surface weather observations (fzraHrObs) organized by a station identifier code and date. fzraHrObs has several columns of weather data. The station code and date (datetime objects) look like:

usaf      dat
716270    2014-11-23 12:00:00
          2015-12-20 08:00:00
          2015-12-20 09:00:00
          2015-12-21 04:00:00
          2015-12-28 03:00:00
716280    2015-12-19 08:00:00
          2015-12-19 08:00:00

I would like to get a count of the number of unique dates (days) per year for each station - i.e. the number of days of obs per year at each station. In my example above this would give me:

    usaf      Year     Count
    716270    2014     1
              2015     3
    716280    2014     0
              2015     1

I've tried using groupby and grouping by station, year, and date: grouped = fzraHrObs['dat'].groupby(fzraHrObs['usaf'], fzraHrObs.dat.dt.year, fzraHrObs.dat.dt.date])

Count, size, nunique, etc. on this just gives me the number of obs on each date, not the number of dates themselves per year. Any suggestions on getting what I want here?

akuiper · Accepted Answer

Could be something like this, group the date by usaf and year and then count the number of unique values:

import pandas as pd
df.dat.apply(lambda dt: dt.date()).groupby([df.usaf, df.dat.apply(lambda dt: dt.year)]).nunique()

#   usaf   dat 
# 716270  2014    1
#         2015    3
# 716280  2015    1
# Name: dat, dtype: int64

Count unique dates in pandas dataframe

Answers (2)

Related Questions