dvdlwlr
dvdlwlr

Reputation: 35

Grouping dates together by year in Pandas

I have a dataset of property prices and they are currently listed by 'DATE_SOLD'. I'd like to be able to count them by year. The dataset looks like this -

SALE_DATE   COUNTY  SALE_PRICE
0   2010-01-01  Dublin  343000.0
1   2010-01-03  Laois   185000.0
2   2010-01-04  Dublin  438500.0
3   2010-01-04  Meath   400000.0
4   2010-01-04  Kilkenny    160000.0

This is the code I've tried -

by_year = property_prices['SALE_DATE'] = pd.to_datetime(property_prices['SALE_DATE'])
print(by_year)

I think I'm close but as a biblical noob it's quite frustrating!

Thank you for any help you can provide; this site has been awesome so far in finding little tips and tricks to make my life easier

Upvotes: 3

Views: 79

Answers (2)

Aditya
Aditya

Reputation: 66

import pandas as pd

sample_dict = {'Date':['2010-01-11', '2020-01-22', '2010-03-12'], 'Price':[1000,2000,3500]}

df = pd.DataFrame(sample_dict)

# Creating 'year' column using the Date column
df['year'] = df.apply(lambda row: row.Date.split('-')[0], axis=1)

# Groupby function
df1 = df.groupby('Year')

# Print the first value in each group
df1.first()

Output:

            Date  x
year
2010  2010-01-11  1
2020  2020-01-22  2

Upvotes: 0

sophocles
sophocles

Reputation: 13821

You are close. As you did, you can use pd.to_datetime to convert your sale_date to a datetime column. Then groupby the year, using dt.year which gets the year of the datetime, and use size() on that which computes the size of each group, which in this case is the year.

property_prices['SALE_DATE'] = pd.to_datetime(property_prices['SALE_DATE'])
property_prices.groupby(property_prices.SALE_DATE.dt.year).size()

Which prints:

SALE_DATE
2010    5
dtype: int64

Upvotes: 1

Related Questions