Katharina Böhm
Katharina Böhm

Reputation: 125

Randomly sample from panel data by 3months periods

I have pandas data frame that is panel data i.e. data of multiple customers over a timeframe. I want to sample (for bootstraping) a continuous three months period (i always wanna get a full month) of a random customer 90 times.

I have googled a bit and found several sampling techniques but none that would include sampling based on three continuous months. I was considering just making a list of all the month names and sampling three consecutive ones (although not sure how to do consecutive). But how would i then be able to e.g. pick Nov21-Dec21-Jan22 ?

Would appreciate the help a lot!

Upvotes: 0

Views: 48

Answers (1)

Aidis
Aidis

Reputation: 1280

import pandas as pd
date_range = pd.date_range("2020-01-01", "2022-01-01")
df = pd.DataFrame({"value":3}, index=date_range)
df.groupby(df.index.quarter).sample(5)

This would output:

Out[12]:
            value
2021-01-14      3
2021-02-27      3
2020-01-20      3
2021-02-03      3
2021-02-19      3
2021-04-27      3
2021-06-29      3
2021-04-12      3
2020-06-24      3
2020-06-05      3
2021-07-30      3
2020-08-29      3
2021-07-03      3
2020-07-17      3
2020-09-12      3
2020-12-22      3
2021-12-13      3
2021-11-29      3
2021-12-19      3
2020-10-18      3

It selected 5 sample values form each quarter group.

From now own you can format date column (index) for it to write month in text.

Upvotes: 0

Related Questions