Luckasino
Luckasino

Reputation: 424

Find median in nth range in Python

I am trying to find value of every median in my dataset for every 15 days. Dataset has three columns - index, value and date.

This is for evaluation of this median according to some conditions. Each of 15 days will get new value according to conditions. I've tried several approaches (mostly python comprehension) but I am still a beginner to solve it properly.

    value   date        index
14  13065   1983-07-15  14
15  13065   1983-07-16  15
16  13065   1983-07-17  16
17  13065   1983-07-18  17
18  13065   1983-07-19  18
19  13065   1983-07-20  19
20  13065   1983-07-21  20
21  13065   1983-07-22  21
22  13065   1983-07-23  22
23  .....    .........  .. 

medians = [dataset['value'].median() for range(0, len(dataset['index']), 15) in dataset['value']]   

I am expecting to return medians from the dataframe to a new variable.

syntaxError: can't assign to function call

Upvotes: 2

Views: 514

Answers (1)

Mohit Motwani
Mohit Motwani

Reputation: 4792

Assuming you have data in the below format:

test = pd.DataFrame({'date': pd.date_range(start = '2016/02/12', periods = 1000, freq='1D'),
                                         'value': np.random.randint(1,1000,1000)})
test.head()

    date       value
0   2016-02-12  243
1   2016-02-13  313
2   2016-02-14  457
3   2016-02-15  236
4   2016-02-16  893

If you want to median for every 15 days then use pd.Grouper and groupby date:

test.groupby(pd.Grouper(freq='15D', key='date')).median().reset_index()

date        Value
2016-02-12  457.0
2016-02-27  733.0
2016-03-13  688.0
2016-03-28  504.0
2016-04-12  591.0

Note that while using pd.Grouper, your date column should be of type datetime. If it's not, convert using:

test['date'] = pd.to_datetime(test['date'])

Upvotes: 1

Related Questions