nocoffeenoworkee
nocoffeenoworkee

Reputation: 59

time series from list of dates-python

I have a list of dates (ListA) each entry in which represents an occurrence. How do I make a time series out of the list in python3? A sequence of dates would be on the X axis, and the frequency of each date would be on the Y

ListA = [2016-04-05, 2016-04-05, 2016-04-07, 2016-09-10, 
         2016-03-05, 2016-07-11, 2017-01-01]

Desired Output:

[2016-04-05, 2], [2016-04-06, 0], [2016-04-07, 1],
[2016-04-08, 0], ……………… .., [2017-01-01, 1]

Desired Format of output:

[[Date, Frequency],....,*]

I have the Date code as:

Date=pd.date_range('2016-04-05', '2017-01-01', freq='D')
Print(Date)

Which gives:

[2016-04-05, 2016-04-06, 2016-04-07,....,]

I need something like the code below to step through Date above to get the frequency for each date.

for item in ListA:
    if item>=Date[0] and item<Date[1]:
        print(ListA.count(item))

Upvotes: 3

Views: 1807

Answers (1)

Stephen Rauch
Stephen Rauch

Reputation: 49774

Using Counter from the collections module this is very straight forward:

Code:

dates = [
    '2016-04-05',
    '2016-04-05',
    '2016-04-07',
    '2016-09-10',
    '2016-03-05',
    '2016-07-11',
    '2017-01-01'
]

from collections import Counter
counts = Counter(dates)
print(sorted(counts.items()))

Results:

[('2016-03-05', 1), ('2016-04-05', 2), 
 ('2016-04-07', 1), ('2016-07-11', 1), 
 ('2016-09-10', 1), ('2017-01-01', 1)]

Build a list over pandas.DatetimeIndex:

To build a list of lists over a range of dates is easy enough because a Counter will return 0 when indexed with a value for which the count is zero.

# pandas date range
dates = pd.date_range('2016-04-05', '2017-01-01', freq='D')

# counter for date we need counted
counts = Counter(pd.to_datetime(dates))

# build a list using a list comprehension of counts at all dates in range
date_occurence_sequence = [[d, counts[d]] for d in dates]

Add to per day dataframe:

And since you seem to be using pandas let's insert the occurrence counts into a data frame indexed per day.

import pandas as pd
index = pd.date_range('2016-04-05', '2017-01-01', freq='D')
df = pd.DataFrame([0] * len(index), index=index)
df.update(pd.DataFrame.from_dict(Counter(pd.to_datetime(dates)), 'index'))

print(df.head())

Results:

              0
2016-04-05  2.0
2016-04-06  0.0
2016-04-07  1.0
2016-04-08  0.0
2016-04-09  0.0

Upvotes: 4

Related Questions