Reputation: 59
I have a list of dates (ListA) each entry in which represents an occurrence. How do I make a time series out of the list in python3? A sequence of dates would be on the X axis, and the frequency of each date would be on the Y
ListA = [2016-04-05, 2016-04-05, 2016-04-07, 2016-09-10,
2016-03-05, 2016-07-11, 2017-01-01]
Desired Output:
[2016-04-05, 2], [2016-04-06, 0], [2016-04-07, 1],
[2016-04-08, 0], ……………… .., [2017-01-01, 1]
Desired Format of output:
[[Date, Frequency],....,*]
I have the Date code as:
Date=pd.date_range('2016-04-05', '2017-01-01', freq='D')
Print(Date)
Which gives:
[2016-04-05, 2016-04-06, 2016-04-07,....,]
I need something like the code below to step through Date above to get the frequency for each date.
for item in ListA:
if item>=Date[0] and item<Date[1]:
print(ListA.count(item))
Upvotes: 3
Views: 1807
Reputation: 49774
Using Counter
from the collections
module this is very straight forward:
Code:
dates = [
'2016-04-05',
'2016-04-05',
'2016-04-07',
'2016-09-10',
'2016-03-05',
'2016-07-11',
'2017-01-01'
]
from collections import Counter
counts = Counter(dates)
print(sorted(counts.items()))
Results:
[('2016-03-05', 1), ('2016-04-05', 2),
('2016-04-07', 1), ('2016-07-11', 1),
('2016-09-10', 1), ('2017-01-01', 1)]
Build a list over pandas.DatetimeIndex
:
To build a list of lists over a range of dates is easy enough because a Counter
will return 0
when indexed with a value for which the count is zero.
# pandas date range
dates = pd.date_range('2016-04-05', '2017-01-01', freq='D')
# counter for date we need counted
counts = Counter(pd.to_datetime(dates))
# build a list using a list comprehension of counts at all dates in range
date_occurence_sequence = [[d, counts[d]] for d in dates]
Add to per day dataframe:
And since you seem to be using pandas
let's insert the occurrence counts into a data frame indexed per day.
import pandas as pd
index = pd.date_range('2016-04-05', '2017-01-01', freq='D')
df = pd.DataFrame([0] * len(index), index=index)
df.update(pd.DataFrame.from_dict(Counter(pd.to_datetime(dates)), 'index'))
print(df.head())
Results:
0
2016-04-05 2.0
2016-04-06 0.0
2016-04-07 1.0
2016-04-08 0.0
2016-04-09 0.0
Upvotes: 4