Reputation: 163
The data is like:(Note that the dates are not consecutive)
name created label
0 leahbirdjohnso2 2020-02-20 PATRIOTSAWAKENED
1 leahbirdjohnso2 2020-02-21 TRUMP2020
2 carol2busy 2020-02-23 TRUMP2020
3 carol2busy 2020-02-24 TRUMP2020
4 GODRUS 2020-02-25 FOXNEWS
if I set interval =2 days,then I got a dataframe:
created counts label
2020-02-20 1 PATRIOTSAWAKENED
1 TRUMP2020
2020-02-23 1 TRUMP2020
2020-02-24 1 FOXNEWS
1 TRUMP2020
if I set interval = 3 days,then I got a dataframe:
created counts label
2020-02-20 1 PATRIOTSAWAKENED
1 TRUMP2020
2020-02-23 2 TRUMP2020
1 FOXNEWS
Basically, I want to sum up label numbers according to label names based on time intervals. I can set any intervals by days, 3days, 7days, 15days,etc. I cheked https://stackoverflow.com/questions/56275425/pandas-typeerror-not-supported-between-instances-of-int-and-str-when-s
; but not working for me. I can use Counter to do it in a more complex way. How to do it in an elegant pandas style?
Upvotes: 0
Views: 350
Reputation: 14949
IIUC, you can try:
df.created = pd.to_datetime(df.created)
df = df.groupby([pd.Grouper(key="created", freq="3D"), 'label']).count()
Upvotes: 1