Reputation: 619
did some searching but nothing yields the desired result, which is grouping the data by date and counting the frequency. I am able to do this with aggregate but I'm not sure how to create a new column with the results, thanks.
data in file:
Domain Dates
twitter.com 2016-08-08
google.com 2016-08-09
apple.com 2016-08-09
linkedin.com 2016-08-09
microsoft.com 2016-08-09
slack.com 2016-08-12
instagram.com 2016-08-12
ibm.com 2016-08-12
code
import pandas as pd
import matplotlib.pyplot as plt
import datetime
import numpy as np
df = pd.read_csv('domains.tsv', sep='\t')
df = df.groupby([pd.to_datetime(df.Dates).dt.date]).agg({'Dates':'size'})
print(df)
yields
Dates
Dates
2016-08-08 1
2016-08-09 4
2016-08-12 3
Ideally, I would like the count column to be 'count' and then I will save as a new csv.
Upvotes: 3
Views: 3047
Reputation: 1087
import pandas as pd
df = pd.read_csv('domains.tsv', sep='\t')
counter = df.groupby('Dates').count().rename(columns={'Domain': 'count'})
counter.to_csv('count.csv')
You will get count.csv including following result on your current dir.
Dates,count
2016-08-08,1
2016-08-09,4
2016-08-12,3
Upvotes: 3