jr134
jr134

Reputation: 83

How to count words and group by hour?

I am working with a large Twitter data-set, I am trying to count the word column and group by hour using the Time column, then display it as a histogram so I can see how the words changed over time (distribution of words over time). I was wondering if anybody knows how I can do this with R?

Sample of the data is accessible via this link: https://docs.google.com/spreadsheets/d/1JhXEyzkjPs59hVgoS3lW7e0Fcumis62QDUvuMP2q5aQ/edit?usp=sharing

Thanks, James

Upvotes: 0

Views: 476

Answers (1)

sconfluentus
sconfluentus

Reputation: 4993

Read your file into R, (I assumed the variable you set the file data into was x in my code below) then use the following:

require(dplyr)
x%>%group_by(Time, Word)%>%
  summarise(count=n())

It returns output like this:

                  Time      Word count
                <fctr>    <fctr> <int>
1  2015/04/30 21:59:00         a     1
2  2015/04/30 21:59:00 baltimore     1
3  2015/04/30 21:59:00     check     1
4  2015/04/30 21:59:00    common     1
5  2015/04/30 21:59:00   grabbed     1
6  2015/04/30 21:59:00      have     1
7  2015/04/30 21:59:00       her     1

Which you can capture in a data table or data frame

Upvotes: 1

Related Questions