Reputation: 61
I am having the list of ticket numbers for each date. Date column is transformed to data but the ticket number column is a text.
Created Ticket
01-Jan-19 a1
02-Jan-19 a2
03-Jan-19 a3
04-Jan-19 a4
05-Jan-19 a5
06-Jan-19 a6
07-Jan-19 a7
08-Jan-19 a8
09-Jan-19 a9
10-Jan-19 a10
11-Jan-19 a11
12-Jan-19 a12
13-Jan-19 a13
14-Jan-19 a14
15-Jan-19 a15
16-Jan-19 a16
17-Jan-19 a17
18-Jan-19 a18
19-Jan-19 a19
01-Feb-19 a20
02-Feb-19 a21
03-Feb-19 a22
04-Feb-19 a23
Tried to use floor data in R but since the ticket number column is character I am not able to use it.
data <- read.csv(file = 'D:\\DS Data\\SampleTickets.csv', stringsAsFactors = FALSE,header = TRUE)
str(data)
library(readr)
library(lubridate)
library(dplyr)
data <- data %>%
mutate(Created = dmy(Created))
data %>% group_by(month=floor_date(Created, "month")) %>%
summarize(amount=sum(Ticket))
I am expecting a data frame output like..
CreatedMonth CountOfTickets
1/1/2019 18
1/2/2019 4
Upvotes: 2
Views: 45
Reputation: 42544
You were nearly there: Just use n()
instead of sum(Ticket)
to count the number of rows:
library(dplyr)
library(lubridate)
data %>%
mutate(Created = dmy(Created)) %>%
group_by(month = floor_date(Created, "month")) %>%
summarize(amount = n())
# A tibble: 2 x 2 month amount <date> <int> 1 2019-01-01 19 2 2019-02-01 4
However, there is a shortcut which uses count()
:
data %>%
count(CreatedMonth = dmy(Created) %>% floor_date("month"))
# A tibble: 2 x 2 CreatedMonth n <date> <int> 1 2019-01-01 19 2 2019-02-01 4
For the sake of completeness, here is also a data.table
version:
library(lubridate)
library(data.table)
setDT(data)[, .N, by = .(CreatedMonth = floor_date(dmy(Created), "month"))]
CreatedMonth N 1: 2019-01-01 19 2: 2019-02-01 4
data <- readr::read_table("Created Ticket
01-Jan-19 a1
02-Jan-19 a2
03-Jan-19 a3
04-Jan-19 a4
05-Jan-19 a5
06-Jan-19 a6
07-Jan-19 a7
08-Jan-19 a8
09-Jan-19 a9
10-Jan-19 a10
11-Jan-19 a11
12-Jan-19 a12
13-Jan-19 a13
14-Jan-19 a14
15-Jan-19 a15
16-Jan-19 a16
17-Jan-19 a17
18-Jan-19 a18
19-Jan-19 a19
01-Feb-19 a20
02-Feb-19 a21
03-Feb-19 a22
04-Feb-19 a23")
Upvotes: 1
Reputation: 388982
Using dplyr
we can first convert Created
column to actual Date and group them by each month and count number of tickets for each group.
library(dplyr)
df %>%
mutate(Created = as.Date(Created, "%d-%b-%y")) %>%
arrange(Created) %>%
mutate(Yearmon = format(Created, "%B-%Y"),
Yearmon = factor(Yearmon, levels = unique(Yearmon))) %>%
group_by(Yearmon) %>%
summarise(count = n())
# Yearmon count
# <fct> <int>
#1 January-2019 19
#2 February-2019 4
Upvotes: 0