Reputation: 2513
Here are my data :
places <- c("London", "London", "London", "Paris", "Paris", "Rennes")
years <- c(2019, 2019, 2020, 2019, 2019, 2020)
dataset <- data.frame(years, places)
The result:
years places
1 2019 London
2 2019 London
3 2020 London
4 2019 Paris
5 2019 Paris
6 2020 Rennes
I am counting by place and years
dataset2 <- dataset %>%
count(places, years)
places years n
1 London 2019 2
2 London 2020 1
3 Paris 2019 2
4 Rennes 2020 1
I want my table to show the two years for each city even if there are no values.
places years n
1 London 2019 2
2 London 2020 1
3 Paris 2019 2
4 Paris 2020 NA # or better 0
5 Rennes 2019 NA # or better 0
6 Rennes 2020 1
Upvotes: 1
Views: 32
Reputation: 887251
We can use CJ
from data.table
library(data.table)
setDT(dataset)[, .N, .(years, places)][CJ(years, places, unique = TRUE), on = .(years, places)]
Upvotes: 0
Reputation: 389055
You could use complete
from tidyr
to fill in missing sequence :
library(dplyr)
library(tidyr)
dataset %>% count(places, years) %>% complete(places, years, fill = list(n = 0))
If you convert years
to factor
you can specify .drop = FALSE
.
dataset %>% mutate(years = factor(years)) %>% count(places, years, .drop = FALSE)
# places years n
# <fct> <fct> <int>
#1 London 2019 2
#2 London 2020 1
#3 Paris 2019 2
#4 Paris 2020 0
#5 Rennes 2019 0
#6 Rennes 2020 1
Upvotes: 2