Reputation: 861
Lets say I have a data table like:
year city
2026 NYC
2026 NYC
2026 NYC
2026 LA
2027 LA
2028 NYC
2028 NYC
Which can be created by:
dt <- structure(list(location = c("NYC", "NYC", "NYC","LA", "LA", "NYC", "NYC"),
year = c(2026, 2026, 2026, 2026, 2027, 2028, 2028)),
class = "data.table",
row.names = c(NA, -7L))
I want to count number of unique cities in a given year. Lets say 2026
.
So, the outcome in this case would be 2, because there is just NYC
and LA
.
What goes in the last line of following?
dt %>%
filter(year == 2026) %>%
What goes here?
Upvotes: 0
Views: 47
Reputation: 887501
We can use data.table
library(data.table)
setDT(dt)[year == 2026, .(count = uniqueN(location))]
# count
#1: 2
Or with `base R
length(table(subset(dt, year == 2026, select = location)))
#[1] 2
Upvotes: 1
Reputation: 389135
We can use n_distinct
to get the number of unique values
library(dplyr)
dt %>%
filter(year == 2026) %>%
summarise(count = n_distinct(city))
# count
#1 2
Or add the filtering step in summarise itself
dt %>% summarise(count = n_distinct(city[year == 2026]))
Or if we want it as a vector we can add pull(count)
dt %>%
filter(year == 2026) %>%
summarise(count = n_distinct(city)) %>%
pull(count)
#[1] 2
In base R, this is equivalent to
length(unique(dt$city[dt$year == 2026]))
#[1] 2
Upvotes: 1