Reputation: 8572
In a specific problem, I am trying to match a number/string to various subgroups.
set.seed(1)
n <- 1e5
groups <- list(0, 1:3, 5:9, 10:33,
35, 36:39, 41:43,
45:47, 49:53,
55:56, 58:63,
64:65, 68, 69:75,
77:82, 84, 85,
86:89, 90:93,
94:96, 97:98, 99)
dat <- sample(unlist(groups), n, TRUE)
Thus i want to know which group dat is contained within, in 'groups'. One method would be using *apply
or the equivalent for-loop
out <- integer(n)
for(i in seq_along(out))
out[i] <- which(sapply(groups, function(x)dat[i] %in% x))
table(out)
#out
# 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
# 1144 3384 5587 26892 1165 4501 3348 3299 5702 2177 6751 2218 1134 7810 6792 1106 1091 4526 4606 3370 2246 1151
but is there a more concise method?
Note the final result is out
, and the table is only for matching visualization. Eg. the final result should match out
and not table(out)
.
Upvotes: 2
Views: 103
Reputation: 887118
An option is also to stack
or enframe
into a single dataset and then do the count
library(dplyr)
stack(setNames(groups, seq_along(groups))) %>%
group_by(ind) %>%
summarise(Count = sum(dat %in% values))
# A tibble: 22 x 2
# ind Count
# <fct> <int>
# 1 1 1144
# 2 2 3384
# 3 3 5587
# 4 4 26892
# 5 5 1165
# 6 6 4501
# 7 7 3348
# 8 8 3299
# 9 9 5702
#10 10 2177
# … with 12 more rows
Or with enframe
library(tibble)
library(tidyr)
enframe(groups) %>%
unnest(c(value)) %>%
group_by(name) %>%
summarise(Count = sum(dat %in% value))
Upvotes: 1
Reputation: 39657
There is no need for the for
loop. To get the counts you can use dat
direct.
i <- lapply(groups, function(x) which(dat %in% x))
out[unlist(i)] <- rep(seq_along(i), lengths(i))
table(out)
#out
# 1 2 3 4 5 6 7 8 9 10 11 12 13
# 1144 3384 5587 26892 1165 4501 3348 3299 5702 2177 6751 2218 1134
# 14 15 16 17 18 19 20 21 22
# 7810 6792 1106 1091 4526 4606 3370 2246 1151
You can also use a lookup table:
lup <- numeric(0)
lup[unlist(groups)+1] <- rep(seq_along(groups), lengths(groups))
out <- lup[dat+1]
Upvotes: 3