Reputation: 188
I am trying to subset a data table by the numeric values so I can perform a five number summary on the numeric variables only. However, I also need to group the variables. The way that I was trying to do it does not allow me to use the subset and the id variable that is not part of the subset. I know that data table has the .SD
command, but I cannot seem to get the right combination of apply functions and group in data table. The id
variable is not numeric and cannot be coerced into being numeric; it is also not unique in my data table.
Here is what I have tried:
library(data.table)
library(magrittr)
dt <- data.table(num1 = rep(1, 10),
num2 = rep(2, 10),
num3 = rep(100, 10),
id = c("1a", "2b", "2h", "3b", "4b", "5b", "5b", "7n", "8mn", "9y"),
char1 = rep("a", 10),
char2 = rep("b", 10))
numeric_variables <-
lapply(dt, is.numeric) %>%
unlist() %>%
as.vector()
dt[, numeric_variables, with = FALSE]
dt_summary <-
apply(dt[, numeric_variables, with = FALSE][, grep("num",
names(dt[, numeric_variables, with = FALSE]),
value = TRUE),
with = FALSE],
2,
fivenum) %>%
as.data.frame()
rownames(dt_summary) <-
c("Min", "Q1", "Med", "Q3", "Max")
dt_summary
dt[, .(numeric_variables, id), with = FALSE]
The final line does not work because id
is not in the numeric_variables category I created. If someone could direct me to using the correct by
, tapply
function with .SD
I would appreciate it.
NOTE: This is part of a larger program where the user can either select one id
to look at or compare two id
variables at once. So, it needs to work for either one or many groups (eventually).
Upvotes: 2
Views: 171
Reputation: 21507
i guess you are looking for .SDcols
ind <- sapply(dt, is.numeric)
(dt_summary <- dt[,lapply(.SD, fivenum), .SDcols = ind])
Which gives you
num1 num2 num3
1: 1 2 100
2: 1 2 100
3: 1 2 100
4: 1 2 100
5: 1 2 100
As data.table
does not accept rownames you can do:
setDF(dt_summary)
rownames(dt_summary) <-
c("Min", "Q1", "Med", "Q3", "Max")
> dt_summary
num1 num2 num3
Min 1 2 100
Q1 1 2 100
Med 1 2 100
Q3 1 2 100
Max 1 2 100
Upvotes: 3