Reputation: 3993
I have a data.table similar to this one:
require(data.table)
dt = as.data.table(iris[ names(iris) != 'Species' ])[1:26]
dt[ , id := LETTERS ]
setnames(dt, c('col1', 'col2', 'col3', 'col4', 'id'))
I want to calculate the absolute difference of col1 to every other numeric columns (i.e. col2, col3, col4) for every id/row. How can I do that with data.table or other tools in R?
The result should look like this:
+--------+-------------+-------------+-------------+------+
| "col1" | "col1_col2" | "col1_col3" | "col1_col4" | "id" |
+--------+-------------+-------------+-------------+------+
| 5.1 | 1.6 | 3.7 | 4.9 | "A" |
| 4.9 | 1.9 | 3.5 | 4.7 | "B" |
+--------+-------------+-------------+-------------+------+
Upvotes: 1
Views: 54
Reputation: 15072
Here is a dplyr
solution. mutate_at
allows you to flexibly apply a function to multiple columns, and you can do the same thing to rename the columns afterwards.
library(tidyverse)
tbl <- iris %>%
select(-Species) %>%
slice(1:26) %>%
mutate(id = LETTERS) %>%
`colnames<-`(c("col1", "col2", "col3", "col4", "id"))
tbl %>%
mutate_at(vars(col2:col4), funs(abs(col1 - .))) %>%
rename_at(vars(col2:col4), ~ str_c("col1_", .))
#> # A tibble: 26 x 5
#> col1 col1_col2 col1_col3 col1_col4 id
#> <dbl> <dbl> <dbl> <dbl> <chr>
#> 1 5.10 1.60 3.70 4.90 A
#> 2 4.90 1.90 3.50 4.70 B
#> 3 4.70 1.50 3.40 4.50 C
#> 4 4.60 1.50 3.10 4.40 D
#> 5 5.00 1.40 3.60 4.80 E
#> 6 5.40 1.50 3.70 5.00 F
#> 7 4.60 1.20 3.20 4.30 G
#> 8 5.00 1.60 3.50 4.80 H
#> 9 4.40 1.50 3.00 4.20 I
#> 10 4.90 1.80 3.40 4.80 J
#> # ... with 16 more rows
Created on 2018-04-04 by the reprex package (v0.2.0).
Upvotes: 0
Reputation: 49448
Remove the copy
if you don't care about modifying the original table.
cols = paste0('col', 2:4)
# or if you want to be fancy
cols = setdiff(names(dt)[sapply(dt, is.numeric)], 'col1')
copy(dt)[, (cols) := col1 - .SD, .SDcols = cols][]
# col1 col2 col3 col4 id
# 1: 5.1 1.6 3.7 4.9 A
# 2: 4.9 1.9 3.5 4.7 B
# 3: 4.7 1.5 3.4 4.5 C
# ...
Upvotes: 3