Reputation: 1335
I have a simple for-loop
which works as I would like on vectors, I would like to use my for-loop
on a column of a dataframe
grouped by another column in the dataframe
e.g.:
# here is my for-loop working as expected on a simple vector:
vect <- c(0.5, 0.7, 0.1)
res <- vector(mode = "numeric", length = 3)
for (i in 1:length(vect)) {
res[i] <- sum(exp(-2 * (vect[i] - vect[-i])))
}
res
[1] 1.9411537 0.9715143 5.5456579
And here is psuedo-code trying to do it on a column of a dataframe:
#Example data
my.df <- data.frame(let = rep(LETTERS[1:3], each = 3),
num1 = 1:3, vect = c(0.5, 0.7, 0.1), num3 = NA)
my.df
let num1 vect num3
1 A 1 0.5 NA
2 A 2 0.7 NA
3 A 3 0.1 NA
4 B 1 0.5 NA
5 B 2 0.7 NA
6 B 3 0.1 NA
7 C 1 0.5 NA
8 C 2 0.7 NA
9 C 3 0.1 NA
# My attempt:
require(tidyverse)
my.df <- my.df %>%
group_by(let) %>%
mutate(for (i in 1:length(vect)) {
num3[i] <- sum(exp(-4 * (vect[i] - vect[-i])))
})
What result should look like (but my psuedo code above doesn't work):
let num1 vect num3
1 A 1 0.5 1.9411537
2 A 2 0.7 0.9715143
3 A 3 0.1 5.5456579
4 B 1 0.5 1.9411537
5 B 2 0.7 0.9715143
6 B 3 0.1 5.5456579
7 C 1 0.5 1.9411537
8 C 2 0.7 0.9715143
9 C 3 0.1 5.5456579
I feel like I am not using tidyverse
logic by trying to having a for-loop
inside mutate
, any suggestions much appreciated.
Upvotes: 1
Views: 189
Reputation: 886968
Or using data.table
library(data.table)
setDT(my.df)[, num3 := unlist(lapply(seq_len(.N),
function(i) sum(exp(-2 * (vect[i] - vect[-i]))))), let]
my.df
# let num1 vect num3
#1: A 1 0.5 1.9411537
#2: A 2 0.7 0.9715143
#3: A 3 0.1 5.5456579
#4: B 1 0.5 1.9411537
#5: B 2 0.7 0.9715143
#6: B 3 0.1 5.5456579
#7: C 1 0.5 1.9411537
#8: C 2 0.7 0.9715143
#9: C 3 0.1 5.5456579
Upvotes: 1
Reputation: 7724
You can turn your for
-loop into a sapply
-call and then use it in mutate
.
sapply
takes a function and aplys it to each list-element. In this case I'm looping over the number of elements in each groups (n()
).
my.df %>%
group_by(let) %>%
mutate(num3 = sapply(1:n(), function(i) sum(exp(-2 * (vect[i] - vect[-i])))))
# A tibble: 9 x 4
# Groups: let [3]
# let num1 vect num3
# <fct> <int> <dbl> <dbl>
# 1 A 1 0.5 1.94
# 2 A 2 0.7 0.972
# 3 A 3 0.1 5.55
# 4 B 1 0.5 1.94
# 5 B 2 0.7 0.972
# 6 B 3 0.1 5.55
# 7 C 1 0.5 1.94
# 8 C 2 0.7 0.972
# 9 C 3 0.1 5.55
This is essential equivalent to the very wrong looking for
-loop inside a mutate
call. In this case, however I'd prefer the custom-function provided by A. Stam.
my.df %>%
group_by(let) %>%
mutate(num3 = {
res <- numeric(length = n())
for (i in 1:n()) {
res[i] <- sum(exp(-2 * (vect[i] - vect[-i])))
}
res
})
You can also replace sapply
with purrr
's map_dbl
.
Upvotes: 1
Reputation: 388817
We can use map_dbl
from purrr
and apply the formula for calculation.
library(dplyr)
library(purrr)
my.df %>%
group_by(let) %>%
mutate(num3 = map_dbl(seq_along(vect), ~ sum(exp(-2 * (vect[.] - vect[-.])))))
# let num1 vect num3
# <fct> <int> <dbl> <dbl>
#1 A 1 0.5 1.94
#2 A 2 0.7 0.972
#3 A 3 0.1 5.55
#4 B 1 0.5 1.94
#5 B 2 0.7 0.972
#6 B 3 0.1 5.55
#7 C 1 0.5 1.94
#8 C 2 0.7 0.972
#9 C 3 0.1 5.55
Upvotes: 2
Reputation: 2222
The simple solution is to create a custom function and pass that to mutate
. A working solution:
custom_func <- function(vec) {
res <- vector(mode = "numeric", length = 3)
for (i in 1:length(vect)) {
res[i] <- sum(exp(-2 * (vect[i] - vect[-i])))
}
res
}
library(tidyverse)
my.df %>%
group_by(let) %>%
mutate(num3 = custom_func(vect))
#> # A tibble: 9 x 4
#> # Groups: let [3]
#> let num1 vect num3
#> <fct> <int> <dbl> <dbl>
#> 1 A 1 0.5 1.94
#> 2 A 2 0.7 0.972
#> 3 A 3 0.1 5.55
#> 4 B 1 0.5 1.94
#> 5 B 2 0.7 0.972
#> 6 B 3 0.1 5.55
#> 7 C 1 0.5 1.94
#> 8 C 2 0.7 0.972
#> 9 C 3 0.1 5.55
I'm wondering whether a more elegant version of the custom function is possible - perhaps someone smarter than me can tell you whether purrr::map
, for example, could provide an alternative.
Upvotes: 2