flee
flee

Reputation: 1335

for-loop inside mutate and append result

I have a simple for-loop which works as I would like on vectors, I would like to use my for-loop on a column of a dataframe grouped by another column in the dataframe e.g.:

# here is my for-loop working as expected on a simple vector:

vect <- c(0.5, 0.7, 0.1) 
res <- vector(mode = "numeric", length = 3) 

for (i in 1:length(vect)) {
  res[i] <- sum(exp(-2 * (vect[i] - vect[-i])))
}

res
[1] 1.9411537 0.9715143 5.5456579

And here is psuedo-code trying to do it on a column of a dataframe:

#Example data
my.df <- data.frame(let = rep(LETTERS[1:3], each = 3), 
    num1 = 1:3, vect = c(0.5, 0.7, 0.1), num3 = NA)

 my.df
   let num1 vect num3
1   A    1  0.5   NA
2   A    2  0.7   NA
3   A    3  0.1   NA
4   B    1  0.5   NA
5   B    2  0.7   NA
6   B    3  0.1   NA
7   C    1  0.5   NA
8   C    2  0.7   NA
9   C    3  0.1   NA

# My attempt:

require(tidyverse)

  my.df <- my.df %>%
      group_by(let) %>%
      mutate(for (i in 1:length(vect)) {
        num3[i] <- sum(exp(-4 * (vect[i] - vect[-i])))
  })

What result should look like (but my psuedo code above doesn't work):

   let num1 vect    num3
1   A    1  0.5 1.9411537
2   A    2  0.7 0.9715143
3   A    3  0.1 5.5456579
4   B    1  0.5 1.9411537
5   B    2  0.7 0.9715143
6   B    3  0.1 5.5456579
7   C    1  0.5 1.9411537
8   C    2  0.7 0.9715143
9   C    3  0.1 5.5456579

I feel like I am not using tidyverse logic by trying to having a for-loop inside mutate, any suggestions much appreciated.

Upvotes: 1

Views: 189

Answers (4)

akrun
akrun

Reputation: 886968

Or using data.table

library(data.table)
setDT(my.df)[, num3 := unlist(lapply(seq_len(.N), 
         function(i) sum(exp(-2 * (vect[i] - vect[-i]))))), let]
my.df
#   let num1 vect      num3
#1:   A    1  0.5 1.9411537
#2:   A    2  0.7 0.9715143
#3:   A    3  0.1 5.5456579
#4:   B    1  0.5 1.9411537
#5:   B    2  0.7 0.9715143
#6:   B    3  0.1 5.5456579
#7:   C    1  0.5 1.9411537
#8:   C    2  0.7 0.9715143
#9:   C    3  0.1 5.5456579

Upvotes: 1

kath
kath

Reputation: 7724

You can turn your for-loop into a sapply-call and then use it in mutate. sapply takes a function and aplys it to each list-element. In this case I'm looping over the number of elements in each groups (n()).

my.df %>% 
  group_by(let) %>% 
  mutate(num3 = sapply(1:n(), function(i) sum(exp(-2 * (vect[i] - vect[-i])))))

# A tibble: 9 x 4
# Groups:   let [3]
#   let    num1  vect  num3
#   <fct> <int> <dbl> <dbl>
# 1 A         1   0.5 1.94 
# 2 A         2   0.7 0.972
# 3 A         3   0.1 5.55 
# 4 B         1   0.5 1.94 
# 5 B         2   0.7 0.972
# 6 B         3   0.1 5.55 
# 7 C         1   0.5 1.94 
# 8 C         2   0.7 0.972
# 9 C         3   0.1 5.55 

This is essential equivalent to the very wrong looking for-loop inside a mutate call. In this case, however I'd prefer the custom-function provided by A. Stam.

my.df %>%
  group_by(let) %>%
  mutate(num3 = {
    res <- numeric(length = n())
    for (i in 1:n()) {
      res[i] <- sum(exp(-2 * (vect[i] - vect[-i])))
    }
    res
  })

You can also replace sapply with purrr's map_dbl.

Upvotes: 1

Ronak Shah
Ronak Shah

Reputation: 388817

We can use map_dbl from purrr and apply the formula for calculation.

library(dplyr)
library(purrr)

my.df %>%
  group_by(let) %>%
  mutate(num3 = map_dbl(seq_along(vect), ~ sum(exp(-2 * (vect[.] - vect[-.])))))


#   let    num1  vect  num3
#  <fct> <int> <dbl> <dbl>
#1  A         1   0.5 1.94 
#2  A         2   0.7 0.972
#3  A         3   0.1 5.55 
#4  B         1   0.5 1.94 
#5  B         2   0.7 0.972
#6  B         3   0.1 5.55 
#7  C         1   0.5 1.94 
#8  C         2   0.7 0.972
#9  C         3   0.1 5.55 

Upvotes: 2

A. Stam
A. Stam

Reputation: 2222

The simple solution is to create a custom function and pass that to mutate. A working solution:

custom_func <- function(vec) {
  res <- vector(mode = "numeric", length = 3)
  for (i in 1:length(vect)) {
    res[i] <- sum(exp(-2 * (vect[i] - vect[-i])))
  }
  res
}

library(tidyverse)

my.df %>%
  group_by(let) %>%
  mutate(num3 = custom_func(vect))

#> # A tibble: 9 x 4
#> # Groups:   let [3]
#>   let    num1  vect  num3
#>   <fct> <int> <dbl> <dbl>
#> 1 A         1   0.5 1.94 
#> 2 A         2   0.7 0.972
#> 3 A         3   0.1 5.55 
#> 4 B         1   0.5 1.94 
#> 5 B         2   0.7 0.972
#> 6 B         3   0.1 5.55 
#> 7 C         1   0.5 1.94 
#> 8 C         2   0.7 0.972
#> 9 C         3   0.1 5.55 

I'm wondering whether a more elegant version of the custom function is possible - perhaps someone smarter than me can tell you whether purrr::map, for example, could provide an alternative.

Upvotes: 2

Related Questions