Copenhagen_dpq922
Copenhagen_dpq922

Reputation: 3

dplyr mutate: Recursive indexing failed

I have an issue with mutate and a self-written function. My data is basically as follows:

license_sets <- list(x = c("A", "B"), y = c("C", "D", "E"))
license_data <- data.frame(license = c("A","B","C","D","E"), bidder = c("x","x","y","y","y"))
source_data <- expand.grid(license_i = c("A","B","C","D","E"), license_j = c("A","B","C","D","E"))
source_data$value <- c(1:25)

The function I want to apply reads as follows:

compute_set <- function(i, J){  tmp <- source_data %>% 
filter(license_i == i, license_j %in% J)
return(sum(tmp$value))
}

I now want to apply the function via mutate:

license_data %>% mutate(z = compute_set(license, license_sets[[bidder]]))

I get the following error and warning messages:

Error in mutate_impl(.data, dots) : 
  Evaluation error: Evaluation error: recursive indexing failed at level 2
..
In addition: Warning messages:
1: In is.na(e1) | is.na(e2) :
  longer object length is not a multiple of shorter object length
2: In `==.default`(license_i, i) :
  longer object length is not a multiple of shorter object length

If I run the same function with a simple for-loop, it works totally fine. Does anyone know what the problem(s) is here? It must have to do sth with mutate, right? I also already tried as.character(bidder) and other stuff I found here but nothing worked so far. I should add that the data frames I'm dealing with are way bigger than the ones I'm showing here so a for-loop is not feasible... (I'm hence also thankful for simplification hints for the function ;) )

Upvotes: 0

Views: 892

Answers (1)

thothal
thothal

Reputation: 20329

The problem is that in mutate the arguments are always passed as whole vectors as you can see here:

license_data %>% mutate(z = {print(list(bidder, license));
                             compute_set(license, license_sets[[bidder]])})
# [[1]]
# [1] x x y y y
# Levels: x y
# [[2]]
# [1] A B C D E
# Levels: A B C D E
# Error in license_sets[[bidder]] : recursive indexing failed at level 2

Indexing a list in this way does not work:

license_sets[[license_data$bidder]]
# Error in license_sets[[license_data$bidder]] : 
#   recursive indexing failed at level 2

So you want to map through the vectors instead:

license_data %>% 
  mutate(z = map2(bidder, license, ~ compute_set(.y, license_sets[[.x]])))

Vectorization

As @[docendo discimus] was pointing out, the issue with your function is that it not vectorized, i.e. it deals (in the case of i) with only a scalar. You can vectorize your function to work with it as intendend:

compute_set_v <- Vectorize(compute_set)
license_data %>% 
   ## add the list content directly to the data frame 
   mutate(bidder_set = map(bidder, ~ license_sets[[.]]),
          z          = compute_set_v(license, bidder_set))

Note

data.frame has the nasty habit to treat strings as factors, so you may want to add stringsAsFactors = FALSE in your data.frame construction.

Upvotes: 2

Related Questions