Reputation: 5
I am trying to modify a dataframe of two columns, to add a third that returns four possible expressions depending on the contents of the other columns (i.e. whether each is positive or negative).
I have tried a couple of approaches, the 'mutate' function in dplyr as well as sapply. Unfortunately I seem to be missing something as I get the error "the condition has length > 1 and only the first element will be used". So only the first iteration is applied to each row in the new column.
A reproducible example (of the mutate approach I've tried) is as follows:
Costs <- c(2, -5, -7, 3, 12)
Outcomes <- c(-2, 5, -7, 3, -2)
results <- as.data.frame(cbind(Costs, Outcomes))
results
quadrant <- function(cost,outcome) {
if (costs < 0 &
outcomes < 0) {
"SW Quadrant"
}
else if (costs<0 & outcomes>0){
"Dominant"
}
else if (costs>0 & outcomes<0){
"Dominated"
}
else{""}
}
results <- mutate(results,Quadrant = quadrant(Costs,Outcomes)
)
The full warning message is:
Warning messages: 1: Problem with
mutate()
inputQuadrant
. i the condition has length > 1 and only the first element will be used i InputQuadrant
isquadrant(results$Costs, results$Outcomes)
. 2: In if (costs < 0 & outcomes < 0) { : the condition has length > 1 and only the first element will be used 3: Problem withmutate()
inputQuadrant
. i the condition has length > 1 and only the first element will be used i InputQuadrant
isquadrant(results$Costs, results$Outcomes)
. 4: In if (costs < 0 & outcomes > 0) { : the condition has length > 1 and only the first element will be used 5: Problem withmutate()
inputQuadrant
. i the condition has length > 1 and only the first element will be used i InputQuadrant
isquadrant(results$Costs, results$Outcomes)
. 6: In if (costs > 0 & outcomes < 0) { : the condition has length > 1 and only the first element will be used<
My attempt at the sapply function:
results <- sapply(results$Quadrant,quadrant(results$Costs,results$Outcomes))
Leads to the following error, with consistent warning messages to the mutate approach.
Error in get(as.character(FUN), mode = "function", envir = envir) : object 'Dominated' of mode 'function' was not found
I'm sure I'm missing something obvious here. Grateful for any suggestions.
Upvotes: 0
Views: 212
Reputation: 160447
There are two things going wrong with that function.
cost
but use costs
(same for outcome);if
which strictly requires a logical condition of length 1, and two things wrong: you use &
which should almost never be used exposed like this in an if
statement, and you are passing vectors, so cost < 0
will return a logical vector the same length of cost
(which is greater than 1 here).Suggestions:
quadrant_sgl <- function(cost, outcome) {
if (cost < 0 && outcome < 0) return("SW Quadrant")
if (cost < 0 && outcome > 0) return("Dominant")
if (cost > 0 && outcome < 0) return("Dominated")
return("")
}
quadrant_vec1 <- function(cost, outcome) {
ifelse(cost < 0 & outcome < 0, "SW Quadrant",
ifelse(cost < 0 & outcome > 0, "Dominant",
ifelse(cost > 0 & outcome < 0, "Dominated",
"")))
}
quadrant_vec2 <- function(cost, outcome) {
ifelse(cost < 0,
ifelse(outcome < 0, "SW Quadrant", "Dominant"),
ifelse(outcome < 0, "Dominated", ""))
}
quadrant_vec3 <- function(cost, outcome) {
dplyr::case_when(
cost < 0 & outcome < 0 ~ "SW Quadrant",
cost < 0 & outcome > 0 ~ "Dominant",
cost > 0 & outcome < 0 ~ "Dominated",
TRUE ~ ""
)
}
quadrant_vec4 <- function(cost, outcome) {
data.table::fcase(
cost < 0 & outcome < 0, "SW Quadrant",
cost < 0 & outcome > 0, "Dominant",
cost > 0 & outcome < 0, "Dominated",
rep(TRUE, length(cost)), ""
)
}
The first function (quadrant_sgl
) turns a function that remains single-operation (not vectorized) into a vectorized function. If you aren't familiar with the concept of vectorization, know that (1) R does it well, (2) R prefers it, and (3) this is not the best venue to talk at length about this. Search for "R vectorization" and you should find plenty of material on this.
Because of this, the first one is just a demonstration of what to do when the function cannot (due to time, programming skill, or something else) be converted into a vectorize-friendly function. Use Vectorize
.
The other functions are all relatively equivalent.
If you are using dplyr
and friends, then I strongly recommend the use of quadrant_vec3
, since it is (IMO) much easier to read and maintain than nested ifelse
s. (BTW: if you must use nested ifelse
, then at least use dplyr::if_else
s, nested, as they are generally safer than base R's ifelse
.)
If you are venturing into the world of data.table
, then quadrant_vec4
is the equivalent using data.table
's own fcase
function, mostly the same as case_when
.
Demo:
Vectorize(quadrant_sgl, vectorize.args = c("cost", "outcome"))(results$Costs, results$Outcomes)
# [1] "Dominated" "Dominant" "SW Quadrant" "" "Dominated"
quadrant_vec1(results$Costs, results$Outcomes)
# [1] "Dominated" "Dominant" "SW Quadrant" "" "Dominated"
quadrant_vec2(results$Costs, results$Outcomes)
# [1] "Dominated" "Dominant" "SW Quadrant" "" "Dominated"
quadrant_vec3(results$Costs, results$Outcomes)
# [1] "Dominated" "Dominant" "SW Quadrant" "" "Dominated"
Upvotes: 2
Reputation: 21937
You probably want something more like:
costs <- c(2, -5, -7, 3, 12)
outcomes <- c(-2, 5, -7, 3, -2)
results <- as.data.frame(cbind(costs, outcomes))
results <- results %>% mutate(Quadrant = case_when(
outcomes < 0 & costs < 0 ~ "SW Quadrant",
costs < 0 & outcomes > 0 ~ "Dominant",
costs > 0 & outcomes < 0 ~ "Dominated",
TRUE ~ ""))
results
# costs outcomes Quadrant
# 1 2 -2 Dominated
# 2 -5 5 Dominant
# 3 -7 -7 SW Quadrant
# 4 3 3
# 5 12 -2 Dominated
Upvotes: 2