Reputation: 1
I try to fix dates (years) using a function
change_century <- function(x){
a <- year(x)
ifelse(test = a >2020,yes = year(x) <- (year(x)-100),no = year(x) <- a)
return(x)
}
The function works for specific row or using a loop for one column (here date of birth)
for (i in c(1:nrow(Df))){
Df_recode$DOB[i] <- change_century(Df$DOB[i])
}
Then I try to use mutate/across
Df_recode <- Df %>% mutate(across(list_variable_date,~change_century(.)))
It does not work. Is there something I am getting wrong? thank you !
Upvotes: 0
Views: 139
Reputation: 160607
Try:
change_century <- function(x){
a <- year(x)
newx <- ifelse(test = a > 2020, yes = a - 100, no = a)
return(newx)
}
(Frankly, the use of newx
as a temporary storage and then return
ing it was done that way solely to introduce minimal changes in your code. In general, in this case one does not need return
, in fact theoretically it adds an unnecessary function to the evaluation stack. I would tend to have two lines in that function: a <- year(x)
and ifelse(..)
, without assignment. The default behavior in R is to return the value of the last expression, which in my case would be the results of ifelse
, which is what we want. Assigning it to newx
and then return(newx)
or even just newx
as the last expression has exactly the same effect.)
ifelse
cannot have variable assignment within it. That's not to say that is is a syntax error (it is not), but that it is counter to its intent. You are asking the function to go through each condition found in test=
, and return a value based on it. Regardless of the condition, both yes=
and no=
are evaluated completely, and then ifelse
joins them together as needed.
For demonstration,
ifelse(test = c(TRUE, FALSE, TRUE), yes = 1:3, no = 11:13)
The return value is something like:
c(
if (test[1]) yes[1] else no[1],
if (test[2]) yes[2] else no[2],
if (test[3]) yes[3] else no[3]
)
# c(1, 12, 3)
To capture the results of the zipped-together yes
es and no
s c(1, 12, 3)
, one must capture the return value from ifelse
itself, not inside of the call to ifelse
.
Another point that may be relevant: ifelse(cond, yes, now)
is not at all a shortcut for if (cond) { yes } else { no }
. Some key differences:
in if
, the cond
must always be exactly length 1, no more, no less.
In R < 4.2, length 0 returns an error argument is of length zero
(see ref), while length 2 or more produces a warning the condition has length > 1 and only the first element will be used
(see ref1, ref2).
In R >= 4.2, both conditions (should) produce an error (no warnings).
ifelse
is intended to be vectorized, so the cond
can be any length. yes=
and no=
should either be the same length or length 1 (recycling is in effect here); cond=
should really be the same length as the longer of yes=
and no=
.
if
does short-circuiting, meaning that if (TRUE || stop("quux")) 1
will never attempt to evaluate stop
. This can be very useful when one condition will fail (logically or with a literal error) if attempted on a NULL
object, such as if (!is.null(quux) && quux > 5) ...
.
Conversely, ifelse
always evaluates all three of cond=
, yes=
, and no=
, and all values in each, there is no short-circuiting.
Upvotes: 3