Reputation: 647
I want to recode the values in a column if x is >1 but < 2, it will be recoded as 1
Here's my code:
neu$b <- lapply(neu$swl.y, function(x) ifelse(x>1 & x<=2, 1, x))
Is there sth wrong?
swl.y
2.2
1.2
3.4
5.6
I need to recode all the values actually:
neu$c <- with(neu, ifelse(swl.y>1 & swl.y <=2, 1, swl.y))
neu$c <- with(neu, ifelse(swl.y>2 & swl.y <=3, 2, swl.y))
neu$c <- with(neu, ifelse(swl.y>3 & swl.y <=4, 3, swl.y))
neu$c <- with(neu, ifelse(swl.y>4 & swl.y <=5, 4, swl.y))
neu$c <- with(neu, ifelse(swl.y>5 & swl.y <=6, 5, swl.y))
neu$c <- with(neu, ifelse(swl.y>6 & swl.y <=7, 6, swl.y))
I think I know where the problem is. When R runs the second line of code, the recoded values were back to the previous values.
Upvotes: 1
Views: 871
Reputation: 887048
We don't need to loop for a single column. By using lapply(neu$swl.y
, we are getting each element of the column as the list
element, which we may not need. The function ifelse
is vectorized and can be used directly on the column 'swl.y' with the logical condition mentioned in the OP's post.
neu$b <- with(neu, ifelse(swl.y>1 & swl.y <=2, 1, swl.y))
Or otherwise, we create 'b' column as 'swl.y' and change the values of 'b' based on the logical condition.
neu$b <- neu$swl.y
neu$b[with(neu, swl.y>1 & swl.y <=2)] <- 1
To better understand the problem with the OP's code, we can check the output from the lapply
lapply(neu$swl.y, function(x) x) #similar to `as.list(neu$swl.y)`
#[[1]]
#[1] 3
#[[2]]
#[1] 0
#[[3]]
#[1] 0
#[[4]]
#[1] 2
#[[5]]
#[1] 1
The output is a list
with each element of the column as list
elements. Using ifelse
on a list may not be optimum as it is vectorized (already mentioned above). But, suppose if we do with ifelse
lapply(neu$swl.y, function(x) ifelse(x>1 & x<=2, 1, x))
#[[1]]
#[1] 3
#[[2]]
#[1] 0
#[[3]]
#[1] 0
#[[4]]
#[1] 1
#[[5]]
#[1] 1
A data.frame
can be considered as a list
with list elements that are having the same length. So, based on the above output, this should be a data.frame with 5 columns and 1 row. By assinging to a single column 'b', we are instead creating a list
column with 5 list elements.
neu$b <- lapply(neu$swl.y, function(x) ifelse(x>1 & x<=2, 1, x))
str(neu)
#'data.frame': 5 obs. of 2 variables:
#$ swl.y: int 3 0 0 2 1
#$ b :List of 5
# ..$ : int 3
# ..$ : int 0
# ..$ : int 0
# ..$ : num 1
# ..$ : int 1
But, this is not we wanted. What is the remedy? One way is using sapply/vapply
instead of lapply
which returns a vector
output as the lengths are the same or we unlist
the lapply
output to create a vector
neu$b <- sapply(neu$swl.y, function(x) ifelse(x>1 & x<=2, 1, x))
str(neu)
#'data.frame': 5 obs. of 2 variables:
# $ swl.y: int 3 0 0 2 1
# $ b : num 3 0 0 1 1
Based on the OP's edited post, if we need multiple recodes, use either cut
or findInterval
. In the cut
, we can specify the breaks
and there are other arguments labels
to return the default label or not.
with(neu1, cut(swl.y, breaks=c(-Inf,1,2,3,4,5,6,Inf), labels=F)-1)
#[1] 2 1 3 5
set.seed(48)
neu <- data.frame(swl.y=sample(0:5, 5, replace=TRUE))
#newdata
neu1 <- structure(list(swl.y = c(2.2, 1.2, 3.4, 5.6)),
.Names = "swl.y", class = "data.frame", row.names = c(NA, -4L))
Upvotes: 3