Reputation: 23
I am pretty new to R and am attempting to create a new columns/variables in my data set, df
, using information from multiple columns which already exist in my data set. I was hoping to use the mapply
function to carry this out. This is data which is referring to certain measurements taken on the right side of someone and also on the left. Only one of these sides is affected however and is defined by df$laterality
. Ultimately, I would like to create new variable/columns which defines the data collected from the measurements as data collected from the affected side.
My data, simplified, essentially looks like the following
recordID <- c(1, 2, 3, 4)
laterality <- c(right, right, left, right)
right_1_measure <- c(2.3, 3.4, 1.7, 2.4)
right_2_measure <- c(1.3, 2.2, 3.1, 4.1)
right_3_measure <- c(2.7, 2.8, 4.2, 3.9)
left_1_measure <- c(1.5, 2.6, 4.5, 2.8)
left_2_measure <- c(1.1, 3.4, 3.5, 2.6)
left_3_measure <- c (2.6, 2.8, 3.6, 1.6)
df <- data.frame(recordID, laterality, right_1_measure, right_2_measure, right_3_measure, left_1_measure, left_2_measure, left_3_measure)
I then created a vector of the column names I wished to cycle through to make the new " affected" variable/columns, which I would name in accordance to the previously defined variables but add the prefix "aff". I also created a vector of the names I hoped to give the new columns.
right_vars <- c("right_1_measure", "right_2_measure" , "right_3_measure")
left_vars <- c("left_1_measure", "left_2_measure" , "left_3_measure")
aff_vars <- c("aff_1_measure", "aff_2_measure", "aff_3_measure")
I then created the function which I was planning to use to conditionally create the new columns based on df$laterality
aff_var_create <- function (x, y, z){
df$x <- ifelse(df$laterality == "Right" , df$y, ifelse (df$laterality == "Left", df$z, NA))
}
Then I created my mapply
code
mapply(FUN = aff_var_create, x = aff_vars, y = r_vars, z = l_vars)
However, when I run this I receive the following error message:
Error in ans[ypos] <- rep(yes, length.out = len)[ypos] :
replacement has length zero
In addition: Warning message:
In rep(yes, length.out = len) :
Error in ans[ypos] <- rep(yes, length.out = len)[ypos] :
replacement has length zero
Ive checked my data frame and all columns have data in them, so I am confused as to why the y.pos has zero length.
Ultimately, I would like my data frame to look like the following
recordID <- c(1, 2, 3, 4)
laterality <- c(right, right, left, right)
right_1_measure <- c(2.3, 3.4, 1.7, 2.4)
right_2_measure <- c(1.3, 2.2, 3.1, 4.1)
right_3_measure <- c(2.7, 2.8, 4.2, 3.9)
left_1_measure <- c(1.5, 2.6, 4.5, 2.8)
left_2_measure <- c(1.1, 3.4, 3.5, 2.6)
left_3_measure <- c (2.6, 2.8, 3.6, 1.6)
aff_1_measure <- c(2.3, 3.4, 4.5, 2.4)
aff_2_measure <- c(1.3, 2.2, 3.5, 4.1)
aff_3_measure <- c(2.7, 2.8, 3.6, 3.9)
df <- data.frame(recordID, laterality, right_1_measure, right_2_measure, right_3_measure, left_1_measure, left_2_measure, left_3_measure, aff_1_measure, aff_2_measure, aff_3_measure)
Any suggestions to fixing this issue or using another method to achieve a similar result would be much appreciated! Thank you.
Upvotes: 2
Views: 146
Reputation: 107687
You cannot dynamically pass string value with $
notation. Instead use [[
. Also, since mapply
does not update data frame in place, you need to assign results to columns:
right_vars <- c("right_1_measure", "right_2_measure" , "right_3_measure")
left_vars <- c("left_1_measure", "left_2_measure" , "left_3_measure")
aff_vars <- c("aff_1_measure", "aff_2_measure", "aff_3_measure")
aff_var_create <- function(x, y, z){
ifelse(df$laterality == "right" , df[[y]], ifelse(df$laterality == "left", df[[z]], NA))
}
df[aff_vars] <- mapply(FUN=aff_var_create, x=aff_vars, y=right_vars, z=left_vars)
df
Alternatively, assign by indexing with [
.
aff_cols <- paste0("aff_", 1:3, "_measure")
right_cols <- paste0("right_", 1:3, "_measure")
left_cols <- paste0("left_", 1:3, "_measure")
curr_logic <- df$laterality == "right"
# INITIALIZE COLUMNS
df[aff_cols] <- NA
# UPDATE COLUMNS BY INDEX
df[curr_logic , aff_cols] <- df[curr_logic , right_cols]
df[!curr_logic , aff_cols] <- df[!curr_logic, left_cols]
df
Even better, use a single ifelse
call since it can run vector and matrix comparison aligning to same dimensions (hence, replicate
).
aff_cols <- paste0("aff_", 1:3, "_measure")
right_cols <- paste0("right_", 1:3, "_measure")
left_cols <- paste0("left_", 1:3, "_measure")
curr_logic <- df$laterality == "right"
df[aff_cols] <- ifelse(replicate(3, curr_logic),
as.matrix(df[right_cols]),
as.matrix(df[left_cols]))
df
Upvotes: 2
Reputation: 16988
It's not a mapply
-solution but for this kind of data work I recommend using the tidyverse
package or at least parts of it:
library(dplyr)
library(tidyr)
df %>%
pivot_longer(matches("_\\d+_measure"), names_to=c("side", "no"), names_pattern="(\\w+)_(\\d+)_measure") %>%
filter(laterality == side) %>%
select(-side) %>%
pivot_wider(names_from=no, names_glue="aff_{no}_measure") %>%
full_join(df, by=c("recordID", "laterality"))
which returns
# A tibble: 4 x 11
recordID laterality aff_1_measure aff_2_measure aff_3_measure right_1_measure right_2_measure right_3_measure
<dbl> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 right 2.3 1.3 2.7 2.3 1.3 2.7
2 2 right 3.4 2.2 2.8 3.4 2.2 2.8
3 3 left 4.5 3.5 3.6 1.7 3.1 4.2
4 4 right 2.4 4.1 3.9 2.4 4.1 3.9
# ... with 3 more variables: left_1_measure <dbl>, left_2_measure <dbl>, left_3_measure <dbl>
Note: you can easily change the order of your columns so this output matches your desired output.
What did I do?
pivot_longer
. This allows us to filter the data for the correct laterality.aff_n_measure
columns using pivot_wider
.full_join
.Upvotes: 0