tobias sch
tobias sch

Reputation: 359

R: Replacing values of one list element with values of a second list element

I want to replace the values of one element of a list with the values of a second element of a list. Specifically,

  • I have a list containing multiple data sets.
  • Each data set has 2 variables
  • The variables are factors
  • The n'th element of the second variable of each data set needs to be replaced with the n'th element of the first variable in each data set
  • Also, the replaced value should be called "replaced"
  • dat1 <- data.frame(names1 =c("a", "b", "c", "f", "x"),values= c("val1_1", "val2_1", "val3_1", "val4_1", "val5_1"))
       dat1$values <- as.factor(dat1$values)
    dat2 <- data.frame(names1 =c("a", "b", "f2", "s5", "h"),values= c("val1_2", "val2_2", "val3_2", "val4_2", "val5_2"))
       dat2$values <- as.factor(dat2$values)
    list1 <- list(dat1, dat2)
    

    The result should be the same list, but just with the 5th value replaced.

    [[1]]
         names1  values
    1         a  val1_1
    2         b  val2_1
    3         c  val3_1
    4         f  val4_1
    5  replaced       x
    [[2]]
         names1  values
    1         a  val1_2
    2         b  val2_2
    3        f2  val3_2
    4        s5  val4_2
    5  replaced       h
    

    Upvotes: 2

    Views: 2276

    Answers (2)

    akrun
    akrun

    Reputation: 886948

    Here is one option with tidyverse. Loop through the list with map, slice the row of interest (in this case, it is the last row, so n() can be used), mutate the column value and bind with the original data without the last row

    library(tidyverse)
    map(list1, ~ .x %>% 
                   slice(n()) %>%
                   mutate(values = names1, names1 = 'replaced') %>% 
                   bind_rows(.x %>% slice(-n()), .))
    #[[1]]
    #    names1 values
    #1        a val1_1
    #2        b val2_1
    #3        c val3_1
    #4        f val4_1
    #5 replaced      x
    
    #[[2]]
    #    names1 values
    #1        a val1_2
    #2        b val2_2
    #3       f2 val3_2
    #4       s5 val4_2
    #5 replaced      h
    

    Or it can be made more compact with fct_c from forcats. Different factor levels can be combined together with fct_c for the 'values' and 'names1' column

    library(forcats)
    map(list1, ~ .x %>% 
            mutate(values = fct_c(values[-n()], names1[n()]), 
                   names1 = fct_c(names1[-n()], factor('replaced'))))
    

    Or using similar approach with base R where we loop through the list with lapply, then convert the data.frame to matrix, rbind the subset of matrix i.e. the last row removed with the values of interest, and convert to data.frame (by default, stringsAsFactors = TRUE - so it gets converted to factor)

    lapply(list1,  function(x)  as.data.frame(rbind(as.matrix(x)[-5, ], 
                  c('replaced',  as.character(x$names1[5])))))
    

    Upvotes: 3

    Ronak Shah
    Ronak Shah

    Reputation: 388817

    A base R approach using lapply, since both the columns are factors we need to add new levels first before replacing them with new values otherwise those value would turn as NAs.

    n <- 5
    
    lapply(list1, function(x) {
       levels(x$values) <- c(levels(x$values), as.character(x$names1[n]))
       x$values[n] <- x$names1[n]
       levels(x$names1) <- c(levels(x$names1), "replaced")
       x$names1[n] <- "replaced"
       x
    })
    
    #[[1]]
    #    names1 values
    #1        a val1_1
    #2        b val2_1
    #3        c val3_1
    #4        f val4_1
    #5 replaced      x
    
    #[[2]]
    #    names1 values
    #1        a val1_2
    #2        b val2_2
    #3       f2 val3_2
    #4       s5 val4_2
    #5 replaced      h
    

    There is also another approach where we can convert both the columns to characters, then replace the values at required position and again convert them back to factors but since every dataframe in the list can be huge we do not want to convert all the values to characters and then back to factor just to change one value which could be computationally very expensive.

    Upvotes: 3

    Related Questions