icedcoffee
icedcoffee

Reputation: 1015

How to collapse values in a list to allow a list column in a dataframe to be converted to a vector?

I have a dataframe, df:

df <- structure(list(ID = c("ID1", "ID2", "ID3"), values = list(A = "test", 
    B = c("test2", "test3"), C = "test4")), row.names = c(NA, 
-3L), class = "data.frame")

df
   ID       values
1 ID1         test
2 ID2 test2, test3
3 ID3        test4


sapply(df, class)
         ID      values 
"character"      "list" 

I'm trying to create a function that will run through each row of df$values, and if the length is greater than one, paste the values into one string. So the data frame will look the same, but will have a different structure:

df
   ID       values
1 ID1         test
2 ID2 test2, test3
3 ID3        test4

dput(df)
structure(list(ID = c("ID1", "ID2", "ID3"), values = c("test", 
"test2, test3", "test4")), class = "data.frame", row.names = c(NA, 
-3L))

sapply(df, class)
         ID      values 
"character" "character"

(Note how in the end result, both columns are character columns, rather than a character column and a list).

I tried making a function to do this, but it doesn't work (and is very messy):

newcol <- NULL
for (i in nrow(df)) {
    row <- df$values[i] %>%
        unlist(., use.names = FALSE)

    if (length(row) == 1) {
        newcol = rbind(row, newcol)
    } else if (length(row)>1) {
        row = paste0(row[1], ", ", row[2])
        newcol = rbind(row, newcol)
    }
}
df$values <- newcol

Is there an easier way to do this (that works), and that can do it for any size of list entry? (eg. if df$values has a row entry that was "test6", test7, test8, test9").

Upvotes: 0

Views: 145

Answers (2)

akrun
akrun

Reputation: 887168

Using tidyverse

library(dplyr)
library(purrr)
df <- df %>%
   mutate(values = map_chr(values, toString))

Upvotes: 0

Ronak Shah
Ronak Shah

Reputation: 388982

We can use sapply with toString :

df$values <- sapply(df$values, toString)
sapply(df, class)

#        ID      values 
#"character" "character" 

str(df)
#'data.frame':  3 obs. of  2 variables:
# $ ID    : chr  "ID1" "ID2" "ID3"
# $ values: chr  "test" "test2, test3" "test4"

toString is shorthand for paste0(..., collapse = ',').

df$values <- sapply(df$values, paste0, collapse = ',')

Upvotes: 1

Related Questions