Margaret
Margaret

Reputation: 5929

Convert length-one list data frame field back into data frame column

I have a dataset, and for some reason some of the text fields are coming through as length-one lists, rather than straight values. (We are still investigating why.) In the meantime, as a stopgap, I would like to convert those lists back into the standard fields they should be.

Here's an example of the kind of data structure I'm seeing:

library(dplyr)

mtcars %>%
  bind_cols(n = I(list("x"))) %>%
  str()

Which comes out looking like this:

enter image description here

Obviously, what I am looking for is that final column to be a column, not a whole bunch of individual lists. Since it might be more than one column in the dataset, it would be good if the approach is flexible enough to go, "For each column, if it is a list, make it a column again".

Is this possible? I've found a bunch of stuff online on how to append a list to a data frame as a column, or how to pull the contents of a column out into a list, but nothing that quite fits the scenario I'm describing here.

Upvotes: 0

Views: 222

Answers (1)

Ronak Shah
Ronak Shah

Reputation: 388982

If like in the example, length of all the list element is 1 you can use unlist on all list columns.

library(dplyr)

data <- data %>% mutate(across(where(is.list), unlist)) 
data

#'data.frame':  32 obs. of  12 variables:
# $ mpg : num  21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
# $ cyl : num  6 6 4 6 8 6 8 4 4 6 ...
# $ disp: num  160 160 108 258 360 ...
# $ hp  : num  110 110 93 110 175 105 245 62 95 123 ...
# $ drat: num  3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
# $ wt  : num  2.62 2.88 2.32 3.21 3.44 ...
# $ qsec: num  16.5 17 18.6 19.4 17 ...
# $ vs  : num  0 0 1 1 0 1 0 1 1 1 ...
# $ am  : num  1 1 1 0 0 0 0 0 0 0 ...
# $ gear: num  4 4 4 3 3 3 3 4 4 4 ...
# $ carb: num  4 4 1 1 2 1 4 2 2 4 ...
# $ n   : chr  "x" "x" "x" "x" ...

However, more often than not this happens because you have at least one entry which has length greater than 1 (Use which(lengths(data$n) > 1) to check) in which case it would be safer to select the 1st value from each list.

data <- data %>% mutate(across(where(is.list), ~sapply(.x, `[[`, 1)))

data

data <- mtcars %>% bind_cols(n = I(list("x")))

Upvotes: 1

Related Questions