Mark Miller
Mark Miller

Reputation: 3096

efficiently convert two or more columns from dataframe into row-wise list of vectors

I have a dataframe like this:

capitals <- structure(
  list(
    country = structure(
      c(1L, 3L, 2L),
      .Label = c("france",
                 "germany", "uk"),
      class = "factor"
    ),
    capital = structure(
      3:1,
      .Label = c("berlin",
                 "london", "paris"),
      class = "factor"
    ),
    currency = structure(
      c(1L,
        2L, 1L),
      .Label = c("euro", "pound"),
      class = "factor"
    )
  ),
  class = "data.frame",
  row.names = c(NA, -3L)
)

I would like to create a list of vectors, like this

list(
  c("france", "paris", "euro"),
  c("germany", "berlin", "euro"),
  c("uk", "london", "pound")
)

This gets close to what I want to do, but it generates a list of lists (not a list of vectors) and I'm concerned that it will be slow at scale.

temp <- apply(
  X = capitals,
  MARGIN = 1,
  FUN = function(currentrow) {
    return(list(currentrow[['country']], currentrow[['capital']], currentrow[['currency']]))
  }
)

Upvotes: 0

Views: 50

Answers (2)

dvd280
dvd280

Reputation: 962

a dataframe can be transposed and converted to a list:

newList <- data.frame(lapply(capitals, as.character), stringsAsFactors=FALSE)
newList = capitals %>% t %>% as.data.frame %>% as.list %>% as.character

use library(magrittr) to enable the use of pipe operators: %>% if you haven't seen them before.

Upvotes: 2

Andrew
Andrew

Reputation: 5138

Depending on the size of your dataframe, just using coercion may be most efficient (i.e., coercing to a matrix is not fast for very large dataframes):

as.list(as.data.frame(t(capitals), stringsAsFactors = FALSE))
$V1
[1] "france" "paris"  "euro"  

$V2
[1] "uk"     "london" "pound" 

$V3
[1] "germany" "berlin"  "euro"  

Or, if you would like named vectors you could also use:

capital_t <- t(capitals)
lapply(seq_len(ncol(capital_t)), function(i) capital_t[, i])

[[1]]
 country  capital currency 
"france"  "paris"   "euro" 

[[2]]
 country  capital currency 
    "uk" "london"  "pound" 

[[3]]
  country   capital  currency 
"germany"  "berlin"    "euro" 

EDIT: I am confused, the result is a list character vectors. What do you want the output to be?

sapply(as.list(as.data.frame(t(capitals), stringsAsFactors = FALSE)), class)
         V1          V2          V3 
"character" "character" "character"

Upvotes: 1

Related Questions