Devin King
Devin King

Reputation: 173

Convert named vector to list in R

Suppose I have the following named numeric vector:

a <- 1:8
names(a) <- rep(c('I', 'II'), each = 4)

How can I convert this vector to a list of length 2 (shown below)?

a.list
# $I
# [1] 1 2 3 4
# $II
# [1] 5 6 7 8

Note that as.list(a) is not what I'm looking for. My very unsatisfying (and slow for large vectors) solution is:

names.uniq <- unique(names(a))
a.list <- setNames(vector('list', length(names.uniq)), names.uniq)
for(i in 1:length(names.uniq)) {
  names.i <- names.uniq[i]
  a.i <- a[names(a)==names.i]
  a.list[[names.i]] <- unname(a.i)
}

Thank you in advance for your help, Devin

Upvotes: 16

Views: 15225

Answers (5)

Researchnology
Researchnology

Reputation: 21

A quick solution is using lapply(), as elements of the list created by it will take the name from the vector/list it applied the function to. So in this case:

> a <- 1:8
> names(a) <- rep(c('I', 'II'), each = 4)
> a %>% lapply(function(x) x)
$I
[1] 1

$I
[1] 2

$I
[1] 3

$I
[1] 4

$II
[1] 5

$II
[1] 6

$II
[1] 7

$II
[1] 8

Upvotes: 0

Gwang-Jin Kim
Gwang-Jin Kim

Reputation: 10010

To handle also unnamed vectors, use then:

vec_to_list <- function(vec) {
  if (is.null(names(vec))) names(vec) <- 1:length(vec)
  split(unname(vec), names(vec))
}

Upvotes: 0

A5C1D2H2I1M1N2O1R2T1
A5C1D2H2I1M1N2O1R2T1

Reputation: 193677

I'd suggest looking at packages that excel at working with aggregating large amounts of data, like the data.table package. With data.table, you could do:

a <- 1:5e7
names(a) <- c(rep('I',1e7), rep('II',1e7), rep('III',1e7),
              rep('IV',1e7), rep('V',1e7))

library(data.table)
temp <- data.table(names(a), a)[, list(V2 = list(a)), V1]
a.list <- setNames(temp[["V2"]], temp[["V1"]])

Here are some functions to test the various options out with:

myFun <- function(invec) {
  x <- data.table(names(invec), invec)[, list(V2 = list(invec)), V1]
  setNames(x[["V2"]], x[["V1"]])
}

rui1 <- function(invec) {
  a.list <- split(invec, names(invec))
  lapply(a.list, unname)
}

rui2 <- function(invec) {
  split(unname(invec), names(invec))
}

op <- function(invec) {
  names.uniq <- unique(names(invec))
  a.list <- setNames(vector('list', length(names.uniq)), names.uniq)
  for(i in 1:length(names.uniq)) {
    names.i <- names.uniq[i]
    a.i <- a[names(invec) == names.i]
    a.list[[names.i]] <- unname(a.i)
  }
  a.list
}

And the results of microbenchmark on 10 replications:

library(microbenchmark)
microbenchmark(myFun(a), rui1(a), rui2(a), op(a), times = 10)
# Unit: milliseconds
#      expr       min        lq      mean    median       uq      max neval
#  myFun(a)  698.1553  768.6802  932.6525  934.6666 1056.558 1168.889    10
#   rui1(a) 2967.4927 3097.6168 3199.9378 3185.1826 3319.453 3413.185    10
#   rui2(a) 2152.0307 2285.4515 2372.9896 2362.7783 2426.821 2643.033    10
#     op(a) 2672.4703 2872.5585 2896.7779 2901.7979 2971.782 3039.663    10

Also, note that in testing the different solutions, you might want to consider other scenarios, for instance, cases where you expect to have lots of different names. In that case, your for loop slows down significantly. Try, for example, the above functions with the following data:

set.seed(1)
b <- sample(100, 5e7, TRUE)
names(b) <- sample(c(letters, LETTERS, 1:100), 5e7, TRUE)

Upvotes: 5

Devin King
Devin King

Reputation: 173

Testing Rui Barradas' solution vs my original solution on a larger vector

  a <- 1:5e7
  names(a) <- c(rep('I',1e7), rep('II',1e7), rep('III',1e7), rep('IV',1e7), rep('V',1e7))

Rui's

st1 <- Sys.time()
 a.list <- split(a, names(a))
 a.list <- lapply(a.list, unname)
Sys.time() - st1
Time difference of 2.560906 secs

Mine

st1 <- Sys.time()
names.uniq <- unique(names(a))
a.list <- setNames(vector('list', length(names.uniq)), names.uniq)
for(i in 1:length(names.uniq)) {
names.i <- names.uniq[i]
  a.i <- a[names(a)==names.i]
  a.list[[names.i]] <- unname(a.i)
}
Sys.time() - st1
Time difference of 2.712066 secs

thelatemail's

st1 <- Sys.time()
  a.list <- split(unname(a),names(a))
Sys.time() - st1
Time difference of 1.62851 secs

Upvotes: 1

Rui Barradas
Rui Barradas

Reputation: 76651

Like I said in the comment, you can use split to create a list.

a.list <- split(a, names(a))
a.list <- lapply(a.list, unname)

A one-liner would be

a.list <- lapply(split(a, names(a)), unname)
#$I
#[1] 1 2 3 4
#
#$II
#[1] 5 6 7 8

EDIT.
Then, thelatemail posted a simplification of this in his comment. I've timed it using Devin King's way and it's not only simpler it's also 25% faster.

a.list <- split(unname(a),names(a))

Upvotes: 28

Related Questions