Emilio M. Bruna
Emilio M. Bruna

Reputation: 359

Split a vector into three vectors of unequal length in R

a questions from a relative n00b: I’d like to split a vector into three vectors of different lengths, with the values assigned to each vector at random. For example, I’d like to split the vector of length 12 below into vectors of length 2,3, and 7

I can get three equal sized vectors using this:

test<-1:12
split(test,sample(1:3))

Any suggestions on how to split test into vectors of 2,3, and 7 instead of three vectors of length 4?

Upvotes: 7

Views: 14683

Answers (5)

LMc
LMc

Reputation: 18632

library(vctrs)

test <- 1:12
vec_chop(test, sizes = c(2, 3, 7))
# [[1]]
# [1] 1 2
# 
# [[2]]
# [1] 3 4 5
# 
# [[3]]
# [1]  6  7  8  9 10 11 12

Upvotes: 0

theLudo
theLudo

Reputation: 137

It is easier than you think. To split the vector in three new randomly chosen sets run the following code:

test <- 1:12
split(sample(test), 1:3)

By doing so any time you run your this code you would get a new random distribution in three different sets(perfect for k-fold cross validation).

You get:

> split(sample(test), 1:3)
$`1`
[1] 5 8 7 3

$`2`
[1]  4  1 10  9

$`3`
[1]  2 11 12  6

> split(sample(test), 1:3)
$`1`
[1] 12  6  4  1

$`2`
[1] 3 8 7 5

$`3`
[1]  9  2 10 11

Upvotes: 1

Dason
Dason

Reputation: 61913

You could use rep to create the indices for each group and then split based on that

split(1:12, rep(1:3, c(2, 3, 7)))

If you wanted the items to be randomly assigned so that it's not just the first 2 items in the first vector, the next 3 items in the second vector, ..., you could just add call to sample

split(1:12, sample(rep(1:3, c(2, 3, 7))))

If you don't have the specific lengths (2,3,7) in mind but just don't want it to be equal length vectors every time then SimonO101's answer is the way to go.

Upvotes: 15

darmat
darmat

Reputation: 728

You could use an auxiliary vector to format the way you want to split your data. Example:

Data <- c(1,2,3,4,5,6)

Format <- c("X","Y","X","Y","Z,"Z")

output <- split(Data,Format)

Will generate the output:

$X
[1] 1 3

$Y
[1] 2 4

$Z
[1] 5 6

Upvotes: 0

Simon O&#39;Hanlon
Simon O&#39;Hanlon

Reputation: 59970

How about using sample slightly differently...

set.seed(123)
test<-1:12
split( test , sample(3, 12 , repl = TRUE) )

#$`1`
#[1] 1 6

#$`2`
#[1]  3  7  9 10 12

#$`3`
#[1]  2  4  5  8 11

set.seed(1234)
test<-1:12
split( test , sample(3, 12 , repl = TRUE) )

#$`1`
#[1] 1 7 8

#$`2`
#[1]  2  3  4  6  9 10 12

#$`3`
#[1]  5 11

The first argument in sample is the number of groups to split the vector into. The second argument is the number of elements in the vector. This will randomly assign each successive element into one of 3 vectors. For 4 vectors just do split( test , sample(4, 12 , repl = TRUE) ).

Upvotes: 5

Related Questions