didine
didine

Reputation: 79

Split a comma separated string into defined number of pieces in R

I have a string of comma separated values that I'd like to split into several pieces based on the number of commas.

E.g.: Split the following string every 5 values or commas:

txt = "120923,120417,120416,105720,120925,120790,120792,120922,120928,120930,120918,120929,61065,120421" 

The result would be:

[1] 120923,120417,120416,105720,120925
[2] 120790,120792,120922,120928,120930
[3] 120918,120929,61065,120421

Upvotes: 2

Views: 645

Answers (3)

akrun
akrun

Reputation: 886948

Using str_extract

library(stringr)
str_extract_all(txt, "\\d+(,\\d+){1,4}")[[1]]
#[1] "120923,120417,120416,105720,120925" "120790,120792,120922,120928,120930"
#[3] "120918,120929,61065,120421"   

Upvotes: 0

Tim Biegeleisen
Tim Biegeleisen

Reputation: 520918

One base R option would be to use gregexpr with the following regex pattern:

\d+(?:,\d+){0,4}

This pattern would match one number, followed greedily by zero to four other CSV numbers. Note that because the pattern is greedy, it would always try to match the maximum numbers available remaining in the input.

txt <- "120923,120417,120416,105720,120925,120790,120792,120922,120928,120930,120918,120929,61065,120421"
regmatches(txt,gregexpr("\\d+(?:,\\d+){0,4}",txt))

[1] "120923,120417,120416,105720,120925" "120790,120792,120922,120928,120930"
[3] "120918,120929,61065,120421"     

Upvotes: 4

Ronak Shah
Ronak Shah

Reputation: 388817

We could split the text on comma (',') and divide them into group of 5.

temp <- strsplit(txt, ",")[[1]]
split(temp, rep(seq_along(temp), each  = 5, length.out = length(temp)))

#$`1`
#[1] "120923" "120417" "120416" "105720" "120925"

#$`2`
#[1] "120790" "120792" "120922" "120928" "120930"

#$`3`
#[1] "120918" "120929" "61065"  "120421"

If you want them as one concatenated string we can use by

as.character(by(temp, rep(seq_along(temp), each  = 5, 
                      length.out = length(temp)), toString))

Upvotes: 4

Related Questions