D. Fowler
D. Fowler

Reputation: 635

how to separate 2 numbers within a column in R

In R, I want to separate numbers that are in the same column. My data appear like this:

id   time       
1    1,2    
2    3,4    
3    4,5,6

I want it to appear like this:

1    1
1    2
2    3
2    4
3    4
3    5
3    6

Though not shown, there are different iterations of time that vary depending on the id. For example:

4    1,6,7      
5    1,3,6    
6    1,4,5    
7    1,3,5    
8    2,3,4

There are 100 ids and the time column has different #s that vary in order as shown above.

Does anyone have advice to do this?

Upvotes: 1

Views: 782

Answers (2)

Ben
Ben

Reputation: 30494

Using tidyverse you could try the following. Make sure time is character type, and use strsplit to split up into single characters.

library(tidyverse)

df %>%
  mutate(time = strsplit(as.character(time), ",")) %>%
  unnest(cols = time)

Or you can just use separate_rows and indicate comma as separator:

df %>%
  separate_rows(time, sep = ',')

Or in base R you could try this:

s <- strsplit(df$time, ',', fixed = T)
data.frame(id = unlist(s), time = rep(df$id, lengths(s)))

Output

# A tibble: 10 x 2
      id time 
   <int> <chr>
 1     1 1    
 2     1 2    
 3     2 3    
 4     2 4    
 5     3 4    
 6     3 5    
 7     3 6    
 8     4 1    
 9     4 6    
10     4 7 

Data

df <- structure(list(id = 1:4, time = c("1,2", "3,4", "4,5,6", "1,6,7"
)), class = "data.frame", row.names = c(NA, -4L))

Upvotes: 1

akrun
akrun

Reputation: 887571

An option with separate_rows

library(dplyr)
library(tidyr)
df %>% 
 separate_rows(time, sep = "(?<=.)(?=.)", convert = TRUE)
# A tibble: 4 x 2
#     id  time
#  <dbl> <int>
#1     1     1
#2     1     2
#3     2     3
#4     2     4

data

df <- structure(list(id = c(1, 2), time = c(12, 34)), class = "data.frame", 
row.names = c(NA, 
-2L))

Upvotes: 2

Related Questions