Anya Pilipentseva
Anya Pilipentseva

Reputation: 187

Splitting and creating 2 rows out of one in R data table

I have a dataset (dt) like this in R:

n       id      val
1       1&&2    10
2       3       20
3       4&&5    30

And what I want to get is

n       id      val
1       1       10
2       2       10
3       3       20
4       4       30
5       5       30

I know that to split ids I need to do something like this: id_split <- strsplit(dt$id,"&&")

But how do I create new rows with the same val for ids which were initially together in a row?

Upvotes: 4

Views: 420

Answers (5)

linog
linog

Reputation: 6226

tstrplit by id from data.table can do the job

library(data.table)
df <- setDT(df)[,.('id' = tstrsplit(id, "&&")), by = c('n','val')]
df[,'n' := seq(.N)]

df
   n val id
1: 1  10  1
2: 2  10  2
3: 3  20  3
4: 4  30  4
5: 5  30  5

Upvotes: 2

Neel Kamal
Neel Kamal

Reputation: 1076

If anyone looking for tidy solution,

dt %>%
  separate(id, into = paste0("id", 1:2),sep = "&&") %>% 
  pivot_longer(cols = c(id1,id2), names_to = "id_name", values_to = "id") %>% 
  drop_na(id) %>% 
  select(n, id, val)

output as

# A tibble: 5 x 3
      n id      val
  <dbl> <chr> <dbl>
1     1 1        10
2     1 2        10
3     2 3        20
4     3 4        30
5     3 5        30

Edit: As suggested by @sotos, and completely missed by me. one liner solution

d %>% separate_rows(id, ,sep = "&&")

gives same output as

# A tibble: 5 x 3
      n id      val
  <dbl> <chr> <dbl>
1     1 1        10
2     1 2        10
3     2 3        20
4     3 4        30
5     3 5        30

Upvotes: 1

Frank Zhang
Frank Zhang

Reputation: 1688

A data.table solution.

library(data.table)
DT <- fread('n       id      val
1       1&&2    10
2       3       20
3       4&&5    30')


DT[,.(id=unlist(strsplit(id,split ="&&"))),by=.(n,val)][,n:=.I][]
#>    n val id
#> 1: 1  10  1
#> 2: 2  10  2
#> 3: 3  20  3
#> 4: 4  30  4
#> 5: 5  30  5

Created on 2020-05-08 by the reprex package (v0.3.0)

Note:

A more rebosut solution is by = 1:nrow(DT). But you need to play around your other columns though.

Upvotes: 1

Sotos
Sotos

Reputation: 51592

You can use the lengths from the split of id and expand your rows. Then set n to be the sequece of the length of your data frame, i.e.

l1 <- strsplit(as.character(df$id), '&&')
res_df <- transform(df[rep(seq_len(nrow(df)), lengths(l1)),], 
                    id = unlist(l1), 
                    n = seq_along(unlist(l1)))

which gives,

    n id val
1   1  1  10
1.1 2  2  10
2   3  3  20
3   4  4  30
3.1 5  5  30

You can remove the rownames with rownames(res_df) <- NULL

Upvotes: 1

jay.sf
jay.sf

Reputation: 73377

You may cbind the splits to get a column which you cbind again to the val (recycling).

res <- do.call(rbind, Map(data.frame, id=lapply(strsplit(dat$id, "&&"), cbind), 
                          val=dat$val))
res <- cbind(n=1:nrow(res), res)
res
#   n id val
# 1 1  1  10
# 2 2  2  10
# 3 3  3  20
# 4 4  4  30
# 5 5  5  30

Upvotes: 3

Related Questions