Reputation: 297
I have the following tibble in R:
df <- tibble(desc=c("test1", "test2", "test3", "test4","test1"), code=c("X00.2", "Y10", "X20.234", "Z10", "Q23.2"))
I want to create a new dataframe as:
df <- tibble(desc=c("test1", "test1", "test2", "test3", "test3", "test3", "test3", "test4", "test1", "test1"), code=c("X00", "X00.2", "Y10", "X20", "X20.2", "X20.23", "X20.234", "Z10", "Q23", "Q23.2"))
How would I do this? I think I can do it with separate_rows in dplyr by manipulating the separator but not exactly sure.
Thank you in advance.
Upvotes: 0
Views: 332
Reputation: 389335
Here is one way using tidyverse
functions.
library(tidyverse)
df %>%
#n is the number of new rows to add
mutate(n = nchar(sub('.*\\.', '', code)) + 1,
#l is location of "."
l = str_locate(code, '\\.')[, 1],
#replace NA with 1
n = replace(n, is.na(l), 1),
l = ifelse(is.na(l), nchar(code), l),
r = row_number()) %>%
#Repeat each row n times
uncount(n) %>%
#For each desc
group_by(r) %>%
#Create code value incrementing one character at a time
mutate(code = map_chr(row_number(), ~substr(first(code), 1, l + .x - 1)),
#Remove "." which is present at the end of string
code = sub('\\.$', '', code)) %>%
ungroup %>%
select(-l, -r)
This returns
# A tibble: 10 x 2
# desc code
# <chr> <chr>
# 1 test1 X00
# 2 test1 X00.2
# 3 test2 Y10
# 4 test3 X20
# 5 test3 X20.2
# 6 test3 X20.23
# 7 test3 X20.234
# 8 test4 Z10
# 9 test1 Q23
#10 test1 Q23.2
Upvotes: 2