Reputation: 1134
I have a unique id that should in total contain 13 characters, 15 with dash. It should look like this
2005-067-000043
However some entries might be like this
2005-067-00043 or 2005-67-000043 or 2005-067-0000043
I would like a script that says between first and second dash there should be three characters, if more cut zeros in front and if less add zero in front. Same goes for the last section where it says after last dash there should be six characters if less add zero in front or if more cut zero in front.
Upvotes: 0
Views: 126
Reputation: 389235
You can split up the data into 3 columns, keep only 3 and 6 characters in 2nd and 3rd column and combine the columns into one again.
library(dplyr)
library(tidyr)
separate(df, x, paste0('col', 1:3), sep = '-') %>%
mutate(col2 = sprintf('%03s', substring(col2, nchar(col2) - 2)),
col3 = sprintf('%06s', substring(col3, nchar(col3) - 5))) %>%
unite(result, starts_with('col'), sep = '-')
# result
#1 2005-067-000043
#2 2005-067-000043
#3 2005-067-000043
#4 2005-067-000043
x <- c('2005-067-000043', '2005-067-00043', '2005-67-000043', '2005-067-0000043')
df <- data.frame(x)
df
# x
#1 2005-067-000043
#2 2005-067-00043
#3 2005-67-000043
#4 2005-067-0000043
Upvotes: 1