user35131
user35131

Reputation: 1134

How to add missing zeros in a unique identifier that is missing some values using R?

I have a unique id that should in total contain 13 characters, 15 with dash. It should look like this

2005-067-000043

However some entries might be like this

2005-067-00043 or 2005-67-000043 or 2005-067-0000043

I would like a script that says between first and second dash there should be three characters, if more cut zeros in front and if less add zero in front. Same goes for the last section where it says after last dash there should be six characters if less add zero in front or if more cut zero in front.

Upvotes: 0

Views: 126

Answers (1)

Ronak Shah
Ronak Shah

Reputation: 389235

You can split up the data into 3 columns, keep only 3 and 6 characters in 2nd and 3rd column and combine the columns into one again.

library(dplyr)
library(tidyr)

separate(df, x, paste0('col', 1:3), sep = '-') %>%
  mutate(col2 = sprintf('%03s', substring(col2, nchar(col2) - 2)), 
         col3 = sprintf('%06s', substring(col3, nchar(col3) - 5))) %>%
  unite(result, starts_with('col'), sep = '-')

#           result
#1 2005-067-000043
#2 2005-067-000043
#3 2005-067-000043
#4 2005-067-000043
x <- c('2005-067-000043', '2005-067-00043', '2005-67-000043', '2005-067-0000043')
df <- data.frame(x)
df

#                 x
#1  2005-067-000043
#2   2005-067-00043
#3   2005-67-000043
#4 2005-067-0000043

Upvotes: 1

Related Questions