Andres Mora
Andres Mora

Reputation: 1106

how to add leading zeroes to some values?

My dataset includes 3 different types of values, two of them have dashes.

df=c("20001982-02", "19933626-02", "20024861-6", "29114-1", "20109774-02", 
"19965663-01", "19992655-01", "20087008-08", "140107", "20032011-09", 
"139")

I need to add leading zeroes to the values that have a dash so they match pattern XXXXXXXX-XX

df.new =c("20001982-02", "19933626-02", "20024861-06", "00029114-01", 
"20109774-02", "19965663-01", "19992655-01", "20087008-08", "140107", "20032011-09", "139")

So far i have this but only does part of the job (see 3rd element as i need it to be 00029114-01)

sub("^(\\d{8})-(\\d)$", "\\1-0\\2", df)

df.new = c("20001982-02", "19933626-02", "20024861-06", "29114-1", "20109774-02", 
"19965663-01", "19992655-01", "20087008-08", "140107", "20032011-09", 
"139")

Upvotes: 0

Views: 293

Answers (3)

Chris Ruehlemann
Chris Ruehlemann

Reputation: 21400

This should work:

library(stringr)
df1 <- sub("-(\\d$)", "-0\\1", df)
df2 <- ifelse(grepl("-\\d", df1), 
              str_pad(df1, width = 11, side = "left", pad = "0"), 
              df1)
 [1] "20001982-02" "19933626-02" "20024861-06" "00029114-01" "20109774-02" "19965663-01" "19992655-01" "20087008-08"
 [9] "140107"      "20032011-09" "139"

Upvotes: 1

akrun
akrun

Reputation: 887088

We can use grepl with sprintf from base R. Split the dataset at - with read.table, use sprintf to join back into a single string specifying the fmt for adding the leading zeros, create the condition in ifelse to return that new format when there is - or else the old one

out <- ifelse(grepl('-', df), do.call(sprintf, c(fmt = '%08d-%02d', 
 read.table(text = df, header = FALSE, sep="-", fill = TRUE))), df)
identical(df.new, out)
#[1] TRUE

Upvotes: 2

Till
Till

Reputation: 6628

stingr::pad() is great for this.

library(stringr)

df_dash <- df.new[grepl("-", df.new)]

Leading zeros

str_pad(df_dash, width = 11, pad = 0)
#> [1] "20001982-02" "19933626-02" "20024861-06" "00029114-01" "20109774-02"
#> [6] "19965663-01" "19992655-01" "20087008-08" "20032011-09"

Wrap it in ifelse() to return the original character vector with intended modifications.

ifelse(grepl("-", df.new), str_pad(df.new, width = 11, pad = 0), df.new)
#>  [1] "20001982-02" "19933626-02" "20024861-06" "00029114-01" "20109774-02"
#>  [6] "19965663-01" "19992655-01" "20087008-08" "140107"      "20032011-09"
#> [11] "139"

Trailing zeros

split_strings <- strsplit(df_dash, "-")
sapply(split_strings, function(x) paste(str_pad(x[1], width = 8, side = "right",  pad = 0), x[2], sep = "-"))
#> [1] "20001982-02" "19933626-02" "20024861-06" "29114000-01" "20109774-02"
#> [6] "19965663-01" "19992655-01" "20087008-08" "20032011-09"

Upvotes: 0

Related Questions