Reputation: 1
I'm having the following difficulty in R: a dataframe has a column with some IDs, which are in the character format. What's the most concise way I can delete the 0s in front?
Example:
This is what I have:
ID <- as.character(c("001001","0001002","01003","001004","1005"))
order <- c("a","b","c","d","e")
df <- as.data.frame(cbind(ID, order))
This is what I want:
ID2 <- as.character(c("1001","1002","1003","1004","1005"))
order2 <- c("a","b","c","d","e")
df2 <- as.data.frame(cbind(ID, order))
I've tried replacing strings but it deletes the 0s I don't want (ex: the ID2[1] = 11).
Thanks in advance!
Upvotes: 0
Views: 233
Reputation: 1688
use trimws
from base R
trimws(c("001001","0001002","01003","001004","1005"),which = "left",whitespace = "0")
#> [1] "1001" "1002" "1003" "1004" "1005"
Created on 2020-06-30 by the reprex package (v0.3.0)
Upvotes: 7
Reputation: 886938
It is easier to do this if we convert to integer or numeric class
as numeric values cannot have 0 prefix. After the conversion, just wrap with as.character
if we need the class to remain as character
df$ID <- as.character(as.integer(df$ID))
df$ID
#[1] "1001" "1002" "1003" "1004" "1005"
It could also be done in a regex way (unnecessary though)
df$ID <- sub("^0+", "", df$ID)
In the above code, we match one or more 0s (0+
) at the start (^
) of the string and replace with blank (""
)
if the IDs have characters other than digits, an option is also to capture the digits after the prefix 0's and replace with the backreference (\\1
) of the captured groups. This would make sure that strings "0xyz" remains as such
df$ID <- sub("^0+(\\d+)$", "\\1", df$ID)
Upvotes: 2