Leigh
Leigh

Reputation: 43

R Remove leading 0's from a factor string

I have imported multiple excel files into R using the read.csv() function.

On the smaller files, the leading 0's in the uniqueID column have been kept e.g. 085405, 021X1B, 0051012

However on the larger files, the leading 0's have been dropped from the uniqueID's where they only contain numbers e.g. 85405, 021X1B, 51012

I would like to drop the leading 0's from all uniqueID's so I am able to merge.

I have tried using the following code:

Test$UniqueID2 <- substr(Dataset$UniqueID,regexpr("[^0]",Dataset$UniqueID,nchar(Dataset$UniqueID))

This generated the following error:

Error in nchar(Dataset$UniqueID) : 
  'nchar()' requires a character vector

A solution which will allow me to drop leading 0's in R would be much appreciated.

Upvotes: 2

Views: 3682

Answers (1)

akrun
akrun

Reputation: 887118

We can use sub for this to match a zero (0) at the start (^) of the string followed by zero or more numbers ([0-9]*) until the end ($) of the string, which got captured as a group and replaced by the backreference (\\1) of the captured group

sub("^0+([0-9]*)$", "\\1", str1)
#[1] "85405"  "021X1B" "51012"

If we want to remove from all the IDs

sub("^0+", "", str1)

Or we can use the as.numeric approach

v1 <- as.numeric(str1)
v1[is.na(v1)] <- str1[is.na(v1)]

data

str1 <- c("085405", "021X1B", "0051012")

Upvotes: 4

Related Questions