Reputation: 43
I have imported multiple excel files into R using the read.csv() function.
On the smaller files, the leading 0's in the uniqueID column have been kept e.g. 085405, 021X1B, 0051012
However on the larger files, the leading 0's have been dropped from the uniqueID's where they only contain numbers e.g. 85405, 021X1B, 51012
I would like to drop the leading 0's from all uniqueID's so I am able to merge.
I have tried using the following code:
Test$UniqueID2 <- substr(Dataset$UniqueID,regexpr("[^0]",Dataset$UniqueID,nchar(Dataset$UniqueID))
This generated the following error:
Error in nchar(Dataset$UniqueID) :
'nchar()' requires a character vector
A solution which will allow me to drop leading 0's in R would be much appreciated.
Upvotes: 2
Views: 3682
Reputation: 887118
We can use sub
for this to match a zero (0
) at the start (^
) of the string followed by zero or more numbers ([0-9]*
) until the end ($
) of the string, which got captured as a group and replaced by the backreference (\\1
) of the captured group
sub("^0+([0-9]*)$", "\\1", str1)
#[1] "85405" "021X1B" "51012"
If we want to remove from all the IDs
sub("^0+", "", str1)
Or we can use the as.numeric
approach
v1 <- as.numeric(str1)
v1[is.na(v1)] <- str1[is.na(v1)]
str1 <- c("085405", "021X1B", "0051012")
Upvotes: 4