dc3
dc3

Reputation: 188

Matching the first digits in R

Anyone know a good way to match and categorize the first n digits of a number in R?

For example,

123451
123452
123461
123462

In this case, the if we match on the first n=1-4 digits, we would get all the same group. If we match with n=5 digits, we would get 2 groups.

I thought about doing this by making the numeric vector a character vector, splitting it so that each number is an element that can then be truncated to n digits, and matching based on those digits; however, I have a lot of numbers, and it seems there must be a better way to sort only the first n digits of a number in R. Any thoughts?

Thanks!

Upvotes: 1

Views: 503

Answers (1)

Ken Benoit
Ken Benoit

Reputation: 14902

Here's a vectorised solution that does not involve conversion to character:

nums <- c(123451,
          123452,
          123461,
          123462)

firstDigits <- function(x, n) {
    ndigits <- floor(log10(x)) + 1
    floor(x / 10^(ndigits - n))
}

factor(firstDigits(nums, 4))
## [1] 1234 1234 1234 1234
## Levels: 1234
factor(firstDigits(nums, 5))
## [1] 12345 12345 12346 12346
## Levels: 12345 12346
factor(firstDigits(nums, 6))
## [1] 123451 123452 123461 123462
## Levels: 123451 123452 123461 123462

Upvotes: 1

Related Questions