Maël
Maël

Reputation: 52004

Convert ordinal to numbers

Is there a built-in way to convert ordinal numbers to numeric vectors?

ordinal <- c("First", "Third", "Second")
ordinal_to_numeric(ordinal)
#[1] 1 3 2

ordinal2 <- c("1st", "4th", "2nd")
ordinal_to_numeric(ordinal)
#[1] 1 4 2

One could indeed create a dictionary, but this could be cumbersome easily.

Upvotes: 4

Views: 958

Answers (3)

LMc
LMc

Reputation: 18642

A solution to your second vector is straight forward. If you are open to using a non-CRAN package for your first vector you could do something like:

# devtools::install_github("benmarwick/words2number")
library(words2number)

text2num <- function(x) {
  isOrd <- grepl("\\d+", x)
  x[isOrd] <- as.numeric(gsub("\\D+", "", x[isOrd]))
  x[!isOrd] <- to_number(x[!isOrd])
  as.numeric(x)
}

One handy benefit is this can handle a vector that has a combination of both cases:

text2num(c(ordinal, ordinal2))
# [1] 1 3 2 1 4 2

Some drawbacks mentioned:

  1. This package is experimental
  2. It does not handle decimals (e.g. six point one)
  3. There is an upper limit that the package author specifies is in the millions
  4. Has an unnecessary dependency on magrittr for the %>% operator. (Unnecessary since there is now a base pipe)

Upvotes: 1

zephryl
zephryl

Reputation: 17134

I’m late to the party and @DaveArmstrong’s solution is definitely simpler, but here’s a slightly more generic solution that first converts ordinals to cardinals, then passes these to nombre::uncardinal() for conversion to numeric. The str_replace_all() vector for ordinal -> cardinal conversion is based on source code for nombre::ordinal().

library(stringr)
library(nombre)

ordinal_to_numeric <- function(x) {
  w_word_stem <- function(x) {
    x |>
    str_to_lower() |>
    str_remove("st$|nd$|rd$|th$") |>
    str_replace_all(c(
      "fir$" = "one",
      "seco$" = "two",
      "thi$" = "three",
      "f$" = "ve",
      "eigh$" = "eight",
      "nin$" = "nine",
      "ie$" = "y"
    )) |>
    uncardinal()
  }
  w_num_stem <- function(x) {
    x |>
      str_extract("^-?\\d+") |>
      as.numeric()
  }
  out <- suppressWarnings(ifelse(
    str_starts(x, "-?\\d"), 
    w_num_stem(x),
    w_word_stem(x)
  ))
  if (any(is.na(out) & !is.na(x))) {
    warning("Conversion failed for some inputs")
  }
  out
}

ordinal <- c("First", "Third", "Second", "Five Hundred Thirty Eighth", "Negative Twenty-Third")
ordinal_to_numeric(ordinal)
# 1   3   2 538 -23

ordinal2 <- c("1st", "4th", "2nd", "538th", "-23rd")
ordinal_to_numeric(ordinal2)
# 1   4   2 538 -23

Upvotes: 4

DaveArmstrong
DaveArmstrong

Reputation: 21937

Not exactly built-in, but you can use Ritchie Sacramento's suggestion of the english package. You first make a long string of the ordinal values in words. Then you find the place of your words in these ordered list of ordinal values:

library(english)
ordinal <- c("First", "Third", "Second")
o <- ordinal(1:1000)
match(tolower(ordinal), o)
#> [1] 1 3 2

The second, as Ritchie suggests, is less complicated. I used a slightly different method, but ultimately it does the same thing.

ordinal2 <- c("1st", "4th", "2nd")
as.numeric(stringr::str_extract(ordinal2, "\\d+"))
#> [1] 1 4 2

Created on 2023-01-11 by the reprex package (v2.0.1)

You could even put them together in a single function:

ordinal_to_numeric <- function(x, max_ord=1000){
  if(any(grepl("\\d", x))){
    as.numeric(stringr::str_extract(x, "\\d+"))
  }else{
    require(english, quietly = TRUE)
    o <- ordinal(seq(1,max_ord, by=1))
    match(tolower(x), o)
  }
}
ordinal <- c("First", "Third", "Second")
ordinal_to_numeric(ordinal)
#> [1] 1 3 2

ordinal2 <- c("1st", "4th", "2nd")
ordinal_to_numeric(ordinal2)
#> [1] 1 4 2

Created on 2023-01-11 by the reprex package (v2.0.1)

Upvotes: 5

Related Questions