Sachin Hegde
Sachin Hegde

Reputation: 51

convert numbers written in words to numbers using R programming

my challenge is to convert ten and one which is in words to numbers as 10 and 1 in the input sentence:

example_input <- paste0("I have ten apple and one orange")

Numbers may change based on user requirement, input sentence can be tokenized:

my_output_toget<-paste("I have 10 apple and 1 orange")

Upvotes: 5

Views: 1405

Answers (4)

FelixST
FelixST

Reputation: 301

I wrote an R package to do this - https://github.com/fsingletonthorn/words_to_numbers which should work for more use cases.

devtools::install_github("fsingletonthorn/words_to_numbers")

library(wordstonumbers)

example_input <- "I have ten apple and one orange"

words_to_numbers(example)

[1] "I have 10 apple and 1 orange"

It also works for much more complex cases like


words_to_numbers("The Library of Babel (by Jorge Luis Borges) describes a library that contains all possible four-hundred and ten page books made with a character set of twenty five characters (twenty two letters, as well as spaces, periods, and commas), with eighty lines per book and forty characters per line.")
#> [1] "The Library of Babel (by Jorge Luis Borges) describes a library that contains all possible 410 page books made with a character set of 25 characters (22 letters, as well as spaces, periods, and commas), with 80 lines per book and 40 characters per line."

Or

words_to_numbers("300 billion, 2 hundred and 79 cats")
#> [1] "300000000279 cats"

Upvotes: 3

tmfmnk
tmfmnk

Reputation: 39858

textclean is quite a handy possibility for this task:

mgsub(example_input, replace_number(seq_len(10)), seq_len(10))

[1] "I have 10 apple and 1 orange"

You just need to adjust the seq_len() parameter according to the maximum number in your data.

Some examples:

example_input <- c("I have one hundred apple and one orange")

mgsub(example_input, replace_number(seq_len(100)), seq_len(100))

[1] "I have 100 apple and 1 orange"

example_input <- c("I have one tousand apple and one orange")

mgsub(example_input, replace_number(seq_len(1000)), seq_len(1000))

[1] "I have 1 tousand apple and 1 orange"

If you don't know your maximum number beforehand, you can just choose a sufficiently big number.

Upvotes: 3

boski
boski

Reputation: 2467

Less elegantly than Akrun's answer but in base.

nums = c("one","two","three","four","five",
         "six","seven","eight","nine","ten")
example_input <- paste0("I have ten apple and one orange")

aux = strsplit(example_input," ")[[1]]
aux[!is.na(match(aux,nums))]=na.omit(match(aux,nums))
example_output = paste(aux,collapse=" ")
example_output
[1] "I have 10 apple and 1 orange"

We first split by spaces, find the matching numbers and change them by the position (coincides with the number itself), then paste it again.

Upvotes: 1

akrun
akrun

Reputation: 887128

We can pass a key/val pair as replacement in gsubfn to replace those words with numbers

library(english)
library(gsubfn)
gsubfn("\\w+", setNames(as.list(1:10), as.english(1:10)), example_input)
#[1] "I have 10 apple and 1 orange"

Upvotes: 6

Related Questions