user4826347
user4826347

Reputation: 783

Regex to replace a combination of number(digits/word) and a word

I use following code to replace a number and string to a replacement text

var rule = (\d+\s((apple\b|apples\b|Apple\b|Apples\b)+))
var search_regexp = new RegExp(rule, "ig");
return masterstring.replace(search_regexp,replacetext);

How is it possible to have a regular expression for handling 10 apples and Ten apples? Say one to identify

 (a number in digits or word)+space+(a case insensitive word) 

and replace this with 10 Oranges both using jQuery and php?

Upvotes: 0

Views: 1214

Answers (1)

Theo
Theo

Reputation: 1633

If you specifically only want to match valid number 'words' you would have to literally include in your regex all the numbers you want to include.

(one|two|three|four|five|six|seven|eight|nine|ten) etc.

This could be improved by combining words that start with the same letter:

(one|t(wo|hree|en)|f(our|ive)|s(ix|even)|eight|nine)

You can then include your \d+ as your first option:

(\d+|one|t(wo|hree|en)|f(our|ive)|s(ix|even)|eight|nine)

As some said in the comments you are using the case insensitive modifier, so I have done all lower case)

Note that if you want to go beyond ten this will become quite long, and hard to make efficient, I've had a quick go, and created a beast of a regex, I have not tried to optimise too much..

(?:
  \d+
  |t(?:en|hirteen)
  |eleven
  |twelve
  |fifteen
  |(?:
    (?:twenty|thirty|fourty|fifty|sixty|seventy|eighty|ninety)
    (?:[ -](?:one|t(?:wo|hree)|f(?:our|ive)|s(?:ix|even)|eight|nine))?
  )
  |(?:one|t(?:wo|hree)|f(?:our(?:teen)?|ive)|s(?:ix|even)(?:teen)?|eight(?:een)?|nine(?:teen)?)
)[ ]apples?

I have spread this over several lines and added the 'x' modifier in the online example - this makes it much easier to read, this works in PHP but not in javascript, you would have to remove the newlines/whitespace to use in JS) [https://regex101.com/r/zDYme7/1](See working example online here)

Its also worth mentioning that doing this in regex may not be the best way - a string tokenizer would involve a lot less cpu time, but would involve more code.

One example of a tokenizer: https://www.npmjs.com/package/tokenize-text

Upvotes: 1

Related Questions