Omaja7
Omaja7

Reputation: 129

Validate predictable phrases in multiline text separated by commas

I have a textarea box where user can input orders.

For example:

see north,
see south,
sell 7 wood,
see west

I need a regular expression that matches the following rules:

So far I have made this regular expression:

preg_match("/(((see) (north|south|west|east))|((sell|buy) ([1-9][0-9]{0,2}) (wood)))/");

But the problem is, the following input also is valid, which should be not: sell 75 wood see north (it should be either sell 75 wood OR see north)

NB! Right now I don't have comma validation in my regular expression, because I use PHP function explode to split with commas and then pass resulting array items to regex. But it doesn't seem to work with the following input:

see north,
see south        *(no comma between two orders)*
sell 7 wood,
see west

So, I need one of the following solutions:

Upvotes: 2

Views: 98

Answers (2)

mickmackusa
mickmackusa

Reputation: 47894

Contrary to Niet's advice, I would definitely recommend regex for validating this very predictably formatted text.

For validation, only expect commas and newline sequences if another valid line is given. Use start and end of string anchors (^ and $). \R is a good idea when parsing newline sequences that are out of your control or are potentially coming from a different operating system.

^                                     #start of string
(                                     #$start capture group 1
 see (?:north|east|south|west)        #1st phrase option
 |                                    #or
 (?:sell|buy) (?:[1-9]\d{0,2}) wood   #2nd phrase option
)                                     #end capture group 1
(?:                                   #start of non-capturing group for logic encapsulation
   ,                                  #literal comma
   \R                                 #a newline sequence (e.g. \n, \r\n)
   (?1)                               #repeat the subpattern of the first capture group
)*                                    #allow zero or more occurrences of the encapsulated logic                  
$                                     #end of string anchor

Validation and Splitting: (Demo)

if (preg_match('/^(see (?:north|east|south|west)|(?:sell|buy) (?:[1-9]\d{0,2}) wood)(?:,\R(?1))*$/', $text)) {
    var_export(preg_split('/,\R/', $text));
}

Output:

array (
  0 => 'see north',
  1 => 'see south',
  2 => 'sell 7 wood',
  3 => 'see west',
)

Upvotes: 0

Niet the Dark Absol
Niet the Dark Absol

Reputation: 324640

Do not use a Regex for this. This is a job for a (basic) parser.

Do whatever you need to get one command at a time. This could be explode, for instance. Use trim if necessary to remove whitespace from the start and end.

Then, $parts = explode(" ",$command);

You can now switch($parts[0]) to determine what to do based on the first keyword.

case "see":
    if( !in_array($parts[1], ["north","south","east","west"])) {
        throw new OutOfBoundsException("Invalid direction");
    }
    // do something here
    break;

Notice how validation is super easy and it's possible to provide specific error messages so that the user knows what they did wrong.

case "sell":
    $q = intval($parts[1]);
    if( $q < 1 || $q > 999) {
        throw new OutOfBoundsException("Invalid amount of things to sell");
    }
    $what = $parts[2];
    if( !in_array($what, ["wood"])) {
        throw new OutOfBoundsException("Invalid thing to sell");
    }
    // do something
    break;

default:
    throw new OutOfBoundsException("Invalid command");

This whole process is all about taking a big problem and breaking it down. It's also very, very easy to change how it works, what commands and parameters are allowed, etc. Changing a regex would be much harder.

Upvotes: 3

Related Questions