Ferrett Steinmetz
Ferrett Steinmetz

Reputation: 332

Parsing Customized Search Syntaxes in PHP?

There's a site out there that has a customized query language that can be passed along like this:

o:target o:creature (r:mythic or r:rare) t:"artifact creature"

Now, I know I could use a rather complicated regex to parse a similar file of code... but there may be as many as fifty different ways of querying, and that's gonna get ridiculously bad when people use, say, nested parentheses to search for things.

So: is there a PHP library that parses such strings automatically? Or is there a best practice for parsing potentially complex things like this? (I'm finding YAML, which seems complex, but that may be the answer.)

Upvotes: 1

Views: 71

Answers (1)

Sammitch
Sammitch

Reputation: 32252

This gets a bit more complex than you might think, particularly if you're going to be using bracketed sub-expressions like this, but you'll likely need to break this down into logical terms and push it onto a stack like RPN/postfix notation arithmetic using a modified Shunting-Yard algorithm.

Judging by your explicit or in the statement I assume that everything else should be anded together in the search query. For this we'll represent and as + and or as *, your expression would be equivalent to:

o:target + o:creature + ( r:mythic * r:rare ) + t:"artifact creature" 

Which should something like this in RPN/postfix

o:target o:creature + r:mythic r:rare * + t:"artifact creature" +

You would likely want to iterate through the string using strtok [with special cases to deal with parentheses with no leading/trailing whitespace as well as matching quotes] to build a logical structure that you can then use to build your query expression in whatever language you like.

It might be a bit more complex to do it this way, but in the end you can theoretically nest statements infinitely, as well as incorporate other operators or functions.

Upvotes: 1

Related Questions