fnky
fnky

Reputation: 705

PHP - Custom query parser

I am trying to make a custom search query parser. The idea is that the user can write specific keywords to search by e.g. artist, color and style. For example if the user searches for:

style:Emboss some keywords color:#333333 artist:"Tom Hank" steel

The returned result in the backend would be:

array(
    "style"  => "Emboss",
    0        => "some",
    1        => "keywords"
    "color"  => "#333333",
    "artist" => "Tom Hank", // Note the word is not broken
    2        => "steel"
)

So far I have managed to do the oppersite - by building a query string from an array with no problem. However I have a problem with parsing a string to an array - mostly due to the fact that there's quotes.

What I've so far is

public function parseQuery($str) {
    $arr = array();

    $pairs = str_getcsv($str, ' '); // This bugs me

    foreach($pairs as $k => $v) {
        list($name, $value) = explode(":", $v, 2);

        if(!isset($value)) {
            $arr[] = $name;
        } else {
            $arr[$name] = $value;
        }
    }

    return $arr;
}

The problem relies on the str_getcsv function which breaks quoted words if there's no space between the first quote or after the last. It breaks it down like this

Array
(
    [0] => Some
    [1] => string
    [2] => with
    [3] => but:"some <--- This is the sinner
    [4] => string"
)

It works if there's spaces between the but: and "some string", however I do not wan't this.

My question how this could be solved by using less to no regular expression.

Upvotes: 1

Views: 248

Answers (1)

Timothy
Timothy

Reputation: 4650

Try this... it's quick and dirty procedural code, but does what you want. You'll have refactor it to make it maintainable.

<?php
$str = 'style:Emboss some keywords color:#333333 artist:"Tom Hank" steel';

$pos = 0;
$buffer = '';
$len = strlen($str);
$quote = false;
$key = '';
$arr = array();

while ($pos < $len) {
    switch ($str[$pos]) {
        case '"':
            $quote = !$quote;
            break;
        case ':':
            $key = $buffer;
            $buffer = '';
            break;
        case ' ':
            if ($quote) {
                $buffer .= $str[$pos];
            }
            elseif (!empty($key)) {
                $arr[$key] = $buffer;
                $key = '';
                $buffer = '';
            }
            else {
                $arr[] = $buffer;
                $buffer = '';
            }
            break;
        default:
            $buffer .= $str[$pos];
    }
    $pos++;
}
if (!empty($key)) {
    $arr[$key] = $buffer;
}
else {
    $arr[] = $buffer;
}

print_r($arr);

Upvotes: 3

Related Questions