kenyu73
kenyu73

Reputation: 681

Using MySQL REGEX anchors with wildcards inside

Say I have db content with the word "supercomputer". I want to search using the following:

select * from content where content REGEXP '[[:<:]]super*[[:>:]]';

I want the user doing the search to be able to add wildcards like ? and *. I want whole word searches unless the user does as I explained above so using [ REGEX 'super' ] isn't an option.

I'd think this would work, but I guess I'm still to new to using expressions.

Upvotes: 0

Views: 758

Answers (2)

kenyu73
kenyu73

Reputation: 681

This ended up being my final PHP / MySQL solution for my search.

$bCustom = (strpos($search, "*") !== false && strpos($search, "?") !== false) ? false : true;

$sql = "SELECT content.*, users.user, users.lastname, users.firstname FROM content INNER JOIN users ON content.created_by=users.id ";

$search = (trim($get['SEARCH']) === "") ? "*" : trim($get['SEARCH']);

if ($bCustom) {
    $search = str_replace("*", ".*", $search);
    $search = str_replace("?", ".", $search);
    $sql .= "WHERE content.name REGEXP '[[:<:]]" . $search . "[[:>:]]' OR content.content REGEXP '[[:<:]]"
            . $search . "[[:>:]]' ORDER BY content.name ASC; ";
} else {
    if ($search !== "") {
        $sql .= "WHERE MATCH (name, content) AGAINST ('" . $search . "')";
    } else {
        $sql .= "ORDER BY content.name ASC; ";
    }
}

Upvotes: 0

Godwin
Godwin

Reputation: 9937

The wildcard character * means something different in regex (sort of). It denotes that a sequence may be repeated zero or more times, not that there is some text of zero or more length that it can be replaced by.

You're going to have to do some preprocessing if you want the user to be able to use wildcards like this, but you can simply replace any * with the expression: \w*. This says that you are expecting zero or more word characters, not spaces or punctuation. so the complete expression would look like:

select * from content where content REGEXP '[[:<:]]super.*[[:>:]]';

This says you want any sequence that begins with 'super' and contains only word characters and is surrounded by word boundaries.

Upvotes: 2

Related Questions