mare96
mare96

Reputation: 3869

Use SQL REGEXP (mariadb) for filter in where with special characters in Laravel

mariadb version 10.4

$query->where('column', 'REGEXP', '[[:<:]]'.$string.'[[:>:]]');

This query provides a search with whole words only and that's working fine without special characters.

Example

Searching row:
foo (bar baz)
Filter:
$string = "(bar";
$query->where('column', 'REGEXP', '[[:<:]]'.$string.'[[:>:]]');
Error:
SQLSTATE[42000]: Syntax error or access violation: 1139 Got error 'missing )

Now I tried with addcslashes and preg_quote. I succeed to escape ( character with preg_quote but the filter won't work for that string because the filter works with whole words.

Any suggestion for better filtering by the whole word will be also appreciated.

One more example

Searching row:
foo(bar baz
Filter:
$filter = 'foo(bar';
$query->where('column', 'REGEXP', '[[:<:]]'.$string.'[[:>:]]');

If ( will be escaped the search will not work

Upvotes: 7

Views: 1112

Answers (3)

LSerni
LSerni

Reputation: 57418

As you can see, the ICU regex library that you're using considers brackets to be special, and requires double slash escaping. Using add_slashes() is not useful because it would add slashes where required by other libraries that aren't ICU's, and it might give you a string that is not appropriate for it (e.g. it wouldn't escape round brackets).

You might have better luck with escapeshellcmd() due to the fact that the round and square brackets are, by chance, both shell metacharacters, but so are the dollar sign and the hash sign and this might come and bite you in the back later.

The simpler solution would be to manually replace the four characters you need (actually three, ( ) [, but for completeness' sake...).

$string = preg_replace(
    '#([(\\[\\])])#', 
    '\\\\\\\\\1', // Hilarious, isn't it?
    $string
);

$query->where('column', 'REGEXP', '[[:<:]]'.$string.'[[:>:]]');

Upvotes: 0

Rick James
Rick James

Reputation: 142472

There are a lot of punctuation marks that have special meaning in REGEXP strings. ( is just one of many.

If the goal is the search for the 4 characters (bar with a trailing "word boundary", then one of these may work -- depending on the version of MySQL/MariaDB and what escaping is needed to get from your app to the server:

\(bar[[:>:]]
\\(bar[[:>:]]
\\\\(bar[[:>:]]
[(]bar[[:>:]]

\(bar\b
\\(bar\\b
\\\\(bar\\\\b
[(]bar\b
[(]bar\\b
[(]bar\\\\b

In any case your app must do some escaping. Given that, it may be better to remove punctuation:

[[:<:]]bar[[:>:]]
\bbar\b
\\bbar\\b
\\\\bbar\\\\b

Consider also using LIKE "(bar"

Consider also using a FULLTEXT index

MATCH(col) AGAINST("bar" IN BOOLEAN MODE)

For maximal performance, this may be best (with a FULLTEXT index):

MATCH(col) AGAINST("bar" IN BOOLEAN MODE)
AND col LIKE "(bar"

Upvotes: 1

ybenhssaien
ybenhssaien

Reputation: 4115

Try this : $string = '\(bar';

You can always check your regex pattern here : https://regex101.com/

Upvotes: 0

Related Questions