Ivan Topić
Ivan Topić

Reputation: 3185

How to search content but skip what is inside square brackets?

I need to find all posts that have double quotation mark " in post_content table

SELECT * FROM `pl_posts` WHERE `post_content` LIKE '%"%'

This is a WordPress website so I get few thousands rows because each post contains shortcodes, most frequently this one:

[gallery columns="4" ids="22502,22504,22503"]

How can I search for everything in a post_content table but so that it skips what is inside square brackets [...]?

Upvotes: 0

Views: 69

Answers (1)

GMB
GMB

Reputation: 222652

Several solutions are possible for your WHERE clause, depending on your data.

Option 1 : just skip records whose content contains square brackets

post_content LIKE '%"%' AND post_content NOT LIKE '%[%'

Option 2 : if you have mixed fields content with double quotes oustide AND inside square brackets, a solution is to remove anything inside square brackets before seaching for the double quote

REGEXP_REPLACE(txt, '\\[[^\\]]*\\]', '') LIKE '%"%'

The regexp means : an opening bracket, followed by O to N characters different than a closing bracket, and then a closing bracket.

Option 3 : functionnaly identical to option 2, but using REGEXP_REPLACE with a bit more complicated regex :

REGEXP_LIKE(txt, '(^|\\])[^\\[\\]]*"')

For older versions of MySQL (< 8.0.4), this can be written :

txt REGEXP '(^|[[.right-square-bracket.]])[^[.left-square-bracket.][.right-square-bracket.]]*"'

db<>fiddle here

 WITH tests as (
     SELECT '[columns="4"]' txt
     UNION SELECT 'abc"db'
     UNION SELECT 'abc"db[columns="4"]'
     UNION SELECT '[columns="4"]abc"db'
     UNION SELECT '[columns="4"]abc"db[columns="4"]'
 )
 SELECT 
     txt, 
     txt NOT LIKE '%[%', 
     REGEXP_REPLACE(txt, '\\[[^\\]]*\\]', '') LIKE '%"%',
     REGEXP_LIKE(txt, '(^|\\])[^\\[\\]]*"')
 FROM tests

 txt                              | txt NOT LIKE '%[%' | REGEXP_REPLACE(txt, '\[[^\]]\]', '') LIKE '%"%'    | REGEXP_LIKE(txt, '(^|\])[^\[\]]"')
 :------------------------------- | -----------------: | --------------------------------------------------: | -------------------------------------:
 [columns="4"]                    |                  0 |                                                   0 |                                      0
 abc"db                           |                  1 |                                                   1 |                                      1
 abc"db[columns="4"]              |                  0 |                                                   1 |                                      1
 [columns="4"]abc"db              |                  0 |                                                   1 |                                      1
 [columns="4"]abc"db[columns="4"] |                  0 |                                                   1 |                                      1
 

Upvotes: 3

Related Questions