Reputation: 1445

Should I perform regex filtering in MySQL or PHP?

I have a MySQL column which holds various string data, it is a VARCHAR field.

The table has more than 100k records, and I'd like to filter a query by this field to SELECT only the records in which this field starts with any characters but 1,2,3,4,5,6,7,8,9.

Is it faster to:

write a REGEXP in the SQL query, or
just select all records and filter them out in PHP by performing a PHP REGEX?

Upvotes: 0

Answers (2)

Rick James

Reputation: 142453

I reopened because the dup (Select Query | Select Entires That Don't Start With A Number - MySQL) had the inverse condition -- "rows not starting with ..."

There is a significant optimization for rows starting with some consecutive set of characters, such as 1..9:

INDEX(col)

SELECT ... WHERE col >= '1'
             AND col  < CHAR(ORD('9') + 1)

This would scan only the 1..9 rows, not the entire table, such as the PHP approach would require, and such that all three answers on the other Question require.

A second reason this is not a dup of that other one -- the Question here is more about PHP vs MySQL. The main performance argument for doing it in MySQL is to save the transmission time.

If you need a fancier REGEXP, you could switch to MariaDB, which has (I think) the same regexp engine as PHP. If you need something too complex for SQL, even with a better regexp, then you may be forced to go to PHP. But even in that case, filter as much as you can in SQL -- to minimize the amount of data being shoveled over the 'wire'.

Upvotes: 0

SrThompson

Reputation: 5748

The SQL query will be faster, hands down. This sort of thing is precisely what SQL is meant to be used for.

To clarify for future reference: when you need the DB to return a specific data set, you should let the DB deal with constructing the dataset by using a SQL query. Your application code can then have one or more abstractions that represent and handle the resulting dataset for your business use case, but it should not do the DB engine's job.

TL;DR: building a dataset from DB tables is a Data access layer concern, handling abstractions related to business entities is the application layer concern

Upvotes: 2

Should I perform regex filtering in MySQL or PHP?

Answers (2)

Related Questions