jskidd3
jskidd3

Reputation: 4783

Optimize MySQL query for group_concat function

SELECT SQL_NO_CACHE link.stop, stop.common_name, locality.name, stop.bearing, stop.latitude, stop.longitude
FROM service
JOIN pattern ON pattern.service = service.code
JOIN link ON link.section = pattern.section
JOIN naptan.stop ON stop.atco_code = link.stop
JOIN naptan.locality ON locality.code = stop.nptg_locality_ref
GROUP BY link.stop

The above query takes roughly 800ms - 1000ms to run.

If I append a group_concat statement the query then takes 8 - 10 seconds:

SELECT SQL_NO_CACHE link.stop, link.stop, stop.common_name, locality.name, stop.bearing, stop.latitude, stop.longitude, group_concat(service.line) lines

How can I change this query so that it runs in less than 2 seconds with the group_concat statement?

SQL Fiddle: http://sqlfiddle.com/#!9/414fe

EXPLAIN statements for both queries: https://i.sstatic.net/lN176.png

Upvotes: 1

Views: 1087

Answers (2)

Gordon Linoff
Gordon Linoff

Reputation: 1269873

How long does this query take?

SELECT p.section, GROUP_CONCAT(s.line)
FROM pattern p join
     service s
     ON p.service = s.code
GROUP BY p.section

I am thinking that you can do the group_concat() in a subquery, so the outer query does not need an aggregation. This can speed queries when there is one table in the subquery. In your case, there are two.

The final results would be something like:

link.section = pattern.section

SELECT SQL_NO_CACHE . . .,
       (SELECT GROUP_CONCAT(s.line)
        FROM pattern p join
             service s
             ON p.service = s.code
        WHERE p.section = link.section
       ) as lines
FROM link JOIN
     naptan.stop
     ON stop.atco_code = link.stop JOIN
     naptan.locality
     ON locality.code = stop.nptg_locality_ref;

For this query, you want the following additional indexes: pattern(section, service) and service(code, line).

I don't know if this will work, but it is worth a try.

Note: this is assuming that you really don't need the group by for the rest of the columns.

Upvotes: 3

O. Jones
O. Jones

Reputation: 108676

A remark: You're using the nonstandard MySQL extension to GROUP BY. It happens to work for you because link.stop is joined to stop.atco_code, which itself is a primary key. But you need to be very careful with this.

I suggest you add some compound indexes. You join in to pattern on service and join out based on section. So add this index.

ALTER TABLE pattern ADD INDEX service_section (service, section, line);

This will let the query use just the index, and not have to hit the table itself to retrieve the information needed for the JOIN or your GROUP_CONCAT() operation. (You might also delete the index on just service, this new index makes it redundant).

Similarly, you want to create an index (section, stop) on the link table, and get rid of the index on just section.

On stop, you're using most of the columns, and you already have an index (PK) on atco_code, so let this one be.

Finally, on locality put an index on (code,name).

All this indexing monkey business should cut down the amount of work MySQL must do to satisfy your query.

Now look, as soon as you add WHERE anything = anything to the query, you may need to add a column to one or more of these indexes. You definitely should read up on multi-column indexing and grouping; good indexing is a critical success factor for your kind of data.

You should also run ANALYZE TABLE xxxx on each of your tables after inserting lots of rows, to make sure the query optimizer can see appropriate information about the content of the table and indexes.

Upvotes: 1

Related Questions