Eric
Eric

Reputation: 41

How to make a GROUP BY SUBSTRING query faster?

I have a table with the following structure:

id bigNumber           text
1  1200321030011010233 "an item with some text"
2  1200321030011014563 "another item with some more text"
3  3120323434432211133 "more...."
.
.
.

The table contains approximately 50,000 records. I want to do the following query but it is slow:

  SELECT COUNT(*), 
         bigNumber 
    FROM items 
GROUP BY substr(bigNumber, 1, X)

where X varies from 2 and 19.

Would it be quicker to create 19 columns with parts of the number in each column to be able to use an index?

Upvotes: 4

Views: 18089

Answers (3)

longneck
longneck

Reputation: 12226

I would suggest instead of 19 separate columns, perhaps just 3 or 4, like so:

alter table items
add column bignumber_l1 char(1)
, add column bignumber_l3 varchar(3)
, add column bignumber_l6 varchar(6);

update items
set bignumber_l1 = left(bignumber, 1)
 , bignumber_l3 = left(bignumber, 3)
 , bignumber_l6 = left(bignumber, 6);

alter table items
add index bignumber_l1
, add index bignumber_l3
, add index bignumber_l6;

then when you are querying for strings of x length, write the query with the longest match without going longer:

  SELECT COUNT(*), 
         bigNumber 
    FROM items 
GROUP BY bignumber_l3, substr(bigNumber, 1, 4)

this query can use an index and might improve your performance significantly. note that since you're asking for the whole table, mysql may decide you need a table scan even with the indexes above so you may need to use FORCE INDEX.

Upvotes: 2

Keith Randall
Keith Randall

Reputation: 23265

You can use an index without adding any columns, just create an index on your bigNumber column:

create index bignum_index on items (bigNumber);

The index on a varchar (bigNumber is a varchar, right?) can be used to look up any string prefix.

That said, you'll need to do a full table scan to answer your query, so an index won't help you all that much.

Upvotes: 0

Kevin Peno
Kevin Peno

Reputation: 9206

I think the result you are looking for is LIKE _X%. This will not use the index though.

SELECT count(*) FROM items WHERE bignumber LIKE "_2%"
SELECT count(*) FROM items WHERE bignumber LIKE "_19%"

_ signifies one character

% signifies any number of characters

See MySQL docs for more information.

Upvotes: -1

Related Questions