Rutger83
Rutger83

Reputation: 13

Speed up SQLite query, can I do it without a union?

Hi everybody of the stackoverflow community! I've been visiting this site for years and here comes my first post

Lets say I have a database with three tables:

Example: The candy factory.
In the candy factory 10 types of candy bags are produced out of 80 different candies.

So: There are 10 unique group types(bags) with 3 different sizes: (4,5,6); a group is combination out of 80 unique candies.

Out of this I make a database, (with some rules about which candy combinations gets into a group).

At this point I have a database with 40791 unique candy bags.

Now I want to compare a collection of candies with all the candy bags in the DB, as a result I want the bags out of the DB which are missing 3 or less candies with the compare collection.

-- restore candy status
update candies set selected = 0, blacklisted = 0;

-- set status for candies to be selected
update candies set selected = 1 where name in ('candy01','candy02','candy03','candy04');

select  groupId, GroupType, max, count(*) as remainingNum, group_concat(name,', ') as remaining
from groups natural join members natural join candies
where not selected 
group by groupid having  count(*) <= 3

UNION -- Union with groups which dont have any remaining candies and have a 100% match

select groupid, GroupType, max,  0 as remainingNum, "" as remaining
from groups natural join members natural join candies
where selected 
group by groupid having count(*) =groups.size;

The above query does this. But the thing I am trying to accomplish is to do this without the union, because speed is of the essence. And also I am new to sql and are very eager to learn/see new methods.

Greetings, Rutger

Upvotes: 1

Views: 1349

Answers (2)

Seph
Seph

Reputation: 8703

Don't update the database when you're doing a select, your first update update candies set selected = 0, blacklisted = 0; will apply to the entire table, and rewrite every record. You should try without using selected and also changing your union to UNION ALL. Further to this, you try inner join instead of natural join (but I don't know your schema for candy to members)

select  groupId, GroupType, max, count(*) as remainingNum, group_concat(name,', ') as remaining
from groups 
    inner join members on members.groupid = groups.groupid
    inner join candies on candies.candyid = member.candyid
where name NOT in ('candy01','candy02','candy03','candy04')
group by groups.groupid
having  count(*) <= 3

UNION ALL -- Union with groups which dont have any remaining candies and have a 100% match

select groupid, GroupType, max,  0 as remainingNum, "" as remaining
from groups 
    inner join members on members.groupid = groups.groupid
    inner join candies on candies.candyid = member.candyid
where name in ('candy01','candy02','candy03','candy04')
group by groupid
having count(*) = groups.size;

This should at least perform better than updating all records in the table before querying it.

Upvotes: 0

MatBailie
MatBailie

Reputation: 86765

I'm not 100% sure about what you are accomplishing through these queries, so I haven't looked at a fundamentally different approach. If you can include example data to demonstrate your logic, I can have a look at that. But, in terms of simply combining your two queries, I can do that. There is a note of caution first, however...

SQL is compiled in to query plans. If the query plan for each query is significantly different from the other, combining them into a single query may be a bad idea. What you may end up with is a single plan that works for both cases, but is not very efficient for either. One poor plan can be a lot worse than two good plans => Shorter, more compact, code does not always give faster code.


You can put selected in to your GROUP BY instead of your WHERE clause; the fact that you have two UNIONed queries shows that you are treating them as two separate groups already.

Then, the only difference between your queries is the filter on count(*), which you can accommodate with a CASE WHEN statement...

SELECT
  groups.groupID,
  groups.GroupType,
  groups.max,
  CASE WHEN Candies.Selected = 0 THEN count(*)  ELSE 0 END as remainingNum,
  CASE WHEN Candies.Selected = 0 THEN group_concat(candies.name,', ') ELSE '' END as remaining
FROM
  groups
INNER JOIN
  members
    ON members.GroupID = groups.GroupID
INNER JOIN
  candies
    ON Candies.CandyID = members.CandyID
GROUP BY
  Groups.GroupID,
  Groups.GroupType,
  Groups.max,
  Candies.Selected
HAVING
  CASE
    WHEN Candies.Selected = 0 AND COUNT(*) <= 3           THEN 1
    WHEN Candies.Selected = 1 AND COUNT(*)  = Groups.Size THEN 1
                                                          ELSE 0
  END
  =
  1

The layout changes are simply because I disagree with using NATURAL JOIN for maintenance reasons. They are a short-cut in initial build and a potential disaster in later development. But that's a different issue, you can read about it on line if you feel you want to.

Upvotes: 4

Related Questions