DHHJ
DHHJ

Reputation: 153

Getting count of rows with multiple combinations

I need help with a MySQL query. We have a database (~10K rows) which I have simplified down to this problem.

We have 7 truck drivers who visit 3 out of a possible 9 locations, daily. Each day they visit exactly 3 different locations and each day they can visit different locations than the previous day. Here are representative tables:

Table: Drivers

id name
10 Abe
11 Bob
12 Cal
13 Deb
14 Eve
15 Fab
16 Guy

Table: Locations

id day address driver.id
1   1    Oak   10
2   1    Elm   10
3   1    4th   10
4   1    Oak   16
5   1    4th   16
6   1    Toy   16
7   1    Toy   11
8   1    5th   11
9   1    Law   11
10   2    Oak   11
11   2    4th   11
12   2    Toy   11   
.........

We have data for a full year and we need to find out how many times each "route" is visited over a year, sorted from most to least.

From my high school math, I believe there are 9!/(6!3!) route combinations, or 84 in total. I want do something like:

Get count of routes where route addresses = 'Oak' and 'Elm' and '4th'

then run again where route addresses = 'Oak' and 'Elm' and '5th'

then again and again, etc.Then sort the route counts, descending. But I don't want to do it 84 times. Is there a way to do this?

Upvotes: 0

Views: 32

Answers (1)

spencer7593
spencer7593

Reputation: 108400

I'd be looking at GROUP_CONCAT

 SELECT t.day 
      , t.driver
      , GROUP_CONCAT(t.address ORDER BY t.address)
  FROM mytable t
 GROUP
    BY t.day
     , t.driver

What's not clear here, if there's an order to the stops on the route. Does the sequence make a difference, and how to we tell what the sequence is? To ask that a different way, consider these two routes:

('Oak','Elm','4th') and ('Elm','4th','Oak')

Are these equivalent (because it's the same set of stops) or are they different (because they are in a different sequence)?

If sequence of stops on the route distinguishes it from other routes with the same stops (in a different order), then replace the ORDER BY t.address with ORDER BY t.id or whatever expression gives the sequence of the stops.

Some caveats with GROUP_CONCAT: the maximum length is limited by the setting of group_concat_max_len and max_allowed_packet variables. Also, the comma used as the separator... if we combine strings that contain commas, then in our result, we can't reliably distinguish between 'a,b'+'c' and 'a'+'b,c'


We can use that query as an inline view, and get a count of the the number of rows with identical routes:

 SELECT c.route
      , COUNT(*) AS cnt
   FROM ( SELECT t.day 
               , t.driver
               , GROUP_CONCAT(t.address ORDER BY t.address) AS route
            FROM mytable t
           GROUP
              BY t.day
               , t.driver
        ) c
  GROUP
     BY c.route
  ORDER
     BY cnt DESC

Upvotes: 1

Related Questions