Andy
Andy

Reputation: 5395

Displaying MySQL rows on a JOIN query even when records do not exist in another table

I have an PHP/MySQL application which is connected to a database featuring 2 tables called 'displays' and 'display_substances'. The structure of these tables is as follows:

mysql> DESCRIBE displays;
+----------+----------------------+------+-----+---------+----------------+
| Field    | Type                 | Null | Key | Default | Extra          |
+----------+----------------------+------+-----+---------+----------------+
| id       | smallint(5) unsigned | NO   | PRI | NULL    | auto_increment |
| label    | varchar(255)         | NO   |     | NULL    |                |
+----------+----------------------+------+-----+---------+----------------+

mysql> DESCRIBE display_substances;
+--------------+-----------------------+------+-----+---------+----------------+
| Field        | Type                  | Null | Key | Default | Extra          |
+--------------+-----------------------+------+-----+---------+----------------+
| id           | mediumint(8) unsigned | NO   | PRI | NULL    | auto_increment |
| display_id   | smallint(5) unsigned  | NO   | MUL | NULL    |                |
| substance_id | mediumint(8) unsigned | NO   | MUL | NULL    |                |
| value        | text                  | NO   |     | NULL    |                |
+--------------+-----------------------+------+-----+---------+----------------+

There is also a 'substances' table and the foreign key display_substances.substance_id is associated with substances.id.

The 'displays' table contains exactly 400 rows and the 'display_substances' approx 1.5 million rows.

What I'm trying to do is output all 400 'displays.label' into a HTML table, and then in a second column show the 'display_substances.value'. Since the page only shows 1 substance at a time it also needs to be based on the 'display_substances.substance_id'.

The issue I'm having is that records only exist in 'display_substances' when we have data available for the appropriate 'displays'. However, the output has to show all records from 'displays' and then put the text "Not Listed" next to anything where there is no corresponding record in 'display_substances'.

I've done the following - which gives me the output I want - but is flawed (see "The Problem" section below).

  1. Select all of the records in the "displays" table: SELECT label FROM displays ORDER BY label ASC

  2. Select all display_substances.display_id for the substance currently being shown: SELECT display_id FROM display_substances WHERE substance_id = 1 (assuming 1 is the current substance ID). Store the ID's in an array called $display_substances

  3. Loop through (1) and use in_array() to see if (2) exists:

    foreach ($displays as $display) { // step (1)
        $display_substances = // See step (2)
        if (in_array($display['id'], $display_substances)) { // step (3)
            $display_value = // See step (4)
        } else {
            $display_value = 'Not listed';
        }
        $output[] = ['display_label' => $display['label'], 'display_value' => $display_value]; // See step (5)
    }
    
  4. If the in_array() condition is true then I make a query to select the corresponding row of "display_substances": SELECT value FROM display_substances WHERE display_id = $display['id'] AND substance_id = 1

  5. The $output variable buffers all the data and then it gets output into a HTML table later. The output I get is exactly as I want.

The problem

Although the output is correct I want to do this all as 1 query (if possible) because I need to add features to search by either displays.label or display_substances.value - or a combination of both. The first part of this is fairly trivial because I can amend the query in (1) to:

SELECT label FROM displays WHERE label LIKE '% Foo %' ORDER BY label ASC

However, this won't make display_substances.value searchable because by the time we get to step (3) we're dealing with a single row of display_substances not the whole table. I can't see how to write it differently though since we need to know which records exist in that table for the loaded substance.

I have also written the following query - but this will not work because it misses anything that's "Not Listed":

SELECT displays.label, display_substances.`value` FROM displays JOIN display_substances ON displays.id = display_substances.display_id WHERE display_substances.substance_id = 1

I have read How do I get the records from the left side of a join that do not exist in the joined table? but that didn't help.

For further clarification

Let's say there are 120 rows in display_substances that correspond to substance ID 1 (WHERE display_substances.substance_id = 1). The output of the query should always have 400 rows. In this example, 120 should have display_substances.value next to them, and 280 should have the text "Not Listed".

Upvotes: 1

Views: 1565

Answers (4)

gview
gview

Reputation: 15361

I've realized that Ramy has about 98% of the solution.

FWIW this problem is just a variation on one that occurs all the time.

You will find other answers on SO when you search for 'left outer join with where clause' -- that address the problem. One example is this question.

Ultimately, you have a many to many resolution table (display_substances) that resolves the many to many relationship between substances and displays. You are just looking for an outer join from one of the 2 parent tables, but also requiring that you filter the results by a specific substance.

SELECT
    d.id AS display_id,
    d.label AS display_label,
    IFNULL(ds.value, 'Not Listed') AS substance_value
FROM displays AS d
LEFT JOIN display_substances AS ds ON (ds.display_id = d.id AND ds.substance_id = 1); 

This query does not generate a value of 'not listed' but it does generate NULL columns for those display rows where there is no corresponding display_substance value. You could embelish it with the IF_NULL() function demonstrated by ahmad, but as you are using PHP to go through the result set, you can just as easily handle that in the procedural loop you'll use to fetch the results.

Upvotes: 1

Ramy Nasr
Ramy Nasr

Reputation: 2527

My understanding is that you want the following:

  • Your page limits the results by substance ID
  • You want one substance per row
  • If there are displays with no substances, they should still show in the page with "Not Listed" as the substance value

I believe this should work for you:

SELECT
    d.id AS display_id,
    d.label AS display_label,
    IFNULL(ds.value, 'Not Listed') AS substance_value
FROM displays AS d
LEFT JOIN display_substances AS ds ON (ds.display_id = d.id)
WHERE ds.substance_id = 1 OR ds.substance_id IS NULL;

Upvotes: 2

Andy
Andy

Reputation: 5395

I'm posting what I've used as a solution. I don't think it's possible to do this in MySQL as one query. Several people answered but none of the answers worked.

For even further clarification - although obvious from the question - the output in the application is a table with 2 columns. The first of these columns should include all 400 rows from displays:

displays.label    |  display_substances.value
------------------|--------------------------
Display 1         
------------------|--------------------------
...
------------------|--------------------------
Display 400   
------------------|--------------------------

This is fairly trivial since at this point it's just SELECT * FROM displays.

The challenge begins when we want to populate the second column of the table with display_substances.value. The data for a given substance (assume substance ID is 1) might look like this:

id   |   display_id   |   substance_id  |  value
-----|----------------|-----------------|-------------
206  |      1         |        1            Foo
-----|----------------|-----------------|-------------
361  |      3         |        1            Bar
-----|----------------|-----------------|-------------
555  |      5         |        1            Baz
-----|----------------|-----------------|-------------

The problem: In this case we only have 3 records for substance ID 1. So if we do a JOIN query, it will only return 3 rows. But the table we are displaying in the application needs to show all 400 rows from displays and put the text "Not listed" on any row where there is no corresponding row in display_substances. So in the example above, when it encounters display_id 2, 4, 6...400 it should say "Not listed" (because we only have data for three display_substances.display_id [1,3,5] for substance ID 1).

Both columns also need to be searchable.

My solution

I don't think it's possible to do this in MySQL so I resorted to using PHP. The logic is as now follows:

  1. If the user is doing a search on column 2 (display_substances.value): SELECT display_id FROM display_substances WHERE value LIKE '% Search term column 2 %'. Store this as an array, $ds.

  2. Select all 400 records from displays. If the user is performing a search on column 1 (displays.label) then that must form part of the query: SELECT * FROM displays WHERE label LIKE '% Search term column 1 %'. Critically - if the $ds array from step (1) is not empty then the following must become part of query: WHERE displays.id IN (2, 4, 6...400). Store this as an array, $displays

  3. Get all of the display_id's associated with the substance being viewed: SELECT display_id FROM display_substances WHERE substance_id = 1

  4. Do a loop as per point (3) of the original question.

The result is that the page loads in <2 seconds, each column is searchable.

The SQL queries given in answers took - at best - around 15-20 seconds to execute and never gave all 400 rows.

If anyone can improve on this or has a pure SQL solution please post.

Upvotes: 0

ahmad
ahmad

Reputation: 2729

You need a left join & a group_concat to get all records on the left table along with group by.

But keep in mind that group_concat has a limit so you might not get all associated records, as it's usually used for small fields but since you have a 'text' field for your value there's a high probability you'd hit the limit

Anyway here's the query

SELECT d.*, GROUP_CONCAT(ds.value) `substances`
FROM displays `d`
LEFT JOIN display_substances `ds` ON `d`.`id` = `ds`.`display_id`
GROUP BY `d`.`id`

Something like this might work then if I understand correctly

SELECT d.*, IFNULL((SELECT GROUP_CONCAT(value) FROM display_substances `ds` WHERE `ds`.`display_id` = `d`.`id` GROUP BY `ds`.`display_id`), 'Not Listed') `substances`
FROM displays `d`

You can update the where & add AND substance_id = 1

Upvotes: 4

Related Questions