David
David

Reputation: 3822

SELECTing Related Rows Based on a Single Row Match

I have the following table running on Postgres SQL 9.5:

+---+------------+-------------+
|ID |  trans_id  |   message   |
+---+------------+-------------+
| 1 | 1234567    | abc123-ef   |
| 2 | 1234567    | def234-gh   |
| 3 | 1234567    | ghi567-ij   |
| 4 | 8902345    | ced123-ef   |
| 5 | 8902345    | def234-bz   |
| 6 | 8902345    | ghi567-ij   |
| 7 | 6789012    | abc123-ab   |
| 8 | 6789012    | def234-cd   |
| 9 | 6789012    | ghi567-ef   |
|10 | 4567890    | abc123-ab   |
|11 | 4567890    | gex890-aj   |
|12 | 4567890    | ghi567-ef   |
+---+------------+-------------+

I am looking for the rows for each trans_id based on a LIKE query, like this:

SELECT * FROM table 
WHERE message LIKE '%def-234%'

This, of course, returns just three rows, the three that match my pattern in the message column. What I am looking for, instead, is all the rows matching that trans_id in groups of messages that match. That is, if a single row matches the pattern, get all the rows with the trans_id of that matching row.

That is, the results would be:

+---+------------+-------------+
|ID |  trans_id  |   message   |
+---+------------+-------------+
| 1 | 1234567    | abc123-ef   |
| 2 | 1234567    | def234-gh   |
| 3 | 1234567    | ghi567-ij   |
| 4 | 8902345    | ced123-ef   |
| 5 | 8902345    | def234-bz   |
| 6 | 8902345    | ghi567-ij   |
| 7 | 6789012    | abc123-ab   |
| 8 | 6789012    | def234-cd   |
| 9 | 6789012    | ghi567-ef   |
+---+------------+-------------+

Notice rows 10, 11, and 12 were not SELECTed because there was not one of them that matched the %def-234% pattern.

I have tried (and failed) to write a sub-query to get the all the related rows when a single message matches a pattern:

SELECT sub.*
    FROM (
        SELECT DISTINCT trans_id FROM table WHERE message LIKE '%def-234%'
    ) sub
WHERE table.trans_id = sub.trans_id

I could easily do this with two queries, but the first query to get a list of matching trans_ids to include in a WHERE trans_id IN (<huge list of trans_ids>) clause would be very large, and would not be a very inefficient way of doing this, and I believe there exists a way to do it with a single query.

Thank you!

Upvotes: 1

Views: 40

Answers (3)

Ramesh
Ramesh

Reputation: 1484

You can do this.

WITH trans
AS
(SELECT DISTINCT trans_id 
   FROM t1 
  WHERE message LIKE '%def234%')
SELECT t1.*
 FROM t1,
      trans
WHERE t1.trans_id = trans.trans_id;

I think this will perform better. If you have enough data, you can do an explain on both Sub query and CTE and compare the output.

Upvotes: 1

Herv&#233; Piedvache
Herv&#233; Piedvache

Reputation: 798

This will do the job I think :

WITH sub AS (
    SELECT trans_id 
      FROM table 
     WHERE message LIKE '%def-234%'
)
SELECT *
  FROM table JOIN sub USING (trans_id);

Hope this help.

Upvotes: 1

Giorgos Betsos
Giorgos Betsos

Reputation: 72165

Try this:

SELECT ID, trans_id, message
FROM (
   SELECT ID, trans_id, message,
          COUNT(*) FILTER (WHERE message LIKE '%def234%') 
                   OVER (PARTITION BY trans_id)  AS pattern_cnt
   FROM mytable) AS t
WHERE pattern_cnt >= 1  

Using a FILTER clause in the windowed version of COUNT function we can get the number of records matching the predefined pattern within each trans_id slice. The outer query uses this count to filter out irrelevant slices.

Demo here

Upvotes: 1

Related Questions