Siva
Siva

Reputation: 302

select unique GROUP_CONCAT-ed rows based on different column

Given table can have following rows.

i.e. for a given filename, there can be two unique version_id(s).

file_id    version_id      filename
1          OS_v1           abc.update
1          App_v1          abc.update
2          OS_v2           xyz.update
2          App_v2          xyz.update
3          OS_v1           abc(1).update
3          App_v1          abc(1).update

PRIMARY KEY (`version_id`, `filename`)

How to detect there are no two different filename's having same combination of OS_App (versions) ?

In the given example, row set with file_id=3 is a duplicate of file_id=1.

Note: It's easy to define separate columns for OS and App version, but that requires a lot of code change which we dont wanted to go through.

Question: is there a SELECT query which would return just file_id = 1 and file_id = 2 and omit file_id = 3 ?

So far I have come up with this query which selects a combination of version_id grouped by filename, but row-2 is a duplicate of row-1

SELECT   DISTINCT(GROUP_CONCAT(version_id SEPARATOR '-')) ,
         filename
FROM     schema_name.table_name
GROUP BY filename;

Returns :

concat_version patch_filename
OS_V1-APP_V1   xyz.update
OS_V2-APP_V2   abc(1).update
OS_V1-APP_V1   abc.update

Upvotes: 1

Views: 45

Answers (1)

GMB
GMB

Reputation: 222442

Question: Is there a SELECT query which would return just file_id = 1 and file_id = 2 and omit file_id = 3

If you are using MySQL 8.0, you can take advantage of window function ROW_NUMBER() :

SELECT x.file_id, x.version_id, x.filename
FROM (
    SELECT t.*, ROW_NUMBER() OVER(PARTITION BY version_id ORDER BY file_id) rn
    FROM master_logs.system_patches t    
) x
WHERE x.rn = 1

The inner query assigns a row number to each record in version_id groups, ordered by file_id, and the outer query filters in records with row number 1.


With earlier versions of MySQL, one typical solution is to use a correlated subquery with a NOT EXISTS condition to filter out unwanted records :

SELECT t.file_id, t.version_id, t.filename
FROM master_logs.system_patches t 
WHERE NOT EXISTS (
    SELECT 1 
    FROM master_logs.system_patches t1
    WHERE t1.version_id = t.version_id AND t1.file_id < t.file_id
)

Upvotes: 1

Related Questions