SQL special group by on list of strings ending with *

Question

I would like to perform a "special group by" on strings with SQL language, some ending with "*". I use postgresql. I can not clearly formulate this problem, even if I have partially solved it, with select, union and nested queries which are not elegant.

For exemple :

1) INPUT : I have a list of strings :

thestrings
varchar(9)
--------------
1000
1000-0001
1000-0002
2000*
2000-0001
2000-0002
3000*
3000-00*
3000-0001
3000-0002

2) OUTPUT : That I would like my "special group by" return :

Because 2000-0001 and 2000-0002 are include in 2000*, and because 3000-00*, 3000-0001 and 3000-0002 are includes in 3000*

3) SQL query I do :

SELECT every strings ending with *
UNION
SELECT every string where the begining  NOT IN  (SELECT every string ending with *)   <-- with multiple inelegant left functions and NOT IN subqueries

4) That what I'm doing return :

1000
1000-0001
1000-0002
2000*
3000*
3000-00* <-- the problem

The problem is : 3000-00* staying in my result.

So my question is : How can I generalize my problem? to remove all string who have a same begining string in the list (ending with *) ? I think of regular expressions, but how to pass a list from a select in a regex ?

Thanks for help.

Thorsten Kettner · Accepted Answer

Select only strings for which no master string exists in the table:

select str
from mytable
where not exists 
(
  select *
  from mytable master
  where master.str like '%*'
  and master.str <> mytable.str
  and rtrim(mytable.str, '*') like rtrim(master.str, '*') || '%'
);

SQL special group by on list of strings ending with *

Answers (2)

Related Questions