Reputation: 13

check if a substring exists in another row for the same ID

How could I retrieve only records that have a substring in one attribute that exists in the same attribute but on a different record for the same ID?..

to elaborate what I mean here's an example:

Table containing two attributes:
- ID
- version

Data is as follows:

ID version

1 'draft 1.0'
1 'final'
1 '1.0'
2 'draft 1.2'
2 'final'

In the example below we see that for ID '1' the substring of the first row has '1.0' in it. for ID '1' the '1.0' value is also present in the third record. where as for ID '2' the version '1.2' is only as a substring in the the forth record.. this ID doesn't have a record with the version number by itself..

My aim is to write an SQL in Oracle that would return only ID '1' since it has the version repeated separately in different row

any help on this would be very much appreciated :)

Rami

Upvotes: 1

Answers (3)

Abecee

Reputation: 2393

A way of doing it without regex:

SELECT
  T1.id
  , T1.version
FROM T T1
JOIN T T2
  ON T1.id = T2.id
AND INSTR(T2.version, T1.version) > 1
;

See it in action: SQL Fiddle
Please comment if and as adjustment / further detail is required.

Upvotes: 0

Sylvain Leroux

Reputation: 52060

Given what you explained, the better advice is probably the one given by @mason in a comment: "You should normalize your tables.".

Assuming this is not an option at this time, my second option would be to add some virtual column containing the version number. At the very least this will allow you to easily index those values for efficient retrieving. I tend to think this could ease transition to a proper schema too:

ALTER TABLE T ADD "versnum"
              GENERATED ALWAYS AS (REGEXP_SUBSTR("version",'\d+(\.\d+)+'))
;

CREATE INDEX t_versnum_idx ON T("versnum")
;

Now, you query is simply something like that:

select "ID", "versnum" from t group by "ID", "versnum" having count(*) > 1;

Producing:

 id | versnum 
----+---------
  1 | 1.0

If really you can't at all alter the schema, using a cte is probably your best bet:

with cte as (
    select T.*, REGEXP_SUBSTR("version",'\d+(\.\d+)+') "versnum"
    from t
)   
select "ID", "versnum" from cte group by "ID", "versnum" having count(*) > 1;

Upvotes: 1

Jean-François Savard

Reputation: 21004

This should works

with countOfVersion as
(select id, count(id) occurence
    from yourTable
    where regexp_like(version, '^*1.0$*$')
    group by id) select id from countOfVersion where occurence > 0;

If you want to check for version 1.2 then don't forget to change the 1.0 in the regex.

Upvotes: 0

check if a substring exists in another row for the same ID

Answers (3)

Related Questions