Reputation: 177
I have 2 tables name and match. The name and match table have columns type. The columns and data in the name table
ID| type |
--| ---- |
1| 1ABC |
2| 2DEF |
3| 3DEF |
4| 4IJK |
The columns and data in match table is
type | DATA |
---|---|
NOT %ABC% AND NOT %DEF% | NOT ABC AND NOT DEF |
%DEF% | DEF ONLY |
NOT %DEF% AND NOT %IJK% | NOT DEF AND NOT IJK |
I have tried using case statement. The first 3 characters will be NOT if there is a NOT in the type in match table. The below query is giving me a missing keyword error. I am not sure what I am missing here
SELECT s.id, s.type, m.data
where case when substr(m.type1,3)='NOT' then s.type not in (REPLACE(REPLACE(m.type,'NOT',''),'AND',','))
ELSE s.type in (m.type) end
from source s, match m;
I need the output to match the type in source column and display the data in match column.
The output should be
ID|type|DATA
1 |1ABC|NOT DEF AND NOT IJK
2 |2DEF|DEF ONLY
3 |3DEF|DEF ONLY
4 |4IJK|NOT ABC AND NOT DEF
Upvotes: 1
Views: 94
Reputation: 271
WITH match_cte AS (
SELECT m.data
,m.type
,decode(instr(m.type,'NOT')
,1 -- found at position 1
,0
,1) should_find_str_1
,substr(m.type
,instr(m.type,'%',1,1) + 1
,instr(m.type,'%',1,2) - instr(m.type,'%',1,1) - 1) str_1
,decode(instr(m.type,'NOT',instr(m.type,'%',1,2))
,0 -- no second NOT
,1
,0) should_find_str_2
,substr(m.type
,instr(m.type,'%',1,3) + 1
,instr(m.type,'%',1,4) - instr(m.type,'%',1,3) - 1) str_2
FROM match m
)
SELECT s.id
,s.type
,m.data
FROM source s
CROSS JOIN match_cte m
WHERE m.should_find_str_1 = sign(instr(s.type,m.str_1))
AND (m.str_2 IS NULL
OR m.should_find_str_2 = sign(instr(s.type, m.str_2))
)
ORDER BY s.id, m.data
MATCH_CTE |DATA|TYPE|SHOULD_FIND_STR_1|STR_1|SHOULD_FIND_STR_2|STR_2| |-|-|-|-|-|-| |NOT ABC AND NOT DEF|NOT %ABC% AND NOT %DEF%|0|ABC|0|DEF| |DEF|%DEF%|1|DEF|1|NULL| |NOT DEF AND NOT IJK|NOT %DEF% AND NOT %IJK%|0|DEF|0|IJK|
Upvotes: 0
Reputation: 3576
The biggest problem with your attempted query seems to be that SQL requires the WHERE
clause to come after the FROM
clause.
But your query is flawed in other ways as well. Although it can have complicated logic within it, including subqueries, a CASE
statement must ultimately return a constant. Conditions within it are not applied as if they are in a WHERE
clause of the main query (like you appear to be trying to do).
My recommendation would be to not store the match
table as you currently are. It seems much preferable to have something that contains each condition you want to evaluate. Assuming that's not possible, I suggest a CTE (or even a view) that breaks it down that way first.
This query (based on Nefreo's answer for breaking strings into multiple rows)...
SELECT
data,
regexp_count(m.type, ' AND ') + 1 num,
CASE WHEN REGEXP_SUBSTR(m.type,'(.*?)( AND |$)',1,levels.column_value) like 'NOT %' THEN 1 ELSE 0 END negate,
replace(replace(REGEXP_SUBSTR(m.type,'(.*?)( AND |$)',1,levels.column_value), 'NOT '), ' AND ') match
FROM match m INNER JOIN
table(cast(multiset(select level from dual connect by level <= regexp_count(m.type, ' AND ') + 1) as sys.OdciNumberList)) levels
ON 1=1
... breaks your match table into something more like:
DATA | NUM | NEGATE | MATCH |
---|---|---|---|
NOT ABC AND NOT DEF | 2 | 1 | %ABC% |
NOT ABC AND NOT DEF | 2 | 1 | %DEF% |
DEF ONLY | 1 | 0 | %DEF% |
NOT DEF AND NOT IJK | 2 | 1 | %DEF% |
NOT DEF AND NOT IJK | 2 | 1 | %IJK% |
So we now know each specific like
condition, whether it should be negated, and the number of conditions that need to be matched for each MATCH
row. (For simplicity, I am using match.data
as essentially a key for this since it is unique for each row in match
and is what we want to return anyway, but if you were actually storing the data this way you'd probably use a sequence of some sort and not repeat the human-readable text.)
That way, your final query can be quite simple:
SELECT name.id, name.type, criteria.data
FROM name INNER JOIN criteria
ON
(criteria.negate = 0 AND name.type LIKE criteria.match)
OR
(criteria.negate = 1 AND name.type NOT LIKE criteria.match)
GROUP BY name.id, name.type, criteria.data
HAVING COUNT(*) = MAX(criteria.num)
ORDER BY name.id
The conditions in the ON
do the appropriate LIKE
or NOT LIKE
(matches one condition from the CRITERIA
view/CTE), and the condition in the HAVING
makes sure we had the correct number of total matches to return the row (makes sure we matched all the conditions in one row of the MATCH
table).
You can see the entire thing...
WITH criteria AS
(
SELECT
data,
regexp_count(m.type, ' AND ') + 1 num,
CASE WHEN REGEXP_SUBSTR(m.type,'(.*?)( AND |$)',1,levels.column_value) like 'NOT %' THEN 1 ELSE 0 END negate,
replace(replace(REGEXP_SUBSTR(m.type,'(.*?)( AND |$)',1,levels.column_value), 'NOT '), ' AND ') match
FROM match m INNER JOIN
table(cast(multiset(select level from dual connect by level <= regexp_count(m.type, ' AND ') + 1) as sys.OdciNumberList)) levels
ON 1=1
)
SELECT name.id, name.type, criteria.data
FROM name INNER JOIN criteria
ON
(criteria.negate = 0 AND name.type LIKE criteria.match)
OR
(criteria.negate = 1 AND name.type NOT LIKE criteria.match)
GROUP BY name.id, name.type, criteria.data
HAVING COUNT(*) = MAX(criteria.num)
ORDER BY name.id
... working in this fiddle.
As a one-off, I don't think this is significantly different than the other answer already provided, but I wanted to do this since I think this is probably more maintainable if the complexity of your conditions changes.
It already handles arbitrary numbers of conditions, mixes of NOT
and not-NOT
within the same row of MATCH
, and allows for the % signs (for the like
) to be placed arbitrarily (e.g. startswith%
, %endswith
, %contains%
, start%somewhere%end
, exactmatch
should all work as expected). If in the future you want to add different types of conditions or handle OR
s, I think the general ideas here will apply.
Upvotes: 2
Reputation: 7786
Not knowing the possible other rules for selecting rows, just with your data from the question, maybe you could use this:
WITH
tbl_name AS
(
Select 1 "ID", '1ABC' "A_TYPE" From Dual Union All
Select 2 "ID", '2DEF' "A_TYPE" From Dual Union All
Select 3 "ID", '3DEF' "A_TYPE" From Dual Union All
Select 4 "ID", '4IJK' "A_TYPE" From Dual
),
tbl_match AS
(
Select 'NOT %ABC% AND NOT %DEF%' "A_TYPE", 'NOT ABC AND NOT DEF' "DATA" From Dual Union All
Select '%DEF%' "A_TYPE", 'DEF ONLY' "DATA" From Dual Union All
Select 'NOT %DEF% AND NOT %IJK%' "A_TYPE", 'NOT DEF AND NOT IJK' "DATA" From Dual
)
Select
n.ID "ID",
n.A_TYPE,
m.DATA
From
tbl_match m
Inner Join
tbl_name n ON (1=1)
Where
(
INSTR(m.A_TYPE, 'NOT %' || SubStr(n.A_TYPE, 2) || '%', 1, 1) = 0
AND
INSTR(m.A_TYPE, 'NOT %' || SubStr(n.A_TYPE, 2) || '%', 1, 2) = 0
AND
Length(m.A_TYPE) > Length(SubStr(n.A_TYPE, 2)) + 2
)
OR
(
Length(m.A_TYPE) = Length(SubStr(n.A_TYPE, 2)) + 2
AND
'%' || SubStr(n.A_TYPE, 2) || '%' = m.A_TYPE
)
Order By n.ID
Result:
ID | A_TYPE | DATA |
---|---|---|
1 | 1ABC | NOT DEF AND NOT IJK |
2 | 2DEF | DEF ONLY |
3 | 3DEF | DEF ONLY |
4 | 4IJK | NOT ABC AND NOT DEF |
Any other format of condition should be evaluated separately ...
Regards...
Upvotes: 0