Reputation: 13
I need a query to classify each record from a table based on a given description. For example, if the description contains 'CABL' the classification will be 'Cable', if it contains 'SLEV' then will be 'Slev', if it contains 'TERM' then will be 'Terminal', if it contains 'MULTI' then will be 'Multiconductor' and so on with many others classifications.
It works like this:
DESCRIPTION: CABL 150C TXL 0.500 BLU LT XLPE THIN
CLASSIFICATION: Cable
DESCRIPTION: SLEV ID=0.24 OD=0.34 L=1.04 CLR
CLASSIFICATION: Slev
DESCRIPTION: TERM 6 F 6.3 DFK AG
CLASSIFICATION: Terminal
DESCRIPTION: CABL 0.500 BLK MULTICONDUCTOR
CLASSIFICATION: Multiconductor
As you can see the problem begins when the description contains more than one 'key word' on it; in the fourth example there are two keywords on the description ('CABL' and 'MULTI') but the correct classification is 'Multiconductor'.
I'm working on a query using CASE and it looks like this:
CASE
WHEN Descrip LIKE 'CABL%'
AND Descrip NOT LIKE '%MULTI%'
AND Descrip NOT LIKE '%SLEV%'
THEN 'CABLE'
WHEN Descrip LIKE '%SEAL%'
AND Descrip NOT LIKE '%CONN%'
AND Descrip NOT LIKE '%LOCK%'
AND Descrip NOT LIKE '%TERM%'
THEN 'SEAL'
END AS Commodity
The query will work but is going to be a big one and I was thinking in use LEFT/RIGHT/INNER JOINs to select all the classifications and then segregate them, the problem is that I need to make the SELECT sentence every time I'm going to need it so I was wondering if there is a form to put the SELECT 'inside' a variable and only do the joins using them.
SELECT * FROM Table WHERE Descrip LIKE 'CABL%' AS cable_selection
SELECT * FROM Table WHERE Descrip LIKE '%MULTI%' AS multi_selection
It is just something I'm thinking on, if you know another way to do this it will help too.
Thanks for your help.
Upvotes: 0
Views: 68
Reputation: 994
I would use a table instead of canned select statements.
use tempdb;
create table DescriptionTable (
[desc] varchar( 50 ),
class varchar( 20 )
)
go
insert into DescriptionTable ( [desc] ) values
('CABL 150C TXL 0.500 BLU LT XLPE THIN'),
('SLEV ID=0.24 OD=0.34 L=1.04 CLR'),
('TERM 6 F 6.3 DFK AG'),
('CABL 0.500 BLK MULTICONDUCTOR' )
create table #classMap (
keyWord varchar( 20 ),
classification varchar( 20 ),
weight int
)
insert into #classMap ( keyWord, classification, weight ) values
( 'CABL%', 'CABLE', 1 ),
( '%SLEV%', 'SLEV', 2 ),
( '%TERM%', 'TERMINAL', 3 ),
( '%MULTICONDUCTOR', 'Multiconductor', 4 )
;;with BestWeight as (
select top 1 with ties [desc], classification
from DescriptionTable
join #classMap
on [desc] like keyWord
order by ROW_NUMBER() over ( partition by [desc] order by weight desc )
)
select * from BestWeight;
drop table #classMap;
This results in:
desc classification weight
CABL 0.500 BLK MULTICONDUCTOR Multiconductor 4
CABL 150C TXL 0.500 BLU LT XLPE THIN CABLE 1
SLEV ID=0.24 OD=0.34 L=1.04 CLR SLEV 2
TERM 6 F 6.3 DFK AG TERMINAL 3
If a simple weight value won't work, then you you'll have to set up a second table of excluded key words that you left join too. Then only keep the ones that don't match.
Upvotes: 0
Reputation: 1270473
You seem to have a priority system for the types, so take advantage of the fact that case
expressions are evaluated in order:
(CASE WHEN Descrip LIKE '%MULTI%' THEN 'MULTI'
WHEN Descrip LIKE 'CABL%' THEN 'CABLE'
WHEN Descrip LIKE '%SLEV%' THEN 'SLEV'
ELSE '???!!!'
END)
Upvotes: 1