Craig Jones
Craig Jones

Reputation: 51

Using a case to assign numerical values and pull through the highest value

I am currently trying to overhaul the current keyword search I have put into the scripting software we use (the old one was vey basic and cumbersome) to something more refined. There are limitations to the software we use, and IT are tied up, so I do not have the option of creating a function or stored procedure which I appreciate would be the ideal solution.

The situation is that the end user might be using a generic script, when they should be using a specific script for their issue, so I want to create some SQL that detects if they should have gone to a different script using a keyword search.

I have a list of words associated with each script I feel they should be using, for example:

Repair script keywords: repair, broken, chasing

Estate script keywords: dirty, cleaning, garden

What I want to do is to make some SQL that assigns a numerical value of 1 to each instance of these words within the databox '{Script.Details01}', and then works out which set of keywords has the highest tally at the end.

This is what I have tired so far, I know it's likely not working due to the syntax. Sadly the software we are using is pretty vague when giving error messages so it's not much help. We are using aliases of V and D. D is the user display, so what they physically see, V is the value that the system reads and is not seen by the user. The databox which is where the string we are searching from is '{Script.Details01}'. As this information is stored within the software virtually we do not have to use the FROM field as we would do normally when referencing this location.

SELECT 'GO TO DIFFERENT SCRIPT' D, 'GO TO DIFFERENT SCRIPT' V,
        CASE WHEN EXISTS(SELECT '{Script.Details01}'  WHERE '{Script.Details01}' like '%repair%') THEN 1 ELSE 0 END +
        CASE WHEN EXISTS(SELECT '{Script.Details01}'  WHERE '{Script.Details01}' like '%broken%') THEN 1 ELSE 0 END +
        CASE WHEN EXISTS(SELECT '{Script.Details01}'  WHERE '{Script.Details01}' like '%chasing%') THEN 1 ELSE 0 END AS REP
        CASE WHEN EXISTS(SELECT '{Script.Details01}'  WHERE '{Script.Details01}' like '%dirty%') THEN 1 ELSE 0 END +
        CASE WHEN EXISTS(SELECT '{Script.Details01}'  WHERE '{Script.Details01}' like '%cleaning%') THEN 1 ELSE 0 END +
        CASE WHEN EXISTS(SELECT '{Script.Details01}'  WHERE '{Script.Details01}' like '%garden%') THEN 1 ELSE 0 END AS EST
WHERE REP = (SELECT MAX(REP)) AND REP <> 0 AND > EST
OR EST = (SELECT MAX(EST)) AND EST <> 0 AND > REP

Essentially what I'm looking for the code to do is to tell me if there is a higher tally for REP (repair) and EST (estate) or if there are no values registered against either. Apologies if I have not explained this well, there are a few restrictions within the software we are using so trying to explain it as best I can. Any ideas would be greatly appreciated.

Upvotes: 1

Views: 119

Answers (3)

Khaled.Zeidiyeh
Khaled.Zeidiyeh

Reputation: 1

Try this, note that I have a sample text for every keywords. you need to replace these with the one you have:

DECLARE @repairtext AS varchar(max);
SET @repairtext = 'repair, broken, chasing, garden dirty, cleaning, garden'
DECLARE @repair int,@RepairCounter int
SET @repair= PATINDEX('%repair%',@repairtext)
IF @repair<>0  SET @Repaircounter=1
WHILE @repair<>0
BEGIN
SET @repairtext = STUFF(@repairtext,1,@repair+6,'')
SET @repair= PATINDEX('%repair%',@repairtext)
IF @repair<>0  SET @RepairCounter=@RepairCounter+1
END;

DECLARE @brokentext AS varchar(max);
SET @brokentext = 'repair, broken, chasing, garden dirty, cleaning, garden'
DECLARE @broken int,@BrokenCounter int
SET @broken= PATINDEX('%broken%',@brokentext)
IF @broken<>0  SET @BrokenCounter=1
WHILE @broken<>0
BEGIN
SET @brokentext = STUFF(@brokentext,1,@broken+6,'')
SET @broken= PATINDEX('%broken%',@brokentext)
IF @broken<>0  SET @BrokenCounter=@BrokenCounter+1
END;

DECLARE @chasingtext AS varchar(max);
SET @chasingtext = 'repair, broken, chasing, garden dirty, cleaning, garden'
DECLARE @chasing int,@chasingCounter int
SET @chasing= PATINDEX('%chasing%',@chasingtext)
IF @chasing<>0  SET @chasingCounter=1
WHILE @chasing<>0
BEGIN
SET @chasingtext = STUFF(@chasingtext,1,@chasing+7,'')
SET @chasing= PATINDEX('%chasing%',@chasingtext)
IF @chasing<>0  SET @chasingCounter=@chasingCounter+1
END;

DECLARE @dirtytext AS varchar(max);
SET @dirtytext ='repair, broken, chasing, garden dirty, cleaning, garden'
DECLARE @dirty int,@dirtyCounter int
SET @dirty= PATINDEX('%dirty%',@dirtytext)
IF @dirty<>0  SET @dirtyCounter=1
WHILE @dirty<>0
BEGIN
SET @dirtytext = STUFF(@dirtytext,1,@dirty+5,'')
SET @dirty= PATINDEX('%dirty%',@dirtytext)
IF @dirty<>0  SET @dirtyCounter=@dirtyCounter+1
END;

DECLARE @cleaningtext AS varchar(max);
SET @cleaningtext = 'repair, broken, chasing, garden dirty, cleaning, garden'
DECLARE @cleaning int,@cleaningCounter int
SET @cleaning= PATINDEX('%cleaning%',@cleaningtext)
IF @cleaning<>0  SET @cleaningCounter=1
WHILE @cleaning<>0
BEGIN
SET @cleaningtext = STUFF(@cleaningtext,1,@cleaning+8,'')
SET @cleaning= PATINDEX('%cleaning%',@cleaningtext)
IF @cleaning<>0  SET @cleaningCounter=@cleaningCounter+1
END;

DECLARE @gardentext AS varchar(max);
SET @gardentext = 'repair, broken, chasing, garden dirty, cleaning, garden'
DECLARE @garden int,@gardenCounter int
SET @garden= PATINDEX('%garden%',@gardentext)
IF @garden<>0  SET @gardenCounter=1
WHILE @garden<>0
BEGIN
SET @gardentext = STUFF(@gardentext,1,@garden+6,'')
SET @garden= PATINDEX('%garden%',@gardentext)
IF @garden<>0  SET @gardenCounter=@gardenCounter+1
END;

DECLARE @REP int = @RepairCounter + @BrokenCounter + @chasingCounter
,@EST int = @dirtyCounter + @cleaningCounter + @gardenCounter;

IF @REP > @EST
SELECT @REP AS REP
ELSE IF  @REP < @EST
 SELECT @EST AS EST
 ELSE SELECT @REP AS REP;

Upvotes: 0

Me.Name
Me.Name

Reputation: 12544

I don't know if you can store the keywords in the database, but that would imho be preferable over hard coding. That way you could keep the maintenance of the keywords outside the function. The t-sql below works as is, but the keywords table can be ported to the db itself instead:

declare @keywords table(word varchar(50), type varchar(10)) --preferrably when stored in db, the type would be normalized to another table
insert into @keywords values
    ('repair', 'Rep'),
    ('broken', 'Rep'),
    ('chasing', 'Rep'),
    ('dirty', 'EST'),
    ('cleaning', 'EST'),
    ('garden', 'EST')

declare @teststring varchar(512) = 'When the film "chasing cars" was broken, we tried to repair it. It turned out it was dirty from lying in the garden, so we gave it a thorough cleaning' 

select top 1 k.type, COUNT(1) cnt from @keywords k where  @teststring like '%' + k.word + '%' group by k.type order by COUNT(1) desc

For each word inside the table, a like is performed (wildcards can be used inside the words as well). The group by checks the number of occurrences for each type and the order by and top 1 make sure you only get the one with most occurences (you could add an extra gravity to the types to sort on, in case both types have an equal number of occurences)

edit Since storing in the db isn't an option: the same can also be done without a table variable:

select top 1 k.type from (values
    ('repair', 'Rep'),
    ('broken', 'Rep'),
    ('chasing', 'Rep'),
    ('dirty', 'EST'),
    ('cleaning', 'EST'),
    ('garden', 'EST')
     ) k(word,type) 
 where  @teststring like '%' + k.word + '%' group by k.type order by COUNT(1) desc

Upvotes: 2

user1429080
user1429080

Reputation: 9166

Here is a first stab at solving this to some extent. It uses a CTE which might or might not be feasible in your case:

declare @inputText nvarchar(2000)
set @inputText = 'Caller wanting to complain about the repair they have chasing for days, as their boiler is garden broken and needs repair'

--------

declare @inputText nvarchar(2000)
set @inputText = 'Caller wanting to complain about the repair they have chasing for days, as their boiler is garden broken and needs repair'

;with SplitIntoWords(DataItem, Data) as (
    select cast(LEFT(@inputText, CHARINDEX(' ', @inputText + ' ') - 1) as nvarchar(2000)),
        cast(STUFF(@inputText, 1, CHARINDEX(' ', @inputText + ' '), '') as nvarchar(2000))
    union all
    select cast(LEFT(Data, CHARINDEX(' ', Data + ' ') - 1) as nvarchar(2000)),
        cast(STUFF(Data, 1, CHARINDEX(' ', Data + ' '), '') as nvarchar(2000))
    from SplitIntoWords
    where Data > ''
)
select  (
    select count(*) from SplitIntoWords where DataItem in ('repair','broken','chasing')
) as rep,
(
    select count(*) from SplitIntoWords where DataItem in ('dirty','cleaning','garden')
) as est,
(
    select count(*) from SplitIntoWords where DataItem not in ('dirty','cleaning','garden','repair','broken','chasing')
) as other

Note: The CTE is an adaptation of a CTE in this answer.

So what does it actually do? It uses the CTEto split the input text into individual words, then it searches for the given keywords and performs a count. Output from sample:

+-----+-----+-------+
| rep | est | other |
+-----+-----+-------+
|   4 |   1 |    16 |
+-----+-----+-------+

So it has found 1 keyword which belongs to the Estate group, 4 keywords from the Repair group, and 16 others.

To adapt the sample for your use case, replace @inputText with '{Script.Details01}'.

--- EDIT ---

Try this then:

;with SplitIntoWords(DataItem, Data) as (
    select cast(LEFT(@inputText, CHARINDEX(' ', @inputText + ' ') - 1) as nvarchar(2000)),
        cast(STUFF(@inputText, 1, CHARINDEX(' ', @inputText + ' '), '') as nvarchar(2000))
    union all
    select cast(LEFT(Data, CHARINDEX(' ', Data + ' ') - 1) as nvarchar(2000)),
        cast(STUFF(Data, 1, CHARINDEX(' ', Data + ' '), '') as nvarchar(2000))
    from SplitIntoWords
    where Data > ''
)
select top 1 scriptType, count(*) as typeCount
from (
    select case when DataItem in ('repair','broken','chasing') then 'rep' when DataItem in ('dirty','cleaning','garden') then 'est' else '' end as scriptType,
        DataItem
    from SplitIntoWords
) as sub
where scriptType != ''
group by scriptType
order by count(*) desc

Output:

+------------+-----------+
| scriptType | typeCount |
+------------+-----------+
| rep        |         4 |
+------------+-----------+

--- ONE MORE EDIT, Wildcard searching ---

Replace

    select case when DataItem in ('repair','broken','chasing') then 'rep' when DataItem in ('dirty','cleaning','garden') then 'est' else '' end as scriptType,

with

    select 
        case when 
            DataItem like '%repair%' 
            or DataItem like '%broken%'
            or DataItem like '%chasing%' then 'rep'
        when 
            DataItem like '%dirty&'
            or DataItem like '%cleaning%'
            or DataItem like '%garden%' then 'est'
        else '' end as scriptType,

Upvotes: 2

Related Questions