Reputation: 131
I am trying to write some SQL to query a bank statement table, I would like to populate a column as 'Exclude' if any of the 5 bank statement description columns contain 'SURPLUS CASH','CHQ IN AT' or 'ELECTR CHARGE'
My current attempt is
select statement_no
,bank_text_1
,bank_text_2
,bank_text_3
,bank_text_4
,bank_text_5
,amount
,case
when bank_text_1 like '%SURPLUS CASH%'
or bank_text_1 like '%CHQ IN AT%'
or bank_text_1 like '%ELECTR CHARGE%'
or bank_text_2 like '%SURPLUS CASH%'
or bank_text_2 like '%CHQ IN AT%'
or bank_text_2 like '%ELECTR CHARGE%'
or bank_text_3 like '%SURPLUS CASH%'
or bank_text_3 like '%CHQ IN AT%'
or bank_text_3 like '%ELECTR CHARGE%'
or bank_text_4 like '%SURPLUS CASH%'
or bank_text_4 like '%CHQ IN AT%'
or bank_text_4 like '%ELECTR CHARGE%'
or bank_text_5 like '%SURPLUS CASH%'
or bank_text_5 like '%CHQ IN AT%'
or bank_text_5 like '%ELECTR CHARGE%'
then 'Exclude'
else ''
end as checker;
It's very mess and doesn't seem very efficient. Does anyone have suggestions for a better way?
Many Thanks
Example outcome data:
Statement_no | bank_text_1 | bank_text_2 | bank_text_3 | bank_text_4 | bank_text_5 | amount | checker
0001 | SURPLUS CASH| | | | |125.00 |Exclude
0002 | M THOMAS |TRAINING FEE |CHQ IN AT 100217| | |470.00 |Exclude
0003 |SWALEC |Training Centre One |Abergaveny | |Electr Charges|700.00 |Exlcude
0004 |M Thomas |Training Centre Hire | | | |850.00 |
Upvotes: 3
Views: 106
Reputation: 897
You should try :
select statement_no
,bank_text_1
,bank_text_2
,bank_text_3
,bank_text_4
,bank_text_5
,amount
,case
when q.item is not null
then 'Exclude'
else ''
end as checker
from queries
left join
(
select '%CHQ IN AT%' item union
select '%ELECTR CHARGE%' item union
select '%SURPLUS CASH%' item
) q on bank_text_1 like item or
bank_text_2 like item or
bank_text_3 like item or
bank_text_4 like item or
bank_text_5 like item
I am not sure about the performance but in my opinion join are lighter than cross apply. If you can share the benchmarks I will be interested.
Upvotes: 0
Reputation: 38073
select *
, checker = case
when charindex('SURPLUS CASH',concat(bank_text_1, bank_text_2, bank_text_3, bank_text_4, bank_text_5))>0
or charindex('CHQ IN AT',concat(bank_text_1, bank_text_2, bank_text_3, bank_text_4, bank_text_5))>0
or charindex('ELECTR CHARGE',concat(bank_text_1, bank_text_2, bank_text_3, bank_text_4, bank_text_5))>0
then 'Exclude'
else ''
end
from t
results: http://rextester.com/VBOC27547
+--------------+---------------+-----------------------+------------------+-------------+----------------+--------+---------+
| statement_no | bank_text_1 | bank_text_2 | bank_text_3 | bank_text_4 | bank_text_5 | amount | checker |
+--------------+---------------+-----------------------+------------------+-------------+----------------+--------+---------+
| 0001 | SURPLUS CASH | | | | | 125,00 | Exclude |
| 0002 | M THOMAS | TRAINING FEE | CHQ IN AT 100217 | | | 470,00 | Exclude |
| 0003 | SWALEC | Training Centre One | Abergaveny | | Electr Charges | 700,00 | Exclude |
| 0004 | M Thomas | Training Centre Hire | | | | 850,00 | |
+--------------+---------------+-----------------------+------------------+-------------+----------------+--------+---------+
I tested the performance of this compared to John Cappelletti's answer over 1,000,000 rows and I would say they are comparable.
My personal preference would be John's answer, using cross apply()
.
In the sets below, the first query in each set uses charindex(...,concat(..)
, the second is John's cross apply()
version.
The hardware specs on this instance are pretty weak though, so your mileage will vary.
-----------------------
(1000000 row(s) affected)
Table 't'. Scan count 1, logical reads 8153, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 3010 ms, elapsed time = 8750 ms.
(1000000 row(s) affected)
Table 't'. Scan count 1, logical reads 8153, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 5585 ms, elapsed time = 8725 ms.
-----------------------
(1000000 row(s) affected)
Table 't'. Scan count 1, logical reads 8153, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 3307 ms, elapsed time = 8685 ms.
(1000000 row(s) affected)
Table 't'. Scan count 1, logical reads 8153, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 5663 ms, elapsed time = 8913 ms.
-----------------------
(1000000 row(s) affected)
Table 't'. Scan count 1, logical reads 8153, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 3073 ms, elapsed time = 8414 ms.
(1000000 row(s) affected)
Table 't'. Scan count 1, logical reads 8153, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 5647 ms, elapsed time = 9049 ms.
-------------------
(1000000 row(s) affected)
Table 't'. Scan count 1, logical reads 8153, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 3011 ms, elapsed time = 8706 ms.
(1000000 row(s) affected)
Table 't'. Scan count 1, logical reads 8153, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 5491 ms, elapsed time = 8704 ms.
-----------------------
Upvotes: 1
Reputation: 72225
This looks a bit tidier, it should perform equally though:
SELECT statement_no,
bank_text_1, bank_text_2, bank_text_3, bank_text_4, bank_text_5,
amount,
IIF(x.IsChecked = 1, 'Exclude', '')
FROM mytable
CROSS APPLY (
SELECT SUM(CASE
WHEN (t.v LIKE '%SURPLUS CASH%') OR
(t.v LIKE '%CHQ IN AT%') OR
(t.v LIKE '%ELECTR CHARGE%')
THEN 1
ELSE 0
END) AS IsChecked
FROM
(
VALUES (bank_text_1), (bank_text_2), (bank_text_3), (bank_text_4), (bank_text_5)
) AS t(v)
) AS x
Upvotes: 0
Reputation: 82020
One option is a CROSS APPLY
Select statement_no
, bank_text_1
, bank_text_2
, bank_text_3
, bank_text_4
, bank_text_5
, amount
, B.Checker
From YourTable A
Cross Apply (Select Checker = case when count(*)>0 then 'Exclude' else '' end
From ( values (A.bank_text_1)
,(A.bank_text_2)
,(A.bank_text_3)
,(A.bank_text_4)
,(A.bank_text_5) ) C1 (Value)
Where charindex('SURPLUS CASH',Value)
+charindex('CHQ IN AT',Value)
+charindex('ELECTR CHARGE',Value)
>0
) B
Returns
Upvotes: 2