Reputation: 73
I have a SQL query that fetches one row (LIMIT 1) for a specific client code from the TABLEA based on several conditions. Here's the query:
SELECT
`TABLEA`.`client_code`
FROM
`TABLEA`
WHERE
(
`TABLEA`.`accounting_date` = 202408
AND (
`TABLEA`.`FLAG1` = ''
OR `TABLEA`.`FLAG1` IS NULL
)
AND `TABLEA`.`client_code` ='A12'
AND (
`TABLEA`.`FLAG2` = ''
OR `TABLEA`.`FLAG2` IS NULL
)
AND (
`TABLEA`.`FLA3` = ''
OR `TABLEA`.`FLA3` IS NULL
)
AND (
`TABLEA`.`payment_closing_flag` = ''
OR `TABLEA`.`payment_closing_flag` IS NOT NULL
)
AND NOT (
`TABLEA`.`FLAG4` = 1
AND `TABLEA`.`FLAG4` IS NOT NULL
AND `TABLEA`.`record_status` = 'D'
AND `TABLEA`.`record_status` IS NOT NULL
AND `TABLEA`.`ABC_CODE` IN (0, 370, 380, 390, 400, 410, 420, 900)
AND `TABLEA`.`ABC_CODE` IS NOT NULL
)
) LIMIT 1;
The query works for a single client_code, but I want to execute it for multiple client_code values (e.g., 'A12', 'A13', 'A14', etc.) and retrieve the LIMIT 1 result for each of those client codes.
Here are my constraints:
I cannot use GROUP BY or DISTINCT due to the large number of rows (millions per client), and I only need to check the existence of one record for each client code. Performance is critical, and using DISTINCT or GROUP BY is too slow for this dataset. Is there a way to iterate over multiple client codes and execute this LIMIT 1 query for each client code in a single query or within a single execution block?
Upvotes: 0
Views: 94
Reputation: 142208
As written, you need
INDEX(accounting_date, client_code)
"fetches one row" -- But which row? The most popular? The newest? Something else? What should the ORDER BY
be?
Wherever you say a='' OR a IS NULL
, consider changing the logic to either use ''
or use NULL
for a
. That is, having two values connected by OR
leads to difficulty in optimization.
To get the one from each client_code
, see the tag I added or see Groupwise-Max
Upvotes: 0
Reputation: 7758
I'm afraid that you'll need some kind of aggregation per client_code to make this work. With this, the execution time depends on (beside the table definition, indexes, constraints, etc...) the scope of codes to be fetched. If you need to fetch just a few codes (or a limited reasonable number of predefined codes) out of many in the table then, one of the options could be using Row_Number() Over() analytic function to get just 1 row per client_code.
-- S a m p l e D a t a :
Create Table tbl ( client_code Text, accounting_date Date,
flag1 Text, flag2 Text, flag3 Text, flag4 Text,
payment_closing_flag Text, record_status Text, ABC_CODE Int);
Insert Into tbl VALUES
( 'A12', '2024-08-01', 'F1', 'F2', 'F3', 'F4', 'PCF', 'D', 0 ),
( 'A12', '2024-08-01', null, null, null, null, 'PCF', 'D', 0 ),
( 'A12', '2024-08-01', null, null, null, null, 'PCF', 'D', 0 ),
( 'A13', '2024-10-01', 'F1', 'F2', 'F3', 'F4', 'PCF', 'D', 0 ),
( 'A13', '2024-10-01', null, null, null, null, 'PCF', 'D', 0 ),
( 'A14', '2024-08-01', 'F1', 'F2', 'F3', 'F4', 'PCF', 'D', 0 ),
( 'A14', '2024-08-01', '', null, null, null, 'PCF', 'D', 0 ),
( 'A15', '2024-08-01', 'F1', 'F2', 'F3', 'F4', 'PCF', 'D', 0 ),
( 'A16', '2024-08-01', '', null, null, null, 'PCF', 'D', 0 );
1. Predefined client_code(s) of interest
-- S Q L : ( filtered by predefined list of client_codes )
WITH
codes (client_code) AS
( Select 'A12' Union All
Select 'A13' Union All
Select 'A14'
)
SELECT c.client_code
FROM ( Select c.client_code, Row_Number() Over(Partition By t.client_code) as RN
From codes c
Inner Join tbl t ON(t.client_code = c.client_code)
Where ( t.accounting_date = '2024-08-01'
AND ( t.FLAG1 = '' OR t.FLAG1 IS NULL )
AND ( t.FLAG2 = '' OR t.FLAG2 IS NULL )
AND ( t.FLAG3 = '' OR t.FLAG3 IS NULL )
AND ( t.payment_closing_flag = '' OR t.payment_closing_flag IS NOT NULL )
AND NOT ( t.FLAG4 = '1' AND t.FLAG4 IS NOT NULL
AND t.record_status = 'D' AND t.record_status IS NOT NULL
AND t.ABC_CODE IN (0, 370, 380, 390, 400, 410, 420, 900) AND t.ABC_CODE IS NOT NULL
)
)
) c
WHERE RN = 1;
R e s u l t :
client_code |
---|
A12 |
A14 |
If you need all client_code(s) that satisfy conditions - this would be performance costly depending on structure of your actual data, but also on other factors as with any SQL dealing with large tables data.
2. for all client_codes
-- S Q L : ( for all client_codes )
SELECT t.client_code
FROM ( Select t.client_code, Row_Number() Over(Partition By t.client_code) as RN
From tbl t
Where ( t.accounting_date = '2024-08-01'
AND ( t.FLAG1 = '' OR t.FLAG1 IS NULL )
AND ( t.FLAG2 = '' OR t.FLAG2 IS NULL )
AND ( t.FLAG3 = '' OR t.FLAG3 IS NULL )
AND ( t.payment_closing_flag = '' OR t.payment_closing_flag IS NOT NULL )
AND NOT ( t.FLAG4 = '1' AND t.FLAG4 IS NOT NULL
AND t.record_status = 'D' AND t.record_status IS NOT NULL
AND t.ABC_CODE IN (0, 370, 380, 390, 400, 410, 420, 900) AND t.ABC_CODE IS NOT NULL
)
)
) t
WHERE RN = 1;
R e s u l t :
client_code |
---|
A12 |
A14 |
A16 |
Note:
Sample data are "dummy" for simplicity and using date and/or flag1 for distinction of your complex Where conditions. Conditions are taken as in your code without analysys. There might be some possibilities for simplification, but you know better your data - so I leave it as it was provided.
Upvotes: 0
Reputation: 21
Lets say you have a table with the list of all clients TABLE_ALL_CLIENTS
. Then try this:
SELECT `TABLE_ALL_CLIENTS`.`client_code`,
CASE
WHEN EXISTS (
SELECT *
FROM `TABLEA`
WHERE `TABLEA`.`client_code` = `TABLE_ALL_CLIENTS`.`client_code`
)
THEN 1
ELSE 0
END AS IfExists
FROM `TABLE_ALL_CLIENTS`
As a result you have table of clients with value 1 or 0, then you can make one more sub-request from it to get the final result set
Upvotes: 1