s.brody
s.brody

Reputation: 185

How to optimize a stored procedure that takes too long to execute?

I've written a stored procedure to generate random SMS records/events.

When inserting 1.2 million rows, the query takes hundreds of minutes

exec insert_random_sms 1200000

I've coded the stored procedure in a 'procedural' way. But, from what I see, SQL is not very efficient in this respect.

create proc insert_random_sms 
    @number_of_records int
as
begin
    declare @cnt int = 0;   -- loop counter
    declare @phone_id int;
    declare @dest_id int;

    while (@cnt < @number_of_records)
    begin
        declare @charge int = rand() * 100; -- will generate a random charge value between 0 and 100.
        declare @tarrif_plan int = round(rand() * 5, 0); 

        select top 1 @phone_id = phone_no 
        from tbl_phone_agenda 
        order by newid();

        select top 1 @dest_id = phone_no 
        from tbl_phone_agenda 
        order by newid();

        insert into tbl_sms (phone_id, dest_id, charge, tarrif_plan) 
        values (@phone_id, @dest_id, @charge, 
                convert(nvarchar(50), @tarrif_plan));

        set @cnt += 1;
    end
end
go

What is the way to optimize this stored procedure?

Upvotes: 4

Views: 1270

Answers (2)

s.brody
s.brody

Reputation: 185

With a minor change in the way random phone_no are fetched from the existing table tbl_phone_agenda, I've achived the insertion of 1.2 million records in ~50 sec. No doubt that GarethD's solution is the fastest, though.

-- create stored procedure to insert random records into the sms table,     automatically   |   tried and tested
create proc insert_random_sms @number_of_records int
as
begin
declare @cnt int = 0;   -- loop counter
declare @phone_id int;
declare @dest_id int;
while (@cnt < @number_of_records)
    begin
        declare @charge int = rand() * 100; -- will generate a random charge value between 0 and 100.
        declare @tarrif_plan int = round(rand() * 5, 0); 
        -- here come the changes
        select top 1 @phone_id = phone_no from tbl_phone_agenda where (abs(cast((binary_checksum(*) * rand()) as int)) % 100) < 10
        select top 1 @dest_id = phone_no from tbl_phone_agenda where (abs(cast((binary_checksum(*) * rand()) as int)) % 100) < 10
        insert into tbl_sms (phone_id, dest_id, charge, tariff_plan) values (@phone_id, @dest_id, @charge , convert(nvarchar(50), @tarrif_plan));
    set @cnt += 1;
    end
end
go

The inspiration for my solution can be found here: MSDN article - Selecting Rows Randomly from a Large Table

Upvotes: 0

GarethD
GarethD

Reputation: 69769

The method I like to use for generating x number of records is the stacked CTE method (having read this article by Aaron Bertrand who credits Itzik Ben-Gan for the stacked CTE approach):

WITH N1 (N) AS 
(   SELECT 1 
    FROM (VALUES 
            (1), (1), (1), (1), (1), 
            (1), (1), (1), (1), (1)
        ) n (Number)
),
N2 (N) AS (SELECT 1 FROM N1 AS N1 CROSS JOIN N1 AS N2),
N3 (N) AS (SELECT 1 FROM N2 AS N1 CROSS JOIN N2 AS N2),
N4 (N) AS (SELECT 1 FROM N3 AS N1 CROSS JOIN N3 AS N2)
SELECT COUNT(*)
FROM N4

This simply starts of with 10 rows, and keeps cross joining, until in the case of the above there are 100,000,000 rows. This would be how I would generate your rows,

When you are using a set based approach you can no longer use RAND() on its own because it is a run time constant, in order to get a new evaluation for each row you need to combine RAND() with NEWID() which is unique per row, so the following will generate a random number between 0 and 100 that is different for each row:

SELECT  CAST(ROUND(RAND(CHECKSUM(NEWID())) * 100, 0) AS INT)

The next thing I would do is put all your phonenumbers into a temp table so that they have a sequential ID (this will be used to allocate randomly):

CREATE TABLE #Phone
(
    ID INT IDENTITY NOT NULL PRIMARY KEY,
    PhoneNo VARCHAR(50) NOT NULL
);
INSERT #Phone (PhoneNo)
SELECT PhoneNo 
FROM tbl_phone_agenda;

So your final query will be

CREATE PROC insert_random_sms @number_of_records IN
AS
BEGIN
    CREATE TABLE #Phone
    (
        ID INT IDENTITY NOT NULL PRIMARY KEY,
        PhoneNo VARCHAR(50) NOT NULL
    );
    INSERT #Phone (PhoneNo)
    SELECT PhoneNo 
    FROM tbl_phone_agenda;

    -- NEEDED SO WE KNOW WHAT NUMBER TO GENERATE A RANDOM
    -- NUMBER IN THE RIGHT RANGE LATER
    DECLARE @PhoneCount INT = (SELECT COUNT(*) FROM #Phone);

    WITH N1 (N) AS 
    (   SELECT 1 
        FROM (VALUES 
                (1), (1), (1), (1), (1), 
                (1), (1), (1), (1), (1)
            ) n (Number)
    ),
    N2 (N) AS (SELECT 1 FROM N1 AS N1 CROSS JOIN N1 AS N2),
    N3 (N) AS (SELECT 1 FROM N2 AS N1 CROSS JOIN N2 AS N2),
    N4 (N) AS (SELECT 1 FROM N3 AS N1 CROSS JOIN N3 AS N2)

    INSERT tbl_sms (phone_id, dest_id, charge, tarrif_plan) 
    SELECT  TOP (@number_of_records)
            p.PhoneNo,
            d.PhoneNo,
            Charge = CAST(ROUND(RAND(CHECKSUM(NEWID())) * 100, 0) AS INT),
            tarrif_plan = CAST(ROUND(RAND(CHECKSUM(NEWID())) * 5, 0) AS INT)
    FROM    N4
            INNER JOIN #Phone p
                ON p.ID = CAST(CEILING(RAND(CHECKSUM(NEWID())) * @PhoneCount) AS INT)
            INNER JOIN #Phone d
                ON d.ID = CAST(CEILING(RAND(CHECKSUM(NEWID())) * @PhoneCount) AS INT)

END

In my tests this ran in about 20-30 seconds to generate 1.2m records, looking up against 100,000 phone numbers.

Upvotes: 7

Related Questions