broke
broke

Reputation: 8302

Dealing with large amounts of data, and a query with 12 inner joins in SQL Server 2008

There is an old SSIS package that pulls a lot of data from oracle to our Sql Server Database everyday. The data is inserted into a non-normalized database, and I'm working on a stored procedure to select that data, and insert it into a normalized database. The Oracle databases were overly normalized, so the query I wrote ended up having 12 inner joins to get all the columns I need. Another problem is that I'm dealing with large amounts of data. One table I'm selecting from has over 12 million records. Here is my query:

Declare @MewLive Table
(

UPC_NUMBER VARCHAR(50),
ITEM_NUMBER VARCHAR(50),
STYLE_CODE VARCHAR(20),
COLOR  VARCHAR(8),
SIZE VARCHAR(8),
UPC_TYPE INT,
LONG_DESC VARCHAR(120),
LOCATION_CODE VARCHAR(20),
TOTAL_ON_HAND_RETAIL NUMERIC(14,0),
VENDOR_CODE VARCHAR(20),
CURRENT_RETAIL NUMERIC(14,2)


)

INSERT INTO @MewLive(UPC_NUMBER,ITEM_NUMBER,STYLE_CODE,COLOR,[SIZE],UPC_TYPE,LONG_DESC,LOCATION_CODE,TOTAL_ON_HAND_RETAIL,VENDOR_CODE,CURRENT_RETAIL)
SELECT     U.UPC_NUMBER, REPLACE(ST.STYLE_CODE, '.', '') 
                      + '-' + SC.SHORT_DESC + '-' + REPLACE(SM.PRIM_SIZE_LABEL, '.', '') AS ItemNumber, 
                      REPLACE(ST.STYLE_CODE, '.', '') AS Style_Code, SC.SHORT_DESC AS Color, 
                      REPLACE(SM.PRIM_SIZE_LABEL, '.', '') AS Size, U.UPC_TYPE, ST.LONG_DESC, L.LOCATION_CODE, 
                      IB.TOTAL_ON_HAND_RETAIL, V.VENDOR_CODE, SD.CURRENT_RETAIL
FROM         MewLive.dbo.STYLE AS ST INNER JOIN
                      MewLive.dbo.SKU AS SK ON ST.STYLE_ID = SK.STYLE_ID INNER JOIN
                      MewLive.dbo.UPC AS U ON  SK.SKU_ID = U.SKU_ID INNER JOIN
                      MewLive.dbo.IB_INVENTORY_TOTAL AS IB ON SK.SKU_ID = IB.SKU_ID INNER JOIN
                      MewLive.dbo.LOCATION AS L ON IB.LOCATION_ID = L.LOCATION_ID INNER JOIN
                      MewLive.dbo.STYLE_COLOR AS SC ON ST.STYLE_ID = SC.STYLE_ID INNER JOIN
                      MewLive.dbo.COLOR AS C ON SC.COLOR_ID = C.COLOR_ID INNER JOIN
                      MewLive.dbo.STYLE_SIZE AS SS ON ST.STYLE_ID = SS.STYLE_ID INNER JOIN
                      MewLive.dbo.SIZE_MASTER AS SM ON SS.SIZE_MASTER_ID = SM.SIZE_MASTER_ID INNER JOIN
                      MewLive.dbo.STYLE_VENDOR AS SV ON ST.STYLE_ID = SV.STYLE_ID INNER JOIN
                      MewLive.dbo.VENDOR AS V ON SV.VENDOR_ID = V.VENDOR_ID INNER JOIN
                      MewLive.dbo.STYLE_DETAIL AS SD ON ST.STYLE_ID = SD.STYLE_ID
WHERE     (U.UPC_TYPE = 1) AND (ST.ACTIVE_FLAG = 1) 

That query pretty much crashes our server. I tried to fix the problem by breaking the query up into smaller queries, but the temp table variable I use causes the tempdb database to fill the hard drive. I figure this is because the server runs out of memory, and crashes. Is there anyway to solve this problem?

Upvotes: 2

Views: 1221

Answers (4)

Roy
Roy

Reputation: 273

Have you tried using a real table instead of a temporary one. You can use SELECT INTO to create a real table to store the results instead of a temporary one.

Syntax would be:

SELECT 
    U.UPC_NUMBER, 
    REPLACE(ST.STYLE_CODE, '.', '').
    ....
INTO 
    MEWLIVE 
FROM
    MewLive.dbo.STYLE AS ST INNER JOIN
    ...

The command will create the table, and may help with the memory issues you are seeing.

Additionally try looking at the execution plan in query analyser or try the index tuning wizard to suggest some indexes that may help speed up the query.

Upvotes: 4

Nick Vaccaro
Nick Vaccaro

Reputation: 5504

Try running the query from the Oracle server rather than from the SQL server. As it stands, there's most likely going to be a lot of communication over the wire as the query tries to process.

By pre-processing the joins (maybe with a view), you'll only be sending over the results.

Regarding the over-normalization: have you tested whether or not it's an issue in terms of speed? I find it hard to believe that it could be too normalized.

Upvotes: 2

Oleg Dok
Oleg Dok

Reputation: 21766

Proper indexing will definitely help

IF

amount of rows in this query not over "zillions" of rows.

Try the following:

  • Join on dbo.COLOR is excessive if there is FKey on dbo.STYLE_COLOR(COLOR_ID)=>dbo.COLOR(COLOR_ID)

Proper index (excessive, should be reviewed)

USE MewLive
CREATE INDEX ix1 ON dbo.STYLE_DETAIL (STYLE_ID)
INCLUDE (STYLE_CODE, LONG_DESC)
WHERE ACTIVE_FLAG = 1
GO
CREATE INDEX ix2 ON dbo.UPC (SKU_ID)
INCLUDE(UPC_NUMBER)
WHERE UPC_TYPE = 1
GO
CREATE INDEX ix3 ON dbo.SKU(STYLE_ID)
INCLUDE(SKU_ID)
GO
CREATE INDEX ix3_alternative ON dbo.SKU(SKU_ID)
INCLUDE(STYLE_ID)
GO
CREATE INDEX ix4 ON dbo.IB_INVENTORY_TOTAL(SKU_ID, LOCATION_ID)
INCLUDE(TOTAL_ON_HAND_RETAIL)
GO
CREATE INDEX ix5 ON dbo.LOCATION(LOCATION_ID)
INCLUDE(LOCATION_CODE)
GO
CREATE INDEX ix6 ON dbo.STYLE_COLOR(STYLE_ID)
INCLUDE(SHORT_DESC,COLOR_ID)
GO
CREATE INDEX ix7 ON dbo.COLOR(COLOR_ID)
GO
CREATE INDEX ON dbo.STYLE_SIZE(STYLE_ID)
INCLUDE(SIZE_MASTER_ID)
GO
CREATE INDEX ix8 ON dbo.SIZE_MASTER(SIZE_MASTER_ID)
INCLUDE(PRIM_SIZE_LABEL)
GO
CREATE INDEX ix9 ON dbo.STYLE_VENDOR(STYLE_ID)
INCLUDE(VENDOR_ID)
GO
CREATE INDEX ixA ON dbo.VENDOR(VENDOR_ID)
INCLUDE(VENDOR_CODE)
GO
CREATE INDEX ON dbo.STYLE_DETAIL(STYLE_ID)
INCLUDE(CURRENT_RETAIL)

In SELECT list replace U.UPC_TYPE, to 1 as UPC_TYPE,

Upvotes: 1

MarcE
MarcE

Reputation: 3731

Can you segregate the imports - batch them by SKU/location/vendor/whatever and run multiple queries to get the data over? Is there a particular reason it all needs to go across in one hit? (apart from the ease of writing the query)

Upvotes: 1

Related Questions