Reputation: 10067
We have a very big database WriteDB
, which store raw trading data and we use this table to fast writes. Then with sql scripts I import data from WriteDB
into ReadDB
in comparatively the same table, but extended with some extra values + relation added. Import script is like that:
TRUNCATE TABLE [ReadDB].[dbo].[Price]
GO
INSERT INTO [ReadDB].[dbo].[Price]
SELECT a.*, 0 as ValueUSD, 0 as ValueEUR
from [WriteDB].[dbo].[Price] a
JOIN [ReadDB].[dbo].[Companies] b ON a.QuoteId = b.QuoteID
So initially there is around 130 mil. rows in this table (~50GB). Each day some of them added, some of them changes, so right now we decide not over complicate logic and just re-import all data. The problem that for some reason with time this script works longer and longer, on the almost same amount of data. First run it's take ~1h, now it's already taken 3h
Also SQL Server after import work not well. After import (or during it) if I try to run different queries, even the simplest they often fail with timeout errors.
What is the reason of such bad behavior and how to fix this?
Upvotes: 7
Views: 1101
Reputation: 13161
One theory is that your first 50GB dataset has filled available memory for caching. Upon truncating the table, your cache is now effectively empty. This alternating behavior makes effective use of the cache difficult and incurs a substantial number of cache misses / increased IO time.
Consider the following sequence of events:
You could test this theory by comparing the SQL Server cache miss ratio during your first and second load operations.
Some ways to improve performance might be to:
MERGE
statement instead of dumping / loading 50GB of data at a time.Upvotes: 5