xin geng
xin geng

Reputation: 23

Better ways for merging data from two sources using Linq

I have some C#/Linq code used to merge data from excel file into db, which needs better performance.

There are
1. A List read from excel file: List<Score> newScoreList
2. A DB table named Scores, primary keys peopleId and testDate

I need to merge data from the list to the table, and if there is any duplicate data, update it.

My current solution is:

1) Find the duplicate data with this LINQ expression:

var dupliData =  
    from newScore in newScoreList  
    from oldScore in db.Scores  
    where newScore.peopleId == oldScore.peopleId && newScore.testDate == oldScore.testDate  
    select oldScore;  

2) Delete the duplicate data.

db.Scores.DeleteAllOnSubmit(dupliData); 

3) Insert the new data from list.

db.Scores.InsertAllOnSubmit(newScoreList);

Could anybody give me a better solution?

Upvotes: 2

Views: 1249

Answers (2)

Pleun
Pleun

Reputation: 8920

If you are using SQL 2008 you can use the Merge command http://www.builderau.com.au/program/sqlserver/soa/Using-SQL-Server-2008-s-MERGE-statement/0,339028455,339283059,00.htm

Upvotes: 0

Josh Smeaton
Josh Smeaton

Reputation: 48730

I really hate stored procedures in general, but this is probably a perfect case for using one. My TSQL is rusty, but this should give an idea.

CREATE PROCEDURE dbo.InsertOrUpdateScore
(
    @id as Int,
    @date as DateTime,
    @result as varchar(20)
)
AS

if not exists(SELECT id FROM Scores WHERE id = @id AND date = @date)
begin
   INSERT INTO Scores (id, date, result) values (@id, @date, @result)
end
else
begin
   UPDATE Scores
   SET result = @result
   WHERE id = @id AND date = @date
end
GO 

Now in your LINQ server browser, select the Score entity, and change its INSERT and UPDATE behaviour to use the stored procedure you just created. Make sure the user accessing the database has EXECUTE permission to the SPROC.

This should perform quite a bit quicker than your version. You're trading an IN clause for N SELECTs on an index which may be quicker. However, the result set of the IN clause is not transported back to the client over the network, which could save quite a bit of time.

Profile exactly how long your method is taking before implementing this, so you can gauge if this is truly quicker.

I'm not sure if this is the only way to create a Score in your application, but you might want to consider the case where you're INSERTing a record that doesn't yet have an ID. You'll need to modify the SPROC to allow @id as null, and handle the INSERT appropriately.

Then it should just be:

db.Scores.InsertAllOnSubmit(newScoreList);

Upvotes: 1

Related Questions