Reputation: 1718
I have data with the same schema in a pipe delimited text file and in a database table, including the primary key column.
I have to check if each row in the file is present in the table, if not generate an INSERT statement for that row.
The table has 30 columns, but here I've simplified for this example:
ID Name Address1 Address2 City State Zip
ID is the running identity column; so if a particular ID value from the file is found in the table, there should be no insert
statement generated for that.
Here's my attempt, which doesn't feel correct:
foreach (var item in RecipientsInFile)
{
if (!RecipientsInDB.Any(u => u.ID == item.ID ))
{
Console.WriteLine(GetInsertSql(item));
}
}
Console.ReadLine();
EDIT: Sorry, I missed the asking the actual question; how to do this? Thank you very much for all the help.
EDIT: The table has a million plus rows, while the file has 50K rows. This a one time thing, not a permanent project.
Upvotes: 1
Views: 96
Reputation: 100607
Try comparing the ID lists using .Except()
List<int> dbIDs = Recipients.Select(x=>x.ID).ToList();
List<int> fileIDs = RecipientsFile.Select(x=>x.ID).ToList();
List<int> toBeInserted = fileIDs.Except(dbIDs).ToList();
toBeInserted.ForEach(x=>GetInsertSqlStatementForID(x));
For the pedantic and trollish among us in the comments, please remember the above code (like any source code you find on the interwebs) shouldn't be copy/pasted into your production code. Try this refactoring:
foreach (var item in RecipientsFile.Select(x=>x.ID)
.Except(DatabaseRecipients.Select(x=>x.ID)))
{
GetInsertSqlStatementForID(item);
}
Upvotes: 2
Reputation: 40413
Lots of ways of accomplishing this. Yours is one way.
Another would be to always generate SQL, but generate it in the following manner:
if not exists (select 1 from Recipients where ID == 1234)
insert Recipients (...) values (...)
if not exists (select 1 from Recipients where ID == 1235)
insert Recipients (...) values (...)
Another would be to retrieve the entire contents of the database into memory beforehand, loading the database IDs into a HashSet
, then only checking that HashSet
to see if it exists - would take a little longer to get started, but would be faster for each record.
Any of these three techniques would work - it all depends on how big your database table is, and how big your file is. If they're both relatively small (maybe 10,000 records or so), then any of these should work fine.
EDIT
And there's always option D: Insert all records from the file into a temporary table (could be a real table or a SQL temp table, doesn't really matter) in the database, then use SQL to join the two tables together and retrieve the differences (using not exists
or in
or whatever technique you want), and insert the missing records that way.
Upvotes: 0
Reputation: 3599
I would add all the RecipientsInDB Ids in a HashSet and then test if the set contains the item Id.
var recipientsInDBIds = new Hashset(RecipientsInDB.Select(u => u.ID));
foreach (var item in RecipientsInFile)
{
if (!recipientsInDBIds.Contains(item.ID ))
{
Console.WriteLine(GetInsertSql(item));
}
}
Console.ReadLine();
Upvotes: 2