aleroot
aleroot

Reputation: 72686

Fastest way to import text file into database

I've a text file containing record of a dump of database table in a custom format, this text have a character in a specific position that identifies the operation with the record :

Therefore if i find a D record in text file i need to delete record into database, instead if i find an M record i need to Insert the record if not exist in the database, if already exist i need to update it.

What is the better and fastest way to import a similar text file in a database table using .NET Framework and c#? I have 300000 record of average in this text file.

Thanks

Upvotes: 0

Views: 5737

Answers (5)

THEn
THEn

Reputation: 1936

Insert Into Temporary table then JOIN to UPDATE or DELETE.

Upvotes: 0

Dr. Wily's Apprentice
Dr. Wily's Apprentice

Reputation: 10280

If you are using SQL Server, you could take advantage of the Bulk Insert functionality. That should be the fastest way to insert data from a file into the database. The first thing I would do is insert the data in your file into a "landing table" (i.e. a table whose structure matches the structure of your file). Also note: .NET 2.0 introduced SqlBulkCopy, which would be similarly useful if you already have the data in memory or are reading it with some kind of DataReader.

Once the contents of your file are inserted into your landing table, you can then execute a series of SQL statements to merge your landing table into your target table(s). Below is an example implementation of those SQL statements (Disclaimer: I did not check these for correctness):

DELETE FROM MyTable
WHERE EXISTS (
    SELECT 1
    FROM LandingTable
    WHERE
        LandingTable.RecordType = 'D'
        AND LandingTable.KeyField1 = MyTable.KeyField1
        AND LandingTable.KeyField2 = MyTable.KeyField2


UPDATE MyTable SET
    MyTable.Field1 = LandingTable.Field1,
    MyTable.Field2 = LandingTable.Field2,
    -- ...
FROM MyTable
INNER JOIN LandingTable ON
    LandingTable.KeyField1 = MyTable.KeyField1
    AND LandingTable.KeyField2 = MyTable.KeyField2
where
    LandingTable.RecordType = 'U'

INSERT INTO MyTable (
    Field1,
    Field2,
    -- ...
)
SELECT
    LandingTable.Field1,
    LandingTable.Field2,
    -- ...
FROM LandingTable
WHERE
    LandingTable.RecordType = 'I'

-- Consider how to handle "Insert" records where there is already a record in MyTable with the same key
-- Possible solution below: treat as an "Update"
UPDATE MyTable SET
    MyTable.Field1 = LandingTable.Field1,
    MyTable.Field2 = LandingTable.Field2,
    -- ...
FROM MyTable
INNER JOIN LandingTable ON
    LandingTable.KeyField1 = MyTable.KeyField1
    AND LandingTable.KeyField2 = MyTable.KeyField2
where
    LandingTable.RecordType = 'I'

-- Now only insert records from LandingTable where there is no corresponding record in MyTable with the same key (determined with a left outer join)
INSERT INTO MyTable (
    Field1,
    Field2,
    -- ...
)
SELECT
    LandingTable.Field1,
    LandingTable.Field2,
    -- ...
FROM LandingTable
LEFT OUTER JOIN MyTable ON
    MyTable.KeyField1 = LandingTable.KeyField1
    AND MyTable.KeyField2 = LandingTable.KeyField2
WHERE
    LandingTable.RecordType = 'I'
    and MyTable.KeyField1 is null

Links that I found after doing a quick search:

Upvotes: 0

Tyug
Tyug

Reputation: 443

There is no easy way to do this, you'll have to parse the text no matter what to determine what SQL statement you need to run. You'll have to decide for yourself whether or not it's an update or an insert statement, hopefully you can do batches, otherwise hitting the database every time you hit an "M" so often isn't gonna be a good idea.

Upvotes: 0

Falcon
Falcon

Reputation: 3180

The easiest way is probably to use ADO.NET to create a typed datatable to load the data into and set the datarowstate accordingly, then flush the data via a DataAdapter.

The fastest way is probably creating a bulk SQL-Script to execute. LinQ can save you a lot of time when selecting the data (you can probably transform it on the fly).

There are also platform specific solutions that should be considered. See here a bulk insert for SQLServer.

http://dotnetslackers.com/Community/blogs/xun/archive/2008/04/15/sql-bulk-insert-and-ado-net-sqlbulkcopy.aspx

Upvotes: 2

nickytonline
nickytonline

Reputation: 6981

Why not parse the text and generate insert, update and delete statements then just run the script you generated?

Upvotes: 1

Related Questions