Reputation: 579
I have a huge XML file, about 60 GB. I've managed to create a XML reader with a StringBuilder where I can extract the specific elements I need. This works great. I've also connected my application to a Lan Microsoft SQL server.
How do I use bulk insert to insert all my rows most efficient. I have about 8,400,000 rows which I need to insert the fastest way possible.
My StringBuilder sets up the string like this:
"FirstVariable, SecondVariable, ThirdVariable; FirstVariable, SecondVariable, ThirdVariable; FirstVariable, SecondVariable, ThirdVariable;"
I need to import this like a CSV file with bulk insert :) Please help
Upvotes: 1
Views: 1715
Reputation: 579
I finally figured it out. I created a datatable before the while loop and then I added each element to the datatable as it was extracting the data. Then I made a counter in the while loop, which would connect to the database every 5000 elements and bulk insert them and empty the datatable after this. This made it possible to only use a few MB ram and I'm able to run through the entire 60 GB file and parse all 8,400,000 elemets to my database in about 12 min. The bulk insert code I used was pretty standard, here is part of my solution:
Using bulkCopy As SqlBulkCopy =
New SqlBulkCopy("Server=testserver;Database=test1;User=test1;Pwd=12345;")
bulkCopy.DestinationTableName = "dbo.testtable"
bulkCopy.BatchSize = 5000
Try
' Write from the source to the destination.
bulkCopy.WriteToServer(table)
Catch ex As Exception
MsgBox(ex.Message)
End Try
End Using
Upvotes: 1
Reputation: 67311
If I understand you correctly you convert your huge XML into a CSV-file.
With this syntax you can insert a CSV-file in one go:
CREATE TABLE testTable(int1 INT,int2 INT, int3 INT);
--Assuming a file with the following content: 1, 2, 3; 4, 5, 6; 7, 8, 9;
BULK INSERT testTable FROM 'F:\testCSV.txt' WITH (FIELDTERMINATOR=',', ROWTERMINATOR=';');
SELECT * FROM testTable;
/*
Result
int1 int2 int3
1 2 3
4 5 6
7 8 9
*/
DROP TABLE testTable;
You might try to avoid the CSV-conversion and import this directly, but this will probably try to load the XML in one piece, 60GB is very much...
CREATE TABLE testTable(int1 INT,int2 INT, int3 INT);
/*
Assuiming a file with the following content:
<data>
<row>
<a>1</a>
<b>2</b>
<c>3</c>
</row>
<row>
<a>4</a>
<b>5</b>
<c>6</c>
</row>
<row>
<a>7</a>
<b>8</b>
<c>9</c>
</row>
</data>
*/
INSERT INTO testTable
SELECT RowData.value('a[1]','int')
,RowData.value('b[1]','int')
,RowData.value('c[1]','int')
FROM
(
SELECT CAST(x.y AS XML) AS XmlData
FROM OPENROWSET(BULK 'F:\testXML.xml',SINGLE_CLOB) AS x(y)
) AS XmlData
CROSS APPLY XmlData.nodes('/data/row') AS x(RowData)
SELECT * FROM testTable;
/*
Result
int1 int2 int3
1 2 3
4 5 6
7 8 9
*/
DROP TABLE testTable;
Last but not least you'll find explanations how to use BULK INSERT
directly against an XML file using an explicitly specified format file here: https://msdn.microsoft.com/en-us/library/ms191184.aspx
Upvotes: 3