Reputation: 57
I'm working with the awesome DuckDB.NET library and C# for this example. I'm specifically working with the ADO.NET provider.
My database contains the following table:
CREATE TABLE tbl01 (
Id INTEGER,
TextListTest TEXT[],
DateTest DATE
);
In the program each record is encapsulated by a class:
class Record
{
public int Id { get; set; }
public List<string> TextListTest { get; set; };
public DateTime TextListTest { get; set; };
}
and is appended to a List<Record>
. This list gets
very large so I would like to avoid the per-row overhead of INSERT
statements in a
loop. The documentation says that if I absolutely must use
inserts in this manner I should also wrap them in calls of BEGIN TRANSACTION
and COMMIT
. I
really don't want to miss out on insert performance here. Is there another approach I can use with
the library I'm using?
I noticed in the DuckDB.NET
sample with the
LowLevelBindingsSample()
method that I could use prepared statements but I'm not sure that if that would confer any performance benefits.
Is there an approach I'm missing - perhaps the appender? If someone could provide an example using the 3 specific data
types in the above table that would be greatly appreciated (I'm having trouble figuring out the
LIST
column).
using DuckDB.NET.Data;
namespace DuckTest;
class Record
{
public int Id { get; set; }
public List<string> TextListTest { get; set; }
public DateTime DateTest { get; set; }
}
class Program
{
public static void Main(string[] args)
{
// pretend this is a really big list!
List<Record> recordList = new List<Record>
{
new Record { Id = 1, TextListTest = new List<string> { "Ball", "Horse" }, DateTest = new DateTime(1994, 12, 3) },
new Record { Id = 2, TextListTest = new List<string> { "Matthew", "Giorgi is cool!" }, DateTest = new DateTime(1998, 11, 28) },
new Record { Id = 3, TextListTest = new List<string> { "Red", "Black", "Purple" }, DateTest = new DateTime(1999, 9, 13) },
new Record { Id = 4, TextListTest = new List<string> { "Cat" }, DateTest = new DateTime(1990, 2, 5) },
};
using (var duckDBConnection = new DuckDBConnection("Data Source=db01.duckdb"))
{
duckDBConnection.Open();
var command = duckDBConnection.CreateCommand();
command.CommandText = "CREATE TABLE tbl01 ( Id INTEGER, TextListTest TEXT[], DateTest DATE );";
var executeNonQuery = command.ExecuteNonQuery();
// I could do this in a loop but there's probably a better way...
command.CommandText = "INSERT INTO tbl01 VALUES (1, ['Ball', 'Horse'], '1994-12-03');";
executeNonQuery = command.ExecuteNonQuery();
}
}
}
I would be willing to use the low level bindings library if needed.
Upvotes: 1
Views: 2950
Reputation: 30883
The library has support for appenders so that should be the fastest way to import data.
Check out the Bulk Data Loading for more details and options.
Upvotes: 2