Reputation: 233
I want to copy a file to an SQL Server Filetable using T-SQL. I cannot use the Win32 API (or SqlFileStream
class for that matter) because the server does not expose SMB. The following code works very well for smaller files (up to 1GB):
using (SqlConnection conn = new SqlConnection(connStr))
using (SqlCommand cmd = conn.CreateCommand())
using (Stream file = File.OpenRead(@"c:\Path\To\Large\file.bin"))
{
conn.Open();
cmd.CommandTimeout = 3600;
cmd.CommandText = "UPDATE file_table SET file_stream = @stream WHERE path_locator = GetPathLocator(@path)";
cmd.Parameters.AddWithValue("@path", @"\\HOST\INSTANCE\DB\file_table\largetest.bin");
cmd.Parameters.Add("@stream", System.Data.SqlDbType.Binary, -1).Value = file;
cmd.ExecuteNonQuery();
}
The performance is decent (about 100MB/s), and passing the stream also works well and is easy on the client's memory. However, I see that the server's memory usage spikes during this query - apparently, SQL Server copies the entire stream into memory before writing it to disk. I also see disk IO to the tempdb. The query succeeds sometimes, but fails other times for the same file. Errors start occuring around 1.5GB filesize I've never been successful for a 4GB file. If the query fails, I get the following exception:
System.Data.SqlClient.SqlException: 'Internal error: An expression services limit has been reached. Please look for potentially complex expressions in your query, and try to simplify them.'
Is there a better way to put a large file into a filetable?
Upvotes: 0
Views: 1400
Reputation: 233
I ended up using SqlBulkCopy
for this. I don't have a copy & pasteable working example, Therefore I'll describe the steps to get there.
The idea is stolen from this answer and adapted to FileTables. It's a very good idea to stream the data since we're handling large files. FileTables requires a minimum of two columns when inserting new data, which are name
and file_stream
. However, this would place the new file in the root of the FileTable. Therefore, we need to specify the path_locator
field as well, which is of Type HierarchyId
. So, the minimum working DbDataReader should have three columns: name
(of type String
), file_path
(of Type SqlHierarchyId
) and file_stream
(of type Stream
). You obviously have the name
and the linked answer shows how to get the contents for file_stream
, which leaves the path_locator
The missing link is generating a FileTable HierarchyId
in code (there are examples how to create them in SQL, but this would be cumbersome):
public string GetNewHierarchyId()
{
var g = Guid.NewGuid().ToByteArray();
var used = 0;
var strings = new List<string>();
foreach (var i in new[] { 6, 6, 4 })
{
byte[] buf = new byte[8];
Array.Copy(g, used, buf, (BitConverter.IsLittleEndian ? 0 : 2), i);
used += i;
strings.Add(BitConverter.ToUInt64(buf, 0).ToString());
}
return string.Join(".", strings) + "/";
}
To get the new path_locator
, prepend the new file's parent directory's path_locator
, which you can get via SELECT GetPathLocator(@parentDir) FROM filetable
Next, you need the Microsoft.SqlServer.Types
Nuget package, because SqlBulkCopy needs to pass an argument of Type SqlHierarchyId
, which you can then create like this: Microsoft.SqlServer.Types.SqlHierarchyId.Parse(parentPathLocator & GetNewHierarchyId())
I had to modify my bindingRedirects in the app.config for the Microsoft.Sql.Types
package to work properly, by adding this:
<runtime>
<assemblyBinding xmlns="urn:schemas-microsoft-com:asm.v1">
<dependentAssembly>
<assemblyIdentity name="Microsoft.SqlServer.Types" publicKeyToken="89845dcd8080cc91" culture="neutral" />
<!-- <bindingRedirect oldVersion="0.0.0.0-11.0.0.0" newVersion="11.0.0.0" /> -->
<bindingRedirect oldVersion="0.0.0.0-14.0.0.0" newVersion="14.0.0.0" />
</dependentAssembly>
</assemblyBinding>
</runtime>
Naturally, you can add additional columns to SqlBulkCopy
if you want to specify the file's attributes.
Upvotes: 0