final
final

Reputation: 233

SQL Server Filetables/FileStream: Insert more than 2GB via Transact SQL?

I want to copy a file to an SQL Server Filetable using T-SQL. I cannot use the Win32 API (or SqlFileStream class for that matter) because the server does not expose SMB. The following code works very well for smaller files (up to 1GB):

using (SqlConnection conn = new SqlConnection(connStr))
using (SqlCommand cmd = conn.CreateCommand())
using (Stream file = File.OpenRead(@"c:\Path\To\Large\file.bin"))
{
    conn.Open();
    cmd.CommandTimeout = 3600;
    cmd.CommandText = "UPDATE file_table SET file_stream = @stream WHERE path_locator = GetPathLocator(@path)";
    cmd.Parameters.AddWithValue("@path", @"\\HOST\INSTANCE\DB\file_table\largetest.bin");
    cmd.Parameters.Add("@stream", System.Data.SqlDbType.Binary, -1).Value = file;
    cmd.ExecuteNonQuery();
}

The performance is decent (about 100MB/s), and passing the stream also works well and is easy on the client's memory. However, I see that the server's memory usage spikes during this query - apparently, SQL Server copies the entire stream into memory before writing it to disk. I also see disk IO to the tempdb. The query succeeds sometimes, but fails other times for the same file. Errors start occuring around 1.5GB filesize I've never been successful for a 4GB file. If the query fails, I get the following exception: System.Data.SqlClient.SqlException: 'Internal error: An expression services limit has been reached. Please look for potentially complex expressions in your query, and try to simplify them.'

Is there a better way to put a large file into a filetable?

Upvotes: 0

Views: 1400

Answers (2)

final
final

Reputation: 233

I ended up using SqlBulkCopy for this. I don't have a copy & pasteable working example, Therefore I'll describe the steps to get there.

The idea is stolen from this answer and adapted to FileTables. It's a very good idea to stream the data since we're handling large files. FileTables requires a minimum of two columns when inserting new data, which are name and file_stream. However, this would place the new file in the root of the FileTable. Therefore, we need to specify the path_locator field as well, which is of Type HierarchyId. So, the minimum working DbDataReader should have three columns: name (of type String), file_path (of Type SqlHierarchyId) and file_stream (of type Stream). You obviously have the name and the linked answer shows how to get the contents for file_stream, which leaves the path_locator

The missing link is generating a FileTable HierarchyId in code (there are examples how to create them in SQL, but this would be cumbersome):

public string GetNewHierarchyId()
{
    var g = Guid.NewGuid().ToByteArray();
    var used = 0;
    var strings = new List<string>();
    foreach (var i in new[] { 6, 6, 4 })
    {
        byte[] buf = new byte[8];
        Array.Copy(g, used, buf, (BitConverter.IsLittleEndian ? 0 : 2), i);
        used += i;
        strings.Add(BitConverter.ToUInt64(buf, 0).ToString());
    }
    return string.Join(".", strings) + "/";
}

To get the new path_locator, prepend the new file's parent directory's path_locator, which you can get via SELECT GetPathLocator(@parentDir) FROM filetable

Next, you need the Microsoft.SqlServer.Types Nuget package, because SqlBulkCopy needs to pass an argument of Type SqlHierarchyId, which you can then create like this: Microsoft.SqlServer.Types.SqlHierarchyId.Parse(parentPathLocator & GetNewHierarchyId())

I had to modify my bindingRedirects in the app.config for the Microsoft.Sql.Types package to work properly, by adding this:

<runtime>
    <assemblyBinding xmlns="urn:schemas-microsoft-com:asm.v1">
      <dependentAssembly>
        <assemblyIdentity name="Microsoft.SqlServer.Types" publicKeyToken="89845dcd8080cc91" culture="neutral" />
        <!-- <bindingRedirect oldVersion="0.0.0.0-11.0.0.0" newVersion="11.0.0.0" /> -->
        <bindingRedirect oldVersion="0.0.0.0-14.0.0.0" newVersion="14.0.0.0" />
      </dependentAssembly>
    </assemblyBinding>
  </runtime>

Naturally, you can add additional columns to SqlBulkCopy if you want to specify the file's attributes.

Upvotes: 0

Ed Bangga
Ed Bangga

Reputation: 13026

I suggest to use BCP for this process. Then call stored procedure to update from temporary table.

cmd.CommandText= "exec xp_cmdShell 'bcp.exe'"+Database + ".." + TableName + " in "  + 
                 c:\Path\To\Large\file.bin + " -c -q -U " + UserId + " -P " + Password +  "-t  ";

Upvotes: 1

Related Questions