Reputation: 75
I have a SQL Server database that is storing the contents of files in a table. Specifically, there are 2 fields:
Contents: varbinary(max)
field that always starts with '0x1F.....'
FileType: varchar(5)
field that has the type of file, such as PDF, docx, etc.
How can I convert the contents back into a file? I am trying to use Coldfusion, if that is possible, to convert it. If not, what are the steps to convert the binary into a file?
I tried the following (assuming a docx filetype) but it didn't produce a valid word file:
<cfset DecodedValue = BinaryDecode(contents,"hex")>
<cffile action="WRITE" output="#DecodedValue#" file="C:\decodedfile.docx">
Upvotes: 4
Views: 3810
Reputation: 6560
tldr;
The data is already binary, so ditch the binaryX() functions and save the content directly to a file. Read the first few bytes of the binary to verify the file type. In this case, turns out the document was actually stored in GZIP format, not raw DOCX.
Don't be misled by how SSMS chooses to display it. SSMS displays binary in user friendly hex format, but it's still stored as binary. Just write the binary directly to the file, without any BinaryX functions.
<cfset FileWrite("C:\decodedfile.docx", contents)>
Also, check your DSN settings and ensure the "BLOB - Enable binary large object retrieval (BLOB)" setting is enabled, so binary values aren't truncated at 64K (default buffer size).
Update 1:
The FileWrite() code above works correctly IF the "contents" column contains the binary of a valid .docx file. Perhaps the data is being stored differently than we're thinking? Run a query to retrieve the binary of a single document and output the first four bytes. What is the result? Typically, the first four bytes of .docx files should be 80, 75, 3, 4
.
<!--- print size and first 4 bytes --->
<cfoutput>
size in bytes = #arrayLen(qYourQuery.contents)#<br>
<cfloop from="1" to="4" index="x">
byte #x# = #qYourQuery.contents[1][x]#<br>
</cfloop>
</cfoutput>
Update 2:
Closest I could find to 1F 8B 08
is GZIP. Try using probeContentType()
on the saved file. What does it report?
<cfscript>
paths = createObject("java", "java.nio.file.Paths");
files = createObject("java", "java.nio.file.Files");
input = paths.get("c:/yourFileName.docx", []);
writeDump(files.probeContentType(input));
</cfscript>
Upvotes: 2
Reputation: 75
Thanks to User Ageax, the first 4 size bytes of 31,-117,8,0 show the content is stored in GZIP format instead.
I first save the content as a gzip then extract the file. My code is as follows:
<cfquery name="getfile" datasource="tempdb">
select content from table
</cfquery>
<cfset FileWrite("C:\mygzipfile.gzip", getfile.content)>
To extract gzip to a file using coldfusion, I used the solution at: http://coldfusion-tip.blogspot.com/2012/04/unzip-gz-file-in-coldfusion.html
Upvotes: 3