pteridoid
pteridoid

Reputation: 61

Retrieve compressed GZip info from SQL Server

I have a table which stores the contents of a compressed GZip file. Not sure if the original information was XML or CSV or what. I don't have access to the original. The PulledContent column is varbinary and looks like this 0x1F8B08000000000004006553C18E9B3010BD57EA3F706C0F2B63...

How can I convert this info into something readable by a human?

I thought the solution might be here: varbinary to string on SQL Server.

But when I try this: SELECT CONVERT(VARCHAR(1000), PulledContent, 2);

It just gives me a slightly different string of numbers and letters like this 1F8B08000000000004006553C18E9B3010BD57EA3F70...

What am I missing?

Upvotes: 5

Views: 9753

Answers (3)

Solomon Rutzky
Solomon Rutzky

Reputation: 48874

NOTE: This question was asked and answered prior to SQL Server 2016 being released which introduced the COMPRESS and DECOMPRESS built-in functions (which merely expose the Compress / Decompress operations of the .NET GZipStream class). Try those first if you are using SQL Server 2016 or newer.


To uncompress the VARBINARY value in T-SQL (whether in SSMS or a SQL script, etc), you can use a SQLCLR function to do that translation via a simple SELECT statement. You would use the GzipStream class that @Doug mentioned.

But, if you don't want to write any code to get this functionality, it is available in the Free version of the SQL# SQLCLR library (which I am the author of, and again, this function is free). For example:

SELECT SQL#.Util_GUnzip(PulledContent)
FROM   SchemaName.TableName
WHERE  SomeField = something;

That will return a VARBINARY value as well, but it will be the uncompressed value. At that point, you can attempt to convert it to various forms to see what it really is. A binary file will still be binary, but plain text should come out as at least partially readable (fully readable if the collation is the proper encoding):

SELECT CONVERT(VARCHAR(MAX), SQL#.Util_GUnzip(PulledContent))
FROM   SchemaName.TableName
WHERE  SomeField = something;

or:

SELECT CONVERT(NVARCHAR(MAX), SQL#.Util_GUnzip(PulledContent))
FROM   SchemaName.TableName
WHERE  SomeField = something;

Upvotes: 2

Nicholas Carey
Nicholas Carey

Reputation: 74345

Shouldn't be much more complicated than this, using the most excellent DotNetZip:

static byte[] FetchAndDeCompressData( int key )
{
  using ( SqlConnection connection = new SqlConnection("your-connect-string-here") )
  using ( SqlCommand cmd = connection.CreateCommand() )
  {
    cmd.CommandType = CommandType.StoredProcedure ;
    cmd.CommandText = "dbo.fetch_gzipped_data" ;
    cmd.Parameters.AddWithValue( "@key" , key ) ;
    byte[] bytes = (byte[]) cmd.ExecuteScalar() ;
    return Ionic.Zlib.GZipStream.UncompressBuffer( bytes ) ;
  }
}

What you do with the byte[] of decompressed octets is dependent on what it is. If you know that it's text, you could simply run it through something like

Encoding.Default.GetString( bytes ) ;

But you'll need to know what the underlying encoding is.


Edited To Note: DotNetZip used to live at Codeplex. Codeplex has been shut down. The old archive is still available at Codeplex. It looks like the code has migrated to Github:


Upvotes: 1

Doug
Doug

Reputation: 5328

The data you return from the "PulledContent" column is in binary GZip compressed format. In order to transform this binary data into a human readable format you will need to perform the following tasks.

1) Uncompress the binary data using an appropriate library. If you are in the windows environment you could use the .net class GZipStream to perform this task.

2) Convert the binary data to a string. You can accomplish this by using the .net Encoding class.

string data = Encoding.Default.GetString(bytes); 

Enjoy!

Upvotes: 3

Related Questions