AZur
AZur

Reputation: 49

C# - how to decompress an ISO-8859-1 string

I have C# program that received zipped bitstream received in iso-8859-1 character set. I need to get the string that was compressed. It should be equivalent to the this python code:

zlib.decompress(bytes(bytearray(json_string, 'iso8859')), 15+32).

I tried this code for decompress:

        Encoding iso_8859_1 = Encoding.GetEncoding("iso-8859-1");
        byte[] isoBytes = iso_8859_1.GetBytes(inputString);

        // then do GZip extract
        MemoryStream objMemStream = new MemoryStream();
        objMemStream.Write(isoBytes, 0, isoBytes.Length);
        objMemStream.Seek(0, SeekOrigin.Begin);

        GZipStream objDecompress = new GZipStream(objMemStream, CompressionMode.Decompress);

But, objDecompress.Read failed, so I did something wrong.

***** Edit 31/03

The Java code which do the compression is:

    ByteArrayOutputStream out = new ByteArrayOutputStream();
    GZIPOutputStream gzip = new GZIPOutputStream(out);
    gzip.write(JsonStr.getBytes());
    gzip.close();
    return out.toString("ISO-8859-1");

I need a C# code to get the JsonStr. Would like to get some help.

Upvotes: 1

Views: 680

Answers (2)

Mertuarez
Mertuarez

Reputation: 944

I think that it should by more like this.

Encoding iso_8859_1 = Encoding.GetEncoding("iso-8859-1");
string inputData = "";
string outputData = "";

// then do GZip extract
using (MemoryStream uncompressedData = new MemoryStream())
using (GZipStream decompressor = new GZipStream(uncompressedData, CompressionMode.Decompress))
{
    byte[] inData = Encoding.ASCII.GetBytes(inputData);
    decompressor.Write(inData, 0, inData.Length);
    outputData = iso_8859_1.GetString(uncompressedData.ToArray());
}

Upvotes: 1

Christos Lytras
Christos Lytras

Reputation: 37288

There are build in DeflateStream GZipStream classes, but I couldn't manage to reverse this probably because ZLibNative has default constants like public const int Deflate_DefaultWindowBits = -15. There are discussions on this subject like in this DotNet runtime issue "System.IO.Compression to support zlib thin wrapper over DEFLATE?"

There is zlib.net NuGet package which you can use to decompress the data. You can read of a simple compress/decompress implementation here Compression and decompression problem with zlib.Net.

Python compress

import zlib
import binascii

json_string = '{"aaaaaaaaaa": 1111111111, "bbbbbbbbbbb": "cccccccccccc"}'

compressed_data = zlib.compress(bytes(bytearray(json_string, 'iso8859')), 2)
decompressed_data = zlib.decompress(compressed_data, 15+32)

print('Compressed HEX data: %s' % (binascii.hexlify(compressed_data)))
print('Decompressed data: %s' % (decompressed_data))

Will output:

Compressed HEX data: b'785eab564a8403252b054338d051504a4200a09452321250aa0500e4681153'
Decompressed data: b'{"aaaaaaaaaa": 1111111111, "bbbbbbbbbbb": "cccccccccccc"}'

C# decompress

static void Main(string[] args) {
    var extCompressedHex = "785eab564a8403252b054338d051504a4200a09452321250aa0500e4681153";
    var extCompressed = HexStringToByteArray(extCompressedHex);

    byte[] extDecompressedData;
    DecompressData(extCompressed, out extDecompressedData);

    string extDecompressedJson = Encoding.GetEncoding("ISO-8859-1").GetString(extDecompressedData);

    Console.WriteLine("Hex ext compressed: {0}", ByteArrayToHex(extCompressed.ToArray()));
    Console.WriteLine("Raw ext decompressed: {0}", extDecompressedJson);
}


void DecompressData(byte[] inData, out byte[] outData)
{
    using (MemoryStream outMemoryStream = new MemoryStream())
    using (ZOutputStream outZStream = new ZOutputStream(outMemoryStream))
    using (Stream inMemoryStream = new MemoryStream(inData))
    {
        CopyStream(inMemoryStream, outZStream);
        outZStream.finish();
        outData = outMemoryStream.ToArray();
    }
}

// Helper functions ___________________________________________________

string ByteArrayToHex(byte[] bytes)
{
    StringBuilder sw = new StringBuilder();

    foreach (byte b in bytes)
    {
        sw.AppendFormat("{0:x2}", b);
    }

    return sw.ToString();
}

void CopyStream(System.IO.Stream input, System.IO.Stream output)
{
    byte[] buffer = new byte[2000];
    int len;
    while ((len = input.Read(buffer, 0, 2000)) > 0)
    {
        output.Write(buffer, 0, len);
    }
    output.Flush();
}

byte[] HexStringToByteArray(string hex)
{
    return Enumerable.Range(0, hex.Length)
                     .Where(x => x % 2 == 0)
                     .Select(x => Convert.ToByte(hex.Substring(x, 2), 16))
                     .ToArray();
}

Will output:

Hex ext compressed: 785eab564a8403252b054338d051504a4200a09452321250aa0500e4681153
Raw ext decompressed: {"aaaaaaaaaa": 1111111111, "bbbbbbbbbbb": "cccccccccccc"}

You can check it working in this .NET Fiddle.

Upvotes: 3

Related Questions