Matt
Matt

Reputation: 6963

Sending and receiving compressed data over a TCP socket

Need help with sending and receiving compressed data over TCP socket.

The code works perfectly fine if I don't use compression, but something very strange happens when I do use compression.. Basically, the problem is that the stream.Read() operation gets skipped and I don't know why..

My code:

using (var client = new TcpClient())
{
    client.Connect("xxx.xxx.xx.xx", 6100);
    using (var stream = client.GetStream())
    {
        // SEND REQUEST
        byte[] bytesSent = Encoding.UTF8.GetBytes(xml);

        // send compressed bytes (if this is used, then stream.Read() below doesn't work.
        //var compressedBytes = bytesSent.ToStream().GZipCompress();
        //stream.Write(compressedBytes, 0, compressedBytes.Length);

        // send normal bytes (uncompressed)
        stream.Write(bytesSent, 0, bytesSent.Length);

        // GET RESPONSE
        byte[] bytesReceived = new byte[client.ReceiveBufferSize];
        // PROBLEM HERE: when using compression, this line just gets skipped over very quickly
        stream.Read(bytesReceived, 0, client.ReceiveBufferSize);

        //var decompressedBytes = bytesReceived.ToStream().GZipDecompress();
        //string response = Encoding.UTF8.GetString(decompressedBytes);

        string response = Encoding.UTF8.GetString(bytesReceived);

        Console.WriteLine(response);
    }
}

You will notice some extension methods above. Here is the code in case you are wondering if something is wrong there.

public static MemoryStream ToStream(this byte[] bytes)
{
    return new MemoryStream(bytes);
}


public static byte[] GZipCompress(this Stream stream)
{
    using (var memoryStream = new MemoryStream())
    {
        using (var gZipStream = new GZipStream(memoryStream, CompressionMode.Compress))
        {
            stream.CopyTo(gZipStream);
        }
        return memoryStream.ToArray();
    }
}

public static byte[] GZipDecompress(this Stream stream)
{
    using (var memoryStream = new MemoryStream())
    {
        using (var gZipStream = new GZipStream(stream, CompressionMode.Decompress))
        {
            gZipStream.CopyTo(memoryStream);
        }
        return memoryStream.ToArray();
    }
}

The extensions work quite well in the following, so I'm sure they're not the problem:

string original = "the quick brown fox jumped over the lazy dog";
byte[] compressedBytes = Encoding.UTF8.GetBytes(original).ToStream().GZipCompress();
byte[] decompressedBytes = compressedBytes.ToStream().GZipDecompress();
string result = Encoding.UTF8.GetString(decompressedBytes);
Console.WriteLine(result);

Does anyone have any idea why the Read() operation is being skipped when the bytes being sent are compressed?

EDIT

I received a message from the API provider after showing them the above sample code. They had this to say:

at a first glance I guess the header is missing. The input must start with a 'c' followed by the length of the input (sprintf(cLength,"c%09d",hres) in our example). We need this because we can't read until we find a binary 0 to recognize the end.

They previously provided some sample code in C, which I don't fully understand 100%, as follows:

example in C:

#include <zlib.h>

uLongf hres;
char cLength[COMPRESS_HEADER_LEN + 1] = {'\0'};

n = read(socket,buffer,10);
// check if input is compressed
if(msg[0]=='c') {
     compressed = 1;
}

n = atoi(msg+1);
read.....


hres = 64000;
res = uncompress((Bytef *)msg,   &hres, (const Bytef*) 
buffer/*compressed*/, n);
if(res == Z_OK && hres > 0 ){
     msg[hres]=0; //original
}
else // errorhandling

hres = 64000;

if (compressed){
res = compress((Bytef *)buffer,   &hres, (const Bytef *)msg, strlen(msg));
     if(res == Z_OK && hres > 0 ) {
         sprintf(cLength,"c%09d",hres);
         write(socket,cLength,10);
         write(socket, buffer, hres);
     }
     else // errorhandling

makefile: add "-lz" to the libs

They're using zlib. I don't suspect that to make any difference, but I did try using zlib.net and I still get no response anyway.

Can someone give me an example of how exactly I'm supposed to send this input length in C#?

EDIT 2

In response to @quantdev, here is what I am trying now for the length prefix:

using (var client = new TcpClient())
{
    client.Connect("xxx.xxx.xx.xx", 6100);
    using (var stream = client.GetStream())
    {
        // SEND REQUEST
        byte[] bytes = Encoding.UTF8.GetBytes(xml);
        byte[] compressedBytes = ZLibCompressor.Compress(bytes);

        byte[] prefix = Encoding.UTF8.GetBytes("c" + compressedBytes.Length);

        byte[] bytesToSend = new byte[prefix.Length + compressedBytes.Length];
        Array.Copy(prefix, bytesToSend, prefix.Length);
        Array.Copy(compressedBytes, 0, bytesToSend, prefix.Length, compressedBytes.Length);

        stream.Write(bytesToSend, 0, bytesToSend.Length);

        // WAIT
        while (client.Available == 0)
        {
            Thread.Sleep(1000);
        }

        // GET RESPONSE
        byte[] bytesReceived = new byte[client.ReceiveBufferSize];
        stream.Read(bytesReceived, 0, client.ReceiveBufferSize);

        byte[] decompressedBytes = ZLibCompressor.DeCompress(bytesReceived);
        string response = Encoding.UTF8.GetString(decompressedBytes);

        Console.WriteLine(response);
    }
}

Upvotes: 1

Views: 5631

Answers (3)

cshu
cshu

Reputation: 5954

Encoding.UTF8.GetString shouldn't be used on arbitrary byte array. e.g.: The compressed bytes may contain NULL character, which is not allowed in UTF-8 encoded text except for being used as terminator.

If you want to print the received bytes for debugging, maybe you should just print them as integers.

Upvotes: 0

Sathish
Sathish

Reputation: 29

I think you may have the end of file or so. Can you try setting the stream position before reading the stream

stream.position = 0;

http://msdn.microsoft.com/en-us/library/vstudio/system.io.stream.read

Upvotes: 1

quantdev
quantdev

Reputation: 23813

You need to check the return value of the Read() calls you are making on the TCP stream: it is the number of bytes effectively read.

MSDN says :

Return Value

The total number of bytes read into the buffer. This can be less than the number of bytes requested if that many bytes are not currently available, or zero (0) if the end of the stream has been reached.

  • If the socket is closed, the call will return immediately 0 (which is what might be happening here).
  • If is not 0, then you must check how many bytes you did actually received, if it is less than client.ReceiveBufferSize, you will need additional calls to Read to retrieve the remaining bytes.

Prior to you call to read, check that some data is actually available on the socket :

while(client.Available == 0)
// wait ...

http://msdn.microsoft.com/en-us/library/system.net.sockets.tcpclient.available%28v=vs.110%29.aspx

Upvotes: 1

Related Questions