omninonsense
omninonsense

Reputation: 6892

Ruby packet buffering and splitting

So, this is a little broad, I suppose, but I'll try and narrow it down as much as I can. I have a server (with EventMachine) and sometimes the packets come split, but sometimes they're buffered. So, I tried creating a function that would buffer/un-buffer them. I did manage to make something, however it's not working 'as expected.' To be quite honest, I doubt I can even call it 'barely functional.'

Before anything else, I'll like to point out the packet structure:

Note: lenf is the raw format of len, so a string, it's not that important, I think.


The Bufferer Code

def split(data)
    if ($packet_buffer != "" && !$packet_buffer.nil?)
        data = $packet_buffer + data
        $packet_buffer = ""
    end
    last = 0
    packets = []
    loop do
        if data[last..-1].length < 8
            $packet_buffer = data[last..-1]
            break
        end
        name = data[last...last+=4]
        lenf = data[last...last+4]

        len = 0
        data[last...last+=4].each_byte {|b| len+=b}

        if !data[last+4..-1].nil? && data[last+4..-1].length < len
            $packet_buffer = data
            break
        end

        ref = data[last...last+=4]
        msg = data[last...last+=len]

        packets << (name << lenf << ref << msg)

        break if data[last..-1].nil?
    end
    packets
end

TLDR

How to split buffered and buffer split packets/data (passed by EventMachine) in Ruby?

Update: The packets are sent over TCP. The data comes from a client made in C, so yeah it is a stream of bytes.

I am not sure what exactly is going wrong, but the method doesn't seem to split or buffer the packets properly. It works fine while it receives small amounts of data (which aren't either buffered or split, I assume).

Sometimes it even splits packets successfully, if they're buffered, but buffering doesn't seem to work at all


I'm fairly sure I'm messing up some 'logic' part here, however I just can't figure out what it is. Any help will be greatly appreciated.

Thanks

Upvotes: 1

Views: 500

Answers (2)

omninonsense
omninonsense

Reputation: 6892

Okay, some thinking I figured a way to do it properly, thanks to David Grayson for all his help, since his answer cleared a lot of confusions/doubts I had:

def split(data)
    packets = []
    loop do
        if !$packet_buffer.nil?
            data = $packet_buffer << data
            $packet_buffer = nil
        end

        if data.length < 8
            $packet_buffer = data
            break
        end


        len = calc_Uint32(data[4...8])

        if data.length-12 < len
            $packet_buffer = data
            break
        end

        packets << data[0...12+len]
        data[0...12+len] = ''

        break if data.length == 0
    end
    packets
end #split

I sincerely doubt anyone will find it useful, since it's not that universal, but I hope someone can find an use for it, eventually.

Upvotes: 0

David Grayson
David Grayson

Reputation: 87541

Well here's one error that jumps out at me:

len = 0
data[last...last+=4].each_byte {|b| len+=b}

You didn't specify what format you are storing the length in, but if it's a little endian integer then you should do something like len = (len>>8) + (b<<24) instead of just adding all the bytes together like you are doing now. Your current algorithm would work fine if len was always less than 256.

There may be other logic errors hiding in here. I don't like your use of confusing expressions like data[last..-1].nil?; I would rewrite them as simple inequalities involving data.length and last.

If you want to really clean up your code then I would recommend taking a different approach: feed the bytes in to a new function called process_byte one at a time. That function would be in charge of keeping track of any state information it needs (e.g. what part of the message it is expecting to receive next), assembling the bytes into complete messages, and passing the compelete message on to higher-level code. The process_byte function would be unaware of how the bytes were packetized, so right away you will crush a certain class of bugs your program might have.

You could use Ruby fibers to implement the process_byte function in a nice way that allows you to write code that looks synchronous (e.g. len += get_next_byte()) but would actually be asynchronous.

Upvotes: 1

Related Questions