Ashwin
Ashwin

Reputation:

Socket: Incorrect lenght returned by read() function

I am pretty new to socket programming. I have a function call similar to:

len = read(FD, buf, 1500);

[which is responsible for reading data from a telnet connection]

a printf in the next line shows buf to be >300 characters in length but (int)len gives only 89! Because of this all further parsing of the returned string fails..

I read a lot of questions on async socket read returning less than required data but in the above case it is returning sufficient data but the length reported is all wrong [the returned string is always the same and the length is always the same wrong value]...

Also the above function works properly when the returned string is small (typically in the range of 100 characters)

Any pointers would be extremely helpful!

--Ashwin

Upvotes: 2

Views: 1882

Answers (4)

paxdiablo
paxdiablo

Reputation: 881373

but in the above case it is returning sufficient data but the length reported is all wrong

No, you are wrong about that. It is returning what it says it's returning, 89 bytes. The problem is that those 89 bytes don't include a nul terminator so that, when you printf the buffer, it keeps going, printing whatever was already in the rest of the buffer before your read happened.

What you should be doing (but see caveat below) is something like:

len = read(FD, buf, 1500);
printf ("%*.*s\n", len, len, buf);

to ensure you don't print beyond the end of the buffer.

What you're seeing is equivalent to:

char buff[500];
strcpy (buff, "Hello there");
memcpy (buff, "Goodbye", 7);
printf ("%s", buff);

Because you're not transferring the nul character in the memcpy, the buffer you're left with is:

               +---+---+---+---+---+---+---+---+---+---+---+---+
After sprintf: | H | e | l | l | o |   | t | h | e | r | e | \0|
               +---+---+---+---+---+---+---+---+---+---+---+---+
After memcpy : | G | o | o | d | b | y | e | h | e | r | e | \0|
               +---+---+---+---+---+---+---+---+---+---+---+---+

giving the string "Goodbyehere".

Caveat:

If there are nul characters within your data stream, that printf won't work since it'll stop at the first nul character it finds. The read function reads binary data from a file descriptor and it doesn't have to stop at the first newline or nul character.

That would be equivalent to:

char buff[500];
strcpy (buff, "Hello there");
memcpy (buff, "Go\0dbye", 8);
printf ("%s", buff);

               +---+---+---+---+---+---+---+---+---+---+---+---+
After sprintf: | H | e | l | l | o |   | t | h | e | r | e | \0|
               +---+---+---+---+---+---+---+---+---+---+---+---+
After memcpy : | G | o | \0| d | b | y | e | \0| e | r | e | \0|
               +---+---+---+---+---+---+---+---+---+---+---+---+

giving the string "Go".

If you really want to process nul- or newline-terminated string on what is a binary channel, the following (pseudo-code) is one way to do it:

while true:
    while buffer has no terminator character:
        read some more data into buffer, break on error or end-of-file.
    break on error or end-of-file.
    while buffer has at least one terminator character:
        process data up to first terminator character.
        remove that section from buffer.

It's a process that reads data until you have at least one "unit of work", then processes those units of work until you don't have a complete unit of work left.

Upvotes: 2

Aif
Aif

Reputation: 11220

you can zero the buffer at the read offset using :

buf[len] = 0;

The print call should be ok.

Upvotes: 2

dfa
dfa

Reputation: 116334

read() attempts to read up to 1500 bytes from file descriptor FD into the buffer starting at buf. On success, the number of bytes read is returned. It is not an error if this number is smaller than the number of bytes requested; this may happen for example because fewer bytes are actually available right now (maybe because we were close to end-of-file

Generally you need to call a read within a loop.

Upvotes: 0

Indy9000
Indy9000

Reputation: 8851

Usually for async sockets you have to read until the desired byte count is received. This means you have to manage the buffers. (i.e. increment the buffer pointer according to the bytes received, etc.)

The observations you have about the buffer having correct amount of data but reporting the wrong size is probably caused by stale data from the previous run. To confirm you could clear the buffer before each run.

Upvotes: 1

Related Questions