Eamorr
Eamorr

Reputation: 10012

C sockets: recv(...) not returning correct bytes

If I telnet into telnet www.xlhi.com 80, and apply the following GET request:

GET http://www.xlhi.com/ HTTP/1.1
Host: www.xlhi.com
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:7.0.1) Gecko/20100101 Firefox/7.0.1
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip, deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Proxy-Connection: keep-alive
Cookie: CG=IE:04:Cork
Cache-Control: max-age=0

I get the following response:

HTTP/1.1 200 OK
Date: Tue, 06 Dec 2011 10:35:08 GMT
Server: Apache/2.2.14 (Ubuntu)
X-Powered-By: PHP/5.3.2-1ubuntu4.9
Vary: Accept-Encoding
Content-Encoding: gzip
Content-Length: 48
Content-Type: text/html

��(�ͱ���I�O����H�����ч��
                          �4�@�

Everything is fine and as expected. I'm interested in the gzipped binary data returned ("Hello").

Now, I have this C function which applies a GET request to a server (in this case www.xlhi.com)

char* applyGetReq(char* url,char* data,int len){
        int sockfd, numbytes;
        struct addrinfo hints, *servinfo, *p;
        int rv;
        char s[INET6_ADDRSTRLEN];

        memset(&hints, 0, sizeof hints);
        hints.ai_family = AF_UNSPEC;
        hints.ai_socktype = SOCK_STREAM;
        printf("Server name: %s\n\n",url);
        if ((rv = getaddrinfo(url,"80", &hints, &servinfo)) != 0) {
                fprintf(stderr, "getaddrinfo: %s\n", gai_strerror(rv));
                exit(1);
        }

        // loop through all the results and connect to the first we can
        for(p = servinfo; p != NULL; p = p->ai_next) {
                if ((sockfd = socket(p->ai_family, p->ai_socktype,p->ai_protocol)) == -1) {
                        perror("client: socket");
                        continue;
                }
                if (connect(sockfd, p->ai_addr, p->ai_addrlen) == -1) {
                        close(sockfd);
                        perror("client: connect");
                        continue;
                }
                break;
        }

        if (p == NULL) {
                fprintf(stderr, "client: failed to connect\n");
                exit(1);
        }

        inet_ntop(p->ai_family, get_in_addr((struct sockaddr *)p->ai_addr),s, sizeof s);
        //printf("client: connecting to %s\n", s);

        sendall(sockfd,data,&len);

        freeaddrinfo(servinfo); // all done with this structure

        char* buf=malloc(MAXDATASIZE*sizeof(char));
        if ((numbytes = recv(sockfd, buf, MAXDATASIZE-1, 0)) == -1) {
                perror("recv");
                exit(1);
        }
        //printf("numbytes:%d\n",numbytes);
        buf[numbytes] = '\0';
        close(sockfd);
        return buf;
}

Now, when I call this function and print out the result:

    ...
    int len = strlen(data);   //data is a char[] and contains the exact same GET request as mentioned above
    char* buf=NULL;
    buf=applyGetReq(stripped_url,data,len);
    printf("%s\n",buf);

I get the following response from the server:

HTTP/1.1 200 OK
Date: Tue, 06 Dec 2011 10:03:13 GMT
Server: Apache/2.2.14 (Ubuntu)
X-Powered-By: PHP/5.3.2-1ubuntu4.9
Vary: Accept-Encoding
Content-Encoding: gzip
Content-Length: 48
Content-Type: text/html

�

As you can see, the page contents (binary data) is cut short for some unexplained reason. I should be getting:

��(�ͱ���I�O����H�����ч��
                              �4�@�

I've been looking at this for two hours now and can't seem to get to the bottom of it so I thought I'd ask the community.

Upvotes: 1

Views: 841

Answers (1)

cnicutar
cnicutar

Reputation: 182619

That's how printf works. It stops when it encounters a NUL (0) byte. Try to use another function

fwrite(buf, 1, numbytes, stdout);

Upvotes: 4

Related Questions