Javad Kouhi
Javad Kouhi

Reputation: 135

receiving a web page from web server in a c program

Here is my code to get a web page from a server (actually google.com):

#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netdb.h>
#include <string.h>
#include <unistd.h>

char http[] = "GET / HTTP/1.1\nAccept: */*\nHost: www.google.com\nAccept-Charset: utf-8\nConnection: keep-alive\n\n";
char page[BUFSIZ];

int main(int argc, char **argv)
{
    struct addrinfo hint, *res, *res0;

    char *address = "www.google.com";
    char *port = "80";

    int ret, sockfd;

    memset(&hint, '\0', sizeof(struct addrinfo));

    hint.ai_family = AF_INET;
    hint.ai_socktype = SOCK_STREAM;
/*  hint.ai_protocol = IPPROTO_TCP; */

    if((ret = getaddrinfo(address, port, &hint, &res0)) < 0)
    {
        perror("getaddrinfo()");
        exit(EXIT_FAILURE);
    }

    for(res = res0; res; res = res->ai_next)
    {
        sockfd = socket(res->ai_family, res->ai_socktype, res->ai_protocol);

        if(-1 == sockfd)
        {
            perror("socket()");
            continue;
        }

        ret = connect(sockfd, res->ai_addr, res->ai_addrlen);

        if(-1 == ret)
        {
            perror("connect()");
            sockfd = -1;
            continue;
        }

        break;

    }

    if(-1 == sockfd)
    {
        printf("Can't connect to the server...");
        exit(EXIT_FAILURE);
    }

    send(sockfd, http, strlen(http), 0);

    recv(sockfd, page, 1023, 0);

    printf("%s\n", page);

    return 0;
}

I've just defined an array of 'BUFSIZ' chars in order to store the web page. The BUFSIZ is actually 1024 character on my operating system and therefor, I can store a web page with 1024 char length. But what if the page was actually larger than 1024? I mean, how can I store a page that is larger than 1024 character? I could define an array of 2048, 4096 or even 10,000 chars, but I think it is not the conventional way.

Thanks.

Upvotes: 1

Views: 304

Answers (3)

colin
colin

Reputation: 301

Dynamic array is a way to handle your "unknown length" issue, when more bytes need to be downloaded, you can enlarge your array dynamically.

I think you can decode the HTTP response header first, if the header has "Content-Length" field, then you know the length of the HTTP response message(response body contains the requested document). This way you can allocate enough space for the page buffer.

Upvotes: 0

unwind
unwind

Reputation: 400129

What you typically do is store the data in a dynamic array, which in C is implemented using realloc() to grow a block of memory.

You usually use a smaller statically allocated array like yours to do repeated reads into, and then once you've gotten a new block of bytes you append it to the dynamic array, growing it if needed.

You're going to have to keep track of the dynamic array's actual length (the number of downloaded characters that have been stored in it) and it's allocated length (the number of bytes available to store data in) separately.

Upvotes: 1

cnicutar
cnicutar

Reputation: 182754

One typical solution is to call recv(2) in a loop and keep processing (printing ?) received bytes. That way you can receive pages of any size.

ssize_t nread;

while ((nread = recv(sockfd, page, sizeof page, 0)) > 0) {
    /* .... */
}

if (nread < 0)
    perror("recv");

Upvotes: 2

Related Questions