banad
banad

Reputation: 491

Fetching a web page using C socket programming

I am trying to build a proxy server in C. My problem is as follows,

I have a function fetch_response() which connects to example.com and queries the server with and HTTP GET request.

int fetch_response() {
   int sockfd, portno, n;
    struct sockaddr_in serv_addr;
    struct hostent *server;

    char buffer[4096];
    char *host = "example.com";

    portno = 80;
    sockfd = socket(AF_INET, SOCK_STREAM, 0);
    if (sockfd < 0) 
        error("ERROR opening socket");
    server = gethostbyname(host);
    if (server == NULL) {
        fprintf(stderr,"ERROR, no such host\n");
        exit(0);
    }
    bzero((char *) &serv_addr, sizeof(serv_addr));
    serv_addr.sin_family = AF_INET;
    bcopy((char *)server->h_addr, 
         (char *)&serv_addr.sin_addr.s_addr,
         server->h_length);
    serv_addr.sin_port = htons(portno);
    if (connect(sockfd,(struct sockaddr *) &serv_addr,sizeof(serv_addr)) < 0) error("ERROR connecting");
    const char * request = "GET / HTTP/1.0\r\nHost: example.com\r\nConnection: close\r\n\r\n";
    n = write(sockfd,request,strlen(request));
    if (n < 0) error("ERROR writing to socket");
    bzero(buffer,4096);
    n = read(sockfd,buffer,4095);
    if (n < 0) error("ERROR reading from socket");
    printf("%d\n", (int)strlen(buffer));
    printf("%s\n",buffer);
    close(sockfd);
    return 0;
}

It runs fine when tested, for example

int main() {
    fetch_response();
    return 0;
}

However in my proxy server I am trying to handle multiple client requests, so my main() functions is like,

while(1) {
   new_socket = accept(params);
   if(new_socket < 0) error("Error on Connect");
   pid = fork();
   if(pid < 0) error("Error on fork");
   if(pid == 0) {
      fetch_response();
      exit(0);
   }
   else close(new_socket);
}

In this case, I encounter a problem. I receive only the first 1328 bytes of the requested page no matter what my buffer size is. I have tested it with different domains and the result is the same. For example,in case of example.com, the expected result is,

<html>
<head></head>
<body><h1> Example Domain </h1>
      < Some remaining body here >
</body>
</html>

But instead I get

<html>
<head></head>
<body><h1> Example Domain </h1>

I cannot understand why this is happening. Please help.

Thanks !

PS : This is not the actual code of the proxy server. For debugging, I commented everything out and tested the code as above.

Upvotes: 2

Views: 3670

Answers (1)

Joel C
Joel C

Reputation: 3158

You need to enclose your reading code in a loop, something like this:

while (1) {
  bzero(buffer,4096);
  n = recv(sockfd,buffer,4095, 0);
  if (n < 0) {
    error("ERROR reading from socket");
    break;
  }
  if (n == 0) {
    // far end has closed socket
    break;
  }
  // printf("%d\n", (int)strlen(buffer));
  printf("%d\n", n);
  printf("%s\n",buffer);
}

This will keep reading from the socket until the far end closes it. For each time recv is called, it will return the number of bytes in the buffer. When it returns 0, the far end has closed the socket and there is no more to be read.

Upvotes: 3

Related Questions