Ólafur Waage
Ólafur Waage

Reputation: 70001

Tips for developing a web server

After doing some search here, I found next to no questions on developing a web server.

I'm mainly going to be doing this for two reasons. As a side project and to learn more about developing a server program. This is not going to turn into a usable application, more of a learning tool

So the questions are simple.

Links to helpful sites are welcome, but don't link to a working project that is open source, since this is about the process of learning.

Upvotes: 11

Views: 7709

Answers (8)

Abdelghani Tassi
Abdelghani Tassi

Reputation: 23

I've already developed a web server that runs (Html and PHP) using C language it's not that complicated you should know how to use TCP/IP Sockets, Thread in order to handle multiple requests, processes fork (you need to create a child for php command line executing (i used execvp))

i think the most strugling part is handling strings in c langage and send POST/GET requests in command line.

Good luck

Upvotes: 0

Lloyd
Lloyd

Reputation: 8406

local-web-server is an example of a simple development web server written in node.js.. It's more reliable and has more features than python -m SimpleHTTPServer

Upvotes: 1

Alex Gartrell
Alex Gartrell

Reputation: 2554

The course I TAed had a proxy assignment so I can kind of shed some light here, I think.

So, you're going to end up doing a lot of header changing just to make your life easier. Namely, HTTP/1.0 is wayyy easier to deal with than HTTP/1.1. You don't want to have to deal with managing timeouts and keep-alives and stuff like that. One connection per transaction is easiest.

You're going to be doing lots and lots of parsing. Parsing is hard in C. I'd advise you to write a function that is something like

int readline(char *buff, int maxLen) {
    while((c = readNextCharFromSocket(&s)) && c != '\n' && i < maxLen)
      buff[i++] = c;
    return i;
}

and handle it one line at a time, solely because it's easiest to use the existing C string functions on one line at a time. Also, remember lines are \r\n separated and headers are terminated with a \r\n\r\n.

The main hard thing will be parsing, so long as you can read files everything else will work as expected.

For debugging, you'll probably want to print out headers that are passed around to sanity test them when stuff breaks.

Upvotes: 1

Charlie Martin
Charlie Martin

Reputation: 112366

A web server starts out as being an extremely simple piece of code:

  • open a TCP/IP socket on port 80
  • while not terminated
    • wait for connections on that socket
    • when someone sends you HTTP headers
      • find the path to the file
      • copy the file to the socket

So the outline of the code is easy.

Now, you have some complexities to handle:

  • in the simplest version of the code, while you're talking to one browser, all the others can't connect. You need to come up with some way of handling multiple connections.
  • it's often convenient to be able to send out something more than just a static file (although the first HTTP servers did exactly that) so you need to be able to run other programs.

Handling the possibility of multiple connections is also relatively easy, with a number of possible choices.

  • the simplest version (again, this is the way it was done originally) is to have the code that listens to port 80 set up a specific socket for that connection, then fork a copy of itself to handle that one connection. That process runs until the socket is closed, and then terminates. However, that's relatively expensive: a fork takes tens of milliseconds in general, so that limits how fast you can run.
  • The second choice is to create a lightweight process — a/k/a a thread — to process the request.

Running a program is actually fairly easy too. In general, you define a special path to a CGI directory; a URL that has a path through that directory then interprets the path name as the path to a program. The server would then create a subprocess using fork/exec, with STDOUT connected to the socket. The program then runs, sending output to STDOUT, and it is sent on to the client browser.

This is the basic pattern; everything else a web server does is just adding frills and additional functionality to this basic pattern.

Here are some other sources for example code:


It pretty much does nothing of what you really wanted, but for simple it's hard to beat this one from http://www.commandlinefu.com:

$ python -m SimpleHTTPServer

Upvotes: 20

rein
rein

Reputation: 33455

Firstly, please don't let this become a usable project - getting security right for web servers is really hard.

Ok, here are things to keep in mind:

  1. The thread that accepts connections needs to hand off to background threads as soon as possible.
  2. You can't have a thread for every single connection - with large volumes you'll run out of your thread limit.
  3. Use some kind of a worker thread pool to handle your requests.
  4. Ensure that you scrub the URL when you get an HTTP GET request. So I couldn't do something like http://localhost/../../Users/blah/ to get higher level access.
  5. Ensure you always set the relevant content and mime types.

Good luck - this is a hell of a job.

Upvotes: 10

Serapth
Serapth

Reputation: 7162

I wrote a light webserver in Python a few years back, also as a learning project.

The simplest piece of advice I can give, especially as a learning project, is build a core that works, then iterative design on top of that. Don't aim for the moon right off the hop, start very small, then add featuers, refine and continue. I would recommend using a tool that encourages expermentation, like Python, where you can literally type and test code at the same time.

Upvotes: 1

Will Hartung
Will Hartung

Reputation: 118681

The networking et al are pretty standard fair, so don't worry so much about that. (there are several "instant", sample network servers in most any language.)

Instead, focus on actually implementing the HTTP specification. You'll be amazed at a) what you don't know and b) how much things that are supposed to be HTTP compliant, really aren't, but fake it well.

Then you'll marvel that the web works at all.

When you're done with HTTP, enjoy trying to implement IMAP.

Upvotes: 2

Evan Meagher
Evan Meagher

Reputation: 4555

I was thinking of starting the same project as a way to learn Python better. There's a BaseHTTPServer class that's a pretty good starting point.

Here's some tutorial-style articles: 1 & 2

Upvotes: 0

Related Questions