Thomas Wagenaar
Thomas Wagenaar

Reputation: 6759

What is faster: preloading files or loading them on command?

I have been making a webserver with Python and I thought it would be faster to preload all the .html files in a dictionary like so:

#snippets are the HTML files
snippets = {}
snippetnames = [ f for f in listdir(getcwd() + "/snippets") if isfile(join(getcwd() + "/snippets",f))]
for i in snippetnames:
    snippets[i.replace('.html', '')] = (open('snippets/' + i, 'r').read())

This is run before I start my server, if my server then wants to retrieve data it simple does clientsocket.send(snippets['nameofhtmlfile']) for every request. However, is this ACTUALLY faster than doing this for every request:

file = open('nameofhtmlfile', 'r')
file = file.read()
c.send(file.encode('utf-8'))

I notice that the first option is faster by using the Chrome Developer Menu (F12) but I want to know why it's actually faster.

Upvotes: 0

Views: 419

Answers (2)

Myst
Myst

Reputation: 19221

The memory option should be faster, as both disc access and the number of operations required to load content from a file are more expensive than reading data from the memory.

...BUT...

I'm a firm believer in the lazy load option for these scenarios.

If your server is "sleeping" - let's assume you deployed on Heroku - load times are could be very important for the responsiveness of that first request.

If you're loading all the server to the cache during that first load time, it would be significantly slower...

...however, lazy loading - by which I mean to say that you store a file to the cache during the first time it's actually requested - allows you to balance between load times, response times and other considerations.

Having dynamic cache management also allows you to store template information upon first load (such as template objects) and maybe other information that isn't a file but is frequently used.

For instance, I have a project in Ruby where I stored Haml, Slim, SASS and CoffeeScript template rendering engines in the cache, so that it's possible to re-run the engine for every request without recreating the engine object (and obviously without re-accessing the template file).

Also, the cache API was open for the framework users, and that certainly expended it's usability.

I know it's Ruby code, but if you want to have a look at the cache system, you can find the code here.

Good luck.

Upvotes: 1

llogiq
llogiq

Reputation: 14541

You may want to look into load tests (e.g. with http_load) to answer your performance question.

Also if you really care about performance, you should probably not write your webserver in python, but reuse an existing one written in a lower level language, for example h2o or nxweb.

Upvotes: 1

Related Questions