user1340802
user1340802

Reputation: 1157

using Python ladon on Apache, load once for a webservice

as i am not really aware of the underlying strategies or protocols used by Ladon, Webservices and Apache (i am using Ladon and Python with mod_wsgi.so on a Windows Apache server - switched to Ubuntu system)

i wonder if this can be possible to load some ressources for python once, so that exposed methods use these ressources from python code without having to load these ressources again when considering /serving new queries to the web services?

do you have any clue on how to achieve this if possible, or any work around if not ?

typically i am loading some huge dictionaries from files that take too much time to load (I/O) and as it is loaded when receiving each new ladon query, the WS is too slow, i would have like to tell Ladon : "load this when apache start, and made that available to all my python web services/codes as a dictionary during all the time that Apache is running". I will not modify these datas, so i just need to able to read/access them.

best regards

first EDIT : if this could help, looks like on my Ubuntu (i have switched to Ubuntu from my Win config to be more "standard", hope i was right doing this), Apache2 is set in prefork mode rather than MPM, (as suggested by Jakob Simon-Gaarde) readed from :

@: sudo /usr/sbin/apache2 -l
Compiled in modules:
  core.c
  mod_log_config.c
  mod_logio.c
  prefork.c
  http_core.c
  mod_so.c
@: sudo /usr/sbin/apache2 -l | grep MPM
@:

i'm going to check how this can be done, maybe i am also putting some simplified code here, because for now i'm in a noway even with your helpful answers (i can make anything work here :/)

when installing MPM mode, found how to do here: $ sudo apt-get install apache2-mpm-worker

last EDIT:

here is the skeleton of my WS code :

MODEL_DIR = "/home/mydata.file"

import sys
import codecs
import glob
import os
import re

import numpy

from ladon.ladonizer import ladonize
from ladon.types.ladontype import LadonType
from ladon.compat import PORTABLE_STRING

class Singleton(type): 
    _instances = {} 
    def __call__(cls, *args, **kwargs): 
        if cls not in cls._instances: 
            cls._instances[cls] = super(Singleton, cls).__call__(*args, **kwargs) 
        return cls._instances[cls] 

class LDtest(object):
    __metaclass__ = Singleton
    modeldir = MODEL_DIR
    def __init__(self):
        self.load()

    def load(self):
        modeldir = LDtest.modeldir
        self.data = mywrapperfordata.mywrapperfordata(modeldir)
        b = datetime.datetime.now()
        self.features = self.mywrapperfordata.load() # loading is wrapped here
        c = datetime.datetime.now()
        print("loading: %s done." % (c-b))

    def letsdoit(self, myinput):
        return [] # actually main logic ie complex stuff involving accessing to self.features

    @ladonize(PORTABLE_STRING, [ PORTABLE_STRING ], rtype = [ PORTABLE_STRING ] )
    def ws(self, myinput):
        result = self.letsdoit(myinput)
        return result

import datetime
a = datetime.datetime.now()
myLDtest = LDtest()
b = datetime.datetime.now()
print("LDtest: %s" % (b-a))

about loading time: from my apache2 log: -notice that module 1 is required and imported by module 2 and also providing as a lonely webservice. It looks like the singleton is not built or not quickly enough?

[Tue Jul 09 11:09:11 2013] [notice] caught SIGTERM, shutting down
[Tue Jul 09 11:09:12 2013] [notice] Apache/2.2.16 (Debian) mod_wsgi/3.3 Python/2.6.6 configured -- resuming normal operations
[Tue Jul 09 11:09:50 2013] [error] Module 4: 0:00:02.885693.
[Tue Jul 09 11:09:51 2013] [error] Module 0: 0:00:03.061020
[Tue Jul 09 11:09:51 2013] [error] Module 1: 0:00:00.026059.
[Tue Jul 09 11:09:51 2013] [error] Module 1: 0:00:00.012517.
[Tue Jul 09 11:09:51 2013] [error] Module 2: 0:00:00.012678.
[Tue Jul 09 11:09:51 2013] [error] Module (dbload): 0:00:00.402387 (22030)
[Tue Jul 09 11:09:54 2013] [error] Module 3: 0:00:00.000036.
[Tue Jul 09 11:13:00 2013] [error] Module 0: 0:00:03.055841
[Tue Jul 09 11:13:01 2013] [error] Module 1: 0:00:00.026215.
[Tue Jul 09 11:13:01 2013] [error] Module 1: 0:00:00.012600.
[Tue Jul 09 11:13:01 2013] [error] Module 2: 0:00:00.012643.
[Tue Jul 09 11:13:01 2013] [error] Module (dbload): 0:00:00.322444 (22030)
[Tue Jul 09 11:13:03 2013] [error] Module 3: 0:00:00.000035.

Upvotes: 0

Views: 689

Answers (2)

Jakob Simon-Gaarde
Jakob Simon-Gaarde

Reputation: 725

We use Ladon extensively at my work with all our web projects, and I have the priviledge of being able to develop my private project (I am the Ladon developer) and getting payed for it ;-) Some of our services have very heavy resource consumptions, for instance we have a text-to-speach service that loads around 1Gb of data into memory per supported language, and a wordprediction service that loads around 100Mb per supported language.

mod_wsgi is fine - we use that aswell - What you need to do is make sure that your apache server is compiled as mpm-worker (http://httpd.apache.org/docs/2.2/mod/worker.html). In this configuration your service runs in a multi-threaded environment instead of a multi-process environment. The effect is that you only fire up one interpreter per server process which then runs your service in several underlying threads that share resources. The caveeat is that you have to make sure that your service does not step on it's own toes, meaning you will have to protect global variables and class-static variables shared between service class instances with mutex.acquire()/mutex.release().

Other than that Ladon as a framework is build for multi-threaded environments.

Best regards Jakob Simon-Gaarde

Upvotes: 1

Simon
Simon

Reputation: 12488

mod_wsgi launches one or more Python processes upon startup and leaves them running to handle requests. If you load a module or set a global variable, they'll still be there when you handle the next request - however, each Python process has its own separate block of memory, so if you configure mod_wsgi to launch 8 processes and load a 1G dataset, eventually you'll be using 8G of memory. Maybe you should consider using a database?

edit: Thanks Graham :-) So with only one process and multiple threads, you can share one copy of your huge dictionary between all worker threads.

Upvotes: 2

Related Questions