Persistent library instead of repeated imports django

Question

I'm using Django to host a machine learning service, which returns predictions when queried with the parameters.

My problem is, everytime a new request comes in, it has to import tensorflow and all the various libraries all over again. This makes it really slow. (Tensorflow spews a bunch of messages whenever its imported and takes like 4 seconds to load)

Is there a way to make the libraries and models persistent?

Current Architecture (Only that service):

main_app/
    manage.py
    classifiers/
        __init__.py
        util.py
        views.py
        lstm_predictor.py

util.py: (Tensorflow is reloaded everytime a new request comes in!)

from sklearn.externals import joblib
import pandas as pd
import xgboost as xgb
from keras.models import load_model
from keras.preprocessing import sequence
from nltk.corpus import stopwords
import os,calendar,re
import logging
from lstm_predictor import lstm_predict
logger = logging.getLogger(__name__)

# Load models here to avoid reload every time
ensemble_final_layer = joblib.load("final_ensemble_layer.pkl")
text_processor = joblib.load("text_processor.pkl")
lstm = load_model("LSTM_2017-07-18_V0")

views.py

import json, pdb, os, hashlib
import logging

from django.core.serializers.json import DjangoJSONEncoder
from django.http.response import HttpResponse
from django.shortcuts import render
from django.views.decorators.csrf import csrf_exempt
from classifiers.util import *

logger = logging.getLogger(__name__)


@csrf_exempt
def predict(request):
    result = get_prediction(params)
    result_hash = {"routing_prediction":result}
    data = json.dumps(result_hash, cls = jangoJSONEncoder)
    return HttpResponse(data, content_type="application/json")

Is there somewhere I can shift the imports to, so that its only loaded once when the app starts?

Thank you! :)

Grimmy · Accepted Answer

This is not really an answer directly to the problem, but points out a potentially critical issue when making web apis that relies on doing memory and/or time hungry processes in Django views directly.

Web apis are supposed to be light and respond as quick as possible. They should also be easy to scale up with more processes without costing too much resources.

When using tensorflow in django directly, each django process will initialise their own tensorflow module and data files. These processes also tend to be restarted based on rules in the master process, but I don't know what the default behaviour is on Heroku. (I'm guessing they use uwsgi or gunicorn)

A better way is to move the work to separate worker processes. These processes waits for incoming work on a queue. Your predict view will just push a new job to the queue and return a unique job_id in the response (taking only a few milliseconds). The client using the api can then periodically pull the status of that job_id. When the job is completed successfully, it returns the json result.

This way you can have a really light and responsive api server. The number of workers can be scaled up and down depending on the needs. The workers can also run on a different server/container.

One way of achieving this is using django_celery, but there are probably many other options.

Persistent library instead of repeated imports django

Answers (2)

Related Questions