NewbieProgrammer
NewbieProgrammer

Reputation: 142

TypeError: cannot pickle 'LockType' object

I am trying to create a flask application with rq redis which stores tasks that returns the data scraped by playwright library.

What I am trying to do is to create a global browser instance of playwright variable, so when different users try to get that data from flask, same instance is used. But I encounter a problem while sending the browser instance as an argument to task function through rq.enqueue

using monkey.patch_all() from gevent doesn't seem to work

My code is as follows: app.py

from gevent import monkey
monkey.patch_all()

import redis
from rq import Queue
from playwright.sync_api import sync_playwright
from flask import (
    Flask,
    render_template,
    request,
    make_response,
)
from src.flask.utils import (
    return_map_info,
    get_cookie
)

playwright = sync_playwright().start()
browser = playwright.chromium.launch(headless=True)
r = redis.Redis(host='localhost')
q = Queue(connection=r)
app = Flask(__name__)


@app.route('/add', methods=('GET', 'POST'))
def add_task():
    """
        This function is used to get the data from the post form
        and then add the task into the redis queue, which data will be
        returned later after the task is complete
    """
    jobs = q.jobs
    message = None
    if request.method == "POST":
        url = request.form['url']
        search_type = request.form['search_type']

        task = q.enqueue(return_map_info,
                         args=(browser,),
                         kwargs={
                             'url':url,
                             'type':search_type
                         })
        job_id = task.id
        cookie_key = get_cookie(request.cookies.get('cookieid'))
        jobs = q.jobs
        q_length = len(q)
        r.hset(cookie_key, url, job_id)

        message = f"The result is {task} and the jobs queued are {q_length}"
        resp = make_response(render_template(
            "add.html", message=message, jobs=jobs))

        resp.set_cookie("cookieid", cookie_key)
        return resp
    return render_template(
        "add.html", message=message, jobs=jobs)

Upvotes: 0

Views: 85

Answers (1)

Lemon Reddy
Lemon Reddy

Reputation: 637

From what I understand, you want to show the scraped results for each and every user. You don't need a queue if you just want to do that, instead find an appropriate way to store the scraped data, filter based on the user input and send the data in the response.

But if you are running RQ worker to scrape sites based on the user input, you have to initiate playwright instance inside the worker, run the job which scrapes the data and store them in a database, which could be later used as mentioned above.

RQ is a task queue based off Redis. Separate RQ worker process is run listening on Redis configured queues. So whatever input(args, kwargs) you give to the job must be serialisable in order to be stored in Redis. And the worker listening reads and deserialises data to get the actual input.

Upvotes: 1

Related Questions