user1881957
user1881957

Reputation: 3378

Is multiprocessing or threading appropriate in this case in Python/Django?

I have a function like this in Django:

def uploaded_files(request):
    global source
    global password
    global destination
    username = request.user.username

        log_id = request.user.id
        b = File.objects.filter(users_id=log_id, flag='F')  # Get the user id from session .delete() to use delete
        source = '[email protected]:/home/sachet/my_files'
        password = 'password'
        destination = '/home/zurelsoft/my_files/'
        a = Host.objects.all() #Lists hosts
        command = subprocess.Popen(['sshpass', '-p', password, 'rsync', '--recursive', source],
                               stdout=subprocess.PIPE)
        command = command.communicate()[0]
        lines = (x.strip() for x in command.split('\n'))
        remote = [x.split(None, 4)[-1] for x in lines if x]
        base_name = [os.path.basename(ok) for ok in remote]
        files_in_server = base_name[1:]
        total_files = len(files_in_server)
        info = subprocess.Popen(['sshpass', '-p', password, 'rsync', source, '--dry-run'],
                               stdout=subprocess.PIPE)
        information = info.communicate()[0] 
        command = information.split()
        filesize = command[1] 
        #st = int(os.path.getsize(filesize))
        #filesize = size(filesize, system=alternative)
        date = command[2] 
        users_b = User.objects.all()
        return render_to_response('uploaded_files.html', {'files': b, 'username':username, 'host':a, 'files_server':files_in_server, 'file_size':filesize, 'date':date, 'total_files':total_files, 'list_users':users_b}, context_instance=RequestContext(request))  

The main usage of the function is to transfer the file from the server to local machine and writes the data into the database. What I want it: There are single file which is of 10GB which will take a long time to copy. Since the copying happens using rsync in command line, I want to let user play with other menus while the file is being transferred. How can I achieve that? For example if the user presses OK, the file will be transferring in command line, so I want to show user "The file is being transferred" message and stop rolling the cursor or something like that? Is multiprocessing or threading appropriate in this case? Thanks

Upvotes: 0

Views: 277

Answers (4)

Sunny
Sunny

Reputation: 143

Every web server has a facility of uploading files. And what it does for large files is that it divides the file in chunks and does a merge after every chunk is received. What you can do here is that you can have a hidden tag in your html page which has a value attribute and whenever your upload webservice returns you an ok message at that point of time you can change the hidden html value to something relevant and also write a function that keeps on reading the value of that hidden html element and check whether your file uploading has been finished or not.

Upvotes: 0

agoebel
agoebel

Reputation: 401

RaviU solutions would certainly work.

Another option is to call a blocking subprocess in its own Thread. This thread could be responsible for setting a flag or information (in memcache, db, or just a file on the harddrive) as well as clearing it when it's complete. Personally, there is no love lost between reading rsyncs stdout and I so I usually just ask the OS for the filesize.

Also, if you don't need the file absolutely ASAP, adding "-c" to do a checksum can be good for those giant files. source: personal experience trying to transfer giant video files over spotty campus network.

I will say the one problem with all of the solutions so far is that it doesn't work for "N" files. Eventually, even if you make sure each file only can be transfered once at a time, if you have a lot of different files then eventually it'll bog down the system. You might be better off just using some sort of task queue unless you know it will only ever be the one file at a time. I haven't used one recently, but a quick google search yielded Celery which doesn't look to bad.

Upvotes: 0

Krzysztof Szularz
Krzysztof Szularz

Reputation: 5249

What you need is Celery.

It let's you spawn job as a parallel task and return http response.

Upvotes: 1

RaviU
RaviU

Reputation: 1213

Assuming that function works inside of a view, your browser will timeout before the 10GB file has finished transferring over. Maybe you should re-think your architecture for this?

There are probably several ways to do this, but here are some that come to my mind right now:

One solution is to have an intermediary storing the status of the file transfer. Before you begin the process that transfers the file, set a flag somewhere like a database saying the process has begun. Then if you make your subprocess call blocking, wait for it to complete, check the output of the command if possible and update the flag you set earlier.

Then have whatever front end you have poll the status of the file transfer.

Another solution, if you make the subprocess call non-blocking as in your example, in that case you should use a thread which sits there reading the stdout and updating an intermediary store which your front end can query to get a more 'real time' update of the transfer process.

Upvotes: 1

Related Questions