Reputation: 9969
I'm experimenting with multithreading for the first time. I'm using Queue.Queue
and putting data in there after I've created a set of objects that inherit from threading.Thread
. The script downloads a series of files and works perfectly fine, I have it downloading and it has proven much faster than my old one.
However, my thread begins with a print command to show that it has begun downloading. Just a simple "Downloading C:\foo.bar". When the queue is first created, all these print commands are stuck together, and then the newlines all appear afterwards.
Here's a basic idea of the code involved:
import Queue
import threading
queue = Queue.Queue()
class ThreadDownload(threading.Thread):
"""Threaded Download"""
def __init__(self, queue):
threading.Thread.__init__(self)
self.queue = queue
def run(self):
while True:
data = self.queue.get()
print ("Downloading {}".format(data))
#download_file(data)
time.sleep(10)
self.queue.task_done()
for i in range(4):
t = ThreadDownload(queue)
t.setDaemon(True)
t.start()
#for d in data:
for d in range(20):
queue.put(d)
queue.join()
Note that download_file
is a function from a 3rd party library that people are unlikely to know or have easy access to so I've left it out in favour of people putting some other time consuming call for their testing. Likewise with data
, the form of data is irrelevant to the question so instead I suggest people use range
in order to test easily.
Here's what output might look like:
Downloading C:\foo.barDownloading C:\foo.barDownloading C:\foo.barDownloading C:\foo.bar
Downloading C:\foo.bar
Downloading C:\foo.bar
Downloading C:\foo.bar
The reason seems to be because of the fact that these threads are starting their run simultaneously. If I add time.sleep(0.01)
I can prevent it but that's a hacky approach. And I'm also concerned that this could mean that if two downloads coincidentally started at the same split second it would happen again.
Is there any way to actually enforce a separation here so I don't get this problem? I have heard that you shouldn't have threads handle UI, that's usually in the context of something like redrawing a progress bar though. Also I'm not sure if there's a convenient way to note when an item from the queue has been taken by a thread, but perhaps I've missed it.
Upvotes: 0
Views: 1129
Reputation: 4325
An implementation of Sorin's answer:
# printt.py
from __future__ import annotations
from queue import Queue
import threading
from typing import Optional, TextIO
class _Param:
def __init__(self,
*args,
sep: str=' ',
end: str='\n',
file: Optional[TextIO]=None,
flush: bool=False):
self._args = args
self._sep: str = sep
self._end: str = end
self._file: TextIO = file
self._flush: bool = flush
@property
def args(self):
return self._args
@property
def sep(self) -> str:
return self._sep
@property
def end(self) -> str:
return self._end
@property
def file(self) -> Optional[TextIO]:
return self._file
@property
def flush(self) -> bool:
return self._flush
_print_queue: Queue[_Param] = Queue()
def _printer():
while True:
p = _print_queue.get()
print(*p.args, sep=p.sep, end=p.end, file=p.file, flush=p.flush)
_print_task = threading.Thread(target=_printer)
_print_task.start()
def printt(*args,
sep: str=' ',
end: str='\n',
file: Optional[TextIO]=None,
flush: bool=False):
"""
print(value, ..., sep=' ', end='\n', file=sys.stdout, flush=False)
Thread safe print. Prints the values to a stream, or to sys.stdout by default.
Optional keyword arguments:
sep: string inserted between values, default a space.
end: string appended after the last value, default a newline.
file: a file-like object (stream); defaults to the current sys.stdout.
flush: whether to forcibly flush the stream.
"""
_print_queue.put(_Param(*args, sep=sep, end=end, file=file, flush=flush))
Obviously, for fully thread safe print, you have to use printt()
everywhere you would have used print() because it just serializes the use of print() under the hood.
Upvotes: 0
Reputation: 11968
Have a queue with a single thread (call it the console thread) that is responsible for writing the messages out. To write something you generate the output then put it in the queue and the console thread will write it correctly when it gets to it.
This way there's a single thread responsible for writing stuff to console and you can control exactly how things should be outputted.
Upvotes: 3