Reputation: 11605
It all began last night when I was making a script that required 8 or so packages including pygame.mixer
which on my computer importing this takes a few seconds.
This meant that before the script even started I had to wait 10 or so seconds for all the imports to load. Because I want the script to obviously be as fast as possible could I start running the script while getting the imports with something like this:
import threading
def import_modules():
import tkinter as tk
from pygame import mixer
import json
import webbrowser
print('imports finished')
a = threading.Thread(target=import_modules)
a.start()
for i in range(10000):
print('Getting Modules')
So my question is:
Is this considered bad practice and will it cause problems?
If so are there alternatives I could use?
Or is it OK to do this?
Upvotes: 4
Views: 4567
Reputation: 1959
I understand this is an old thread but i was looking for a way to minimize the loading time of my application, and wanted the user to see the gui so he can interact with it while other module being imported in background
i have read some answers suggesting a lazy import techniques, which i found complicated "for me", then i stumbled here with a suggest to use threading to import modules in background, then i gave it a shot, and found out it is the most brilliant idea that fits my needs
below is a code for an example gui application using PySimpleGUI which ask the user to enter a url and it will open it in the default browser window, the only module required to do so is webbrowser
, so this job could be done while other modules loading
I added comments in this code to explain mostly all parts, hope it will help someone, tested on python 3.6, windows10.
please note: this is just a dummy code as a showcase.
# import essentials first
import PySimpleGUI as sg
import time, threading
# global variable names to reference to the imported modules, this way will
# solve the problem of importing inside a function local namespace
pg = None
js = None
wb = None
progress = 0 # for our progress bar
def importer():
# we will simulate a time consuming modules by time.sleep()
global progress
progress = 10
start = time.time()
global pg, js, wb
import pygame as pg
time.sleep(3)
print(f'done importing pygame mixer in {time.time()-start} seconds')
progress = 40
start = time.time()
import webbrowser as wb
time.sleep(2)
print(f'done importing webbrowser in {time.time()-start} seconds')
progress = 70
start = time.time()
import json as js
time.sleep(10)
print(f'done importing json in {time.time()-start} seconds')
progress = 100
print('imports finished')
# start our importer in a separate thread
threading.Thread(target=importer).start()
# main app
def main():
# window layout
layout = [[sg.Text('Enter url:', size=(15,1)), sg.Input(default_text='https://google.com', size=(31, 1), key='url')],
[sg.Text('Loading modules:', size=(15,1), key='status'),
sg.ProgressBar(max_value=100, orientation='horizontal', size=(20,10), key='progress')],
[sg.Button('Open url', disabled=True, key='open_url'), sg.Button('joysticks', disabled=True, key='joysticks'), sg.Cancel()]]
window = sg.Window('test application for lazy imports', layout=layout) # our window
while True: # main application loop
event, values = window.Read(timeout=10) # non blocking read from our gui
if event in [None, 'Cancel']:
window.Close()
break
elif event == 'open_url':
wb.open(values['url'])
print('helllooooooooooooo')
elif event == 'joysticks':
# show joystics number currently connected
pg.init()
n = pg.joystick.get_count() # Get count of joysticks
sg.Popup(f'joysticks number currently connected to computer = {n}')
# open url button is disabled by default and will be enabled once our webbrowser module imported
if wb:
window.Element('open_url').Update(disabled= False)
if pg:
window.Element('joysticks').Update(disabled= False)
# progress bar
window.Element('progress').UpdateBar(progress)
if progress >= 100:
window.Element('status').Update('Loading completed', background_color='green')
main()
Upvotes: 4
Reputation: 43505
If you are using CPython, this might not yield as much improvement as you'd expect.
CPython has a Global Interpreter Lock ("GIL") that ensures that only one thread at a time can be executing Python bytecode.
So whenever the import thread is executing Python code, the other thread is not running. The GIL is released by a thread when it is e.g. waiting on I/O. So there will be some time savings because of that.
There is a difference of opinion as to whether tkinter is truly thread-safe. It is still considered wise to run the tkinter main loop in the original thread, and to not invoke tkinter calls from other threads, because that can lead to crashes.
The GIL also can cause problems for GUI programs. If you are using a second thread for a long-running calculation, the user interface might become less responsive. There are at least two possible solutions. The first one is to split the long-running calculation up into small pieces which are each executed by a after
method. The second is to run the calculation in a different process.
Follow-up questions from the comments:
is there anything else to speed up execution time?
The first thing you must to do is measure; what exactly causes the problem. Then you can look into the problem areas and try to improve them.
For example module load times. Run your app under a profiler to see how long the module loads take and why.
If pygame.mixer
takes too long to load, you could use your platform's native mixer. UNIX-like operating systems generally have a /dev/mixer
device, while ms-windows has different API's for it. Using those definitely won't take 10 seconds.
There is a cost associated with this: you will loose portability between operating systems.
What are the alternatives
Using multiple cores is a usual tactic to try and speed things up. Currently on CPython the only general way get code to run in parallel on multiple cores is with multiprocessing
or concurrent.futures
.
However it depends on the nature of your problem if this tactic can work.
If your problem involves doing the same calculations over a huge set of data, that is relatively easy to parallelize. In that case you can expect a maximal speedup roughly equivalent to the numbers of cores you use.
It could be that your problem consists of multiple steps, each of which depends on the result of a previous step. Such problems are serial in nature and are much harder to execute in parallel.
Other ways to possible speed things up could be to use another Python implementation like Pypy. Or you could use cython together with type hints to convert performance-critical parts to compiled C code.
Upvotes: 3