Reputation: 33
I am trying to create an image array from scratch. I got the code running but it takes arrounds 30 secs to run it. I feel it could be faster by using numpy native functions. How can I do this?
import cv2
import numpy as np
import time
volumes = np.random.randint(low=0, high=200, size=10000)
print(volumes)
image_heigh = 128
image_width = 256
image_channel = 3
show_img = False
def nomralized(data, data_min, data_max, maximum_value):
nomamized_data = maximum_value * ((data - data_min) / (data_max - data_min))
return nomamized_data
start_time = time.time()
for ii in range(len(volumes)-image_width):
# ===================== part to optimize start
final_image = np.zeros((image_heigh, image_width, image_channel))
start = ii
end = ii + image_width
current_vols = volumes[start:end]
# nomalize data
vol_min = 0
vol_max = np.max(current_vols)
vol_norm = nomralized(data=current_vols,
data_min=vol_min,
data_max=vol_max,
maximum_value=image_heigh)
for xxx in range(image_width):
final_image[:int(vol_norm[xxx]), xxx, :] = 1
# ===================== part to optimize end
if show_img:
image = np.float32(final_image)
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
cv2.imshow("ok", image)
cv2.waitKey(27)
print("total running time: ", (time.time() - start_time))
How can I do to make this image array creation faster? I need to create the image every timesteps because I want to simulate real live data stream that come every new timesteps.
This is why I would like to optimize only this part of the code :
for xxx in range(image_width):
final_image[:int(vol_norm[xxx]), xxx, :] = 1
How can I do this?
Upvotes: 3
Views: 512
Reputation: 16737
First simplest optimizations are next:
np.arange(...)
instead of inner loop.All these above optimizations give huge speedup (10x
times), and my running time is 2.6 sec
instead of 27 sec
before.
Also another very useful optimization that I didn't do is that you don't need to recompute previous image pixels in a case when max/min of whole data within current window didn't change. You need to recompute previous image data only in the case if max/min changed. And I expect that your real-life data is gradually changing like Forex or Bitcoin prices, hence max/min change within a window is very non-often.
Optimizations 1)-3) mentioned above are implemented in the next code:
import cv2
import numpy as np
import time
volumes = np.random.randint(low=0, high=200, size=10000)
print(volumes)
image_heigh = 128
image_width = 256
image_channel = 3
show_img = False
def nomralized(data, data_min, data_max, maximum_value):
nomamized_data = maximum_value * ((data - data_min) / (data_max - data_min))
return nomamized_data
start_time = time.time()
aranges = np.arange(image_heigh, dtype = np.int32)[:, None]
for ii in range(len(volumes)-image_width):
# ===================== part to optimize start
#final_image = np.zeros((image_heigh, image_width, image_channel), dtype = np.float32)
start = ii
end = ii + image_width
current_vols = volumes[start:end]
# nomalize data
vol_min = 0
vol_max = np.max(current_vols)
vol_norm = nomralized(data=current_vols,
data_min=vol_min,
data_max=vol_max,
maximum_value=image_heigh)
final_image = (aranges < vol_norm[None, :].astype(np.int32)).astype(np.uint8) * 255
# ===================== part to optimize end
if show_img:
cv2.imshow('ok', final_image)
cv2.waitKey(27)
print("total running time: ", (time.time() - start_time))
For above code I just did one more optimization of inner loop which speed-up code above even 2x
times more to have timings of 1.3 sec
. But also I put back 3 channels plus float32, this reduced speed resulting in final 2.8 sec
, here is the code
Another next optimization is possible if re-computing old images data is not needed.
Main thing to be optimized was that you were re-computing almost same whole image on each step with 1 pixel shift-step along width. Instead of this you need to compute whole image once, then shift right not 1 pixel but whole image width.
Then after this optimization running time is 0.08 sec
.
And do 1 pixel stepping only for showing animation, not for computing image data, image data should be computed just once if you need speed.
import cv2
import numpy as np
import time
volumes = np.random.randint(low=0, high=200, size=10000)
print(volumes)
image_heigh = 128
image_width = volumes.size #256
image_channel = 3
screen_width = 256
show_img = False
def nomralized(data, data_min, data_max, maximum_value):
nomamized_data = maximum_value * ((data - data_min) / (data_max - data_min))
return nomamized_data
start_time = time.time()
for ii in range(0, len(volumes), image_width):
# ===================== part to optimize start
final_image = np.zeros((image_heigh, image_width, image_channel))
start = ii
end = ii + image_width
current_vols = volumes[start:end]
# nomalize data
vol_min = 0
vol_max = np.max(current_vols)
vol_norm = nomralized(data=current_vols,
data_min=vol_min,
data_max=vol_max,
maximum_value=image_heigh)
for xxx in range(image_width):
final_image[:int(vol_norm[xxx]), xxx, :] = 1
# ===================== part to optimize end
if show_img:
for start in range(0, final_image.shape[1] - screen_width):
image = np.float32(final_image[:, start : start + screen_width])
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
cv2.imshow("ok", image)
cv2.waitKey(27)
print("total running time: ", (time.time() - start_time))
I also created animation image out of your data:
If you want to create same animation just append next piece of code to the end of script above:
# Needs: python -m pip install pillow
import PIL.Image
imgs = [PIL.Image.fromarray(final_image[:, start : start + screen_width].astype(np.uint8) * 255) for start in range(0, final_image.shape[1] - screen_width, 6)]
imgs[0].save('result.png', append_images = imgs[1:], save_all = True, lossless = True, duration = 100)
I've implemented also simulation of real-time live stream data rendering/visualizing.
live_stream()
generator spits out random amount of data at random points of time, this is to simulate data generation process.stream_fetcher()
listens to live stream and records all data received to python queue q0
, this fetcher is run in one thread.renderer()
gets data recorded by fetcher and renders it into image through your mathematical formulas and normalization process, it renders as much data as available, resulting in images with varying widths, rendered images are saved to another queue q1
.visualizer()
visualizes rendered data by fetching as much rendered images as available.All functions run in separate threads not to block whole process. Also if any of threads works to slow then it skips some of data to catch-up with current real-time data, thus every queue doesn't overflow.
Also you may see that visualized process is jumpy, it is not because functions are somewhat slow, but because live stream spits out different amount of data in each time step, this is how usually real-time data may behave.
In the next code I did also extra optimization mentioned before, that is not-recomputing image if min/max didn't change.
import cv2, numpy as np
import time, random, threading, queue
image_height = 256
image_width = 512
# Make results reproducible and deterministic
np.random.seed(0)
random.seed(0)
def live_stream():
last = 0.
while True:
a = np.random.uniform(low = -1., high = 1., size = random.randint(1, 20)).astype(np.float64).cumsum() + last
yield a
last = a[-1]
time.sleep(random.random() * 0.1)
q0 = queue.Queue()
def stream_fetcher():
for e in live_stream():
q0.put(e)
threading.Thread(target = stream_fetcher, daemon = True).start()
aranges = np.arange(image_height, dtype = np.int32)[:, None]
q1 = queue.Queue()
def renderer():
def normalized(data, data_min, data_max, maximum_value):
nomamized_data = maximum_value * ((data - data_min) / (data_max - data_min))
return nomamized_data
prev_image = np.zeros((image_height, 0), dtype = np.uint8)
prev_vols = np.zeros((0,), dtype = np.float64)
while True:
data = []
data.append(q0.get())
try:
while True:
data.append(q0.get(block = False))
except queue.Empty:
pass
vols = np.concatenate(data)[-image_width:]
prev_vols = prev_vols[-(image_width - vols.size) or prev_vols.size:]
concat_vols = np.concatenate((prev_vols, vols))[-image_width:]
vols_min, vols_max = np.amin(concat_vols), np.amax(concat_vols)
if prev_vols.size > 0 and (vols_min < np.amin(prev_vols) - 10 ** -8 or vols_max > np.amax(prev_vols) + 10 ** -8):
vols = concat_vols
prev_image = prev_image[:, :-prev_vols.size]
prev_vols = prev_vols[:0]
vols_norm = normalized(
data = vols, data_min = vols_min,
data_max = vols_max, maximum_value = image_height,
)
image = (aranges < vols_norm.astype(np.int32)[None, :]).astype(np.uint8) * 255
whole_image = np.concatenate((prev_image, image), axis = 1)[:, -image_width:]
q1.put(whole_image)
prev_image = whole_image
prev_vols = concat_vols
threading.Thread(target = renderer, daemon = True).start()
def visualizer():
imgs = []
while True:
data = []
data.append(q1.get())
try:
while True:
data.append(q1.get(block = False))
except queue.Empty:
pass
image = np.concatenate(data, axis = 1)[:, -image_width:]
cv2.imshow('ok', image)
cv2.waitKey(1)
if imgs is not None:
try:
# Needs: python -m pip install pillow
import PIL.Image
has_pil = True
except:
has_pil = False
imgs = None
if has_pil:
imgs.append(PIL.Image.fromarray(np.pad(image, ((0, 0), (image_width - image.shape[1], 0)), constant_values = 0)))
if len(imgs) >= 1000:
print('saving...', flush = True)
imgs[0].save('result.png', append_images = imgs[1:], save_all = True, lossless = True, duration = 100)
imgs = None
print('saved!', flush = True)
threading.Thread(target = visualizer, daemon = True).start()
while True:
time.sleep(0.1)
Above live process simulation is rendered into result.png
which I show down below:
I've also decided to improve visualization, by using more advanced matplotlib
instead of cv2
to be able to show axes and doing real-time plot drawing. Visualization image is down below:
Next is a matplotlib-based code corresponding to last image above:
import cv2, numpy as np
import time, random, threading, queue
image_height = 256
image_width = 512
save_nsec = 20
dpi, fps = 100, 15
# Make results reproducible and deterministic
np.random.seed(0)
random.seed(0)
def live_stream():
last = 0.
pos = 0
while True:
a = np.random.uniform(low = -1., high = 1., size = random.randint(1, 30)).astype(np.float64).cumsum() + last
yield a, pos, pos + a.size - 1
pos += a.size
last = a[-1]
time.sleep(random.random() * 2.2 / fps)
q0 = queue.Queue()
def stream_fetcher():
for e in live_stream():
q0.put(e)
threading.Thread(target = stream_fetcher, daemon = True).start()
aranges = np.arange(image_height, dtype = np.int32)[:, None]
q1 = queue.Queue()
def renderer():
def normalized(data, data_min, data_max, maximum_value):
nomamized_data = maximum_value * ((data - data_min) / (data_max - data_min))
return nomamized_data
prev_image = np.zeros((image_height, 0), dtype = np.uint8)
prev_vols = np.zeros((0,), dtype = np.float64)
while True:
data = []
data.append(q0.get())
try:
while True:
data.append(q0.get(block = False))
except queue.Empty:
pass
data_vols = [e[0] for e in data]
data_minx, data_maxx = data[0][1], data[-1][2]
vols = np.concatenate(data_vols)[-image_width:]
prev_vols = prev_vols[-(image_width - vols.size) or prev_vols.size:]
concat_vols = np.concatenate((prev_vols, vols))[-image_width:]
vols_min, vols_max = np.amin(concat_vols), np.amax(concat_vols)
if prev_vols.size > 0 and (vols_min < np.amin(prev_vols) - 10 ** -8 or vols_max > np.amax(prev_vols) + 10 ** -8):
vols = concat_vols
prev_image = prev_image[:, :-prev_vols.size]
prev_vols = prev_vols[:0]
vols_norm = normalized(
data = vols, data_min = vols_min,
data_max = vols_max, maximum_value = image_height,
)
image = (aranges < vols_norm.astype(np.int32)[None, :]).astype(np.uint8) * 255
whole_image = np.concatenate((prev_image, image), axis = 1)[:, -image_width:]
q1.put((whole_image, data_maxx - whole_image.shape[1] + 1, data_maxx, vols_min, vols_max))
prev_image = whole_image
prev_vols = concat_vols
threading.Thread(target = renderer, daemon = True).start()
def visualizer():
import matplotlib.pyplot as plt, matplotlib.animation
def images():
while True:
data = []
data.append(q1.get())
try:
while True:
data.append(q1.get(block = False))
except queue.Empty:
pass
minx = min([e[1] for e in data])
maxx = min([e[2] for e in data])
miny = min([e[3] for e in data])
maxy = min([e[4] for e in data])
image = np.concatenate([e[0] for e in data], axis = 1)[:, -image_width:]
image = np.pad(image, ((0, 0), (image_width - image.shape[1], 0)), constant_values = 0)
image = np.repeat(image[:, :, None], 3, axis = -1)
yield image, minx, maxx, miny, maxy
it = images()
im = None
fig = plt.figure(figsize = (image_width / dpi, image_height / dpi), dpi = dpi)
def animate_func(i):
nonlocal it, im, fig
image, minx, maxx, miny, maxy = next(it)
print(f'.', end = '', flush = True)
if im is None:
im = plt.imshow(image, interpolation = 'none', aspect = 'auto')
else:
im.set_array(image)
im.set_extent((minx, maxx, miny, maxy))
return [im]
anim = matplotlib.animation.FuncAnimation(fig, animate_func, frames = round(save_nsec * fps), interval = 1000 / fps)
print('saving...', end = '', flush = True)
#anim.save('result.mp4', fps = fps, dpi = dpi, extra_args = ['-vcodec', 'libx264'])
anim.save('result.gif', fps = fps, dpi = dpi, writer = 'imagemagick')
print('saved!', end = '', flush = True)
plt.show()
threading.Thread(target = visualizer, daemon = True).start()
while True:
time.sleep(0.1)
Then I've decided to play a bit and colored last image with RGB palette, the higher the peak is more red-ish it is, if it is more in the middle then it is more green-ish, if it is low enough then it is more blue-ish. Resulting image below was achieved by this coloring code:
And another one colored animation below, line-style instead of bar-style, with the help of this code:
Upvotes: 4