gaut
gaut

Reputation: 5958

PyTorch with yolov5: color channel and result display

I have a script that grabs an application's screenshot and displays it. it works quite nicely on my machine like a video with around 60FPS. Now I want to use a yolov5 object detection model on these frames, with TorchHub, as advised here.

The following works:

import os
os.getcwd()
from PIL import ImageGrab
import numpy as np
import cv2
import pyautogui
import win32gui
import time
from mss import mss
from PIL import Image
import tempfile
os.system('calc')
sct = mss()
xx=1
tstart = time.time()
while xx<10000:
    hwnd = win32gui.FindWindow(None, 'Calculator')
    left_x, top_y, right_x, bottom_y = win32gui.GetWindowRect(hwnd)
    #screen = np.array(ImageGrab.grab( bbox = (left_x, top_y, right_x, bottom_y ) ) )
    bbox = {'top': top_y, 'left': left_x, 'width': right_x-left_x, 'height':bottom_y-top_y }
    screen = sct.grab(bbox)
    scr = np.array(screen)
    
    cv2.imshow('window', scr)
    if cv2.waitKey(25) & 0xFF == ord('q'):
        cv2.destroyAllWindows()
        break
    xx+=1
cv2.destroyAllWindows()
tend = time.time()
print(xx/(tend-tstart))
print((tend-tstart))
os.system('taskkill /f /im calculator.exe')

Below I try to import torch and use my previously trained model,

screen = sct.grab(bbox)
scr = np.array(screen)    
result = model(scr, size=400)  
result.save("test.png") #this gives a TypeError: save() takes 1 positional argument but 2 were given
result.show() #this opens a new Paint instance for every frame instead of keeping the same window. 
# The shown image is also in a wrong color channel
scr = cv2.imread("test.png")
# How can I use the `result` as argument to cv2.imshow(),
# without saving to disk if possible?

My questions:

  1. result.show() shows an image with wrong color channel compared to cv2.imshow(), how can I ensure that the image being fed to model is on the correct channel?
  2. The performance of classification and detection drastically decrease compared to the training validation, perhaps because of 1?
  3. Do you know how I can display the result model image with bounding boxes in a single window like what cv2.imshow() does ? (result.show() opens a new Paint process instance for each frame) ? How can I save this result image to disk and find more documentation on how to interact with model objects in general?

Upvotes: 1

Views: 2761

Answers (3)

Elmir
Elmir

Reputation: 180

Answer to 3: You can render boxes and labels on image with function render(), and then use cv::imshow for display in needed window:

renderedResult = result.render()
cv2.imshow("Result", renderedResult[0])

Upvotes: 1

Glenn Jocher
Glenn Jocher

Reputation: 86

I believe the cvtColor operation should be identical to the provided channel order inversion shown in the YOLOv5 PyTorch Hub tutorial. This returns True in two environments tested (colab notebook python 3.6 and MacOS python 3.9)

import cv2
import numpy as np

file = 'data/images/bus.jpg'
im1 = cv2.imread(file)[:, :, ::-1]
im2 = cv2.cvtColor(cv2.imread(file), cv2.COLOR_BGR2RGB)
print(np.allclose(im1, im2))

Upvotes: 1

gaut
gaut

Reputation: 5958

The following worked: result = model(cv2.cvtColor(scr, cv2.COLOR_BGR2RGB), size=400) This solved the accuracy problem and model.save() has pre-defined output names which are not currently changeable, it takes no arguments. model.show() shows the correct color channel output when fed the correct color channel as input.

Upvotes: 2

Related Questions