OpenVINO inference is running much slower than CPU when using integrated Intel(R) UHD Graphics 620 (iGPU)

Question

Python : 3.10.0 Windows : 10 openvino : 2024.3.0

Available devices:
CPU
IMMUTABLE PROPERTIES:
AVAILABLE_DEVICES : ""
RANGE_FOR_ASYNC_INFER_REQUESTS : 1 1 1
RANGE_FOR_STREAMS : 1 8
EXECUTION_DEVICES : CPU
FULL_DEVICE_NAME : Intel(R) Core(TM) i5-8250U CPU @ 1.60GHz
OPTIMIZATION_CAPABILITIES : FP32 FP16 INT8 BIN EXPORT_IMPORT
DEVICE_TYPE : integrated
DEVICE_ARCHITECTURE : intel64
MUTABLE PROPERTIES:
NUM_STREAMS : 1
AFFINITY : NONE
INFERENCE_NUM_THREADS : 0
PERF_COUNT : NO
INFERENCE_PRECISION_HINT : f32
PERFORMANCE_HINT : LATENCY
EXECUTION_MODE_HINT : PERFORMANCE
PERFORMANCE_HINT_NUM_REQUESTS : 0
ENABLE_CPU_PINNING : YES
SCHEDULING_CORE_TYPE : ANY_CORE
MODEL_DISTRIBUTION_POLICY : ""
ENABLE_HYPER_THREADING : YES
DEVICE_ID : ""
CPU_DENORMALS_OPTIMIZATION : NO
LOG_LEVEL : LOG_NONE
CPU_SPARSE_WEIGHTS_DECOMPRESSION_RATE : 1
DYNAMIC_QUANTIZATION_GROUP_SIZE : 0
KV_CACHE_PRECISION : f16

GPU
IMMUTABLE PROPERTIES:
AVAILABLE_DEVICES : 0
RANGE_FOR_ASYNC_INFER_REQUESTS : 1 2 1
RANGE_FOR_STREAMS : 1 2
OPTIMAL_BATCH_SIZE : 1
MAX_BATCH_SIZE : 1
DEVICE_ARCHITECTURE : GPU: vendor=0x8086 arch=v9.0.0
FULL_DEVICE_NAME : Intel(R) UHD Graphics 620 (iGPU)
DEVICE_UUID : 00000000000000000000000000000000
DEVICE_LUID : 0000000000000000
DEVICE_TYPE : integrated
DEVICE_GOPS : {f16:844.8,f32:422.4,i8:422.4,u8:422.4}
OPTIMIZATION_CAPABILITIES : FP32 BIN FP16 EXPORT_IMPORT
GPU_DEVICE_TOTAL_MEM_SIZE : 3379195904
GPU_UARCH_VERSION : 9.0.0
GPU_EXECUTION_UNITS_COUNT : 24
GPU_MEMORY_STATISTICS : ""
MUTABLE PROPERTIES:
PERF_COUNT : NO
MODEL_PRIORITY : MEDIUM
GPU_HOST_TASK_PRIORITY : MEDIUM
GPU_QUEUE_PRIORITY : MEDIUM
GPU_QUEUE_THROTTLE : MEDIUM
GPU_ENABLE_LOOP_UNROLLING : YES
GPU_DISABLE_WINOGRAD_CONVOLUTION : NO
CACHE_DIR : ""
CACHE_MODE : optimize_speed
PERFORMANCE_HINT : LATENCY
EXECUTION_MODE_HINT : PERFORMANCE
COMPILATION_NUM_THREADS : 8
NUM_STREAMS : 1
PERFORMANCE_HINT_NUM_REQUESTS : 0
INFERENCE_PRECISION_HINT : f16
ENABLE_CPU_PINNING : NO
DEVICE_ID : 0

IR Model is based on CodeFormer and compressed to fp16.

Configuration

import openvino as ov
import openvino.properties.hint as hints
core = ov.Core()
# in case of Performance
device_property = {
    "GPU": {
        hints.execution_mode: hints.ExecutionMode.PERFORMANCE,
        hints.performance_mode : hints.PerformanceMode.LATENCY,
        hints.inference_precision: ov.Type.f16,
        },
    "CPU": {
        hints.execution_mode: hints.ExecutionMode.PERFORMANCE,
        hints.performance_mode : hints.PerformanceMode.LATENCY,
        hints.inference_precision: ov.Type.f32,
        }
}

core.set_property("HETERO", {"MULTI_DEVICE_PRIORITIES": "GPU,CPU"})
core.set_property("GPU", device_property["GPU"])
core.set_property("CPU", device_property["CPU"])

When running the inference using

compiled_model = core.compile_model(model=model, device_name="CPU")

The time taken is 4.196 secs. and with

compiled_model = core.compile_model(model=model, device_name="GPU")

time taken is 22.595 secs.

Is it because of wrong configuration or integrated GPU's limitations? Any other suggestions to improve the performance?

OpenVINO inference is running much slower than CPU when using integrated Intel(R) UHD Graphics 620 (iGPU)

Answers (0)

Related Questions