Prashant
Prashant

Reputation: 921

OpenVINO inference is running much slower than CPU when using integrated Intel(R) UHD Graphics 620 (iGPU)

Python : 3.10.0 Windows : 10 openvino : 2024.3.0

Available devices:
CPU
IMMUTABLE PROPERTIES:
AVAILABLE_DEVICES : ""
RANGE_FOR_ASYNC_INFER_REQUESTS : 1 1 1
RANGE_FOR_STREAMS : 1 8
EXECUTION_DEVICES : CPU
FULL_DEVICE_NAME : Intel(R) Core(TM) i5-8250U CPU @ 1.60GHz
OPTIMIZATION_CAPABILITIES : FP32 FP16 INT8 BIN EXPORT_IMPORT
DEVICE_TYPE : integrated
DEVICE_ARCHITECTURE : intel64
MUTABLE PROPERTIES:
NUM_STREAMS : 1
AFFINITY : NONE
INFERENCE_NUM_THREADS : 0
PERF_COUNT : NO
INFERENCE_PRECISION_HINT : f32
PERFORMANCE_HINT : LATENCY
EXECUTION_MODE_HINT : PERFORMANCE
PERFORMANCE_HINT_NUM_REQUESTS : 0
ENABLE_CPU_PINNING : YES
SCHEDULING_CORE_TYPE : ANY_CORE
MODEL_DISTRIBUTION_POLICY : ""
ENABLE_HYPER_THREADING : YES
DEVICE_ID : ""
CPU_DENORMALS_OPTIMIZATION : NO
LOG_LEVEL : LOG_NONE
CPU_SPARSE_WEIGHTS_DECOMPRESSION_RATE : 1
DYNAMIC_QUANTIZATION_GROUP_SIZE : 0
KV_CACHE_PRECISION : f16

GPU
IMMUTABLE PROPERTIES:
AVAILABLE_DEVICES : 0
RANGE_FOR_ASYNC_INFER_REQUESTS : 1 2 1
RANGE_FOR_STREAMS : 1 2
OPTIMAL_BATCH_SIZE : 1
MAX_BATCH_SIZE : 1
DEVICE_ARCHITECTURE : GPU: vendor=0x8086 arch=v9.0.0
FULL_DEVICE_NAME : Intel(R) UHD Graphics 620 (iGPU)
DEVICE_UUID : 00000000000000000000000000000000
DEVICE_LUID : 0000000000000000
DEVICE_TYPE : integrated
DEVICE_GOPS : {f16:844.8,f32:422.4,i8:422.4,u8:422.4}
OPTIMIZATION_CAPABILITIES : FP32 BIN FP16 EXPORT_IMPORT
GPU_DEVICE_TOTAL_MEM_SIZE : 3379195904
GPU_UARCH_VERSION : 9.0.0
GPU_EXECUTION_UNITS_COUNT : 24
GPU_MEMORY_STATISTICS : ""
MUTABLE PROPERTIES:
PERF_COUNT : NO
MODEL_PRIORITY : MEDIUM
GPU_HOST_TASK_PRIORITY : MEDIUM
GPU_QUEUE_PRIORITY : MEDIUM
GPU_QUEUE_THROTTLE : MEDIUM
GPU_ENABLE_LOOP_UNROLLING : YES
GPU_DISABLE_WINOGRAD_CONVOLUTION : NO
CACHE_DIR : ""
CACHE_MODE : optimize_speed
PERFORMANCE_HINT : LATENCY
EXECUTION_MODE_HINT : PERFORMANCE
COMPILATION_NUM_THREADS : 8
NUM_STREAMS : 1
PERFORMANCE_HINT_NUM_REQUESTS : 0
INFERENCE_PRECISION_HINT : f16
ENABLE_CPU_PINNING : NO
DEVICE_ID : 0

IR Model is based on CodeFormer and compressed to fp16.

Configuration

import openvino as ov
import openvino.properties.hint as hints
core = ov.Core()
# in case of Performance
device_property = {
    "GPU": {
        hints.execution_mode: hints.ExecutionMode.PERFORMANCE,
        hints.performance_mode : hints.PerformanceMode.LATENCY,
        hints.inference_precision: ov.Type.f16,
        },
    "CPU": {
        hints.execution_mode: hints.ExecutionMode.PERFORMANCE,
        hints.performance_mode : hints.PerformanceMode.LATENCY,
        hints.inference_precision: ov.Type.f32,
        }
}

core.set_property("HETERO", {"MULTI_DEVICE_PRIORITIES": "GPU,CPU"})
core.set_property("GPU", device_property["GPU"])
core.set_property("CPU", device_property["CPU"])

When running the inference using

compiled_model = core.compile_model(model=model, device_name="CPU")

The time taken is 4.196 secs. and with

compiled_model = core.compile_model(model=model, device_name="GPU")

time taken is 22.595 secs.

Is it because of wrong configuration or integrated GPU's limitations? Any other suggestions to improve the performance?

Upvotes: 1

Views: 172

Answers (0)

Related Questions