-
Notifications
You must be signed in to change notification settings - Fork 3
Configuring Compilation with OpenVINO
detector = core.compile_model(model = detector_ir,
device_name = 'CPU',
config = {"PERFORMANCE_HINT":"LATENCY","NUM_STREAMS":3, 'INFERENCE_NUM_THREADS':6, 'AFFINITY':'NUMA'})
segmentor = core.compile_model(model = segmentor_ir,
device_name = 'CPU',
config = {"PERFORMANCE_HINT":"LATENCY","NUM_STREAMS":3, 'INFERENCE_NUM_THREADS':6, 'AFFINITY':'NUMA'})Even though all supported devices in OpenVINO™ offer low-level performance settings, utilizing them is not recommended outside of very few cases. The preferred way to configure performance in OpenVINO Runtime is using performance hints. This is a future-proof solution fully compatible with the automatic device selection inference mode and designed with portability in mind.
The hints also set the direction of the configuration in the right order. Instead of mapping the application needs to the low-level performance settings, and keeping an associated application logic to configure each possible device separately, the hints express a target scenario with a single config key and let the device configure itself in response.
While an application is free to create more requests if needed (for example to support asynchronous inputs population) it is very important to at least run the ov::optimal_number_of_infer_requests of the inference requests in parallel. It is recommended for efficiency, or device utilization, reasons.
| Configuration | Number of inference requests | FPS achieved (avg) |
|---|---|---|
| Performance Hint : Throughput | 4 for each model | 7.5879 |
| Performance Hint : Latency | 4 for each model | 7.6491 |
| Streams : 6Threads : 8Affinity : NUMA | 4 for each model | 9.4229 |
| Performance Hint : LatencyMulti-Stream | 4 for each model | 10.077 |
(all the result were calculated on CPU with 6 cores)
For more information visit, OpenVINO Docs .