Best practices for nested parallel_for under a constrained arena #1816
Closed
slomp
started this conversation in
Design discussions
Replies: 1 comment 1 reply
-
|
In general, it should be fine. But if these resource constraints are qualitative (for example, a different NUMA node or a different core efficiency type) rather than quantitative, it is possible that you might observe some performance penalty, but that depends on a workload. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hello,
I have the following situation:
There's an "outer" context which issues a top-level parallel_for. This outer level uses a task arena that often "undersubscribes" in terms of threads. I do not have control over that level,
but the reason that outer level is doing this is because the workload is more task-oriented and resource-constrained.
Now, here's where I have control and can make decisions: eventually, the tasks of the outer pararllel_for will end up also calling parallel_for. These "inner" parallel_fors do not have resource constraints and are "just compute".
If I run the inner parallel_fors "as-is", they will be implicitly constrained by the task arena of the outer context, which is sub-optimal. So instead, I am executing the inner parallel_fors under the global/default task arena, in hopes to ensure ideal parallelism without oversubscription (using a default-constructed task_arena).
Is this a sensible, recommended approach?
Are there any gotchas or subtleties I need to be aware of?
Beta Was this translation helpful? Give feedback.
All reactions