Commit d32e5dd
committed
Parallelize by a factor of 1
CPU's threading model is distinct from GPU's thread group model: GPU
shared memory is not shared beyond one GPU thread group.
Whenever nested parallelism is enabled in the Mullapudi2016
auto-scheduler, always implement parallelizable loop dimensions as
`gpu_block`. This can be implemented by splitting the dimensions by a
factor 1: `f.split(z, zi, zo, 1)`.
This makes the autoscheduler's `last_level_cache` estimates per GPU warp
more robust against variations of the nested parallelism.
In the folder `*/apps/`, remove all manual override of
`last_level_cache_size`. Use the default estimate: 47kB per thread
group.1 parent e46ac2c commit d32e5dd
File tree
5 files changed
+6
-6
lines changed- apps
- bgu
- iir_blur
- lens_blur
- stencil_chain
- src/autoschedulers/mullapudi2016
5 files changed
+6
-6
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
19 | 19 | | |
20 | 20 | | |
21 | 21 | | |
22 | | - | |
| 22 | + | |
23 | 23 | | |
24 | 24 | | |
25 | 25 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
19 | 19 | | |
20 | 20 | | |
21 | 21 | | |
22 | | - | |
| 22 | + | |
23 | 23 | | |
24 | 24 | | |
25 | 25 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
19 | 19 | | |
20 | 20 | | |
21 | 21 | | |
22 | | - | |
| 22 | + | |
23 | 23 | | |
24 | 24 | | |
25 | 25 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
21 | 21 | | |
22 | 22 | | |
23 | 23 | | |
24 | | - | |
| 24 | + | |
25 | 25 | | |
26 | 26 | | |
27 | 27 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1430 | 1430 | | |
1431 | 1431 | | |
1432 | 1432 | | |
1433 | | - | |
| 1433 | + | |
1434 | 1434 | | |
1435 | 1435 | | |
1436 | 1436 | | |
1437 | 1437 | | |
1438 | 1438 | | |
1439 | 1439 | | |
1440 | 1440 | | |
1441 | | - | |
| 1441 | + | |
1442 | 1442 | | |
1443 | 1443 | | |
1444 | 1444 | | |
| |||
0 commit comments