Commit a5436a5
[CPU] add onednn context cache for qlinear to improve performance (pytorch#168150)
**Summary**
We noticed big framework overhead of `qlinear`. It's because to call onednn's primitive, we need to prepare a bunch of data structs as its args, which has big overhead. In the past, such things are cached in the context and attached to torch jit graph. However, Inductor does not support non-tensor data on graph.
This PR adds a cache of those data structs by using a static `std::unordered_map`, whose key is weight data address as an `int64` and value is a struct that contains all data needed to run a primitive.
This cache is safe in most cases where weight data address won't change during inference and weight data are not reused by different layers. However, since we cannot guarantee the assumption, we define an environment variable `"ONEDNN_CACHE_CONTEXT_UNSAFE"` to control this feature. Users should use it at their own risk.
We found >5% E2E performance gain when running ViT with PT2E static quantization on an 6th gen of Intel Xeon CPU.
**Test plan**
```
pytest -sv test/test_quantization.py -k "qlinear and pt2e"
```
Pull Request resolved: pytorch#168150
Approved by: https://github.com/mingfeima, https://github.com/jerryzh1681 parent ca3e8b3 commit a5436a5
File tree
3 files changed
+125
-33
lines changed- aten/src/ATen/native/quantized/cpu
- test/quantization/core
3 files changed
+125
-33
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
462 | 462 | | |
463 | 463 | | |
464 | 464 | | |
| 465 | + | |
| 466 | + | |
| 467 | + | |
| 468 | + | |
| 469 | + | |
| 470 | + | |
| 471 | + | |
| 472 | + | |
| 473 | + | |
| 474 | + | |
| 475 | + | |
| 476 | + | |
| 477 | + | |
| 478 | + | |
| 479 | + | |
| 480 | + | |
| 481 | + | |
| 482 | + | |
| 483 | + | |
| 484 | + | |
| 485 | + | |
| 486 | + | |
| 487 | + | |
| 488 | + | |
| 489 | + | |
| 490 | + | |
| 491 | + | |
| 492 | + | |
| 493 | + | |
| 494 | + | |
| 495 | + | |
| 496 | + | |
| 497 | + | |
| 498 | + | |
| 499 | + | |
| 500 | + | |
465 | 501 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1147 | 1147 | | |
1148 | 1148 | | |
1149 | 1149 | | |
1150 | | - | |
1151 | | - | |
| 1150 | + | |
1152 | 1151 | | |
1153 | 1152 | | |
1154 | 1153 | | |
1155 | 1154 | | |
1156 | | - | |
1157 | 1155 | | |
1158 | | - | |
1159 | | - | |
1160 | | - | |
1161 | | - | |
1162 | | - | |
1163 | | - | |
1164 | | - | |
1165 | | - | |
1166 | | - | |
1167 | | - | |
| 1156 | + | |
1168 | 1157 | | |
1169 | 1158 | | |
1170 | 1159 | | |
| |||
1185 | 1174 | | |
1186 | 1175 | | |
1187 | 1176 | | |
| 1177 | + | |
| 1178 | + | |
| 1179 | + | |
| 1180 | + | |
| 1181 | + | |
| 1182 | + | |
| 1183 | + | |
| 1184 | + | |
| 1185 | + | |
| 1186 | + | |
| 1187 | + | |
| 1188 | + | |
| 1189 | + | |
| 1190 | + | |
| 1191 | + | |
| 1192 | + | |
| 1193 | + | |
| 1194 | + | |
| 1195 | + | |
| 1196 | + | |
| 1197 | + | |
| 1198 | + | |
| 1199 | + | |
| 1200 | + | |
| 1201 | + | |
| 1202 | + | |
| 1203 | + | |
| 1204 | + | |
| 1205 | + | |
| 1206 | + | |
| 1207 | + | |
| 1208 | + | |
| 1209 | + | |
1188 | 1210 | | |
1189 | 1211 | | |
1190 | 1212 | | |
1191 | 1213 | | |
1192 | 1214 | | |
1193 | 1215 | | |
1194 | 1216 | | |
1195 | | - | |
| 1217 | + | |
1196 | 1218 | | |
1197 | 1219 | | |
1198 | 1220 | | |
| |||
1249 | 1271 | | |
1250 | 1272 | | |
1251 | 1273 | | |
1252 | | - | |
| 1274 | + | |
1253 | 1275 | | |
1254 | 1276 | | |
1255 | 1277 | | |
| |||
1273 | 1295 | | |
1274 | 1296 | | |
1275 | 1297 | | |
1276 | | - | |
| 1298 | + | |
| 1299 | + | |
| 1300 | + | |
| 1301 | + | |
| 1302 | + | |
| 1303 | + | |
| 1304 | + | |
| 1305 | + | |
| 1306 | + | |
| 1307 | + | |
| 1308 | + | |
| 1309 | + | |
| 1310 | + | |
| 1311 | + | |
| 1312 | + | |
| 1313 | + | |
1277 | 1314 | | |
1278 | 1315 | | |
1279 | 1316 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
4563 | 4563 | | |
4564 | 4564 | | |
4565 | 4565 | | |
| 4566 | + | |
4566 | 4567 | | |
| 4568 | + | |
| 4569 | + | |
| 4570 | + | |
4567 | 4571 | | |
4568 | 4572 | | |
4569 | 4573 | | |
| |||
4615 | 4619 | | |
4616 | 4620 | | |
4617 | 4621 | | |
| 4622 | + | |
4618 | 4623 | | |
4619 | | - | |
4620 | | - | |
4621 | | - | |
4622 | | - | |
4623 | | - | |
| 4624 | + | |
| 4625 | + | |
| 4626 | + | |
| 4627 | + | |
| 4628 | + | |
| 4629 | + | |
4624 | 4630 | | |
4625 | 4631 | | |
4626 | 4632 | | |
| |||
4637 | 4643 | | |
4638 | 4644 | | |
4639 | 4645 | | |
4640 | | - | |
4641 | | - | |
4642 | | - | |
4643 | | - | |
4644 | | - | |
4645 | | - | |
| 4646 | + | |
| 4647 | + | |
| 4648 | + | |
| 4649 | + | |
| 4650 | + | |
| 4651 | + | |
| 4652 | + | |
| 4653 | + | |
4646 | 4654 | | |
4647 | 4655 | | |
4648 | 4656 | | |
| |||
4655 | 4663 | | |
4656 | 4664 | | |
4657 | 4665 | | |
4658 | | - | |
4659 | | - | |
4660 | | - | |
4661 | | - | |
4662 | | - | |
4663 | | - | |
| 4666 | + | |
| 4667 | + | |
| 4668 | + | |
| 4669 | + | |
| 4670 | + | |
| 4671 | + | |
| 4672 | + | |
4664 | 4673 | | |
4665 | 4674 | | |
4666 | 4675 | | |
| |||
4686 | 4695 | | |
4687 | 4696 | | |
4688 | 4697 | | |
| 4698 | + | |
| 4699 | + | |
| 4700 | + | |
4689 | 4701 | | |
4690 | 4702 | | |
4691 | 4703 | | |
4692 | 4704 | | |
4693 | 4705 | | |
| 4706 | + | |
4694 | 4707 | | |
4695 | 4708 | | |
4696 | 4709 | | |
4697 | 4710 | | |
4698 | 4711 | | |
4699 | 4712 | | |
| 4713 | + | |
4700 | 4714 | | |
4701 | 4715 | | |
4702 | 4716 | | |
4703 | 4717 | | |
4704 | 4718 | | |
4705 | 4719 | | |
4706 | 4720 | | |
| 4721 | + | |
4707 | 4722 | | |
4708 | 4723 | | |
4709 | 4724 | | |
4710 | 4725 | | |
4711 | 4726 | | |
4712 | 4727 | | |
| 4728 | + | |
4713 | 4729 | | |
4714 | 4730 | | |
4715 | 4731 | | |
4716 | 4732 | | |
4717 | 4733 | | |
4718 | 4734 | | |
| 4735 | + | |
4719 | 4736 | | |
4720 | 4737 | | |
4721 | 4738 | | |
4722 | 4739 | | |
4723 | 4740 | | |
4724 | 4741 | | |
| 4742 | + | |
4725 | 4743 | | |
4726 | 4744 | | |
4727 | 4745 | | |
4728 | 4746 | | |
4729 | 4747 | | |
4730 | 4748 | | |
| 4749 | + | |
4731 | 4750 | | |
4732 | 4751 | | |
4733 | 4752 | | |
| |||
0 commit comments