[APPack] Iterative Re-Packing #3171

AlexandreSinger · 2025-06-29T01:31:02Z

Being based off the packer, APPack also uses iterative re-packing when a dense enough clustering cannot be found. APPack has some special options that it can use to increase the density of clustering further without hurting quality as much as the default flow.

Updated the iterative re-packing algorithm to use these options if needed.

Having this safer fall-back has allowed me to tune some numbers that I knew would improve the quality of most circuits but was causing a few circuits to fail packing. These few that were failing should now hit this fall-back paths which resolves this issue.

AlexandreSinger · 2025-06-29T02:19:55Z

Results on VTR:

Metric	Normalized to AP Baseline
post_gp_hpwl	1.14
post_fl_hpwl	1.02
post_dp_hpwl	0.98
total_wirelength	0.99
post_gp_cpd	1.00
post_fl_cpd	0.98
post_dp_cpd	0.98
crit_path_delay	0.98

Since I was able to tighten the max displacement threshold (as well as other parameters which may increase density) we were able to see a 2% improvement in basically everything after global placement. Post-FL HPWL is an exception; however only one circuit (MCML) got much worst, everything else got similar improvements.

Note: the Post-GP HPWL got worse due to me setting the target density of CLBs/LABs to 0.8 in the partial legalizer. This is expected to make the post-GP HPWL worse, but I found that it made the overall solution quality better.

Being based off the packer, APPack also uses iterative re-packing when a dense enough clustering cannot be found. APPack has some special options that it can use to increase the density of clustering further without hurting quality as much as the default flow. Updated the iterative re-packing algorithm to use these options if needed. Having this safer fall-back has allowed me to tune some numbers that I knew would improve the quality of most circuits but was causing a few circuits to fail packing. These few that were failing should now hit this fall-back paths which resolves this issue.

AlexandreSinger · 2025-06-29T15:52:42Z

Results on Titan (timing driven, no fixed blocks):

circuit	post_gp_hpwl	post_fl_hpwl	post_dp_hpwl	total_wirelength	post_gp_cpd	post_fl_cpd	post_dp_cpd	crit_path_delay
LU230_stratixiv_arch_timing.blif	1.143	0.923	0.928	0.929	1.018	1.022	1.011	1.022
LU_Network_stratixiv_arch_timing.blif	0.807	0.706	1.052	1.031	0.792	0.951	1.734	1.723
SLAM_spheric_stratixiv_arch_timing.blif	1.758	1.913	1.217	1.205	1.224	1.196	1.014	1.017
bitcoin_miner_stratixiv_arch_timing.blif	0.701	0.900	0.918	0.925	0.685	0.992	0.962	1.029
bitonic_mesh_stratixiv_arch_timing.blif	1.055	0.927	0.788	0.839	1.177	1.029	1.004	0.997
cholesky_bdti_stratixiv_arch_timing.blif	1.046	1.071	1.031	1.022	1.095	1.119	1.072	1.068
cholesky_mc_stratixiv_arch_timing.blif	1.021	0.929	0.983	0.975	0.731	0.894	0.980	0.999
dart_stratixiv_arch_timing.blif	1.325	1.236	1.003	0.994	1.278	1.197	1.080	1.104
denoise_stratixiv_arch_timing.blif	1.042	1.028	0.946	0.946	1.016	0.998	0.982	0.981
des90_stratixiv_arch_timing.blif	1.020	0.879	0.833	0.870	0.982	0.933	0.932	0.929
directrf_stratixiv_arch_timing.blif	1.048	0.704	0.891	0.898	0.947	0.940	1.012	0.964
gsm_switch_stratixiv_arch_timing.blif	1.109	0.809	1.007	0.992	0.956	0.991	0.905	0.878
mes_noc_stratixiv_arch_timing.blif	1.005	0.879	0.890	0.915	1.023	0.926	0.866	0.856
minres_stratixiv_arch_timing.blif	1.093	0.948	0.880	0.898	1.179	0.933	0.897	1.230
neuron_stratixiv_arch_timing.blif	0.971	1.044	1.084	1.055	0.988	1.029	1.136	1.142
openCV_stratixiv_arch_timing.blif	0.899	0.774	0.878	0.889	0.983	0.974	1.122	1.096
segmentation_stratixiv_arch_timing.blif	1.045	0.904	0.967	0.987	1.023	0.979	0.987	0.985
sparcT1_chip2_stratixiv_arch_timing.blif	1.249	1.029	0.914	0.926	1.152	1.099	1.037	1.047
sparcT1_core_stratixiv_arch_timing.blif	0.798	0.721	0.929	0.941	1.072	0.845	1.082	1.002
sparcT2_core_stratixiv_arch_timing.blif	1.117	0.931	0.894	0.923	1.095	1.104	1.012	0.956
stap_qrd_stratixiv_arch_timing.blif	0.931	0.671	0.900	0.913	0.777	0.698	1.022	1.040
stereo_vision_stratixiv_arch_timing.blif	1.090	1.080	0.987	0.987	1.092	1.104	1.004	1.001

Geomean	1.040	0.930	0.947	0.954	1.001	0.991	1.029	1.038

All numbers are normalized to AP baseline (the AP flow just prior to this PR).

We have a 5% improvement in post-routed wirelength. It looks as if we have a degredation of CPD by 4% however, this is caused by 2 outliers (LU_Netword and minres) which had better post-FL CPD but worse post-DP CPD. This implies to me that more tuning is needed for the annealer; but I think we can recover everything with a bit more tuning on the overall flow.

The runtime results are also interesting:

circuit	ap_runtime	ap_gp_runtime	ap_fl_runtime	ap_dp_runtime	route_runtime	total_runtime
LU230_stratixiv_arch_timing.blif	0.996	1.090	0.974	0.983	0.835	0.990
LU_Network_stratixiv_arch_timing.blif	0.983	0.988	0.968	0.983	1.019	0.986
SLAM_spheric_stratixiv_arch_timing.blif	1.633	0.792	2.735	1.541	1.271	1.564
bitcoin_miner_stratixiv_arch_timing.blif	0.980	1.462	1.018	0.916	0.797	0.974
bitonic_mesh_stratixiv_arch_timing.blif	1.026	0.989	0.992	1.078	0.783	0.992
cholesky_bdti_stratixiv_arch_timing.blif	0.948	1.040	1.001	0.877	1.046	0.967
cholesky_mc_stratixiv_arch_timing.blif	1.025	1.152	1.017	0.892	0.983	1.020
dart_stratixiv_arch_timing.blif	0.967	0.958	0.923	1.027	1.008	0.972
denoise_stratixiv_arch_timing.blif	0.959	1.006	0.962	0.943	0.900	0.956
des90_stratixiv_arch_timing.blif	0.999	0.960	0.974	1.077	0.799	0.974
directrf_stratixiv_arch_timing.blif	0.994	1.020	0.950	0.992	0.948	0.993
gsm_switch_stratixiv_arch_timing.blif	0.971	0.905	0.936	1.009	0.923	0.973
mes_noc_stratixiv_arch_timing.blif	0.945	0.999	1.015	0.889	1.326	0.965
minres_stratixiv_arch_timing.blif	1.007	0.988	1.015	1.021	0.933	1.008
neuron_stratixiv_arch_timing.blif	1.525	0.929	2.694	1.116	1.054	1.401
openCV_stratixiv_arch_timing.blif	1.233	1.009	1.807	1.024	0.882	1.175
segmentation_stratixiv_arch_timing.blif	0.971	0.933	1.040	0.959	1.003	0.975
sparcT1_chip2_stratixiv_arch_timing.blif	1.204	0.973	1.815	1.024	0.991	1.186
sparcT1_core_stratixiv_arch_timing.blif	1.043	1.224	1.047	0.918	0.975	1.033
sparcT2_core_stratixiv_arch_timing.blif	0.983	0.965	0.918	1.021	0.884	0.982
stap_qrd_stratixiv_arch_timing.blif	1.011	1.097	1.040	0.978	0.834	1.004
stereo_vision_stratixiv_arch_timing.blif	1.017	0.994	0.954	1.006	0.969	1.029

Geomean	1.053	1.014	1.142	1.005	0.954	1.042

4 circuits actually hit this fallback path which caused their full legalization (APPack) runtime to increase (shown in bold in the table above). This brought the overall geomean runtime up; however for the rest of the circuits the runtime actually decreased by around 10%. I think with more tuning with the fall-back options I can also reduce these as well.

@vaughnbetz We did hit a very big milestone. With this change we are now 9.4% better WL and 2% worst CPD (however we are practically tied if we ignore LU_Network) than without using AP on Titan, at the cost of around 9% run time. This puts us very close to the prior state of the art on Titan. I think with a bit more tuning I can get these numbers even better!

@amin1377 FYI

github-actions bot added VPR VPR FPGA Placement & Routing Tool lang-cpp C/C++ code labels Jun 29, 2025

AlexandreSinger requested review from amin1377 and vaughnbetz June 29, 2025 02:20

AlexandreSinger force-pushed the feature-ap-iterative-repacking branch from c574bc5 to 695d205 Compare June 29, 2025 03:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[APPack] Iterative Re-Packing #3171

[APPack] Iterative Re-Packing #3171

Uh oh!

AlexandreSinger commented Jun 29, 2025

Uh oh!

AlexandreSinger commented Jun 29, 2025 •

edited

Loading

Uh oh!

AlexandreSinger commented Jun 29, 2025 •

edited

Loading

Uh oh!

Uh oh!

[APPack] Iterative Re-Packing #3171

Are you sure you want to change the base?

[APPack] Iterative Re-Packing #3171

Uh oh!

Conversation

AlexandreSinger commented Jun 29, 2025

Uh oh!

AlexandreSinger commented Jun 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

AlexandreSinger commented Jun 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

AlexandreSinger commented Jun 29, 2025 •

edited

Loading

AlexandreSinger commented Jun 29, 2025 •

edited

Loading