-
Notifications
You must be signed in to change notification settings - Fork 123
Open
Description
Issue: Segmentation Fault with Crossbar Configuration
I encountered a Segmentation Fault (core dumped) when running the Timeloop mapping process with a specific architecture. The error traceback indicates that the segmentation fault occurs after attempting to execute the mapper with joblib.parallel.
The output message is as follows:
root@tutorial:/home/workspace/example_designs# python3 run_example_designs.py --architecture einsumDNN_like
Running mapper1 for target einsumDNN_like in /home/workspace/example_designs/example_designs/einsumDNN_like/outputs/default_problem...
Running mapper3 for einsumDNN_like with problem default_problem...
Running mapper3 for einsumDNN_like with problem default_problem...
input file: /home/workspace/example_designs/example_designs/einsumDNN_like/outputs/default_problem/parsed-processed-input.yaml
_______ __
/_ __(_)___ ___ ___ / /___ ____ ____
/ / / / __ `__ \/ _ \/ / __ \/ __ \/ __ \
/ / / / / / / / / __/ / /_/ / /_/ / /_/ /
/_/ /_/_/ /_/ /_/\___/_/\____/\____/ .___/
/_/
Problem configuration complete.
execute:/usr/local/bin/accelergy /home/workspace/example_designs/example_designs/einsumDNN_like/outputs/default_problem/parsed-processed-input.yaml --oprefix timeloop-mapper. -o ./ > timeloop-mapper.accelergy.log 2>&1
Generate Accelergy ERT (energy reference table) to replace internal energy model.
Generate Accelergy ART (area reference table) to replace internal area model.
Architecture configuration complete.
Sparse optimization configuration complete.
Using threads = 4
Mapper configuration complete.
Initializing Index Factorization subspace.
Factorization options along problem dimension C = 6
Factorization options along problem dimension M = 252
Factorization options along problem dimension R = 6
Factorization options along problem dimension S = 6
Factorization options along problem dimension N = 1
Factorization options along problem dimension P = 756
Factorization options along problem dimension Q = 756
Mapspace Dimension [IndexFactorization] Size: 31109847552
Mapspace Dimension [LoopPermutation] Size: 193491763200000
Mapspace Dimension [Spatial] Size: 64
Mapspace Dimension [DatatypeBypass] Size: 1
Mapspace split! Per-split Mapping Dimension [IndexFactorization] Size: 7777461888 Residue: 0
Mapspace construction complete.
Search configuration complete.
Start Parsering Layout
No Layout specified, so using bandwidth based modeling
[ 0] Utilization = 0.56 | pJ/Compute = 1247.502 | L5[WIO] M4 P2 Q7 S3 - L4[IO] P14 - L3[W] Q2 - L2[] Q1 C3X R3X - L1[] Q1 M2Y P4Y - L0[O] Q8 M4 | Cycles = 150528
[ 1] Utilization = 0.19 | pJ/Compute = 1482.627 | L5[WIO] M4 P2 Q7 S3 - L4[IO] P14 - L3[W] C3 Q2 - L2[] Q1 R3X - L1[] Q1 M2Y P4Y - L0[O] Q8 M4 | Cycles = 451584
[ 3] Utilization = 0.19 | pJ/Compute = 1482.627 | L5[WIO] M4 P2 Q7 C3 S3 - L4[IO] P14 - L3[W] Q2 - L2[] Q1 R3X - L1[] Q1 M2Y P4Y - L0[O] Q8 M4 | Cycles = 451584
[ 2] Utilization = 0.19 | pJ/Compute = 1482.627 | L5[WIO] M4 P2 Q7 S3 - L4[IO] C3 P14 - L3[W] Q2 - L2[] Q1 R3X - L1[] Q1 M2Y P4Y - L0[O] Q8 M4 | Cycles = 451584
[ 3] Utilization = 0.88 | pJ/Compute = 1567.922 | L5[WIO] C3 P56 - L4[IO] M2 S3 - L3[W] R3 M4 - L2[] Q1 Q14X - L1[] Q1 P2Y M4Y - L0[O] Q8 | Cycles = 96768
[ 2] Utilization = 0.88 | pJ/Compute = 1558.438 | L5[WIO] P56 - L4[IO] M2 S3 C3 - L3[W] R3 M4 - L2[] Q1 Q14X - L1[] Q1 P2Y M4Y - L0[O] Q8 | Cycles = 96768
[ 1] Utilization = 0.88 | pJ/Compute = 1558.438 | L5[WIO] P56 - L4[IO] M2 S3 - L3[W] R3 C3 M4 - L2[] Q1 Q14X - L1[] Q1 P2Y M4Y - L0[O] Q8 | Cycles = 96768
[ 0] Utilization = 0.75 | pJ/Compute = 1146.322 | L5[WIO] M4 S3 P8 - L4[IO] P7 Q2 - L3[W] Q14 P2 - L2[] Q1 Q2X M2X R3X - L1[] Q1 M4Y Q2Y - L0[O] C3 | Cycles = 112896
[ 0] Utilization = 0.75 | pJ/Compute = 1128.111 | L5[WIO] P14 Q14 - L4[IO] C3 M2 - L3[W] M2 P2 R3 - L2[] Q1 S3X M4X - L1[] Q1 Q4Y M2Y - L0[O] Q2 P4 | Cycles = 112896
[ 1] Utilization = 0.75 | pJ/Compute = 1137.595 | L5[WIO] C3 P14 Q14 - L4[IO] M2 - L3[W] M2 P2 R3 - L2[] Q1 S3X M4X - L1[] Q1 Q4Y M2Y - L0[O] Q2 P4 | Cycles = 112896
[ 2] Utilization = 0.88 | pJ/Compute = 1205.734 | L5[WIO] Q2 P28 - L4[IO] S3 C3 - L3[W] Q2 M4 P2 - L2[] Q1 M2X Q7X - L1[] Q1 M2Y Q4Y - L0[O] P2 R3 M2 | Cycles = 96768
[ 2] Utilization = 1.00 | pJ/Compute = 1087.899 | L5[WIO] Q7 P7 - L4[IO] Q2 P2 - L3[W] S3 Q8 M2 R3 - L2[] Q1 M16X - L1[] Q1 P8Y - L0[O] C3 | Cycles = 84672
[ 3] Utilization = 0.88 | pJ/Compute = 1215.217 | L5[WIO] C3 Q2 P28 - L4[IO] S3 - L3[W] Q2 M4 P2 - L2[] Q1 M2X Q7X - L1[] Q1 M2Y Q4Y - L0[O] P2 R3 M2 | Cycles = 96768
[ 1] Utilization = 0.88 | pJ/Compute = 1205.734 | L5[WIO] Q2 P28 - L4[IO] S3 - L3[W] Q2 C3 M4 P2 - L2[] Q1 M2X Q7X - L1[] Q1 M2Y Q4Y - L0[O] P2 R3 M2 | Cycles = 96768
[ 3] Utilization = 0.75 | pJ/Compute = 1009.646 | L5[WIO] Q2 P14 M2 - L4[IO] M2 Q7 - L3[W] P2 R3 - L2[] Q1 M8X Q2X - L1[] Q1 C3Y P2Y - L0[O] Q4 P2 S3 | Cycles = 112896
[ 0] Utilization = 0.75 | pJ/Compute = 1010.482 | L5[WIO] Q7 M2 P2 - L4[IO] Q2 - L3[W] Q4 M2 - L2[] Q1 S3X P4X - L1[] Q1 P2Y M4Y - L0[O] Q2 P7 R3 M2 C3 | Cycles = 112896
[ 0] Utilization = 0.88 | pJ/Compute = 1088.840 | L5[WIO] M2 Q14 - L4[IO] C3 P2 - L3[W] P4 Q4 S3 - L2[] Q1 M8X P2X - L1[] Q1 P7Y - L0[O] Q2 R3 M2 | Cycles = 96768
[ 1] Utilization = 0.88 | pJ/Compute = 1088.943 | L5[WIO] M2 Q14 C3 - L4[IO] P2 - L3[W] P4 Q4 S3 - L2[] Q1 M8X P2X - L1[] Q1 P7Y - L0[O] Q2 R3 M2 | Cycles = 96768
[ 0] Utilization = 0.88 | pJ/Compute = 1088.049 | L5[WIO] P7 Q4 - L4[IO] M2 R3 - L3[W] M2 Q2 S3 - L2[] Q1 M2X Q7X - L1[] Q1 P4Y Q2Y - L0[O] P4 M4 C3 | Cycles = 96768
[ 3] Utilization = 0.88 | pJ/Compute = 1097.498 | L5[WIO] C3 P28 - L4[IO] M2 - L3[W] P2 M2 S3 R3 - L2[] Q1 Q14X - L1[] Q1 M8Y - L0[O] Q8 P2 | Cycles = 96768
[ 1] Utilization = 0.88 | pJ/Compute = 1088.014 | L5[WIO] P28 - L4[IO] M2 - L3[W] C3 P2 M2 S3 R3 - L2[] Q1 Q14X - L1[] Q1 M8Y - L0[O] Q8 P2 | Cycles = 96768
[ 0] Utilization = 0.88 | pJ/Compute = 1062.363 | L5[WIO] Q4 M16 P4 - L4[IO] P7 - L3[W] P2 - L2[] Q1 Q14X - L1[] Q1 M2Y P2Y Q2Y - L0[O] S3 R3 C3 | Cycles = 96768
[ 2] STATEMENT: 500 suboptimal mappings found since the last upgrade, terminating search.
[ 3] Utilization = 0.88 | pJ/Compute = 1062.363 | L5[WIO] Q4 M16 P4 - L4[IO] P7 - L3[W] P2 C3 - L2[] Q1 Q14X - L1[] Q1 M2Y P2Y Q2Y - L0[O] S3 R3 | Cycles = 96768
[ 0] Utilization = 1.00 | pJ/Compute = 1088.840 | L5[WIO] M2 Q14 - L4[IO] C3 P56 - L3[W] Q1 - L2[] Q1 M2X P2X Q4X - L1[] Q1 M4Y Q2Y - L0[O] S3 R3 M2 | Cycles = 84672
[ 1] STATEMENT: 500 suboptimal mappings found since the last upgrade, terminating search.
[ 0] Utilization = 1.00 | pJ/Compute = 1087.942 | L5[WIO] P14 M2 - L4[IO] C3 - L3[W] Q14 P2 S3 - L2[] Q1 Q2X P2X M4X - L1[] Q1 M2Y Q2Y P2Y - L0[O] Q2 R3 M2 | Cycles = 84672
[ 3] STATEMENT: 500 suboptimal mappings found since the last upgrade, terminating search.
[ 0] STATEMENT: 500 suboptimal mappings found since the last upgrade, terminating search.
Segmentation fault
Traceback (most recent call last):
File "/home/workspace/example_designs/run_example_designs.py", line 125, in <module>
joblib.Parallel(n_jobs=args.n_jobs)(
File "/usr/local/lib/python3.10/dist-packages/joblib/parallel.py", line 1918, in __call__
return output if self.return_generator else list(output)
File "/usr/local/lib/python3.10/dist-packages/joblib/parallel.py", line 1847, in _get_sequential_output
res = func(*args, **kwargs)
File "/home/workspace/example_designs/run_example_designs.py", line 69, in run_mapper
tl.call_mapper(spec, output_dir=output_dir, dump_intermediate_to=output_dir)
File "/usr/local/lib/python3.10/dist-packages/pytimeloop/timeloopfe/common/backend_calls.py", line 222, in call_mapper
return _parse_output(
File "/usr/local/lib/python3.10/dist-packages/pytimeloop/timeloopfe/common/backend_calls.py", line 184, in _parse_output
raise RuntimeError(errmsg)
RuntimeError:
========================================================================================================================
Timeloop mapper failed with return code 139. Please check the output files in /home/workspace/example_designs/example_designs/einsumDNN_like/outputs/default_problem for more information. To debug, you can edit the file:
/home/workspace/example_designs/example_designs/einsumDNN_like/outputs/default_problem/parsed-processed-input.yaml
and run
tl mapper /home/workspace/example_designs/example_designs/einsumDNN_like/outputs/default_problem/parsed-processed-input.yaml
to see the error. If you're running the mapper and Timeloop can't find a vaild mapping, try setting 'diagnostics: true' in the mapper input specification.
Here is the crossbar configuration that I would like to use (as an example):
- !Component
name: InputCrossbar
class: XY_NoC
subclass: crossbar
attributes:
n_inputs: 32
n_outputs: 16
width: 24
latency: 1 Removing the crossbar allows me to get a valid mapping a the end. The ERT and ART are still generated but the mapping not .... Does someone has already got this problem ?
Thanks
Metadata
Metadata
Assignees
Labels
No labels