-
Notifications
You must be signed in to change notification settings - Fork 14
Description
I noticed that the getFields function of the RISC-V example often gets called just to get the value of a single field (with calls along the line of let funct3 := get(getFields(inst), funct3)). Since fetching the values of all the fields just to extract the value of one of them looked a bit overkill, I tried replacing getFields with individual getXxx functions (e.g. getFunct3). This provides a decent boost to simulation performance:
-- Running tests with Cuttlesim --
_objects/rv32i.v/rvcore.cuttlesim.opt tests/_build/rv32i/integ/rvbench_qsort.rv32 -1
[before] real: 0m0.018s user: 0m0.017s sys: 0m0.001s
[after ] real: 0m0.013s user: 0m0.013s sys: 0m0.000s
_objects/rv32i.v/rvcore.cuttlesim.opt tests/_build/rv32i/integ/rvbench_median.rv32 -1
[before] real: 0m0.012s user: 0m0.012s sys: 0m0.000s
[after ] real: 0m0.008s user: 0m0.007s sys: 0m0.001s
_objects/rv32i.v/rvcore.cuttlesim.opt tests/_build/rv32i/integ/img.rv32 -1
[before] real: 0m0.216s user: 0m0.195s sys: 0m0.020s
[after ] real: 0m0.160s user: 0m0.142s sys: 0m0.018s
_objects/rv32i.v/rvcore.cuttlesim.opt tests/_build/rv32i/integ/primes.rv32 -1
[before] real: 0m4.421s user: 0m4.417s sys: 0m0.001s
[after ] real: 0m3.090s user: 0m3.084s sys: 0m0.001s
_objects/rv32i.v/rvcore.cuttlesim.opt tests/_build/rv32i/integ/morse.rv32 -1
[before] real: 0m1.584s user: 0m1.583s sys: 0m0.000s
[after ] real: 0m1.086s user: 0m1.085s sys: 0m0.000s
-- Running tests with Verilator --
_objects/rv32i.v/obj_dir.opt/Vtop +VMH=tests/_build/rv32i/integ/rvbench_qsort.vmh -1
[before] real: 0m0.033s user: 0m0.032s sys: 0m0.001s
[after ] real: 0m0.030s user: 0m0.030s sys: 0m0.000s
_objects/rv32i.v/obj_dir.opt/Vtop +VMH=tests/_build/rv32i/integ/morse.vmh -1
[before] real: 0m2.689s user: 0m2.687s sys: 0m0.000s
[after ] real: 0m2.186s user: 0m2.184s sys: 0m0.000s
_objects/rv32i.v/obj_dir.opt/Vtop +VMH=tests/_build/rv32i/integ/rvbench_median.vmh -1
[before] real: 0m0.022s user: 0m0.022s sys: 0m0.000s
[after ] real: 0m0.018s user: 0m0.017s sys: 0m0.001s
_objects/rv32i.v/obj_dir.opt/Vtop +VMH=tests/_build/rv32i/integ/primes.vmh -1
[before] real: 0m8.358s user: 0m8.349s sys: 0m0.001s
[after ] real: 0m6.778s user: 0m6.767s sys: 0m0.001s
_objects/rv32i.v/obj_dir.opt/Vtop +VMH=tests/_build/rv32i/integ/img.vmh -1
[before] real: 0m0.399s user: 0m0.385s sys: 0m0.014s
[after ] real: 0m0.327s user: 0m0.310s sys: 0m0.017s
This results in a speedup of roughly 1.43 for Cuttlesim and of 1.23 for Verilator compared to master. I did not yet check the effects on synthesis but I might do it soon. I assume that this kind of optimization is applied automatically when synthesizing (isn't it?), so I don't expect changes on this side (alas). See commit 04cfdf8 for a demonstration.