Skip to content

Conversation

@gus-massa
Copy link
Contributor

This version is working and reduces a lot of cases of +, in particular to make (+ x 1) as fast as (add1 x) in some useful common cases. It can be merged after fixing some typos, but I wish to add a similar reduction for - to keep the balance of the universe.

The main idea is that if x is a flonum, then (#3%fl+ x 1.0) is 4 or 6 times faster than (+ x 1) even if we add some corner cases for 0, 0.0 and -0.0. I found too many corner cases, so I wish to share this as an advanced draft to get feedback.

I'll update this PR with a version for - next week, probably, because I'm not sure how many unexpected corner case I'll find. (The version for * is quite different, I'll wait a few months until attempting it. And / has even more cases!)


Reduce + with some combinations of values that are known at compile time to be real

(+ <fx> <fx> ...) => ($fxx+ <fx> <fx> ...)
(+ <fl> <fx> ...) => (fl+ <fl> (fixnum->flonum <fx>) ...)
(+ <fl> <real> ...) => (fl+ <fl> (real->flonum <real>) ...)
(+ <fl> <fl> ...) => (fl+ <fl> <fl>)

with some special cases for 0, in particular

(+ <fl> 1 ...) => (fl+ <fl> 1.0 ...)
(+ <fl> 0 ...) => (fl+ <fl> -0.0 ...)
(+ 1.0 0 ...) => (fl+ 1.0 0.0 ...)

In particular, I actually had to reduce

(+ <fl> <fx>) => (fl+ <fl> (if (eq? <fx> 0) -0,0 (fixnum->flonum <fx>)))

in case the first argument is -0.0. (This case was pointed out by @mflatt.)

(+ -0.0 0) => -0,0 
(+ -0.0 0.0) => 0,0 
(+ -0.0 -0.0) => -0,0 

I fount no version of (fixnum->flonum <fx>) that return -0.0, and it looks like it's implemented using an assembler function with the same behavior, so the only solution was to use an if. The code tries to avoid adding this check when possible, in particular when the second argument is never 0 or the first argument is never -0.0.

Also, some reductions apply only when the left associativity is enabled.

@gus-massa
Copy link
Contributor Author

Two more questions:


In a previous PR I added an implementation of $fxx+ to an unsafe version to prim.ss and and a safe version to mathprim.ss. After taking another deeper look I think it's unnecessary/wrong and one of them was overwriting the other. So no I removed the version from prim.ss. I extended the safe version in mathprim.ss to handle multiple arguments.

Where is the best place to put this primitive? Is is better to keep the safe version or just assume it will be always call in unsafe mode as expected?


I'm wondering if I have been too cautious with the -0.0 case. From https://en.wikipedia.org/wiki/Signed_zero

  • x + (+0) = x and x + (-0) = x for x!= 0
  • (-0) + (-0) = (-0) - (+0) = -0
  • (+0) + (+0) = (+0) - (-0) = +0
  • x - x = x + (-x) = +0 (for any finite x, -0 when rounding toward negative)

Is it possible to enable "rounding toward negative" in Chez Scheme? If not, the only way to get a -0 in a + is something like (+ 0 -0.0 -0.0 0 -0.0), so if any of the arguments is not 0 nor -0.0 the result is never -0.0. If this is correct (in all platforms) then I can simplify my implementation and avoid in a few more case the tests (eq? x 0).

@mflatt
Copy link
Contributor

mflatt commented Jul 31, 2025

On $fxx+: I think "mathprims.ss" is the right choice, and I'd keep the safe version.

On rounding toward negative: I think probably not. R6RS certainly specifies rounding toward even for flround, and I expect that the implementation of flround relies on the default rounding mode in that case. More generally, I expect that consistent results rely on using default rounding modes, and so it seems ok to assume that rounding toward negative is off.

Reduce + with some combinations of values that are
known at compile time to be real

(+ <fx> <fx> ...) => ($fxx+ <fx> <fx> ...)
(+ <fl> <fx> ...) => (fl+ <fl> (fixnum->flonum <fx>) ...)
(+ <fl> <real> ...) => (fl+ <fl> (real->flonum <real>) ...)
(+ <fl> <fl> ...) => (fl+ <fl> <fl>)

with some special cases for 0, in particular

(+ <fl> 1 ...) => (fl+ <fl> 1.0 ...)
(+ <fl> 0 ...) => (fl+ <fl> -0.0 ...)
(+ 1.0 0 ...) => (fl+ 1.0 0.0 ...)
@gus-massa
Copy link
Contributor Author

A few minor changes, probably not worth reviewing until the final version. In particular a rebase because in a previous PR the internal representation of fixnums changed from 'fixnum to something like (union 'non-zero-fixnum 0). Also, after reading the docs and looking here and there, I think that the only way to get a -0.0 in a + is something like (+ 0 -0.0 -0.0 0 -0.0) so I'm skipping the conversion to -0.0 when one of the parameters is neither 0 nor -0.0.


A few days ago I asked in HN in case someone there realizes there is another example. Then sparkie@HN replied with an example in C that used FE_DOWNWARD https://godbolt.org/z/5qvqsdh9P .

Later, the reply explains how to use vfixupimm to change only 0.0 to -0.0 without branches. I don't know enough about the later pass of Chez Scheme to be sure how to use vfixupimm and IIUC it's not available in all microprocessors, so I'll not use it. Anyway, it looks like an interesting instruction because it can make a lot of other specific changes to flonums without branching.

And small changes to the reductions of `abs` and
`sub1` in cptypes
Reduce + with some combinations of values that are
known at compile time to be real

(- <fx> <fx> ...) => ($fxx- <fx> <fx> ...)
(- <fl> <fx> ...) => (fl- <fl> (fixnum->flonum <fx>) ...)
(- <fl> <real> ...) => (fl- <fl> (real->flonum <real>) ...)
(- <fl> <fl> ...) => (fl- <fl> <fl>)

with some special cases for 0, in particular

(- <fl> 1 ...) => (fl- <fl> 1.0 ...)
(- 1 <fl> ...) => (fl- 1.0 <fl> ...)
(- <fl> 0 ...) => (fl- <fl> 0.0 ...)
(- 0 <fl> ...) => (fl- -0.0 <fl> ...)
@gus-massa
Copy link
Contributor Author

99% done. But I got a surprising error.

add reductions for + in cptypes

As discussed before. No changes since the intermediate update. Note that it avoids using -0.0 in cases like (+ 1.0 <fx>)

Add $fxx- primitive

Add the $fxx- primitive and use it in the reduction for (abs <fx>) to get a faster code for small positive fixnums, in exchange of making the case for (most-negative-fixnum) slower. Also change the reduction of (sub1 <fx>) just because it looks more natural.

add reductions for - in cptypes

Similar to the reduction for +. The first argument is fixed in the first spot, so some rules can be simplified. IIUC the other arguments are never reordered, but I avoid using that because in the future someone may reorder them. I only assume they are not reordered when (enable-arithmetic-left-associative).

Bad fix changing a lot of - to fx-

With the previous commit, it compiles finely in Linux and OSX but it breaks the 5 WinNT runs https://github.com/gus-massa/ChezScheme/actions/runs/16943355792

After looking for a while, the problem is that in 7.ss there is a (- n i). First cp0 inlines and use constant propagation to transform it into (- 4 i). Then cptypes changes it to (#3%$fxx- 4 i) becuse the type of i is fixnum. The cp0 tries to apply partial-fold-minus and to check that the first argument has the correct type it uses a call to an implicit ($fxx- 4 0)

Now ($fxx- 4 0) is calculated using mathprim.ss that should call to (#3%$fxx- 4 0) in cpprim.ss but instead it calls again to mathprim.ss that calls again to mathprim.ss that calls again to mathprim.ss ...

I added some debugging info to the screen to show most of it. So it prints a few weird things until it reach the loop and prints [********* .https://github.com/gus-massa/ChezScheme/actions/runs/16941090669

It can be solved changing a few - to fx- to avoid the reduction to $fxx- https://github.com/gus-massa/ChezScheme/actions/runs/16941087640

Also, after a successful compilation that change cna be reverted and the code compiles nicely, so I guess that upgrading the bootfiles is enough to fix it.

I'm not sure that this fixes the problem or if my code has some very bad assumption.

@mflatt
Copy link
Contributor

mflatt commented Aug 13, 2025

I do wonder whether it helps to rebuild pb boot files with ./configure --pb && make re.boot instead of changing - to fx-. I think you'll need to add $fxx- and $fxx+ to reboot.ss, since it seems that they're needed at compile time. Bootstrapping is tricky, and I can't work out exactly how it would go wrong, but maybe.

@gus-massa gus-massa changed the title Add reductions for + in cptypes [WIP] Add reductions for + in cptypes Aug 14, 2025
@gus-massa
Copy link
Contributor Author

I'm optimistic, so I removed [WIP] from the title.

I updated the version. Is that ok?


Some minor remarks:

  • I'm using partial-fold-minus, so some expressions like
    (#3%$fxx- (most-negative-fixnum) 7)

are not reduced, in spite

    (- (most-negative-fixnum) 7)

is constant folded to a bignum. I may try to fix this, but partial-fold-minus has already too many options and I don't want to add even more. #3%$fxx- is strange because the arguments are restricted to fixnums, but the result is not restricted.

  • Fixing the problem, I noticed a lot of cases of
    (- <positive-fixnum>  <positive-fixnum>)

that are reduced to

    (#3%$fxx- <positive-fixnum>  <positive-fixnum>)

but a smart enough compiler could be reduced to

    (#3%fx- <positive-fixnum>  <positive-fixnum>)

that avoids the overflow check, so it's slightly faster.

The problem is that cptypes does not distinguish positive and negative fixnums. I'm adding this to my wish list.

@mflatt mflatt merged commit 6b333ee into cisco:main Aug 15, 2025
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants