Skip to content

Conversation

JohnoKing
Copy link

@JohnoKing JohnoKing commented Jun 5, 2025

This is the first of a thirteen(!!) part patch series that enables ksh93 to operate within a 64-bit address space (currently dubbed thickfold). These changes were accomplished by fixing most (but not all) of the warnings that materialize during compilation when the following clang compiler flags are passed:
-Wsign-compare -Wshorten-64-to-32 -Wsign-conversion -Wimplicit-int-conversion

Originally, this patch was going to take the suggested approach of replacing int with ssize_t. While that does work in practice, it's a bad idea because POSIX does not guarantee ssize_t will accept any negative value besides -1, despite its signed nature1:

The type ssize_t shall be capable of storing values at least
in the range [-1, {SSIZE_MAX}].

As such, for correctness and portability these patches will prefer usage of the C89 ptrdiff_t, which is practically guaranteed to accept the full range of negative numbers an int can accept. In practice, ptrdiff_t and size_t will be of the same size, being both 32-bit or both 64-bit. There are edge cases where this might not be so2, but I've chosen to use ptrdiff_t despite that since the caveats of such edge cases aren't nearly as bad as that of a platform's ssize_t potentially rejecting values lower than -1.

In some areas ssize_t results from e.g. sfvalue() may end up being used with ptrdiff_t variables. This is not pedantically correct, but it's certainly better than the prior int hell status quo, wherein ssize_t was usually shortened to int.

Conspicuous changes with noteworthiness:

  • Fixed many, many compiler warnings by adding a considerable number of casts and changing the types of more than quite a few variables. This is a consequence of using compiler warnings as an aid for implementing 64-bit memory allocation; while the warnings were useful, there were so many warnings fixed in the process that it increased the patch sizes greatly.
  • Transitioned the strgrpmatch() and strngrpmatch() calls to use a ssize_t* pointer rather than an int* pointer.
  • For the sh_options macros/function, enforce uint64_t as the main argument type to fix compiler warnings.
  • The libast stk sublibrary already uses the ssize_t/size_t types, which made it rather easy to integrate proper ptrdiff_t usage. In fact, stktell() has always returned a ptrdiff_t result. The documentation in stk(3) has been incorrectly claiming since < 1995 it returns int, which is both pedantically and in actuality wrong. That error has been rectified alongside the other updates to stk.

The thickfold patch series was accomplished by fixing compiler warnings, so I've decided to provide metrics as to the change in the total number of warnings occurring during compilation.
Change in the number of warnings on Linux when compiling with clang using -Wsign-compare -Wshorten-64-to-32 -Wsign-conversion -Wimplicit-int-conversion:
4,042 => 3,808 => 248 (progression from cc5e069 => part 1 => part 13)

Side note: To re-emphasize, this patch is one part of a whole (although it can be used on it's own). The full suite of changes can be found on the thickfold-size_t branch. This first part has been submitted severed from the other changes to make code review less hellish laborious (I hope). The full patch series can be found here (last updated 2025-06-16):
https://github.com/JohnoKing/ksh/tree/891120d26/patches

Progresses #592

Footnotes

  1. https://pubs.opengroup.org/onlinepubs/9799919799/basedefs/sys_types.h.html#tag_14_70

  2. I'll note there are also edge cases where ssize_t and size_t are of differing bit sizes, size_t being unsigned 64-bit and ssize_t being signed 32-bit. I presume this situation is extremely uncommon, but it's yet another reason why ptrdiff_t is a better choice.

JohnoKing added 3 commits June 4, 2025 20:53
…on, init and stk(3)

This is the first of a thirteen(!!) part patch series that enables
ksh93 to operate within a 64-bit address space (currently dubbed
thickfold). These changes were accomplished by fixing most (but not
all) of the warnings that materialize during compilation when the
following clang compiler flags are passed:
'-Wsign-compare -Wshorten-64-to-32 -Wsign-conversion -Wimplicit-int-conversion'

Originally, this patch was going to take the suggested approach of
replacing int with ssize_t. While that does work in practice, it's
a bad idea because POSIX does not guarantee ssize_t will accept
any negative value besides -1, despite its signed nature[1]:
    > The type ssize_t shall be capable of storing values at least
    > in the range [-1, {SSIZE_MAX}].
As such, for correctness and portability these patches will prefer
usage of the C89 ptrdiff_t, which is practically guaranteed to
accept the full range of negative numbers an int can accept. In
practice, ptrdiff_t and size_t will be of the same size, being
both 32-bit or both 64-bit. There are edge cases where this might
not be so, but I've chosen to use ptrdiff_t despite that since
the caveats of such edge cases aren't nearly as bad as that of
a platform's ssize_t rejecting values lower than -1.
[1]: https://pubs.opengroup.org/onlinepubs/9799919799/basedefs/sys_types.h.html#tag_14_70

In some areas ssize_t results from e.g. sfvalue() may end up being
used with ptrdiff_t variables. This is not pedantically correct,
but it's certainly better than the prior int hell status quo,
wherein ssize_t was usually shortened to int.

Conspicuous changes with noteworthyness:
- Fixed many, many compiler warnings by adding a considerable
  number of casts and changing the types of more than quite
  a few variables. This is a consequence of using compiler
  warnings as an aid for implementing 64-bit memory allocation;
  while the warnings were useful, there were so many warnings
  fixed in the process that it increased the patch sizes greatly.
- Transitioned the strgrpmatch() and strngrpmatch calls to use
  a ssize_t* pointer rather than an int* pointer.
- For the sh_options macros/function, enforce uint64_t as the main
  argument type to fix compiler warnings.
- The libast stk sublibrary already uses the ssize_t/size_t types,
  which made it rather easy to integrate proper ptrdiff_t usage.
  In fact, stktell() has *always* returned a 'ptrdiff_t' result.
  The documentation in stk(3) has been incorrectly claiming
  since < 1995 it returns 'int', which is wrong. That error
  has been rectified alongside the other updates to stk.

The thickfold patch series was accomplished by fixing compiler warnings,
so I've decided to provide metrics as to the change in the total number
of warnings occurring during compilation.
Change in the number of warnings on Linux when compiling with clang using
-Wsign-compare -Wshorten-64-to-32 -Wsign-conversion -Wimplicit-int-conversion:
4,042 => 3,808 => 248 (progression from cc5e069 => part 1 => part 13)

Side note: To re-emphasize, this patch is one part of a whole
(although it can be used on it's own). The full suite of
changes can be found on the thickfold-size_t branch.
This first part has been submitted severed from the other
changes to make code review less laborious (I hope).
The full patch series can be found here:
https://github.com/JohnoKing/ksh/tree/ad588c003d/patches

Progresses ksh93#592
This code is set to be deleted to fix ACE vulns,
so undo the ptrdiff_t change to avoid a merge conflict.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant