Skip to content

Conversation

@chrisnovakovic
Copy link
Contributor

There is currently an assumption that the Python interpreter used while building .pex files (i.e. the value of DefaultInterpreter) is also the Python interpreter that runs them. This can be overridden with the shebang parameter on individual python_binary and python_test targets, which sets the shebang in the generated .pex file to a specific value. The functionality this provides is limited:

  • If additional interpreter arguments are specified by the target, the generated .pex file will depend on a shell at /bin/sh in the run-time environment (which isn't guaranteed to exist, e.g. in distroless containers), as well as /usr/bin/env (also not guaranteed to exist).
  • It is possible to build the .pex file using a Python interpreter defined as a build target in the repo, but it isn't possible to run the .pex file using that interpreter because the .pex file's shebang is (by definition) static, but the interpreter's path relative to the .pex file's path varies depending on the context (e.g. whether the .pex file exists inside a build environment vs outside of one). This was discussed in CI: temporarily remove in-repo Python tests #247.

Provide more flexibility in how .pex files are run by prepending a native-code preamble that implements dynamic searching of Python interpreters. The behaviour of this preamble is controlled at run time by a special zip member within the .pex file,
.bootstrap/PLZ_PREAMBLE_CONFIG, which contains a list of candidate paths for Python interpreters under which the .pex file should be executed and a list of arguments to pass to the interpreter when attempting to execute it. The lists can be customised per python_binary or python_test target via the runtime_interpreters and runtime_interpreter_args parameters respectively; new plugin configuration options, DefaultRuntimeInterpreters and DefaultRuntimeInterpreterArgs, provide default values. If both are unspecified, the plugin's current behaviour is maintained (i.e. the same interpreter is used to both build and run the code). The flexibility offered by the new .pex preamble is sufficient to be able to remove python_binary and python_test's shebang parameter (and the DefaultShebang plugin configuration option) altogether.

The new .pex preamble is compact enough to prepend to all .pex files in a repo without collectively making them prohibitively large: ~98KB statically linked with musl on Linux, ~83KB dynamically linked on Darwin (which doesn't have a stable kernel ABI, and strongly discourages static linking).

Among other possibilities, this allows for the creation of .pex files that are truly portable between run-time environments of the same operating system and architecture - e.g., the preamble can be configured to attempt to execute an interpreter that only exists within a production environment (such as a distroless container with its interpreter at /python/bin/python3), falling back on an interpreter defined as a build target in the repo if the .pex file is still located within Please's development/testing environment.

test:
name: Run tests
runs-on: ${{ (inputs.platform == 'freebsd_amd64' || inputs.platform == 'linux_amd64') && 'ubuntu-24.04' || (inputs.platform == 'linux_arm64' && 'ubuntu-24.04-arm' || (inputs.platform == 'darwin_amd64' && 'macos-15-intel' || (inputs.platform == 'darwin_arm64' && 'macos-15' || 'unknown'))) }}
steps:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The proof that all of this works is in the "in-repo please_pex tool" test matrix passing - the others will fail because they use the build defs from this PR but the older stable version of please_pex, which is missing features added by this PR. The GHA workflow is now structured in such a way that only the "in-repo please_pex tool" test matrix needs to pass in order for a new version of please_pex to be released, so after I've tagged a new please_pex version and updated //tools:please_pex to download it, the other test matrices will also start passing again.

Copy link
Contributor

@toastwaffle toastwaffle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have reviewed up to and including tools/please_pex/preamble/path.h; will get to the rest tomorrow hopefully

Comment on lines +798 to +801
# If the target name ends with an entry point name, strip it - runtime_deps won't accept
# target names containing entry point names (which isn't really a problem for us,
# because the entire target will have to be built for the entry point to be provided
# anyway).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this comment be attached to line 784? Or should line 784 be beneath this comment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think it makes more sense to move the definition of i_target here.

@@ -0,0 +1,25 @@
subinclude("///cc//build_defs:c")

remote_file(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the rationale for not doing this as a github_repo?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OpenBSD has a pretty big code base and it seemed overkill to download the entire thing for the sake of a 21KB header file.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. Might be worth leaving a comment to explain.

// NewWriter constructs a new Writer.
func NewWriter(entryPoint, interpreter string, options []string, stamp string, zipSafe, noSite bool) *Writer {
func NewWriter(entryPoint string, interpreters []string, interpreterArgs []string, stamp string, zipSafe, noSite bool) *Writer {
if interpreterArgs == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs an explanation comment

Copy link
Contributor Author

@chrisnovakovic chrisnovakovic Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we discussed offline, you're correct that this is unnecessary. Thanks, removed it.

@chrisnovakovic
Copy link
Contributor Author

The head of the branch addresses all the comments with a 👍 beneath them, plus (before you get around to reviewing it) a tidy-up of get_config to return an err_t rather than logging errors directly. This also required more liberal use of const in pexerror.c, which can only be a good thing.

@chrisnovakovic
Copy link
Contributor Author

The FreeBSD tests are failing because new Python packages were pushed to Ports, so the hashes changed. I can't wait until the next Quarterly Ports release comes out so we can migrate to that...

@chrisnovakovic
Copy link
Contributor Author

The following rebase will pick up the FreeBSD hash changes, which should cause the tests in the please_pex matrix to start passing again. Apologies for the noise.

There is currently an assumption that the Python interpreter used while
building .pex files (i.e. the value of `DefaultInterpreter`) is also the
Python interpreter that runs them. This can be overridden with the
`shebang` parameter on individual `python_binary` and `python_test`
targets, which sets the shebang in the generated .pex file to a specific
value. The functionality this provides is limited:

- If additional interpreter arguments are specified by the target, the
  generated .pex file will depend on a shell at `/bin/sh` in the
  run-time environment (which isn't guaranteed to exist, e.g. in
  distroless containers), as well as `/usr/bin/env` (also not guaranteed
  to exist).
- It is possible to build the .pex file using a Python interpreter
  defined as a build target in the repo, but it isn't possible to run
  the .pex file using that interpreter because the .pex file's shebang
  is (by definition) static, but the interpreter's path relative to the
  .pex file's path varies depending on the context (e.g. whether the
  .pex file exists inside a build environment vs outside of one). This
  was discussed in please-build#247.

Provide more flexibility in how .pex files are run by prepending a
native-code preamble that implements dynamic searching of Python
interpreters. The behaviour of this preamble is controlled at run time
by a special zip member within the .pex file,
`.bootstrap/PLZ_PREAMBLE_CONFIG`, which contains a list of candidate
paths for Python interpreters under which the .pex file should be
executed and a list of arguments to pass to the interpreter when
attempting to execute it. The lists can be customised per
`python_binary` or `python_test` target via the `runtime_interpreters`
and `runtime_interpreter_args` parameters respectively; new plugin
configuration options, `DefaultRuntimeInterpreters` and
`DefaultRuntimeInterpreterArgs`, provide default values. If both are
unspecified, the plugin's current behaviour is maintained (i.e. the same
interpreter is used to both build and run the code). The flexibility
offered by the new .pex preamble is sufficient to be able to remove
`python_binary` and `python_test`'s `shebang` parameter (and the
`DefaultShebang` plugin configuration option) altogether.

The new .pex preamble is compact enough to prepend to all .pex files in
a repo without collectively making them prohibitively large: ~98KB
statically linked with musl on Linux, ~83KB dynamically linked on Darwin
(which doesn't have a stable kernel ABI, and strongly discourages static
linking).

Among other possibilities, this allows for the creation of .pex files
that are truly portable between run-time environments of the same
operating system and architecture - e.g., the preamble can be configured
to attempt to execute an interpreter that only exists within a
production environment (such as a distroless container with its
interpreter at `/python/bin/python3`), falling back on an interpreter
defined as a build target in the repo if the .pex file is still located
within Please's development/testing environment.
@@ -0,0 +1,25 @@
subinclude("///cc//build_defs:c")

remote_file(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. Might be worth leaving a comment to explain.

errmsg_t *errmsg = NULL;
size_t len = strlen(msg);

if (wrapped == NULL) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this check be before line 53? seems odd to check for nullness when you've already used wrapped

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right - to be honest the null check doesn't really make a whole lot of sense (it's not something that can happen at run-time). I'll remove it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe just call SLIST_FIRST(wrapped) after the check. I'll do that instead.

err_t *err_wrap(const char *msg, err_t *wrapped);
char *err_str(err_t *err);

#endif /* __PEXERROR_H__ */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#endif /* __PEXERROR_H__ */
#endif /* !__PEXERROR_H__ */

I think? the other 2 header files have a ! here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, thanks. It's only stylistic, but it's good to be consistent.

@chrisnovakovic chrisnovakovic merged commit 4c0d0c0 into please-build:master Dec 3, 2025
24 of 75 checks passed
@chrisnovakovic chrisnovakovic deleted the runtime-interpreters branch December 3, 2025 15:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants