Skip to content

[RAPTOR-12499] SemGrep detected root user in dronin environments #1423

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 13 commits into
base: master
Choose a base branch
from

Conversation

peterzdeb
Copy link
Contributor

This repository is public. Do not put here any private DataRobot or customer's data: code, datasets, model artifacts, .etc.

Summary

All the reported images were updated with non-root user. For Changuard images we have nonroot and nobody for other Debian-type images.

Rationale

SemGrep security scanner is configured for all DataRobot repositories.
For the environment images with DRUM it reports two issues:

  • last-user-is-root: The last user in the container is root. This is a security hazard because if an attacker gains control of the container they will have root access. Switch back to another user after running commands as root.
  • missing-user-entrypoint: By not specifying a USER, a program in the container may run as root. This is a security hazard. If an attacker can control a process running as root, they may have control over the container. Ensure that the last USER in a Dockerfile is a USER other than root.

@engprod-2 engprod-2 bot changed the title SemGrep detected root user in dronin environments [RAPTOR-12499] SemGrep detected root user in dronin environments May 12, 2025
@devexp-slackbot
Copy link

The Needs Review labels were added based on the following file changes.

Team @datarobot/buzok (#buzok) was assigned because of changes in files:

public_dropin_environments/python311_genai/Dockerfile

Team @datarobot/codegen (#scoring-code) was assigned because of changes in files:

public_dropin_environments/python311_genai/Dockerfile

Team @datarobot/core-modeling (#predictive-ai) was assigned because of changes in files:

custom_model_runner/CHANGELOG.md
public_dropin_environments/python3_keras/Dockerfile
public_dropin_environments/python3_pmml/Dockerfile
public_dropin_environments/python3_pytorch/Dockerfile
public_dropin_environments/python3_sklearn/Dockerfile
public_dropin_environments/python3_xgboost/Dockerfile
public_dropin_environments/r_lang/Dockerfile

Team @datarobot/custom-models (#custom-models) was assigned because of changes in files:

custom_model_runner/CHANGELOG.md
example_dropin_environments/julia_mlj/Dockerfile
example_dropin_environments/python311_genai-gpu/Dockerfile
example_dropin_environments/python39_genai/Dockerfile
public_dropin_environments/java_codegen/Dockerfile
public_dropin_environments/python311/Dockerfile
public_dropin_environments/python3_keras/Dockerfile
public_dropin_environments/python3_onnx/Dockerfile
public_dropin_environments/python3_pmml/Dockerfile
public_dropin_environments/python3_pytorch/Dockerfile
public_dropin_environments/python3_sklearn/Dockerfile
public_dropin_environments/python3_xgboost/Dockerfile
public_dropin_environments/r_lang/Dockerfile
public_dropin_environments_sandbox/fips_python3_keras/Dockerfile
public_dropin_environments_sandbox/python3_keras/Dockerfile
public_dropin_nim_environments/nim_sidecar/Dockerfile

Team @datarobot/tracking-agent (#tracking-agent-reviews) was assigned because of changes in files:

public_dropin_nim_environments/nim_sidecar/Dockerfile

If you think that there are some issues with ownership, please discuss with C&A domain at #sdtk slack channel and create PR to update DRCODEOWNERS\CODEOWNERS file.

Comment on lines -46 to -53

RUN sh -c "python -m venv ${VIRTUAL_ENV} && \
. ${VIRTUAL_ENV}/bin/activate && \
python -m ensurepip --default-pip && \
python -m pip install --upgrade pip && \
python -m pip install --no-cache-dir -r requirements.txt && \
find ${VIRTUAL_ENV} -type d -name '__pycache__' -exec rm -rf {} + && \
rm -rf /usr/bin/rm /usr/bin/find"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All that commands do not need a separate shell when building Docker image. I moved the dependencies installation step into build stage.

Comment on lines -36 to -37
COPY --from=build /usr/bin/rm /usr/bin/rm
COPY --from=build /usr/bin/find /usr/bin/find
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No cleanup needed as we remove __pycache__ in the build stage. The rm and find commands won't be needed in the runtime image

Comment on lines -129 to -130
python ensure_sh_shebang.py ${R_HOME}/bin && \
mkdir -p /usr/share/doc/R/html && \
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved these two lines into a separate steps in the Dockerfile.

Copy link
Contributor

@shopin32 shopin32 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I think we do not own those anymore

/public_dropin_environments/python311_genai @datarobot/buzok @datarobot/codegen
/public_dropin_environments/python311_genai_agents @datarobot/buzok @datarobot/codegen
, custom-model is now in charge.

Copy link
Contributor

@tsdaemon tsdaemon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM Buzok

@@ -24,4 +24,6 @@ ENV WITH_ERROR_SERVER=1
#Uncomment the following line to switch to production mode
#ENV PRODUCTION=1 MAX_WORKERS=4 SHOW_STACKTRACE=1

USER nobody
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might not matter too much. In LRS we run the container with a specific set of user id / group id (defaults are 1000:1000), also we force chown on the /opt/code to these.
I'm not sure if setting USER to something else would do it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you test building such image and testing on staging and/or locally?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure I can test this on staging before merging the PR. Will try

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@klichukb I tested this image on Staging. It really lacks permissions if running for any non-root user (when tried to install model dependencies in the exec env).
The only solution to this is to grant permissions to all users for the $HOME directory which is "/opt". That directory keeps pip cache, virtualenvs the the code.

We can't know which user will run this container but we definitely can't run the containers from root. Chainguard will never let us to use 1000 as a user - there is no way to create custom users, it only gives us nonroot user that we can use when running the container - that's requirement from STIG/FIPS compliance.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also sorry, I didnt realize I commented on specifically julia.. Its an irrelevant env, so if any tests - definitely not the Julia one

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, for that reason I updated all the Dockerfiles in this PR.

find ${VIRTUAL_ENV} -type d -name '__pycache__' -exec rm -rf {} + && \
rm -rf /usr/bin/rm /usr/bin/find"
# Copy preinstalled virtualenv
COPY --from=build --chown=nonroot ${VIRTUAL_ENV} ${VIRTUAL_ENV}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this definitely needs testing with actual datarobot. we use 1000:1000 to run by default.

@peterzdeb
Copy link
Contributor Author

Drop-in environments are running with security context by LRS. Whatever we set in the Dockerfile can be overriden on runtime. Users who request that custom environments may wish to reinstall packages, including DRUM

Planning to close this PR as won't do.
cc @Mint-Martini

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants