Skip to content

A new training mode called "both" has been added. This allows simultaneous training with both PPO and SAC, each on a separate instance. For example, if you have 4 environments, the even-numbered ones will use PPO, and the odd-numbered ones will use SAC. Therefore, it's necessary to specify even values for num_envs. #6233

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 51 commits into
base: main
Choose a base branch
from

Conversation

VictorBarbosa
Copy link

@VictorBarbosa VictorBarbosa commented Aug 3, 2025

Proposed change(s)

Describe the changes made in this PR.

Useful links (Github issues, JIRA tickets, ML-Agents forum threads etc.)

Types of change(s)

  • Bug fix
  • New feature
  • Code refactor
  • Breaking change
  • Documentation update
  • Other (please describe)

Checklist

  • Added tests that prove my fix is effective or that my feature works
  • Updated the changelog (if applicable)
  • Updated the documentation (if applicable)
  • Updated the migration guide (if applicable)

Other comments

miguelalonsojr and others added 30 commits October 9, 2023 12:58
* Bumped numpy version.

* Updated CHANGELOG.

* Lowered upper python version to <1.24
added feature to toggle between light and dark mode
* Fixed mkdocs config.

* Updated README.
* Added no graphics monitor feature.

* Fixed precommit issues.

* Fixed installation docs for incorrect python version causing conflicts in Windows.
…grade to Sentis 1.3.0-pre.2 (Unity-Technologies#6013)

* Upgraded to pytorch 2.1.1, fixed some windows related test issues.

* Upgraded to Sentis 1.3.0-pre.2

* Updated changelog.
* Upgraded to Sentis 1.3.0-pre.2

* Fixed tensor disposal bug in ModelRunner.
* Added new 3DBall sample package.

* Updated CHANGELOG.
* Updated yamato CI to use updated bokken images.

* Fixed performance tests.
…nologies#6063)

Fixes pre-commit errors of the form:
```
Run actions/setup-ruby@v1
  with:
    ruby-version: 2.6
  env:
    pythonLocation: /opt/hostedtoolcache/Python/3.10.13/x64
    LD_LIBRARY_PATH: /opt/hostedtoolcache/Python/3.10.13/x64/lib
    ImageOS: ubuntu20
------------------------
NOTE: This action is deprecated and is no longer maintained.
Please, migrate to https://github.com/ruby/setup-ruby, which is being actively maintained.
------------------------
Error: Version 2.6 not found
```

While we're upgrading this pre-commit, also upgrade from ruby 2.6 to ruby 2.7.
* Specify python 3.10.12 for nightly runs

Both ml-agents and ml-agents-envs only allow Python versions <=3.10.12: make sure the nightly uses a valid version.

(We might want to consider allowing any 3.10 version so that we can be using the latest security bugfixes, such as 3.10.13: https://www.python.org/downloads/release/python-31013/ )

Sample failure of nightly full-pytest before: https://github.com/alex-mccarthy-unity/ml-agents/actions/runs/8152427823/job/22281884176
Sample passing run afterwards: https://github.com/alex-mccarthy-unity/ml-agents/actions/runs/8153333182/job/22284499278

* Fix dead links in documentation

Together with Unity-Technologies#6065, fix the `markdown-link-check-full` component of nightly runs.

Sample failing run before: https://github.com/alex-mccarthy-unity/ml-agents/actions/runs/8152427823/job/22281884377
Sample passing run after: https://github.com/alex-mccarthy-unity/ml-agents/actions/runs/8154489456/job/22288022888
Before installing `grpcio` on my Apple Silicon mac, running `mlagents-learn --help` threw the following error:

```
ImportError: dlopen(/Users/alex.mccarthy/miniconda3/envs/mlagents/lib/python3.10/site-packages/grpc/_cython/cygrpc.cpython-310-darwin.so, 0x0002): symbol not found in flat namespace '_CFRelease'
```

After installing `grpcio` (which I did from conda, rather than pip), `mlagents-learn --help` ran cleanly.
This fixes builds of ONNX on OS X while installing ml-agents.

OS X builds use Xcode by default, and the Xcode compiler defaults to using C++98 mode for C++ ( https://stackoverflow.com/a/21349148 ). This causes errors building protocol buffer libraries, which need to be compiled with support for C++14 or newer ( protocolbuffers/protobuf#12393 (comment) ).

[This ONNX commit](onnx/onnx@a979e75) changes its compilation to use C++14 mode: releases that include this commit (1.15.0 or newer) build with Xcode by default.

ONNX 1.15.0 uses a newer protocol buffer library, so allow newer versions here too.
…hnologies#6064)

These references were missed when upgrading from pytorch 1.x to 2.x in Unity-Technologies#6013

References found by running `grep -R '1\.13\.1' .`

Install command chosen from the guide at https://pytorch.org/get-started/locally/
* Fix list being rendered incorrectly in webdocs

I assume this extra blank line will fix the list not being correctly formatted on https://unity-technologies.github.io/ml-agents/#releases-documentation

* Fix typos in docs

* Fix more mis-rendered lists

Add a blank line before bulleted lists in markdown files to avoid them being rendered as in-paragraph sentences that all start with hyphens.

* Fix typos in python comments used to generate docs
Ignore spurious dead links to tensorflow.org pages, since we're seeing an infinite redirect for some reason that doesn't reproduce in the browser or with `curl`.
…o and torchvideo (Unity-Technologies#6074)

* Fix GPU continuous build: correct torchaudio version

torchaudio 0.17 doesn't exist, but 2.2 does.

Use a slightly older cuda version, since that makes cuda detection work on the machines we're running on (RTX 2080's on Ubuntu 18.04, which presumably aren't compatible with CUDA 12).

(I'm not sure if the tests actually need torchaudio or torchvision, since those aren't listed dependencies of any of our software, but let's at least install a valid one)

Passing yamato run with this change: https://unity-ci.cds.internal.unity3d.com/job/34869354/logs

* Don't install torchaudio and torchvision, since they're unused

Sample passing GPU test run: https://unity-ci.cds.internal.unity3d.com/job/34891013/logs
This is the minimum version supported by the latest Sentis release 1.3.0: https://discussions.unity.com/t/about-sentis-beta/260899

Remove the TextMeshPro package, which is deprecated in 2023.2 (and causes duplicate symbol errors): https://forum.unity.com/threads/2023-2-latest-development-on-textmesh-pro.1434757/

Tested: ran models in the 3DBall scene on OS X
Added the rest of the incomplete sentence around using anaconda (conda for using mlagents).
Removing "or higher" because:
(i) ./ml-agents/setup.py requires >=3.10.1,<=3.10.12 
(ii) python 3.10.13 is the default conda install, and 3.10.13 does not work correctly with numpy 1.21.2
alex-mccarthy-unity and others added 21 commits March 13, 2024 13:23
…(2022 and trunk) (Unity-Technologies#6079)

* Fix com.unity.ml-agents test 2022.3 on win by upgrading Unity versions

Run all CI tests against 2023.2 (required by Sentis), not 2022.3

Sample failing sub-jobs before: https://unity-ci.cds.internal.unity3d.com/job/35022030/dependency-graph
Sample passing sub-jobs after: https://unity-ci.cds.internal.unity3d.com/job/35033024/dependency-graph

Note that `trunk` jobs are still failing after this fix.
Those will be investigated separately since they've been failing since March 8: https://unity-ci.cds.internal.unity3d.com/job/34919178/dependency-graph

* Ignore yamato-parser output files

* Disable `trunk` tests, which break with Unity 6

Clean run of "Run All Combinations of Editors/Platforms Tests" after this change: https://unity-ci.cds.internal.unity3d.com/job/35037130/dependency-graph

* Print full diffs when a sensor mismatch occurs in tests

* Refactor tests to use positions instead of hardcoded numbers

* Disable `RigidBodySensorTests.TestBodiesWithJoint` which fails in 2023.2

* Fix editor version in test_versions.metafile (use 2023.2)
…indows (Unity-Technologies#6082)

Fixes Unity-Technologies#6047

Fixes the following errors when installing ml-agents-envs on windows if numpy 1.21.2 is already installed:
```
  Building wheel for numpy (pyproject.toml) ... error
  error: subprocess-exited-with-error

  × Building wheel for numpy (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> [326 lines of output]
      setup.py:63: RuntimeWarning: NumPy 1.21.2 may not yet support Python 3.10.
        warnings.warn(
      Running from numpy source directory.
```
…es#6083)

The 8.x release should contain dotnet/runtime#90342 which fixes dotnet/runtime#80619.

I hope this will fix flaky failures like https://github.com/Unity-Technologies/ml-agents/actions/runs/8268945605/job/22623023348 of the form:
```
dotnet-format............................................................Failed
- hook id: dotnet-format
- exit code: 1

System.IO.IOException: The system cannot open the device or file specified. : 'NuGet-Migrations'
   at System.Threading.Mutex.CreateMutexCore(Boolean initiallyOwned, String name, Boolean& createdNew)
   at System.Threading.Mutex..ctor(Boolean initiallyOwned, String name)
   at NuGet.Common.Migrations.MigrationRunner.Run(String migrationsDirectory)
   at Microsoft.DotNet.Configurer.DotnetFirstTimeUseConfigurer.Configure()
   at Microsoft.DotNet.Cli.Program.ConfigureDotNetForFirstTimeUse(IFirstTimeUseNoticeSentinel firstTimeUseNoticeSentinel, IAspNetCertificateSentinel aspNetCertificateSentinel, IFileSentinel toolPathSentinel, Boolean isDotnetBeingInvokedFromNativeInstaller, DotnetFirstRunConfiguration dotnetFirstRunConfiguration, IEnvironmentProvider environmentProvider, Dictionary`2 performanceMeasurements)
   at Microsoft.DotNet.Cli.Program.ProcessArgs(String[] args, TimeSpan startupTime, ITelemetry telemetryClient)
   at Microsoft.DotNet.Cli.Program.Main(String[] args)
```
* Upgraded to Sentis v1.4.0-pre.3

* Disabling sonar qube yamato job. Will add gha sonar scanner at a later date.

* Addressing PR comments.

* Addressing feedback.

* Upgraded to sentis 2.0.0.

* Fixed failing tests.

* Fixed soccertwos policy.

* Fixed pytorch deprecation message during training startup. Updated Installation.md

* Updated installation docs.

* Fixed failing torch utils test.
* Update PerformancProject and DevProject.

* Removed mac perf tests.
* adding wrench

* correct build path

* release branch and 6.0 target

* XmlDoc update

* adressing xml docs

* more docs

* updating the release

* test xmldoc fixes

* more xml doc fixes

* Uncompress the 3DBall sample

* Fix API documentation

* more xml doc fixes

* Revert "Uncompress the 3DBall sample"

This reverts commit d67dc94.

* reformat MaxStep xml

* more xml doc fixes

* fix more xml doc issues

* fix summary tag

* Updated changelog for missing PRs.

* Removed tabs from .tests.json.

* Updated changelog.

* Removed tabs from CHANGELOG.

* Fix failing ci post upgrade (Unity-Technologies#6141) (Unity-Technologies#6145)

* Update PerformancProject and DevProject.

* Removed mac perf tests.

* Removing standalone tests dep from wrench packaging.

* Fixed package works issues. Updated com.unity.ml-agents.md.

* Updated com.unity.ml-agents.md.

* Updated package version in Academy.cs

* Adding back in package pack deps.

* Updated package pack testing deps..

* Regenerated wrench ymls.

* License update.

* Extensions License update.

* Another license tweak.

* Another license tweak.

* Upgraded to sentis 2.1.0.

* Updated standalone yamato build test to using new ml-agents ubuntu ci bokken image.

---------

Co-authored-by: alexandre-ribard <[email protected]>
Co-authored-by: Aurimas Petrovas <>
* Bumped versions.

* Bumped versions.

* Updated unity projects.

* Updated version validation.

* Fixed failing GPU test.
* adding wrench

* correct build path

* release branch and 6.0 target

* XmlDoc update

* adressing xml docs

* more docs

* updating the release

* test xmldoc fixes

* more xml doc fixes

* Uncompress the 3DBall sample

* Fix API documentation

* more xml doc fixes

* Revert "Uncompress the 3DBall sample"

This reverts commit d67dc94.

* reformat MaxStep xml

* more xml doc fixes

* fix more xml doc issues

* fix summary tag

* Updated changelog for missing PRs.

* Removed tabs from .tests.json.

* Updated changelog.

* Removed tabs from CHANGELOG.

* Fix failing ci post upgrade (Unity-Technologies#6141) (Unity-Technologies#6145)

* Update PerformancProject and DevProject.

* Removed mac perf tests.

* Removing standalone tests dep from wrench packaging.

* Fixed package works issues. Updated com.unity.ml-agents.md.

* Updated com.unity.ml-agents.md.

* Updated package version in Academy.cs

* Adding back in package pack deps.

* Updated package pack testing deps..

* Regenerated wrench ymls.

* License update.

* Extensions License update.

* Another license tweak.

* Another license tweak.

* Upgraded to sentis 2.1.0.

* Updated standalone yamato build test to using new ml-agents ubuntu ci bokken image.

* Bumped python and extensions package versions.

* Changed ci image for pytest gpu yamato test.

* Changed default cuda dtype to torch.float32.

* Updated version validation and extensions version.

* Fixed failing GPU test.

* Fixed failing GPU test.

* Updated readme table and make_readme_table.py

* Updated publish to pypi gha.

---------

Co-authored-by: alexandre-ribard <[email protected]>
Co-authored-by: Aurimas Petrovas <>
…dge setup in Installation.md (Unity-Technologies#6164)

* fix(docs): remove --branch flag from git clone command for bleeding edge setup in Installation.md

The `--branch` flag was causing a fatal error during the cloning process:
"fatal: You must specify a repository to clone."

* docs: correct config setting name from `conditioning_type` to `goal_conditioning_type`

Updated the documentation to reflect the correct config setting name `goal_conditioning_type` instead of `conditioning_type` in the config YAML. This ensures accuracy and prevents potential confusion for users following the documentation.
* Upgrade projects to Unity 6000.0

* Upgrade obsolete Unity API

* Use stable version of upm-ci-utils

* Upgrade Wrench configuration to Unity 6000.0

* Use ubuntu-ci v1.0.0 for Yamato tests

* Use b1.medium VM for the Pack test

* Rely solely on IsEqualUsingDot quaternion comparison in the Pose inverse test due to float inaccuracy

* Re-enable Unity trunk ml-agents tests

* Use ubuntu-24.04 for pre-commit

* Use ubuntu-22.04 for colab

* Add missing PR references to the changelog
* Update grpcio version in setup
* Update to Inference Engine 2.2.1

* Update documentation
…Technologies#6227)

* Move all files from the extension package to the main package

* Update the extension tests

* Move Runtime Input tests to a separate assembly

* Move Runtime example test to Tests

* Update CHANGELOG.md

* Update the doc

* Change namespace to Unity.MLAgents.Input

* Add MovedFrom tags

* Upgrade upm-pvp
* Remove Sample from the package

* Update CHANGELOG

* Remove sample description in package.json
…neous training with both PPO and SAC, each on a separate instance. For example, if you have 4 environments, the even-numbered ones will use PPO, and the odd-numbered ones will use SAC. Therefore, it's necessary to specify even values for num_envs.
@CLAassistant
Copy link

CLAassistant commented Aug 3, 2025

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
8 out of 10 committers have signed the CLA.

✅ miguelalonsojr
✅ alex-mccarthy-unity
✅ Siddhu2502
✅ xyz2022
✅ alhasacademy96
✅ alexander-suvorov
✅ maryamziaa
✅ louisgthier
❌ hamidrexa
❌ VictorBarbosa
You have signed the CLA already but the status is still pending? Let us recheck it.

"Pillow>=4.2.1",
"protobuf>=3.6,<3.20",
"protobuf>=3.6,<3.21",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cycode: Security vulnerabilities found in newly introduced dependency.

Ecosystem PyPI
Dependency protobuf
Dependency Paths protobuf 3.20.3
Direct Dependency Yes

The following vulnerabilities were introduced:

GHSA CVE Severity Fixed Version
GHSA-8qvm-5x2c-j2w7 CVE-2025-4565 HIGH 4.25.8

Highest fixed version: 4.25.8

Description

Detects when new vulnerabilities affect your dependencies.

Tell us how you wish to proceed using one of the following commands:

Tag Short Description
#cycode_vulnerable_package_fix_this_violation Fix this violation via a commit to this branch
#cycode_ignore_manifest_here <reason> Applies to this manifest in this request only

⚠️ When commenting on Github, you may need to refresh the page to see the latest updates.

"Pillow>=4.2.1",
"protobuf>=3.6,<3.20",
"protobuf>=3.6,<3.21",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cycode: Security vulnerabilities found in newly introduced dependency.

Ecosystem PyPI
Dependency protobuf
Dependency Paths protobuf 3.20.3
Direct Dependency Yes

The following vulnerabilities were introduced:

GHSA CVE Severity Fixed Version
GHSA-8qvm-5x2c-j2w7 CVE-2025-4565 HIGH 4.25.8

Highest fixed version: 4.25.8

Description

Detects when new vulnerabilities affect your dependencies.

Tell us how you wish to proceed using one of the following commands:

Tag Short Description
#cycode_vulnerable_package_fix_this_violation Fix this violation via a commit to this branch
#cycode_ignore_manifest_here <reason> Applies to this manifest in this request only

⚠️ When commenting on Github, you may need to refresh the page to see the latest updates.

@@ -72,7 +72,7 @@ def run(self):
"attrs>=19.3.0",
"huggingface_hub>=0.14",
'pypiwin32==223;platform_system=="Windows"',
"onnx==1.12.0",
"onnx==1.15.0",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cycode: Security vulnerabilities found in newly introduced dependency.

Ecosystem PyPI
Dependency onnx
Dependency Paths onnx 1.15.0
Direct Dependency Yes

The following vulnerabilities were introduced:

GHSA CVE Severity Fixed Version
GHSA-h36j-8vv3-cj52 CVE-2024-7776 HIGH 1.17.0
GHSA-6rq9-53c3-f7vj CVE-2024-5187 HIGH 1.16.2
GHSA-whh8-fjgc-qp73 CVE-2024-27318 HIGH 1.16.0

Highest fixed version: 1.17.0

Description

Detects when new vulnerabilities affect your dependencies.

Tell us how you wish to proceed using one of the following commands:

Tag Short Description
#cycode_vulnerable_package_fix_this_violation Fix this violation via a commit to this branch
#cycode_ignore_manifest_here <reason> Applies to this manifest in this request only

⚠️ When commenting on Github, you may need to refresh the page to see the latest updates.

@VictorBarbosa
Copy link
Author

VictorBarbosa commented Aug 3, 2025

We often find ourselves unsure whether to use PPO or SAC, and sometimes we only realize we should’ve chosen the other one after spending a long time training. So why not train both at the same time?

I’ve added a feature that allows simultaneous training with both PPO and SAC. Each will run on its own separate instance. For this to work properly, it's important to set num_envs to a value divisible by 2.

Below, you can see an example of how the YAML configuration looks for this training setup.

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.