Releases: NVIDIA/NVFlare
2.7.1rc1: New features and bug fixes
What's Changed
- Fix webpage links by @nvkevlu in #3812
- Bump vite from 6.3.6 to 6.4.1 in /web by @dependabot[bot] in #3807
- Update Hello-PT and Stats by @holgerroth in #3810
- Bump astro from 5.13.2 to 5.14.4 in /web by @dependabot[bot] in #3776
- Use Python 3.9 typing by @cyyever in #3611
- Fix docs and add missing diagram by @nvkevlu in #3815
- Simplify quantization code by @cyyever in #3612
- Adjust supported minimum Python versions to 3.9 by @cyyever in #3665
- Enhance MLflow receiver by @YuanTingHsieh in #3657
- Fix edge simulator by @YuanTingHsieh in #3813
- Deployment guide of confidential ACI [skip ci] by @IsaacYangSLA in #3816
- Add Azure CVM deployment guide [skip ci] by @IsaacYangSLA in #3820
- Update release notes by @YuanTingHsieh in #3818
- Fix a syntax error in ACI doc [skip ci] by @IsaacYangSLA in #3823
- Update sub_start.sh with one additional option by @IsaacYangSLA in #3821
- Update cc docs by @YuanTingHsieh in #3824
- Add FedNCA publication by @nickl1234567 in #3806
- CC document updates and others by @IsaacYangSLA in #3826
- Add Tensor Stream component for efficient safetensors-based model tensor streaming by @rfilgueiras in #3741
- Add GPU CC docs by @YuanTingHsieh in #3825
- Update FAQ by @nvkevlu in #3832
- Android app enhancements for state management and status on UI by @nvkevlu in #3819
- New CC token verification mechanism by @YuanTingHsieh in #3829
New Contributors
- @nickl1234567 made their first contribution in #3806
Full Changelog: 2.7.0...2.7.1rc1
2.7.0: Major release
2.7.0 Release Contributors (Chronological Order)
The followings lists all contributors to the NVFLARE 2.7.0 release (from 2.6.2 to 2.7.0), ordered by when they first contributed.
- Zhijin - 4 commits
- Holger Roth - 36 commits
- Zhihong Zhang - 19 commits
- Sean Yang - 8 commits
- Ziyue Xu - 26 commits
- Yuan-Ting Hsieh (謝沅廷) - 82 commits
- Yan Cheng - 43 commits
- Isaac Yang - 10 commits
- Chester Chen - 59 commits
- Kevin Lu - 17 commits
- Emmanuel Ferdman - 1 commit
- Georg Slamanig - 1 commit
- Ruben Bagan Benavides - 1 commit
- Peixin - 1 commit
- Yuanyuan Chen - 5 commits
- Francesco Farina - 1 commit
- Suizhi Huang - 1 commit
Welcome First-Time Contributors!
We would like to extend a special thank you to the following contributors who made their first commits to NVFLARE in this release:
- Emmanuel Ferdman (@emmanuel-ferdman) - 1 commit - PR #3469
- Georg Slamanig (@gslama12) - 1 commit - PR #3495
- Ruben Bagan Benavides (@rbagan) - 1 commit - PR #3506
- Yuanyuan Chen (@cyyever) - 5 commits - PR #3573
- Suizhi Huang (@JeanDiable) - 1 commit - PR #3629
Thank you for your contributions to the NVFLARE community!
Highlights of This Release
For the complete list of updates and features, please see the full release notes
Below are some of the key highlights in this version:
Confidential Federated AI
Read more in the FLARE Confidential Federated AI Guide
.
First-of-its-Kind End-to-End IP Protection
This release introduces a first-of-its-kind, end-to-end intellectual property (IP) protection solution for federated learning using confidential computing. The solution supports on-premise deployments on bare metal with AMD CPUs and NVIDIA GPUs running inside Confidential VMs (CVMs).
End-to-End Protection
End-to-end protection means safeguarding both runtime IP (model and code) and ensuring deployment integrity against CVM tampering or unauthorized modification.
Key Capabilities
- Secure Aggregation (Server-Side): Prevents privacy leaks through shared model parameters.
- Model Theft Protection (Client-Side): Protects proprietary model IP during collaboration.
- Data Leak Prevention (Client-Side): with only pre-approved, certified code, no one can alter the code inside the CVM
Job Recipe
Introducing the new Flare Job Recipe: a lightweight way to capture the code needed to specify the client training logic and the server-side algorithm. The same Job Recipe can run seamlessly in SimEnv, PoCEnv, or ProdEnv—from local experiments to production deployments.
With Flare Job Recipe, we are making the federated learning workflow dramatically simpler for data scientists. In most cases, constructing a complete federated learning job requires only about 6+ lines of Python code. When combined with the Client API (typically 4+ lines), building and running federated learning experiments becomes almost effortless.
Example: FedAvg Job Recipe
n_clients = args.n_clients
num_rounds = args.num_rounds
batch_size = args.batch_size
recipe = FedAvgRecipe(
name="hello-pt",
min_clients=n_clients,
num_rounds=num_rounds,
initial_model=SimpleNetwork(),
train_script="client.py",
train_args=f"--batch_size {batch_size}",
)
add_experiment_tracking(recipe, tracking_type="tensorboard")
env = SimEnv(num_clients=n_clients)
run = recipe.execute(env)
print()
print("Result can be found in :", run.get_result())
print("Job Status is:", run.get_status())
print()Enhanced Communication: Port Consolidation and New HTTP Driver
Port Consolidation
Previously, FLARE's server required two separate ports: one for FL client/server communication and another for Admin client/server communication. In 2.7, these are merged into a single configurable port, reducing network configuration complexity. Dual-port mode remains available for environments with stricter network policies.
New HTTPS Driver
The HTTP driver has been rewritten using aiohttp to address prior performance limitations. It now matches gRPC performance, while maintaining the same API, TLS support, and backward compatibility with existing deployments.
Key Benefits
- Consolidated Port: Reduced from two ports to a single port, simplifying deployment.
- Standard Port Compatibility: Use standard HTTPS port 443 - no need for IT to open additional ports.
- High Performance: New HTTP driver matches gRPC in speed and reliability.
Develop Edge Applications with FLARE
FLARE 2.7 extends federated learning to edge devices with features that directly address the unique challenges of edge environments:
Key Features
-
Scalability: Hierarchical Federated Architecture
Hierarchical FLARE allows millions of edge devices to participate efficiently without connecting each directly to the server. -
Intermittent Device Participation: Asynchronous FL based on FedBuff
FLARE handles devices that may join, leave, or fail to return local training results due to network or power interruptions. -
Cross-Platform & No Device Programming Required
Data scientists can deploy models to iOS and Android with FLARE Mobile Development without writing Swift, Objective-C, Java, or Kotlin. FLARE handles PyTorch → Executorch conversion and device training code automatically. -
Simulation Tools: Device Simulator for Large Scale Testing
Test and validate edge deployments at scale before production deployment.
Complete Self-Paced Training Tutorials
Welcome to the five-part course on Federated Learning with NVIDIA FLARE! This course covers everything from the fundamentals to advanced applications, system deployment, privacy, security, and real-world industry use cases.
This comprehensive tutorial is now complete with 100+ notebooks and 80 instructional videos, all fully recorded and ready for learning. See details in Self-Paced Training Tutorials.
Full Changelog: 2.6.0...2.7.0
2.7.0rc10: Bug fixes. Final RC.
What's Changed
- Architecture documentation fix [skip ci] by @chesterxgchen in #3797
- [HE Tutorial] add missing codes by @holgerroth in #3793
- fix requirements bug by @chesterxgchen in #3798
Full Changelog: 2.7.0rc9...2.7.0rc10
2.7.0rc9: Bug fixes and document updates
What's Changed
- Add diagrams and docs improvements [skip ci] by @nvkevlu in #3788
- Fix broken links by @YuanTingHsieh in #3789
- Documentation structure and what is new updates [skip ci] by @chesterxgchen in #3779
- Fix image_stats integration test by @YuanTingHsieh in #3790
- Add System architecture and Security Architecture documentations [skip ci] by @chesterxgchen in #3794
- Updated CC related User Guide [skip ci] by @nvidianz in #3795
- Add allow_out_ports by @YuanTingHsieh in #3796
Full Changelog: 2.7.0rc8...2.7.0rc9
2.7.0rc8: Bug fixes
What's Changed
- FedStats: Improve error messages by @holgerroth in #3770
- Fix Conftest.py by @chesterxgchen in #3771
- Remove production board text [skip ci] by @YuanTingHsieh in #3772
- Add missing init in tf recipes by @YuanTingHsieh in #3773
- Hello-word documentation update [skip ci] by @chesterxgchen in #3778
- Fix colab commands in hello-pt by @holgerroth in #3781
- Lower the log severity on timeout. No change in logic. by @IsaacYangSLA in #3782
- Misc Notebook updates by @chesterxgchen in #3777
- Force new token generation when check a new job by @YuanTingHsieh in #3787
- Make admin server timeout configurable by @YuanTingHsieh in #3786
- Do not generate start_all.sh and use async grpc by default by @nvidianz in #3783
- [BioNeMo] Use decomposer register widget by @holgerroth in #3784
- Include a section describing how to build KBS docker images [skip ci] by @IsaacYangSLA in #3785
- Fixed a certificate issue with newer OpenSSL by @nvidianz in #3775
Full Changelog: 2.7.0rc7...2.7.0rc8
2.7.0rc7: Bug fixes
What's Changed
- Docs examples, notebook updates [skip ci] by @nvkevlu in #3757
- update requirements.txt for Hierarchical Fed Stats [skip ci] by @chesterxgchen in #3758
- Fix federated Stats Advanced folders by @chesterxgchen in #3753
- hello-pt: restore requirements installation and handle Colab by @chesterxgchen in #3760
- fix code that overwrite the original notebook. Fix Colab issue for hello-pt by @chesterxgchen in #3761
- Add colab support 3 by @chesterxgchen in #3762
- Update the swarm learning example. by @IsaacYangSLA in #3759
- Reset task_data in client after executor is run. by @nvidianz in #3763
- Fix SNPAuthorizer issue by @YuanTingHsieh in #3764
- fix image_stats.ipynb by @chesterxgchen in #3768
- Fixed XGB Plugin Compiling Errors by @nvidianz in #3767
- Updates on llm and xgb examples by @ZiyueXu77 in #3765
- fix notebooks in advanced directory 4 by @chesterxgchen in #3769
- Update CC docs [skip ci] by @YuanTingHsieh in #3756
Full Changelog: 2.7.0rc6...2.7.0rc7
2.7.0rc6: Bug fixes
What's Changed
- Fix some typos by @cyyever in #3720
- Add min_clients to CyclicRecipe by @YuanTingHsieh in #3724
- Add PT/TF specific CyclicRecipe for better UX by @YuanTingHsieh in #3723
- Remove heartbeat_timeout in job templates to use default value by @YuanTingHsieh in #3728
- Support native decompose using for tensor using safetensors by @yanchengnv in #3732
- Update What's new in documentation [skip ci] by @chesterxgchen in #3729
- Documentation: add side bar title: somehow the original title got lost [skip ci] by @chesterxgchen in #3736
- Videos: Add Tutorial Videos [skip ci] by @chesterxgchen in #3734
- Sidebar tweak [skip ci] by @chesterxgchen in #3737
- Documentation update: what's new etc. [skip ci ] by @chesterxgchen in #3739
- update doc by @chesterxgchen in #3742
- update by @chesterxgchen in #3743
- Fix admin script user name by @YuanTingHsieh in #3735
- Fix job process start commands by @yanchengnv in #3733
- Adjust get task timeout for proper LLM execution by @ZiyueXu77 in #3744
- Add and Update Edge and Android Docs by @nvkevlu in #3727
- Update hello pt by @holgerroth in #3738
- Fix filter race issue by @ZiyueXu77 in #3745
- [Flower] Add Flower supernode (client) node config information by @holgerroth in #3712
- fix job name due to argument name change by @chesterxgchen in #3746
- Fix admin user name in poc mode by @YuanTingHsieh in #3748
- Enhance snp with retry by @YuanTingHsieh in #3749
- Add pytest nbmake plugable to enable auto notebooks test by @chesterxgchen in #3750
- Fix simulator hanging when client script raise OOM by @YuanTingHsieh in #3725
- Support decomposer registration in recipe by @yanchengnv in #3747
- [MONAI] Remove multi-gpu job by @holgerroth in #3754
- Reduced Memory Footprint by @nvidianz in #3751
- Enhance logging error message by @YuanTingHsieh in #3752
Full Changelog: 2.7.0rc5...2.7.0rc6
2.7.0rc5: Bug fixes
What's Changed
- Add job recipe notebook by @ZiyueXu77 in #3643
- Increase default timeout value by @YuanTingHsieh in #3708
- update xgb doc [skip ci] by @chesterxgchen in #3709
- Improve cli error message by @YuanTingHsieh in #3711
- Update async_num assessor's selection list content by @ZiyueXu77 in #3715
- Fix android connection issues by @nvkevlu in #3716
- Add overall summary for XGB encryption and acceleration [skip ci] by @ZiyueXu77 in #3710
- make safetensors as required dependency by @chesterxgchen in #3718
- 2.7.0 Documentations [skip ci] by @chesterxgchen in #3713
- Update cc provision by @YuanTingHsieh in #3719
Full Changelog: 2.7.0rc4...2.7.0rc5
2.7.0rc4: Bug fixes
What's Changed
- handle None quantile returns. by @chesterxgchen in #3659
- [NVFlare Summer of Code] Adding FedHCA2 (CVPR2024) to the research part. by @JeanDiable in #3629
- Logistics Regression Recipe by @chesterxgchen in #3650
- Move DFStats example from job_api to hello-world by @chesterxgchen in #3660
- Bug Fixes : FLARE-2645, 2644 by @chesterxgchen in #3662
- Documentation Update: Recipes, Quick Start and Tutorials by @chesterxgchen in #3667
- Hello-LR Update by @chesterxgchen in #3663
- Fix documentation issues in Cyclic and Logistics Regression [skip ci] by @chesterxgchen in #3670
- Support extra properties in ExecEnv by @yanchengnv in #3672
- Add wait_for_device info and timeout by @ZiyueXu77 in #3632
- Add SSL support for Android Edge App by @nvkevlu in #3668
- Remove white space by @cyyever in #3664
- Fix use_aio_grpc processing by @yanchengnv in #3676
- Add decomposer register by @yanchengnv in #3678
- Add auto version cleaning for aggr and global model version keeping by @ZiyueXu77 in #3642
- Move tf getting started example by @holgerroth in #3682
- move hello-world/hello-fedavg from advanced/fedavg-with-early-stopping by @chesterxgchen in #3681
- Get started Examples removal -Part 1, PyTorch and Lightning Example by @chesterxgchen in #3680
- Add FlowerRecipe by @holgerroth in #3673
- remove getting_started example by @chesterxgchen in #3683
- Bump astro from 5.12.8 to 5.13.2 in /web by @dependabot[bot] in #3617
- Bump devalue from 5.1.1 to 5.3.2 in /web by @dependabot[bot] in #3648
- Bump vite from 6.3.5 to 6.3.6 in /web by @dependabot[bot] in #3674
- Update edge examples to use prod environment by @holgerroth in #3677
- Sign local folders of startup kits. by @IsaacYangSLA in #3679
- Fix missing job name in edge simulation feg_api by @yanchengnv in #3692
- Fix android job selection logic by @nvkevlu in #3688
- Add Python utility code to verify startup kits by @IsaacYangSLA in #3690
- Increase the grace period so error msg won't show up by @YuanTingHsieh in #3689
- Add job name to sample edge simulator config by @YuanTingHsieh in #3691
- Update the sub_start.sh to support pre launch verification by @IsaacYangSLA in #3694
- Update SimEnv get_status by @YuanTingHsieh in #3693
- Make via-file decomposers configurable by @yanchengnv in #3684
- Consolidate and improve hello examples for numpy by @nvkevlu in #3675
- Documentation Structure Re-org [skip ci] by @chesterxgchen in #3686
- Doc reg-org 2 [skip ci] by @chesterxgchen in #3700
- fix broken links by @chesterxgchen in #3703
- broken Links by @chesterxgchen in #3704
- fix broken links by @chesterxgchen in #3706
- CC provision with docker image workload by @YuanTingHsieh in #3639
- Refactor Run and ExecEnv by @YuanTingHsieh in #3699
- Fix hello numpy by @nvkevlu in #3702
New Contributors
- @JeanDiable made their first contribution in #3629
Full Changelog: 2.7.0rc3...2.7.0rc4
2.7.0rc3: Feature enhancements
What's Changed
- Fix job name in StatsJob by @holgerroth in #3653
- Enhance recipe and flare api abort job to return msg by @YuanTingHsieh in #3656
- handle cases where quantile config is un-defined for certain features by @chesterxgchen in #3654
- Fix examples for latest SFT by @cyyever in #3613
- Update Android edge app and documentation by @nvkevlu in #3651
- Enhance recipe experiment tracking by @YuanTingHsieh in #3655
Full Changelog: 2.7.0rc2...2.7.0rc3