Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion source/funding/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,13 +49,13 @@ Any opinions, findings, and conclusions of our research projects are those of th
![Ford](images/ford.png)
![Cisco](images/cisco.png)
![Google](images/google.png)
![Slalom](images/slalom.png)
</div>

## Past Sponsors

<div class='flex-row'>

![Slalom](images/slalom.png)
![VMware](images/vmware.png)
![Salesforce](images/salesforce.png)
![Meta](images/meta.png)
Expand Down
1 change: 1 addition & 0 deletions source/publications/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -159,6 +159,7 @@ venues:
name: The 7th Conference on Machine Learning and Systems
date: 2024-05-13
url: https://mlsys.org/Conferences/2024
acceptance: 22.02%
- key: MLSys'23
name: The 6th Conference on Machine Learning and Systems
date: 2023-06-04
Expand Down
16 changes: 6 additions & 10 deletions source/research/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,35 +20,31 @@ One of our key focus areas is multi-scale resource sharing for AI accelerators f
We also work on planning and optimizing executions of distributed AI systems.
Major projects include [Salus](https://github.com/SymbioticLab/Salus) and [Tiresias](https://github.com/SymbioticLab/Tiresias).

## [Energy-Efficient Systems](/publications/#/topic:Energy-Efficient%20Systems)
The energy consumption of computing systems is increasing with the rising popularity of Big Data and AI.
While the hardware community has invested considerable effort in energy optimizations, we observe that similar efforts on the software side are significantly lacking.
[Our initiative](https://ml.energy) to understand and optimize the energy consumption of modern AI workloads is exposing new ways to understand energy consumption from software.
Major projects include [Zeus](https://ml.energy/zeus), the first GPU energy-vs-training performance tradeoff optimizer for DNN training.

## [Disaggregation](/publications/#/topic:Disaggregation)
Modern datacenters often overprovision application memory to avoid performance cliffs, leading to 50% underutilization on average.
Our research addresses this fundamental problem via practical memory disaggregation, whereby an application can leverage both local and remote memory by leveraging high-speed networks, and more recently with emerging CXL technology.
We are building systems that can ensure a disaggregated system with 100s of nanoseconds latency.
We are generally interested in disaggregating all resources for fully utilized datacenters.
Major projects include [Infiniswap](https://infiniswap.github.io/), the first practical memory disaggregation software, and [TPP](https://arxiv.org/abs/2206.02878).


## [Wide-Area Computing](/publications/#/topic:Wide-Area%20Computing)
Most data is generated outside cloud datacenters.
Collecting voluminous remote data to a central location not only presents a bandwidth and storage problem but increasingly is likely to violate privacy regulations such as General Data Protection Regulation (GDPR).
In these settings, data systems must minimize communication instead.
We are developing systems, algorithms, and benchmarks to analyze data distributed across multiple cloud datacenters and end-user devices to enable geo-distributed/federated learning and analytics.
Major projects include [FedScale](https://fedscale.ai/), the largest benchmark and a scalable and extensible platform for federated learning.


## [Energy-Efficient Systems](/publications/#/topic:Energy-Efficient%20Systems)
The energy consumption of computing systems is increasing with the rising popularity of Big Data and AI.
While the hardware community has invested considerable effort in energy optimizations, we observe that similar efforts on the software side are significantly lacking.
[Our initiative](https://ml.energy) to understand and optimize the energy consumption of modern AI workloads is exposing new ways to understand energy consumption from software.
Major projects include [Zeus](https://ml.energy/zeus), the first GPU energy-vs-training performance tradeoff optimizer for DNN training.


## [Datacenter Networking](/publications/#/topic:Datacenter%20Networking)
We also work on network resource management schemes to isolate Big Data and AI systems at the edge and inside the datacenter network.
Our recent focus has primarily been on emerging networking technologies such as low-latency RDMA-enabled networks, programmable switches, and SmartNICs.
We are also interested in improving the existing networking infrastructure such as improving QoS for low-latency RPCs in datacenters.
Major projects include [Aequitas](https://github.com/SymbioticLab/Aequitas) and [Justitia](https://github.com/SymbioticLab/Justitia).


## [Big Data Systems](/publications/#/topic:Big%20Data%20Systems)
In the recent past, we worked on designing and improving big data systems via new algorithms for resource scheduling, caching data in memory, and dynamic query planning to improve resource efficiency, application performance, and fairness.