Skip to content

Milestone v0.3.0 #433

Open
Open
@kerthcet

Description

@kerthcet

What would you like to be added:

See the whole list: https://github.com/InftyAI/llmaz/milestone/3

We'll focus on three main things:

  • xPyD serving with heterogeneous devices, we need a new orchestration layer build on top of lws

    • disaggregate PD serving
    • aggregate PD serving
  • More advanced routing policies, e.g. based on request profile & GPU type

  • GPU spot instances scaling ready for production env

Glad to have like:

  • Advanced Pod scaling with dedicated scaler

Why is this needed:

Completion requirements:

This enhancement requires the following artifacts:

  • Design doc
  • API change
  • Docs update

The artifacts should be linked in subsequent comments.

Metadata

Metadata

Assignees

No one assigned

    Labels

    featureCategorizes issue or PR as related to a new feature.needs-priorityIndicates a PR lacks a label and requires one.needs-triageIndicates an issue or PR lacks a label and requires one.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions