[WIP][AQUA] GPU Shape Recommendation #1221

elizjo · 2025-07-07T20:41:19Z

Wrote an additional POST API and aqua command for recommending GPU shapes for a particular model

ads aqua deployment recommend_shape --model_id  'ocid1.datasciencemodel.oc1.<ocid>'

Returns

{
  "model_ocid" : "ocid1.datasciencemodel.oc1.<ocid>"
}

Returns

{                                                                                                                                                                                                                                            
    "display_name": "Almawave/Velvet-14B",
    "recommendations": [
        {
            "shape_details": {
                "available": false,
                "core_count": null,
                "memory_in_gbs": null,
                "name": "BM.GPU.MI300X.8",
                "shape_series": "GPU",
                "gpu_specs": {
                    "gpu_memory_in_gbs": 1536,
                    "gpu_count": 8,
                    "gpu_type": "MI300X",
                    "quantization": [
                        "fp8",
                        "gguf"
                    ],
                    "ranking": {
                        "cost": 90,
                        "performance": 90
                    }
                }
            },
            "configurations": [
                {
                    "model_details": {
                        "model_size_gb": 28.16,
                        "kv_cache_size_gb": 26.84,
                        "total_model_gb": 55.0
                    },
                    "deployment_params": {
                        "quantization": "bfloat16",
                        "max_model_len": 131072,
                        "params": ""
                    },
                    "recommendation": "No override PARAMS needed. \n\nModel fits well within the allowed compute shape (55.0GB used / 1536.0GB allowed)."
                }
            ]
        }
    ],
     "troubleshoot": ""
}

Status: business logic works, API works, unit tests finished, rich diff CLI table finished

github-actions · 2025-07-07T21:10:04Z

📌 Cov diff with main:

📌 Overall coverage:

github-actions · 2025-07-08T23:16:28Z

📌 Cov diff with main:

📌 Overall coverage:

github-actions · 2025-07-25T18:00:09Z

📌 Cov diff with main:

📌 Overall coverage:

github-actions · 2025-07-29T19:33:45Z

📌 Cov diff with main:

📌 Overall coverage:

github-actions · 2025-07-30T18:54:42Z

📌 Cov diff with main:

📌 Overall coverage:

github-actions · 2025-07-31T18:50:31Z

📌 Cov diff with main:

📌 Overall coverage:

github-actions · 2025-08-01T17:50:01Z

📌 Cov diff with main:

📌 Overall coverage:

mrDzurb · 2025-08-01T17:24:07Z

ads/aqua/resources/gpu_shapes_index.json

    },
    "VM.GPU.A10.1": {
      "gpu_count": 1,
      "gpu_memory_in_gbs": 24,
-      "gpu_type": "A10"
+      "gpu_type": "A10",


Let's add FP8 for the A10 shapes as well.

mrDzurb · 2025-08-01T17:31:29Z

ads/aqua/common/utils.py

@@ -1287,6 +1287,7 @@ def load_gpu_shapes_index(

    # Merge: remote shapes override local
    local_shapes = local_data.get("shapes", {})
+    remote_data = {}


Why do we need this?

mrDzurb · 2025-08-01T17:37:41Z

ads/aqua/extension/__init__.py

@@ -13,6 +13,7 @@
 from ads.aqua.extension.evaluation_handler import __handlers__ as __eval_handlers__
 from ads.aqua.extension.finetune_handler import __handlers__ as __finetune_handlers__
 from ads.aqua.extension.model_handler import __handlers__ as __model_handlers__
+from ads.aqua.extension.recommend_handler import __handlers__ as __gpu_handlers__


Maybe we can name it as __shape_handler?

mrDzurb · 2025-08-01T18:31:03Z

ads/aqua/shaperecommend/llm_config.py

+        Detects quantization bit-width as a string (e.g., '4bit', '8bit') from Hugging Face config dict.
+        """
+        if raw.get("load_in_8bit"):
+            return "8bit"


It would be better to move this into constants.

mrDzurb · 2025-08-01T18:31:27Z

ads/aqua/shaperecommend/llm_config.py

+        If model is un-quantized, uses the weight size.
+        If model is pre-quantized, uses the quantization level.
+        """
+        key = (self.quantization or self.weight_dtype or "float32").lower()


Let's move "float32" to constants

mrDzurb · 2025-08-01T18:32:04Z

ads/aqua/shaperecommend/llm_config.py

+        """
+        vals = []
+        curr = min_len
+        max_seq_len = 16384 if not self.max_seq_len else self.max_seq_len


Let's move the numbers like 16384 to constants and add some description there

github-actions · 2025-08-02T00:17:37Z

📌 Cov diff with main:

📌 Overall coverage:

mrDzurb · 2025-08-04T01:51:51Z

ads/aqua/extension/recommend_handler.py

+    """
+
+    @handle_exceptions
+    def post(self, *args, **kwargs):  # noqa: ARG002


Would it make sense to move this handler to deployment_handler.py under the AquaDeploymentHandler class?

We already have a list_shape method there, so I suggest adding a new method called list_recommended_shape.

The endpoint path could be /aqua/deployments/recommended_shapes.

The main reason for this change is that we’ll be implementing similar logic for service models as well. Having a unified handler will allow us to return the recommended list of shapes for both service and custom models in a consistent way.

mrDzurb · 2025-08-04T04:29:51Z

ads/aqua/shaperecommend/recommend.py

+    ShapeRecommendationReport,
+    ShapeReport,
+)
+from ads.config import COMPARTMENT_OCID


Looks like this const variable is not used?

mrDzurb · 2025-08-04T04:32:15Z

ads/aqua/shaperecommend/recommend.py

+        return self.rich_diff_table(shape_recommend_report)
+
+    @staticmethod
+    def validate_model_ocid(ocid: str) -> DataScienceModel:


If we are not planning to use this method outside of the class, it would be better to make it as private.
def _validate_model_ocid. Check for the others.

mrDzurb · 2025-08-04T04:33:38Z

ads/aqua/shaperecommend/recommend.py

+from ads.model.datascience_model import DataScienceModel
+
+
+class AquaRecommendApp(AquaApp):


NIT: Maybe we can name it as AquaShapeRecommendApp?

Technically, the shape recommendation feature falls under the broader Model Deployment functionality. Instead of creating a new App class, I suggest we implement the business logic in a regular helper class and then integrate it into the existing AquaDeploymentApp.

Since AquaDeploymentApp already includes a list_shapes method, we can add a new method called list_recommended_shapes to maintain backward compatibility.

Looking ahead, if we need similar logic for other modules like fine-tuning or evaluation, we can follow the same pattern and add the corresponding methods in those apps as well.

As for naming, we could rename the current AquaRecommendApp to AquaShapeRecommend. I don't think we need to inherit from AquaApp in this case, since it's primarily a utility class focused on recommendation logic.

mrDzurb · 2025-08-04T04:55:15Z

ads/aqua/shaperecommend/recommend.py

+        Use `ads aqua recommend which_gpu --help` to get more details on available parameters.
+    """
+
+    def which_gpu(self, **kwargs) -> ShapeRecommendationReport:


Maybe to make it a bit more generic - which_shape?

mrDzurb · 2025-08-04T04:56:56Z

ads/aqua/shaperecommend/recommend.py

+        return recommendations
+
+    @staticmethod
+    def rich_diff_table(shape_report: ShapeRecommendationReport) -> Table:


I think we can make it private?

mrDzurb · 2025-08-04T05:26:08Z

ads/aqua/shaperecommend/recommend.py

+from ads.model.datascience_model import DataScienceModel
+
+
+class AquaRecommendApp(AquaApp):


Technically, the shape recommendation feature falls under the broader Model Deployment functionality. Instead of creating a new App class, I suggest we implement the business logic in a regular helper class and then integrate it into the existing AquaDeploymentApp.

Since AquaDeploymentApp already includes a list_shapes method, we can add a new method called list_recommended_shapes to maintain backward compatibility.

Looking ahead, if we need similar logic for other modules like fine-tuning or evaluation, we can follow the same pattern and add the corresponding methods in those apps as well.

As for naming, we could rename the current AquaRecommendApp to AquaShapeRecommend. I don't think we need to inherit from AquaApp in this case, since it's primarily a utility class focused on recommendation logic.

mrDzurb · 2025-08-04T05:54:39Z

ads/aqua/shaperecommend/recommend.py

+        ValueError
+            If the file cannot be opened, parsed, or the 'shapes' key is missing.
+        """
+        user_shapes = AquaDeploymentApp().list_shapes(compartment_id=compartment_id)


To avoid calling AquaDeploymentApp().list_shapes, I think it would be cleaner to add a shapes method directly to the OCIDataScienceModelDeployment class.

For example:

class OCIDataScienceModelDeployment( OCIDataScienceMixin, oci.data_science.models.ModelDeployment, ): @classmethod def shapes( cls, compartment_id: str = None, **kwargs, ) -> List[oci.data_science.models.ModelDeploymentShapeSummary]: return oci.pagination.list_call_get_all_results( cls().client.list_model_deployment_shapes, compartment_id or COMPARTMENT_ID, **kwargs ).data

This makes the logic more self-contained within the deployment model and avoids relying on external app instances just to retrieve shape info.

The usage could be:

user_shapes = OCIDataScienceModelDeployment.shapes(compartment_id=compartment_id)

mrDzurb · 2025-08-04T05:57:02Z

ads/aqua/shaperecommend/recommend.py

+            if name in set_user_shapes:
+                compute_shape = set_user_shapes.get(name)
+                compute_shape.available = True
+                compute_shape.shape_series = "GPU"


The oci.data_science.models.ModelDeploymentShapeSummary already contains the def shape_series(self, shape_series):. Maybe we can use it?

github-actions · 2025-08-04T18:06:28Z

📌 Cov diff with main:

📌 Overall coverage:

elizjo · 2025-08-04T23:22:20Z

ads/aqua/modeldeployment/deployment.py

+
+        valid_shapes = []
+        # only loops through GPU shapes, update later to include CPU shapes
+        for name, spec in gpu_shapes_metadata.items():


in the future- we would need a list of CPU shapes available (similar to load_gpu_shapes_index()). We need this since we make recommendations based on what shapes are possible, not what shapes are currently available

github-actions · 2025-08-04T23:58:09Z

📌 Cov diff with main:

📌 Overall coverage:

mrDzurb · 2025-08-04T23:32:18Z

ads/aqua/extension/deployment_handler.py

@@ -57,6 +57,16 @@ def get(self, id: Union[str, List[str]] = None):
            return self.get_deployment_config(
                model_id=id.split(",") if "," in id else id
            )
+        elif paths.startswith("aqua/deployments/recommend_shapes"):


NIT: /recommended_shapes would be better.

mrDzurb · 2025-08-04T23:36:13Z

ads/aqua/extension/deployment_handler.py

+
+        compartment_id = self.get_argument("compartment_id", default=COMPARTMENT_OCID)
+
+        generate_table = (


Is this related to a different output format? If so, the handler should always return a JSON response regardless of the format.

mrDzurb · 2025-08-04T23:39:05Z

ads/aqua/extension/deployment_handler.py

@@ -57,6 +57,16 @@ def get(self, id: Union[str, List[str]] = None):
            return self.get_deployment_config(
                model_id=id.split(",") if "," in id else id
            )
+        elif paths.startswith("aqua/deployments/recommend_shapes"):
+            id = id or self.get_argument("model_id", default=None)


Why do we need this? Looks like in case of the aqua/deployments/config we don't check the "model_id"?

mrDzurb · 2025-08-04T23:42:14Z

ads/aqua/modeldeployment/constants.py

@@ -11,3 +11,5 @@

 DEFAULT_WAIT_TIME = 12000
 DEFAULT_POLL_INTERVAL = 10
+
+SHAPE_MAP = {"NVIDIA_GPU": "GPU"}


Just in case, here the full list of supported series.

SHAPE_SERIES_AMD_ROME = "AMD_ROME" SHAPE_SERIES_INTEL_SKYLAKE = "INTEL_SKYLAKE" SHAPE_SERIES_NVIDIA_GPU = "NVIDIA_GPU" SHAPE_SERIES_GENERIC = "GENERIC" SHAPE_SERIES_LEGACY = "LEGACY" SHAPE_SERIES_ARM = "ARM"

I did research this- the issue is that I'm not 100% sure if these map to GPU/CPU types. For now, I will map all to CPU types except for NVIDIA_GPU since the AMD GPU shape (MX300) did not have the AMD_ROME for the shape_series parameter. I also did not see any of these series (except for NVIDIA_GPU) when we queried for GPU only shapes.

mrDzurb · 2025-08-04T23:54:39Z

ads/aqua/shaperecommend/recommend.py

+    Must be used within a properly configured and authenticated OCI environment.
+    """
+
+    def which_shapes(self, **kwargs) -> Union[ShapeRecommendationReport, Table]:


Maybe instead of kwargs, we can accept RequestRecommend?

def which_shapes(self, request: RequestRecommend)

I think that how it was done initially?

mrDzurb · 2025-08-05T00:08:36Z

ads/aqua/shaperecommend/recommend.py

+        ocid : str
+           OCID of the model to recommend feasible compute shapes.
+
+        available_shapes : List[ComputeShapeSummary]


If we want to prepare the available shapes in advance and pass them to the recommender, I think it would make more sense to move this parameter into the constructor:

class AquaShapeRecommend(BaseModel): available_shapes: List[ComputeShapeSummary] = Field( ..., description="List of available shapes in OCI." )

This way, we can decouple the shape fetching logic from the recommender logic and keep the design cleaner.

Alternatively, as previously discussed, we could define a method like def available_shapes() directly within the AquaShapeRecommend class. This method can internally use the OciDataScienceModelDeployment.shapes() function to retrieve the initial list of shapes. We can then move the valid_compute_shapes mehtod form deployment.py there.

My vote would be for the second approach.

elizjo requested review from darenr, mayoor, mrDzurb, VipulMascarenhas, qiuosier and ahosler as code owners July 7, 2025 20:41

oracle-contributor-agreement bot added the OCA Verified All contributors have signed the Oracle Contributor Agreement. label Jul 7, 2025

mrDzurb changed the title ~~GPU Shape Recommendation~~ [AQUA] GPU Shape Recommendation Jul 7, 2025

mrDzurb changed the title ~~[AQUA] GPU Shape Recommendation~~ [WIP][AQUA] GPU Shape Recommendation Jul 15, 2025

elizjo added 4 commits July 25, 2025 10:27

inital code for GPU Shape Recommendator

18a92c4

modifications to handler

bd026e7

init implementation for gpu recommendations

4461af7

fixed docstrings and unused imports

26e08a2

elizjo force-pushed the ODSC-74228/GPU-Shape-Recommendation branch from 2f54f8b to 26e08a2 Compare July 25, 2025 17:29

added unit tests

7ce57c8

added rich diff table

a17b035

fixed unit tests

e94b6f1

Merge branch 'main' into ODSC-74228/GPU-Shape-Recommendation

fbfdb91

mrDzurb reviewed Aug 1, 2025

View reviewed changes

addressed comments, fixed rich diff table

ba605ee

mrDzurb reviewed Aug 4, 2025

View reviewed changes

Adds shapes method to the OciDataScienceModelDeployment

300aa17

addressed comments

96f5543

elizjo commented Aug 4, 2025

View reviewed changes

fixed formatting

3a431fa

mrDzurb reviewed Aug 5, 2025

View reviewed changes

		from ads.model.datascience_model import DataScienceModel


		class AquaRecommendApp(AquaApp):


		compartment_id = self.get_argument("compartment_id", default=COMPARTMENT_OCID)

		generate_table = (

[WIP][AQUA] GPU Shape Recommendation #1221

Are you sure you want to change the base?

[WIP][AQUA] GPU Shape Recommendation #1221

Conversation

elizjo commented Jul 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Jul 7, 2025

Uh oh!

github-actions bot commented Jul 8, 2025

Uh oh!

github-actions bot commented Jul 25, 2025

Uh oh!

github-actions bot commented Jul 29, 2025

Uh oh!

github-actions bot commented Jul 30, 2025

Uh oh!

github-actions bot commented Jul 31, 2025

Uh oh!

github-actions bot commented Aug 1, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Aug 2, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Aug 4, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Aug 4, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

elizjo commented Jul 7, 2025 •

edited

Loading