Add `shm_size` support to AzureML Orchestrator

### Contact Details [Optional]

_No response_

### Feature Description

Add support for configurable shared memory (`shm_size`) parameter in `AzureMLOrchestratorSettings`. The underlying AzureML SDK v2 (`JobResourceConfiguration`) already supports this parameter, but ZenML's integration does not expose it to users.

### Problem or Use Case

**Problem:**
When running ML pipelines on AzureML through ZenML, containers default to around 64MB of shared memory (`/dev/shm`). This is insufficient for:
- PyTorch dataloaders with multi-worker loading
- Large model caching in RAM
- In-memory data processing for large-scale ML workflows

**Use Case:**
Users working with large models (e.g., diffusion models, LLMs) need to cache model weights and intermediate tensors in shared memory. Without configurable `shm_size`, these pipelines fail with out-of-memory errors in shared memory, even when the compute instance has sufficient RAM.

**Current Limitation:**
The AzureML SDK v2's `JobResourceConfiguration` class supports `shm_size`, but ZenML's `AzureMLOrchestratorSettings` does not expose this parameter, requiring users to manually patch the ZenML source code.

### Proposed Solution

Add a `shm_size` field to the `AzureMLOrchestratorSettings` class and pass it through to the underlying AzureML [`JobResourceConfiguration`](https://learn.microsoft.com/en-us/python/api/azure-ai-ml/azure.ai.ml.entities.jobresourceconfiguration).

### Implementation Changes:

**1. Update `AzureMLOrchestratorSettings`:**

File: `zenml/integrations/azure/flavors/azureml_orchestrator_flavor.py`

```python
class AzureMLOrchestratorSettings(AzureMLComputeSettings):
    """Settings for the AzureML orchestrator."""

    synchronous: bool = Field(
        default=True,
        description="Whether the orchestrator runs synchronously or not.",
    )
    
    # Add shm_size field
    shm_size: Optional[str] = Field(
        default=None,
        description="Size of the shared memory block (e.g. '2g', '200g').",
    )
```

**2. Update `AzureMLOrchestrator`:**

File: `zenml/integrations/azure/orchestrators/azureml_orchestrator.py`

*Add import at the top of the file:*

```python
from azure.ai.ml.entities import (
    CommandComponent,
    JobResourceConfiguration,  # Add this import
    # ... other imports
)
```

*Update the `_create_command_component` method signature and implementation:*

```python
@staticmethod
def _create_command_component(
    step: Step,
    image: str,
    command: List[str],
    environment_variables: Dict[str, str],
    environment: str,
    shm_size: Optional[str] = None,  # Add this parameter
) -> CommandComponent:
    """Create an AzureML CommandComponent for a pipeline step.
    
    Args:
        step: The ZenML step to create a component for.
        image: Docker image to use.
        command: Command to execute.
        environment_variables: Environment variables.
        environment: Environment name.
        shm_size: Optional shared memory size (e.g. '2g', '200g').
    
    Returns:
        CommandComponent configured for the step.
    """
    # Create resource configuration if shm_size is specified
    resources = None
    if shm_size:
        resources = JobResourceConfiguration(shm_size=shm_size)
    
    return CommandComponent(
        name=step.config.name,
        display_name=step.config.name,
        description=step.config.docstring,
        command=" ".join(command),
        environment=environment,
        environment_variables=environment_variables,
        resources=resources,  # Pass the resources configuration
    )
```

*Update the `submit_pipeline` method to pass `shm_size`:*

```python
def submit_pipeline(
    self,
    deployment: "PipelineDeploymentResponse",
    stack: "Stack",
) -> None:
    """Submit a pipeline to AzureML."""
    # ... existing code ...
    
    
    for step_name, step in deployment.step_configurations.items():

        # ... existing code ...

        components[step_name] = self._create_command_component(
            step=step,
            image=docker_image_name,
            command=command,
            environment_variables=env_vars,
            environment=environment_name,
            shm_size=settings.shm_size,  # Pass shm_size from settings
        )
```

### User Experience:

```python
from zenml.integrations.azure.flavors import AzureMLOrchestratorSettings

# Configure orchestrator with custom shared memory size
settings = AzureMLOrchestratorSettings(
    mode="compute-cluster",
    compute_name="gpu-cluster",
    shm_size="200g"  # Set shared memory to 200GB
)

# Use in pipeline execution
pipeline.with_options(settings={"orchestrator": settings})()
```

### Proposed Documentation:

```markdown
**shm_size** (`Optional[str]`): Size of the shared memory block for the container (e.g., `"2g"`, `"200g"`). 
Useful when working with large models, PyTorch dataloaders with multiple workers, or RAM-based caching. 
Defaults to AzureML's default (typically 64MB).
```

### Alternatives Considered

**1. Manual Patching (Current Workaround):**
- Manually edit ZenML source files in `site-packages`
- **Drawbacks:** 
  - Not maintainable across ZenML updates
  - Requires modifying installed packages
  - Not portable across environments

**2. Using Docker `--shm-size` flag:**
- This approach doesn't apply to AzureML's managed job execution
- AzureML's container configuration requires using the SDK's `JobResourceConfiguration`

### Additional Context

### Code Implementation Reference

The manual patch has been successfully tested and requires changes to two files:

**File 1:** `zenml/integrations/azure/flavors/azureml_orchestrator_flavor.py`
- Add `shm_size: Optional[str]` field to `AzureMLOrchestratorSettings`

**File 2:** `zenml/integrations/azure/orchestrators/azureml_orchestrator.py`
- Import: `from azure.ai.ml.entities import JobResourceConfiguration`
- Modify: `_create_command_component` method to accept and use `shm_size`
- Modify: `submit_pipeline` method to pass `settings.shm_size` to component creation

### Technical Details

- The AzureML SDK v2 already supports this via `JobResourceConfiguration.shm_size`
- Format: String with size notation (e.g., `"2g"`, `"200g"`, `"1024m"`)
- Default behavior: When `None`, AzureML uses default (typically 64MB)
- No breaking changes: The parameter is optional with `default=None`

### Note

The Code of Conduct link in this feature request template appears to be broken (shows "Not found" error).


### Priority

High - Critical for my use case

### Code of Conduct

- [x] I agree to follow this project's Code of Conduct

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add `shm_size` support to AzureML Orchestrator #4329

Contact Details [Optional]

Feature Description

Problem or Use Case

Proposed Solution

Implementation Changes:

User Experience:

Proposed Documentation:

Alternatives Considered

Additional Context

Code Implementation Reference

Technical Details

Note

Priority

Code of Conduct

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add shm_size support to AzureML Orchestrator #4329

Description

Contact Details [Optional]

Feature Description

Problem or Use Case

Proposed Solution

Implementation Changes:

User Experience:

Proposed Documentation:

Alternatives Considered

Additional Context

Code Implementation Reference

Technical Details

Note

Priority

Code of Conduct

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Add `shm_size` support to AzureML Orchestrator #4329