Skip to content

Do Not Merge_Sample Pull request_MCP Example #3615

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
94 changes: 94 additions & 0 deletions .github/workflows/sdk-assets-sdkmcpsample-sdkmcpsample.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
# This code is autogenerated.
# Code is generated by running custom script: python3 readme.py
# Any manual changes to this file may cause incorrect behavior.
# Any manual changes will be overwritten if the code is regenerated.

name: sdk-assets-sdkmcpsample-sdkmcpsample
# This file is created by sdk/python/readme.py.
# Please do not edit directly.
on:
workflow_dispatch:
schedule:
- cron: "4 15/12 * * *"
pull_request:
branches:
- main
paths:
- sdk/python/assets/sdkmcpsample/**
- .github/workflows/sdk-assets-sdkmcpsample-sdkmcpsample.yml
- sdk/python/dev-requirements.txt
- infra/bootstrapping/**
- sdk/python/setup.sh

permissions:
id-token: write
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true
jobs:
build:
runs-on: ubuntu-latest
steps:
- name: check out repo
uses: actions/checkout@v2
- name: setup python
uses: actions/setup-python@v2
with:
python-version: "3.10"
- name: pip install notebook reqs
run: pip install -r sdk/python/dev-requirements.txt
- name: azure login
uses: azure/login@v1
with:
client-id: ${{ secrets.OIDC_AZURE_CLIENT_ID }}
tenant-id: ${{ secrets.OIDC_AZURE_TENANT_ID }}
subscription-id: ${{ secrets.OIDC_AZURE_SUBSCRIPTION_ID }}
- name: bootstrap resources
run: |
echo '${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}';
bash bootstrap.sh
working-directory: infra/bootstrapping
continue-on-error: false
- name: setup SDK
run: |
source "${{ github.workspace }}/infra/bootstrapping/sdk_helpers.sh";
source "${{ github.workspace }}/infra/bootstrapping/init_environment.sh";
bash setup.sh
working-directory: sdk/python
continue-on-error: true
- name: validate readme
run: |
python check-readme.py "${{ github.workspace }}/sdk/python/assets/sdkmcpsample"
working-directory: infra/bootstrapping
continue-on-error: false
- name: setup-cli
run: |
source "${{ github.workspace }}/infra/bootstrapping/sdk_helpers.sh";
source "${{ github.workspace }}/infra/bootstrapping/init_environment.sh";
bash setup.sh
working-directory: cli
continue-on-error: true
- name: Eagerly cache access tokens for required scopes
run: |
# Workaround for azure-cli's lack of support for ID token refresh
# Taken from: https://github.com/Azure/login/issues/372#issuecomment-2056289617

# Management
az account get-access-token --scope https://management.azure.com/.default --output none
# ML
az account get-access-token --scope https://ml.azure.com/.default --output none
- name: run assets/sdkmcpsample/sdkmcpsample.ipynb
run: |
source "${{ github.workspace }}/infra/bootstrapping/sdk_helpers.sh";
source "${{ github.workspace }}/infra/bootstrapping/init_environment.sh";
bash "${{ github.workspace }}/infra/bootstrapping/sdk_helpers.sh" generate_workspace_config "../../.azureml/config.json";
bash "${{ github.workspace }}/infra/bootstrapping/sdk_helpers.sh" replace_template_values "sdkmcpsample.ipynb";
[ -f "../../.azureml/config" ] && cat "../../.azureml/config";
papermill -k python sdkmcpsample.ipynb sdkmcpsample.output.ipynb
working-directory: sdk/python/assets/sdkmcpsample
- name: upload notebook's working folder as an artifact
if: ${{ always() }}
uses: ./.github/actions/upload-artifact
with:
name: sdkmcpsample
path: sdk/python/assets/sdkmcpsample
1 change: 1 addition & 0 deletions sdk/python/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@ Test Status is for branch - **_main_**
|assets|data|[working_with_mltable](assets/data/working_with_mltable.ipynb)|Read, write and register a data asset|[![working_with_mltable](https://github.com/Azure/azureml-examples/actions/workflows/sdk-assets-data-working_with_mltable.yml/badge.svg?branch=main)](https://github.com/Azure/azureml-examples/actions/workflows/sdk-assets-data-working_with_mltable.yml)|
|assets|environment|[environment](assets/environment/environment.ipynb)|Create custom environments from docker and/or conda YAML|[![environment](https://github.com/Azure/azureml-examples/actions/workflows/sdk-assets-environment-environment.yml/badge.svg?branch=main)](https://github.com/Azure/azureml-examples/actions/workflows/sdk-assets-environment-environment.yml)|
|assets|model|[model](assets/model/model.ipynb)|Create model from local files, cloud files, Runs|[![model](https://github.com/Azure/azureml-examples/actions/workflows/sdk-assets-model-model.yml/badge.svg?branch=main)](https://github.com/Azure/azureml-examples/actions/workflows/sdk-assets-model-model.yml)|
|assets|sdkmcpsample|[sdkmcpsample](assets/sdkmcpsample/sdkmcpsample.ipynb)|Advanced Azure ML SDK v2 sample with MCP integration, comprehensive analytics, and enterprise-grade ML workflows|[![sdkmcpsample](https://github.com/Azure/azureml-examples/actions/workflows/sdk-assets-sdkmcpsample-sdkmcpsample.yml/badge.svg?branch=main)](https://github.com/Azure/azureml-examples/actions/workflows/sdk-assets-sdkmcpsample-sdkmcpsample.yml)|
|basic-installation|basic-installation|[basic-installation](basic-installation/basic-installation.ipynb)|Install and do a basic sanity of azure-ai-ml by listing the workspaces available in the resource group.||
|data-wrangling|interactive_data_wrangling.ipynb|[interactive_data_wrangling](data-wrangling/interactive_data_wrangling.ipynb)|*no description* - _This sample is excluded from automated tests_|[![interactive_data_wrangling](https://github.com/Azure/azureml-examples/actions/workflows/sdk-data-wrangling-interactive_data_wrangling.yml/badge.svg?branch=main)](https://github.com/Azure/azureml-examples/actions/workflows/sdk-data-wrangling-interactive_data_wrangling.yml)|
|endpoints|batch|[custom-output-batch](endpoints/batch/deploy-models/custom-outputs-parquet/custom-output-batch.ipynb)|*no description*|[![custom-output-batch](https://github.com/Azure/azureml-examples/actions/workflows/sdk-endpoints-batch-deploy-models-custom-outputs-parquet-custom-output-batch.yml/badge.svg?branch=main)](https://github.com/Azure/azureml-examples/actions/workflows/sdk-endpoints-batch-deploy-models-custom-outputs-parquet-custom-output-batch.yml)|
Expand Down
256 changes: 256 additions & 0 deletions sdk/python/assets/sdkmcpsample/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,256 @@
# Azure ML SDK MCP Sample

This folder contains an advanced sample notebook demonstrating comprehensive usage of Azure Machine Learning SDK v2 with a focus on MCP (Model Context Protocol) integration and enterprise-grade ML workflows.

## Overview

The `sdkmcpsample.ipynb` notebook provides an in-depth exploration of Azure ML SDK v2, covering advanced concepts and operations for experienced machine learning practitioners and MLOps engineers.

## What You'll Learn

- **Advanced Workspace Connection**: Robust authentication patterns with fallback mechanisms
- **Comprehensive Asset Management**: In-depth analysis of data, model, environment, and component assets
- **Advanced Compute Management**: Compute resource optimization and analysis
- **Sophisticated Data Operations**: Multi-dataset management with rich metadata and comprehensive analysis
- **Advanced Data Exploration**: Statistical analysis, visualizations, and data quality assessment
- **Job and Experiment Analytics**: Comprehensive workflow analysis and performance metrics
- **Model and Endpoint Management**: Advanced deployment strategies and endpoint health monitoring
- **Workspace Health Monitoring**: Performance metrics, health scoring, and optimization recommendations

## Prerequisites

Before running this notebook, ensure you have:

1. **Azure Subscription**: An active Azure subscription with a configured Azure ML workspace
2. **Python Environment**: Python 3.8+ with the following packages:
```bash
pip install azure-ai-ml azure-identity pandas scikit-learn matplotlib seaborn
```
3. **Authentication**: Proper credentials for accessing your Azure ML workspace
4. **Permissions**: Appropriate RBAC permissions for asset management and compute operations
5. **Basic ML Knowledge**: Understanding of machine learning concepts and workflows

## Getting Started

1. **Clone the Repository**:
```bash
git clone https://github.com/Azure/azureml-examples
cd azureml-examples/sdk/python/assets/sdkmcpsample
```

2. **Set Up Environment**:
```bash
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate

# Install dependencies
pip install azure-ai-ml azure-identity pandas scikit-learn matplotlib seaborn
```

3. **Open the Notebook**:
- Open `sdkmcpsample.ipynb` in Jupyter Lab, Jupyter Notebook, or VS Code
- Alternatively, use Azure ML Studio notebooks for cloud execution

4. **Configure Workspace Details**:
Update the workspace configuration in the notebook:
```python
subscription_id = "<your-subscription-id>"
resource_group_name = "<your-resource-group>"
workspace_name = "<your-workspace-name>"
```

5. **Run the Notebook**:
Execute the cells sequentially to learn about advanced Azure ML SDK v2 capabilities

## Notebook Structure

### 1. Setup and Installation
- Advanced package management and environment verification
- Comprehensive import statements for ML workflows

### 2. Advanced Workspace Connection
- Robust authentication with multiple credential fallbacks
- Connection validation and error handling

### 3. Comprehensive Workspace Assets Analysis
- Detailed analysis of all asset types with metadata
- Asset categorization and distribution analysis

### 4. Advanced Compute Resource Management
- Compute resource optimization and scaling analysis
- Performance metrics and cost optimization

### 5. Advanced Data Asset Management
- Multi-dataset registration with rich metadata
- Data lineage and versioning strategies

### 6. Advanced Data Exploration and Analysis
- Statistical analysis with comprehensive visualizations
- Data quality assessment and anomaly detection

### 7. Advanced Job and Experiment Management
- Workflow analysis and performance tracking
- Experiment comparison and optimization insights

### 8. Model and Endpoint Management
- Advanced deployment strategies and patterns
- Endpoint health monitoring and performance analysis

### 9. Workspace Health and Performance Metrics
- Comprehensive health scoring system
- Performance optimization recommendations

## Key Features

### Advanced Analytics
- **Statistical Analysis**: Comprehensive data profiling and quality assessment
- **Visualization Suite**: Professional-grade charts and insights
- **Performance Metrics**: Detailed workspace and resource utilization analysis

### Enterprise Features
- **Health Monitoring**: Automated workspace health scoring
- **Optimization Recommendations**: AI-driven suggestions for improvement
- **Cost Analysis**: Resource utilization and cost optimization insights

### Production Readiness
- **Error Handling**: Robust exception management and fallback strategies
- **Scalability**: Patterns for large-scale ML operations
- **Best Practices**: Enterprise-grade coding standards and patterns

## Sample Data

The notebook works with multiple datasets for comprehensive demonstration:

- **Titanic Dataset**: Classification problem for passenger survival prediction
- **Diabetes Dataset**: Regression problem for medical outcome prediction
- **Custom Data**: Extensible framework for your own datasets

## Authentication Options

The notebook supports multiple authentication methods with intelligent fallback:

- **DefaultAzureCredential**: Automatic authentication in Azure environments
- **InteractiveBrowserCredential**: Browser-based interactive authentication
- **Service Principal**: Programmatic authentication for automated workflows
- **Managed Identity**: Azure-managed identity for secure, passwordless authentication

## Advanced Use Cases

This sample is ideal for:

- **MLOps Engineers**: Implementing enterprise-grade ML operations
- **Data Scientists**: Advanced model development and experimentation
- **Platform Teams**: Setting up and managing ML infrastructure
- **Architects**: Designing scalable ML solutions
- **DevOps Teams**: Integrating ML into CI/CD pipelines

## Performance Optimization

### Resource Management
- Compute auto-scaling strategies
- Data caching and optimization
- Model serving optimization

### Cost Optimization
- Resource utilization monitoring
- Automated cost alerts and budgeting
- Right-sizing recommendations

### Security Best Practices
- RBAC implementation patterns
- Network security configurations
- Data encryption and compliance

## Related Examples

After completing this advanced sample, explore these related notebooks:

- **[Basic SDK Sample](../sdksample/sdksample.ipynb)**: Foundational SDK concepts
- **[MLOps Pipelines](../../../tutorials/mlops/)**: Production pipeline development
- **[AutoML Integration](../../../tutorials/automl/)**: Automated machine learning
- **[Distributed Training](../../../tutorials/distributed-training/)**: Large-scale model training
- **[Model Deployment](../../endpoints/)**: Advanced deployment strategies

## Troubleshooting

### Common Issues

1. **Authentication Failures**:
```bash
# Check Azure CLI login
az login
az account show

# Verify workspace access
az ml workspace show --name <workspace-name> --resource-group <resource-group>
```

2. **Package Dependencies**:
```bash
# Update packages
pip install --upgrade azure-ai-ml azure-identity

# Check versions
pip list | grep azure
```

3. **Memory Issues with Large Datasets**:
- Use data streaming techniques
- Implement chunked processing
- Consider distributed computing options

4. **Compute Resource Issues**:
- Verify compute quotas and limits
- Check resource availability in your region
- Review RBAC permissions for compute management

### Performance Tuning

1. **Data Loading Optimization**:
- Use Azure ML data streams for large datasets
- Implement data caching strategies
- Consider data preprocessing pipelines

2. **Compute Optimization**:
- Right-size compute instances
- Use spot instances for cost savings
- Implement auto-scaling policies

3. **Network Optimization**:
- Use VNet integration for security
- Optimize data transfer patterns
- Consider regional data placement

### Getting Help

- [Azure ML Documentation](https://docs.microsoft.com/azure/machine-learning/)
- [SDK v2 Migration Guide](https://docs.microsoft.com/azure/machine-learning/how-to-migrate-from-v1)
- [Azure ML Community](https://techcommunity.microsoft.com/t5/azure-ai/ct-p/Azure-AI)
- [GitHub Issues](https://github.com/Azure/azureml-examples/issues)
- [Stack Overflow](https://stackoverflow.com/questions/tagged/azure-machine-learning)

## Contributing

We welcome contributions to improve this advanced sample! Please see the [CONTRIBUTING.md](../../../../CONTRIBUTING.md) file for guidelines on:

- Code standards and best practices
- Testing requirements
- Documentation standards
- Pull request process

## Security

For security-related concerns, please review our [SECURITY.md](../../../../SECURITY.md) file and follow responsible disclosure practices.

## License

This project is licensed under the MIT License - see the [LICENSE](../../../../LICENSE) file for details.

## Changelog

### Version 1.0
- Initial release with comprehensive Azure ML SDK v2 coverage
- Advanced analytics and visualization capabilities
- Enterprise-grade health monitoring and optimization
- Production-ready error handling and best practices
Loading
Loading