diff --git a/mlops-template-terraform/LICENSE.txt b/mlops-template-terraform/LICENSE.txt new file mode 100644 index 00000000..a44c3b63 --- /dev/null +++ b/mlops-template-terraform/LICENSE.txt @@ -0,0 +1,16 @@ +Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. + +SPDX-License-Identifier: MIT-0 + +Permission is hereby granted, free of charge, to any person obtaining a copy of this +software and associated documentation files (the "Software"), to deal in the Software +without restriction, including without limitation the rights to use, copy, modify, +merge, publish, distribute, sublicense, and/or sell copies of the Software, and to +permit persons to whom the Software is furnished to do so. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, +INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A +PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT +HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION +OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE +SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. diff --git a/mlops-template-terraform/README.md b/mlops-template-terraform/README.md new file mode 100644 index 00000000..827b4133 --- /dev/null +++ b/mlops-template-terraform/README.md @@ -0,0 +1,62 @@ +## MLOps Terraform Template for SageMaker Projects + +An important aspect of Machine Learning (ML) projects is the transition from the manual experimentation with +Jupyter notebooks and similar to an architecture, where workflows for building, training, deploying and maintaining ML models +in production are automated and orchestrated. In order to achieve this, an operating model between different personas such as Data Scientists, +Data Engineers, ML Engineers, DevOps Engineers, IT and business stakeholders needs to be established. Further, the data and +model lifecycle and the underlying workflows need to be defined, as well as the responsibilities of the different personas +in these areas. This collection of practices is called Machine Learning Operations (MLOps). + +This repository contains a set baseline infrastructure for an MLOps environment on AWS for a single AWS account at this moment. +The infrastructure is defined with Terraform and is built around the Amazon SageMaker service. + +The 3 main components in the repository are: + +### Component 1: ./mlops_infra: + +This terraform project is used to bootstrap an account for ML and includes the following modules: +- modules/networking: Deploy vpc, subnet & vpc endpoints for SageMaker +- modules/code_commit: Deploy codecommit repository & associate it as SageMaker repository +- modules/kms: Deploy KMS key for encryption of SageMaker resources +- modules/iam: Deploy Sagemaker roles & policies +- modules/sagemaker_studio: Deploy SageMaker Studio with users and default Jupyterhub app, as well as enabling SageMaker projects + +### Component 2: ./mlops_templates: + +This terraform project is used to bootstrap service catalog with a portfolio and example terraform based SageMaker project. +It allows deploying many different organizational SageMaker project templates. + +- modules/sagemaker_project_template: Create Service Catalog Portolio & products + +### Component 3: ./mlops_templates/templates/mlops_terraform_template/seed_code + +These folders contain the "seed code", which is the code that will be initialized when a new SageMaker project is created in SageMaker Studio. +The seed code is associated with the corresponding template in the mlops_template code. The seed code should be 100% generic +and should provide the baseline for new ML projects to build on. + +- seed_code/build_app: Example terraform based model build application using SageMaker Pipelines, Codecommit & Codebuild +- seed_code/deploy_app: Example terraform based model deployment application that deploys trained models to SageMaker endpoints + +## Prerequisites + +- Terraform +- Git +- AWS CLI v2 + +## Architecture overview and workflows + +![Architecture Diagram](./mlops_templates/diagrams/mlops-terraform-template-overview.png) + + +## How to use + +### Step 1: Deploy mlops_infra into a fresh account + +Navigate to the 'mlops_infra' directory with `cd mlops_infra` and follow instructions: +[mlops_infra](mlops_infra/README.md) + + +### Step 2: Deploy mlops_template into the same account + +Navigate to the 'mlops_templates' directory with `cd mlops_templates` and follow instructions: +[mlops_templates](mlops_templates/README.md) diff --git a/mlops-template-terraform/mlops_infra/Makefile b/mlops-template-terraform/mlops_infra/Makefile new file mode 100644 index 00000000..e9940532 --- /dev/null +++ b/mlops-template-terraform/mlops_infra/Makefile @@ -0,0 +1,17 @@ +.DEFAULT_GOAL = help +.PHONY: help bootstrap init plan apply + +help: + @grep -E '^[a-zA-Z_-]+:.*?## .*$$' $(MAKEFILE_LIST) | sort | awk 'BEGIN {FS = ":.*?## "}; {printf "\033[36m%-30s\033[0m %s\n", $$1, $$2}' + +bootstrap: + @./scripts/terraform-account-setup.sh + +init: + @cd terraform && terraform init + +plan: + @cd terraform && terraform plan + +apply: + @cd terraform && terraform apply diff --git a/mlops-template-terraform/mlops_infra/README.md b/mlops-template-terraform/mlops_infra/README.md new file mode 100644 index 00000000..6f5600ac --- /dev/null +++ b/mlops-template-terraform/mlops_infra/README.md @@ -0,0 +1,53 @@ +# mlops_infra + +## Summary + +This repository is used to deploy the foundational infrastructure for MLOps on AWS using Terraform. It includes the modules to deploy: + +- `networking`: Sets up basic VPC and Subnets and required VPC Endpoints for running SageMaker Studio in private subnets +- `iam`: Sets up basic IAM roles and IAM policies +- `kms`: Creates KMS key and policies +- `sagemaker_studio`: Deploys and configures Amazon SageMaker Studio including automatically enabling Amazon SageMaker projects + +## Architecture overview + +![Architecture Diagram](./diagrams/SageMaker_dev_env.png) + +## Getting started + + +How to apply the resources: + +Expose aws credentials via environment variables(https://registry.terraform.io/providers/hashicorp/aws/latest/docs#environment-variables) + + +Create state bucket via CloudFormation & generate terraform/provider.tf, as well as initializing Terraform +```bat +make bootstrap +make init +``` + +Modify the locals in `terraform/main.tf`. In particular, change the `prefix` to a unique name for your project/use-case. This will allow deploying multiple versions of the infrastructure side by side for each prefix. + +## Deploying your infrastructure + +```bat +make plan +make apply +``` + +## Adding new users to Amazon SageMaker Studio Domain + +To add additional users to the Amazon SageMaker Studio domain + +1. Open `terraform/main.tf` +2. Modify the user list +3. `make plan && make apply` + +## How to destroy the resources: + +```bat +make destroy +``` + +> Note: For the SageMaker Studio related resources (e.g. User etc.) you need to currently do it manually, as the destroy command fails with the user still being in use. \ No newline at end of file diff --git a/mlops-template-terraform/mlops_infra/diagrams/SageMaker_dev_env.png b/mlops-template-terraform/mlops_infra/diagrams/SageMaker_dev_env.png new file mode 100644 index 00000000..7f22a365 Binary files /dev/null and b/mlops-template-terraform/mlops_infra/diagrams/SageMaker_dev_env.png differ diff --git a/mlops-template-terraform/mlops_infra/scripts/bootstrap/bootstrap_cfn.yaml b/mlops-template-terraform/mlops_infra/scripts/bootstrap/bootstrap_cfn.yaml new file mode 100644 index 00000000..3995a397 --- /dev/null +++ b/mlops-template-terraform/mlops_infra/scripts/bootstrap/bootstrap_cfn.yaml @@ -0,0 +1,13 @@ +Resources: + S3Bucket: + Type: 'AWS::S3::Bucket' + DeletionPolicy: Retain + Properties: + BucketName: !Sub 'mlops-${AWS::AccountId}-${AWS::Region}-tf-state' +Outputs: + BucketName: + Value: !Ref S3Bucket + Description: Name of the sample Amazon S3 bucket with CORS enabled. + BucketRegion: + Value: !Sub "${AWS::Region}" + Description: Region on state bucket diff --git a/mlops-template-terraform/mlops_infra/scripts/bootstrap/provider.template b/mlops-template-terraform/mlops_infra/scripts/bootstrap/provider.template new file mode 100644 index 00000000..687a0b11 --- /dev/null +++ b/mlops-template-terraform/mlops_infra/scripts/bootstrap/provider.template @@ -0,0 +1,12 @@ +provider "aws" { + region = "$BUCKET_REGION" +} + +terraform { + required_version = ">= 1.0.0" + backend "s3" { + bucket = "$BUCKET_NAME" + key = "mlops.tfstate" + region = "$BUCKET_REGION" + } +} diff --git a/mlops-template-terraform/mlops_infra/scripts/terraform-account-setup.sh b/mlops-template-terraform/mlops_infra/scripts/terraform-account-setup.sh new file mode 100755 index 00000000..efb6d29e --- /dev/null +++ b/mlops-template-terraform/mlops_infra/scripts/terraform-account-setup.sh @@ -0,0 +1,33 @@ +#!/bin/bash +STACK_NAME="mlops-tf-bootstrap" +AWS_ACCOUNT=$(aws sts get-caller-identity --query "Account" --output text) +AWS_REGION=${AWS_REGION:-$(aws configure get region)} + +bootstrap() { + echo "---------------------BOOTSTRAPPING---------------------" + aws cloudformation deploy --template ./scripts/bootstrap/bootstrap_cfn.yaml --stack-name $STACK_NAME --region $AWS_REGION + export BUCKET_NAME=$(aws cloudformation describe-stacks --stack-name $STACK_NAME --query "Stacks[0].Outputs[?OutputKey=='BucketName'].OutputValue" --output text) + export BUCKET_REGION=$(aws cloudformation describe-stacks --stack-name $STACK_NAME --query "Stacks[0].Outputs[?OutputKey=='BucketRegion'].OutputValue" --output text) + + # Update provider.tf to use bucket and region + envsubst < "./scripts/bootstrap/provider.template" > "./terraform/provider.tf" + + # Re-initialize terraform + terraform init -reconfigure || true + echo "State Bucket: $BUCKET_NAME" + echo "State Region: $BUCKET_REGION" + echo "-----------------------COMPLETE------------------------" + echo "Ensure to set your AWS_REGION environment variable to $AWS_REGION for Terrform to select the correct region" +} + +read -r -p "Bootstrap $AWS_ACCOUNT in $AWS_REGION?? [Y/n] " CONFIRMATION +case "$CONFIRMATION" in + [yY][eE][sS]|[yY]) + # Deploy bucket for state files + bootstrap + ;; + *) + echo "Skipping bootstrap" + ;; +esac + diff --git a/mlops-template-terraform/mlops_infra/terraform/main.tf b/mlops-template-terraform/mlops_infra/terraform/main.tf new file mode 100644 index 00000000..6adbe2bf --- /dev/null +++ b/mlops-template-terraform/mlops_infra/terraform/main.tf @@ -0,0 +1,41 @@ +locals { + prefix = "mlops" + user_profile_names = ["user1", "user2"] + domain_name = "${local.prefix}-studio-domain" + kms_key_alias = "${local.prefix}-kms-key" +} + +data "aws_caller_identity" "current" {} + +data "aws_region" "current" {} + +module "networking" { + source = ".//modules/networking" + prefix = local.prefix + vpc_cidr_block = "10.0.0.0/16" + private_subnet_cidr_block = "10.0.1.0/24" + public_subnet_cidr_block = "10.0.0.0/24" + availability_zone = "${data.aws_region.current.name}a" # Get rid of this +} + +module "kms" { + source = ".//modules/kms" + kms_key_alias = local.kms_key_alias +} + +module "iam" { + source = ".//modules/iam" + prefix = local.prefix + kms_key_arn = module.kms.kms_key_arn +} + +module "sm_studio" { + source = ".//modules/sagemaker_studio" + domain_name = local.domain_name + vpc_id = module.networking.vpc_id + subnet_ids = [module.networking.private_subnet_ids] + sm_execution_role_arn = module.iam.sm_execution_role_arn + kms_key_arn = module.kms.kms_key_arn + user_profile_names = local.user_profile_names +} + diff --git a/mlops-template-terraform/mlops_infra/terraform/modules/iam/main.tf b/mlops-template-terraform/mlops_infra/terraform/modules/iam/main.tf new file mode 100644 index 00000000..c0f0353d --- /dev/null +++ b/mlops-template-terraform/mlops_infra/terraform/modules/iam/main.tf @@ -0,0 +1,267 @@ +# TODO: Consider scoping down IAM roles depending on your needs + +locals { + prefix = var.prefix + sm_execution_role_name = "${local.prefix}-sagemaker-execution-role" + kms_key_arn = var.kms_key_arn +} + +data "aws_caller_identity" "current" {} + +data "aws_region" "current" {} +################################################################################################## +# Roles & Policies +################################################################################################## + +resource "aws_iam_role" "sagemaker_execution_role" { + name = local.sm_execution_role_name + + assume_role_policy = jsonencode({ + Version = "2012-10-17" + Statement = [ + { + Action = "sts:AssumeRole" + Effect = "Allow" + Sid = "AllowRoleAssume" + Principal = { + Service = "sagemaker.amazonaws.com" + } + }, + ] + }) +} + +resource "aws_iam_role_policy_attachment" "aws_sagemaker_full_access" { + role = aws_iam_role.sagemaker_execution_role.name + policy_arn = "arn:aws:iam::aws:policy/AmazonSageMakerFullAccess" +} + +resource "aws_iam_role_policy_attachment" "aws_sagemaker_cloudformation_poweruser" { + role = aws_iam_role.sagemaker_execution_role.name + policy_arn = "arn:aws:iam::aws:policy/AWSCloudFormationFullAccess" +} + + +resource "aws_iam_policy" "codecommit_policy" { + name = "${local.prefix}-codecommit-policy" + description = "${local.prefix} policy for SM Studio codecommit access" + + policy = < Control Panel -> Your User --> Launch App --> Studio +3. Wait for Amazon SageMaker Studio to launch +4. In the left bar -> SageMaker Resources -> In the drop down tab select Projects +5. Click "Create project" +6. Select the "Organization templates" tab +7. Select the "mlops_terraform_template & click on "Select project template" + - Note: Your template name might be different if you used a different prefix. +8. Enter a unique project name and the required fields +9. Click "Create" +10. Wait for the SageMaker Project instance to be created from the template +11. Open project by double clicking -> Repositories Tab +12. Click Clone Repository (for both) + +After a new project has been created. It needs to be bootstrapped: + +- Top Menu: Git -> Open a Terminal in SageMaker Studio and navigate to the modelbuild Git repository +- `$ make bootstrap` # tap enter for the defaults +- In case you get a permission denied error, consider using the command `chmod u+x ./infra_scripts/bootstrap.sh` +- Ensure entries are correct and complete after bootstrapping process. This will generate terraform/provider.tf & terraform/terraform.tfvars files and install Terraform +- `$ make init` +- `$ make plan` +- `$ make apply` # Deploys CICD pipeline for building models and running sagemaker pipelines on new commits +- `$ git add -A` +- `$ git commit -m "bootstrapped"` +- `$ git push` # Will trigger CodePipeline & deploy/train SageMaker pipeline +- Approve the resulting model in SageMaker Studio under Model Groups of the SageMaker Project + + +## 3. Create Inference Infrastructure based on your trained model +- Navigate to the modeldeploy repository in the terminal window of SageMaker Studio to deploy your infra related infrastructure: +- `$ make bootstrap` +- Ensure entries are correct and complete after bootstrapping process. This will generate terraform/provider.tf & terraform/terraform.tfvars files and install terraform +- `$ make init` +- `$ make plan` +- `$ make apply` \ No newline at end of file diff --git a/mlops-template-terraform/mlops_templates/diagrams/mlops-terraform-template-overview.png b/mlops-template-terraform/mlops_templates/diagrams/mlops-terraform-template-overview.png new file mode 100644 index 00000000..e4af013c Binary files /dev/null and b/mlops-template-terraform/mlops_templates/diagrams/mlops-terraform-template-overview.png differ diff --git a/mlops-template-terraform/mlops_templates/diagrams/sm_project_template.png b/mlops-template-terraform/mlops_templates/diagrams/sm_project_template.png new file mode 100644 index 00000000..1fe710db Binary files /dev/null and b/mlops-template-terraform/mlops_templates/diagrams/sm_project_template.png differ diff --git a/mlops-template-terraform/mlops_templates/scripts/bootstrap/bootstrap_cfn.yaml b/mlops-template-terraform/mlops_templates/scripts/bootstrap/bootstrap_cfn.yaml new file mode 100644 index 00000000..3995a397 --- /dev/null +++ b/mlops-template-terraform/mlops_templates/scripts/bootstrap/bootstrap_cfn.yaml @@ -0,0 +1,13 @@ +Resources: + S3Bucket: + Type: 'AWS::S3::Bucket' + DeletionPolicy: Retain + Properties: + BucketName: !Sub 'mlops-${AWS::AccountId}-${AWS::Region}-tf-state' +Outputs: + BucketName: + Value: !Ref S3Bucket + Description: Name of the sample Amazon S3 bucket with CORS enabled. + BucketRegion: + Value: !Sub "${AWS::Region}" + Description: Region on state bucket diff --git a/mlops-template-terraform/mlops_templates/scripts/bootstrap/provider.template b/mlops-template-terraform/mlops_templates/scripts/bootstrap/provider.template new file mode 100644 index 00000000..d44d64b2 --- /dev/null +++ b/mlops-template-terraform/mlops_templates/scripts/bootstrap/provider.template @@ -0,0 +1,12 @@ +provider "aws" { + region = "$BUCKET_REGION" +} + +terraform { + required_version = ">= 1.0.0" + backend "s3" { + bucket = "$BUCKET_NAME" + key = "mlops-templates.tfstate" + region = "$BUCKET_REGION" + } +} diff --git a/mlops-template-terraform/mlops_templates/scripts/terraform-account-setup.sh b/mlops-template-terraform/mlops_templates/scripts/terraform-account-setup.sh new file mode 100755 index 00000000..f2ee6909 --- /dev/null +++ b/mlops-template-terraform/mlops_templates/scripts/terraform-account-setup.sh @@ -0,0 +1,35 @@ +#!/bin/bash +STACK_NAME="mlops-tf-bootstrap" +AWS_ACCOUNT=$(aws sts get-caller-identity --query "Account" --output text) +AWS_REGION=${AWS_REGION:-$(aws configure get region)} + +# Deploy bucket for state files +bootstrap() { + echo "---------------------BOOTSTRAPPING---------------------" + aws cloudformation deploy --template ./scripts/bootstrap/bootstrap_cfn.yaml --stack-name $STACK_NAME + export BUCKET_NAME=$(aws cloudformation describe-stacks --stack-name $STACK_NAME --query "Stacks[0].Outputs[?OutputKey=='BucketName'].OutputValue" --output text) + export BUCKET_REGION=$(aws cloudformation describe-stacks --stack-name $STACK_NAME --query "Stacks[0].Outputs[?OutputKey=='BucketRegion'].OutputValue" --output text) + + # Update provider.tf to use bucket and region + envsubst < "./scripts/bootstrap/provider.template" > "./terraform/provider.tf" + + # Re-initialize terraform + terraform init -reconfigure || true + echo "State Bucket: $BUCKET_NAME" + echo "State Region: $BUCKET_REGION" + echo "-----------------------COMPLETE------------------------" + echo "Ensure to set your AWS_REGION environment variable to $AWS_REGION for Terrform to select the correct region" +} + + +read -r -p "Bootstrap $AWS_ACCOUNT in $AWS_REGION?? [Y/n] " CONFIRMATION +case "$CONFIRMATION" in + [yY][eE][sS]|[yY]) + # Deploy bucket for state files + bootstrap + ;; + *) + echo "Skipping bootstrap" + ;; +esac + \ No newline at end of file diff --git a/mlops-template-terraform/mlops_templates/terraform/main.tf b/mlops-template-terraform/mlops_templates/terraform/main.tf new file mode 100644 index 00000000..518502d2 --- /dev/null +++ b/mlops-template-terraform/mlops_templates/terraform/main.tf @@ -0,0 +1,49 @@ +locals { + prefix = "mlops" + templates = { + mlops_terraform_template : { + name : "${local.prefix}_terraform_template" + owner : "central-IT" + template_key = "${local.prefix}/resouces/service_catalog/templates/terraform_template/service_catalog_product_template.yaml" + template_local_path = "./templates/mlops_terraform_template/service_catalog_product_template.yaml.tpl" + seed_code_map : { + "build" : { + local_path : "./templates/mlops_terraform_template/seed_code/build_app" + key : "${local.prefix}/resouces/service_catalog/seed_code/terraform_template/build.zip" + }, + "deploy" : { + local_path : "./templates/mlops_terraform_template/seed_code/deploy_app", + key : "${local.prefix}/resouces/service_catalog/seed_code/terraform_template/deploy.zip" + } + } + } + } +} + + +data "aws_caller_identity" "current" {} +data "aws_region" "current" {} + +module "sm_project_template" { + source = ".//modules/sagemaker_project_template" + template_name = local.templates.mlops_terraform_template.name + owner = local.templates.mlops_terraform_template.owner + sagemaker_execution_role_arn = "arn:aws:iam::${data.aws_caller_identity.current.account_id}:role/${local.prefix}-sagemaker-execution-role" + seed_code_map = local.templates.mlops_terraform_template.seed_code_map + template_local_path = local.templates.mlops_terraform_template.template_local_path + template_key = local.templates.mlops_terraform_template.template_key + + # These values will be injected into the template.yaml file. + # The $${} escape allows referencing CloudFormation Parameters + template_vars = { + prefix = local.prefix + artifact_bucket_name = "${local.prefix}-project-$${SageMakerProjectName}" + state_bucket_name = "${local.prefix}-project-$${SageMakerProjectName}-tf-state" + build_code_repo_name = "${local.prefix}-project-$${SageMakerProjectName}-modelbuild" + deploy_code_repo_name = "${local.prefix}-project-$${SageMakerProjectName}-modeldeploy" + seed_code_build_key = local.templates.mlops_terraform_template.seed_code_map["build"].key, + seed_code_deploy_key = local.templates.mlops_terraform_template.seed_code_map["deploy"].key, + default_branch = "main" + } +} + diff --git a/mlops-template-terraform/mlops_templates/terraform/modules/sagemaker_project_template/main.tf b/mlops-template-terraform/mlops_templates/terraform/modules/sagemaker_project_template/main.tf new file mode 100644 index 00000000..bc72c337 --- /dev/null +++ b/mlops-template-terraform/mlops_templates/terraform/modules/sagemaker_project_template/main.tf @@ -0,0 +1,131 @@ +locals { + template_name = var.template_name + template_key = var.template_key + template_local_path = var.template_local_path + seed_code_map = var.seed_code_map + template_vars = var.template_vars + sm_execution_role_arn = var.sagemaker_execution_role_arn + owner = var.owner + sagemaker_execution_role_arn = var.sagemaker_execution_role_arn +} + +data "aws_caller_identity" "current" {} + +data "aws_region" "current" {} + +resource "aws_s3_bucket" "sc_bucket" { + bucket = "${replace(local.template_name, "_", "-")}-${data.aws_caller_identity.current.account_id}-${data.aws_region.current.name}-service-catalog" +} + +resource "aws_s3_bucket_acl" "private_acl" { + bucket = aws_s3_bucket.sc_bucket.id + acl = "private" +} + + +data "archive_file" "seed_source" { + for_each = local.seed_code_map + type = "zip" + source_dir = each.value.local_path + output_path = "${each.value.local_path}.zip" +} + +resource "aws_s3_bucket_object" "seed_code_archive" { + depends_on = [ + aws_s3_bucket.sc_bucket, + data.archive_file.seed_source + ] + for_each = local.seed_code_map + bucket = aws_s3_bucket.sc_bucket.id + key = each.value.key + source = "${each.value.local_path}.zip" + etag = filemd5("${each.value.local_path}.zip") +} +resource "local_file" "template_out" { + content = templatefile("${local.template_local_path}", merge({ seed_code_bucket = aws_s3_bucket.sc_bucket.id}, local.template_vars)) + filename = "${path.module}/.terraform.out/generated_template.yaml" +} + + +resource "aws_s3_bucket_object" "service_catalog_mlops_template" { + depends_on = [ + aws_s3_bucket.sc_bucket + ] + bucket = aws_s3_bucket.sc_bucket.id + key = local.template_key + content = templatefile("${local.template_local_path}", merge({ seed_code_bucket = aws_s3_bucket.sc_bucket.id}, local.template_vars)) +} + +resource "aws_servicecatalog_product" "mlops_product" { + depends_on = [ + aws_s3_bucket_object.service_catalog_mlops_template + ] + + name = local.template_name + owner = local.owner + type = "CLOUD_FORMATION_TEMPLATE" + + provisioning_artifact_parameters { + type = "CLOUD_FORMATION_TEMPLATE" + template_url = "https://${aws_s3_bucket.sc_bucket.id}.s3.amazonaws.com/${local.template_key}" + } + + tags = { + "sagemaker:studio-visibility" = "true" + } +} + + +resource "aws_iam_role" "service_launch_role" { + name = "service_launch_role" + path = "/service-role/" + assume_role_policy = jsonencode({ + Version = "2012-10-17" + Statement = [ + { + Action = "sts:AssumeRole" + Effect = "Allow" + Sid = "AllowRoleAssume" + Principal = { + Service = "servicecatalog.amazonaws.com" + } + }, + ] + }) +} + +resource "aws_iam_role_policy_attachment" "aws_sagemaker_full_access" { + role = aws_iam_role.service_launch_role.name + policy_arn = "arn:aws:iam::aws:policy/AmazonSageMakerAdmin-ServiceCatalogProductsServiceRolePolicy" +} + +resource "aws_servicecatalog_portfolio" "projects_templates" { + name = "${local.template_name}-portfolio" + description = "${local.template_name}: Terraform SageMaker Project Templates" + provider_name = local.owner + tags = { + "sagemaker:studio-visibility" = "true" + } +} + +resource "aws_servicecatalog_product_portfolio_association" "template_association" { + portfolio_id = aws_servicecatalog_portfolio.projects_templates.id + product_id = aws_servicecatalog_product.mlops_product.id +} + +resource "aws_servicecatalog_principal_portfolio_association" "sm_execution_role_principal_association" { + portfolio_id = aws_servicecatalog_portfolio.projects_templates.id + principal_arn = local.sagemaker_execution_role_arn +} + +# resource "aws_servicecatalog_constraint" "sm_execution_role_launch_constraint" { +# description = "${local.template_name} - constraint launch to Sagemaker execution role" +# portfolio_id = aws_servicecatalog_portfolio.projects_templates.id +# product_id = aws_servicecatalog_product.mlops_product.id +# type = "LAUNCH" + +# parameters = jsonencode({ +# "RoleArn" : local.sagemaker_execution_role_arn +# }) +# } + diff --git a/mlops-template-terraform/mlops_templates/terraform/modules/sagemaker_project_template/terraform.tf b/mlops-template-terraform/mlops_templates/terraform/modules/sagemaker_project_template/terraform.tf new file mode 100644 index 00000000..6e498ec5 --- /dev/null +++ b/mlops-template-terraform/mlops_templates/terraform/modules/sagemaker_project_template/terraform.tf @@ -0,0 +1,13 @@ +terraform { + required_version = ">= 1.0 , <= 1.2.6" + + required_providers { + aws = { + source = "hashicorp/aws" + version = "~> 3.0" + } + local = { + version = "~> 2.1" + } + } +} diff --git a/mlops-template-terraform/mlops_templates/terraform/modules/sagemaker_project_template/variables.tf b/mlops-template-terraform/mlops_templates/terraform/modules/sagemaker_project_template/variables.tf new file mode 100644 index 00000000..93904031 --- /dev/null +++ b/mlops-template-terraform/mlops_templates/terraform/modules/sagemaker_project_template/variables.tf @@ -0,0 +1,33 @@ +variable "template_name" { + type = string + description = "Name of template, used to lookup seed_code and CloudFormation templates for service catalog" +} + +variable "owner" { + type = string + description = "Owner of the service catalog product" +} + +variable "sagemaker_execution_role_arn" { + type = string + description = "Sagemaker Execution Role ARN" +} + +variable "template_key" { + type = string + description = "S3 Key for storing service catalog template" +} + +variable "template_local_path" { + type = string + description = "Path to local service catalog product template file" +} + +variable "seed_code_map" { + type = map(any) + description = "Map of seed code for template" +} +variable "template_vars" { + type = map(any) + description = "Values to substitute into service catalog product template" +} diff --git a/mlops-template-terraform/mlops_templates/terraform/provider.tf b/mlops-template-terraform/mlops_templates/terraform/provider.tf new file mode 100644 index 00000000..9c40338b --- /dev/null +++ b/mlops-template-terraform/mlops_templates/terraform/provider.tf @@ -0,0 +1,14 @@ +provider "aws" { + region = "" +} + +terraform { + required_version = ">= 1.0.0" + backend "s3" { + bucket = "" + key = "mlops-templates.tfstate" + region = "" + } +} + + diff --git a/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/.gitkeep b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/.gitkeep new file mode 100644 index 00000000..e69de29b diff --git a/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/.gitignore b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/.gitignore new file mode 100644 index 00000000..ed839918 --- /dev/null +++ b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/.gitignore @@ -0,0 +1,141 @@ +*.tfstate +*.tfstate.* +*.tfvars.json +**/.terraform/* +override.tf +override.tf.json +*_override.tf +*_override.tf.json +.terraformrc +terraform.rc +.terraform.lock.hcl +.cache/ +*.zip +*.bundle.* +lib/ +node_modules/ +*.egg-info/ +.ipynb_checkpoints +*.tsbuildinfo + +# Created by https://www.gitignore.io/api/python +# Edit at https://www.gitignore.io/?templates=python + +### Python ### +# Byte-compiled / optimized / DLL files +__pycache__/ +*.py[cod] +*$py.class + +# C extensions +*.so + +# Distribution / packaging +.Python +build/ +develop-eggs/ +dist/ +downloads/ +eggs/ +.eggs/ +lib/ +lib64/ +parts/ +sdist/ +var/ +wheels/ +pip-wheel-metadata/ +share/python-wheels/ +.installed.cfg +*.egg +MANIFEST + +# PyInstaller +# Usually these files are written by a python script from a template +# before PyInstaller builds the exe, so as to inject date/other infos into it. +*.manifest +*.spec + +# Installer logs +pip-log.txt +pip-delete-this-directory.txt + +# Unit test / coverage reports +htmlcov/ +.tox/ +.nox/ +.coverage +.coverage.* +.cache +nosetests.xml +coverage.xml +*.cover +.hypothesis/ +.pytest_cache/ + +# Translations +*.mo +*.pot + +# Scrapy stuff: +.scrapy + +# Sphinx documentation +docs/_build/ + +# PyBuilder +target/ + +# pyenv +.python-version + +# celery beat schedule file +celerybeat-schedule + +# SageMath parsed files +*.sage.py + +# Spyder project settings +.spyderproject +.spyproject + +# Rope project settings +.ropeproject + +# Mr Developer +.mr.developer.cfg +.project +.pydevproject + +# mkdocs documentation +/site + +# mypy +.mypy_cache/ +.dmypy.json +dmypy.json + +# Pyre type checker +.pyre/ + +# OS X stuff +*.DS_Store + +# End of https://www.gitignore.io/api/python + +_temp_extension +junit.xml +[uU]ntitled* +notebook/static/* +!notebook/static/favicons +notebook/labextension +notebook/schemas +docs/source/changelog.md +docs/source/contributing.md + +# playwright +ui-tests/test-results +ui-tests/playwright-report + +# VSCode +.vscode diff --git a/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/Makefile b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/Makefile new file mode 100644 index 00000000..60b830f0 --- /dev/null +++ b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/Makefile @@ -0,0 +1,12 @@ + +bootstrap: + @sh ./infra_scripts/bootstrap.sh + +init: + @cd terraform && terraform init + +plan: + @cd terraform && terraform plan + +apply: + @cd terraform && terraform apply \ No newline at end of file diff --git a/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/README.md b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/README.md new file mode 100644 index 00000000..8e95b64b --- /dev/null +++ b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/README.md @@ -0,0 +1,14 @@ +# Terraform Build App + +After a new project has been created and bootstrapped, you can update your SageMaker Pipelines and scripts that are +running in the different steps. The main files and folders to consider are: +- `/ml_pipelines` for adapting any necessary changes to the SageMaker Pipeline definition such as data sources, instance types + and similiar +- `/source_scripts` for changing any of the source code for preprocessing, model evaluation or modelling logic. + +In order for your changes to become effective and retrain the ML model with the new source code, you will need to push the +code to the repository: +- `$ git add -A` +- `$ git commit -m "bootstrapped"` +- `$ git push` # Will trigger CodePipeline & deploy/train SageMaker pipeline +- Approve the resulting model in SageMaker Studio under Model Groups of the SageMaker Project \ No newline at end of file diff --git a/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/buildspec.yml b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/buildspec.yml new file mode 100644 index 00000000..d6032bb4 --- /dev/null +++ b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/buildspec.yml @@ -0,0 +1,17 @@ +version: 0.2 + +phases: + install: + runtime-versions: + python: 3.8 + commands: + - pip install --upgrade --force-reinstall . "awscli>1.20.30" + build: + commands: + - export PYTHONUNBUFFERED=TRUE + - | + run-pipeline --module-name ml_pipelines.training.pipeline \ + --role-arn $SAGEMAKER_PIPELINE_ROLE_ARN \ + --tags "[{\"Key\":\"sagemaker:project-name\", \"Value\":\"${SAGEMAKER_PROJECT_NAME}\"}, {\"Key\":\"sagemaker:project-id\", \"Value\":\"${SAGEMAKER_PROJECT_ID}\"}]" \ + --kwargs "{\"region\":\"${AWS_REGION}\",\"role\":\"${SAGEMAKER_PIPELINE_ROLE_ARN}\",\"default_bucket\":\"${ARTIFACT_BUCKET}\",\"pipeline_name\":\"${PIPELINE_NAME}\",\"model_package_group_name\":\"${MODEL_PACKAGE_GROUP_NAME}\",\"base_job_prefix\":\"${PIPELINE_NAME}\", \"bucket_kms_id\":\"${ARTIFACT_BUCKET_KMS_ID}\"}" + - echo "Create/Update of the SageMaker Pipeline and execution completed." \ No newline at end of file diff --git a/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/infra_scripts/bootstrap.sh b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/infra_scripts/bootstrap.sh new file mode 100644 index 00000000..db0799a3 --- /dev/null +++ b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/infra_scripts/bootstrap.sh @@ -0,0 +1,117 @@ +#!/bin/bash + +sudo yum install -y gettext wget unzip jq + +export TERRAFORM_VERSION="1.2.4" + +echo "Attempting to install terraform" && \ +wget -q https://releases.hashicorp.com/terraform/${TERRAFORM_VERSION}/terraform_${TERRAFORM_VERSION}_linux_amd64.zip -P /tmp && \ +unzip -q /tmp/terraform_${TERRAFORM_VERSION}_linux_amd64.zip -d /tmp && \ +sudo mv /tmp/terraform /usr/local/bin/ && \ +rm -rf /tmp/terraform_${TERRAFORM_VERSION}_linux_amd64.zip && \ +echo "terraform is installed successfully" + + +# Read the SM_PROJECT_ID from the folder name +DEFAULT_SM_PROJECT_ID=$(cat .sagemaker-code-config | jq -r .sagemakerProjectId) +read -p "Sagemaker Project ID (default \"$DEFAULT_SM_PROJECT_ID\"): " sm_project_id_input +export SM_PROJECT_ID="${sm_project_id_input:-$DEFAULT_SM_PROJECT_ID}" + +if [[ -z $SM_PROJECT_ID ]]; then + echo "No Sagemaker Project ID provided" + exit 1 +fi + +# Read the SM_PROJECT_NAME from the folder name +DEFAULT_SM_PROJECT_NAME=$(cat .sagemaker-code-config | jq -r .sagemakerProjectName) +read -p "Sagemaker Project ID (default \"$DEFAULT_SM_PROJECT_NAME\"): " sm_project_id_input +export SM_PROJECT_NAME="${sm_project_id_input:-$DEFAULT_SM_PROJECT_NAME}" + +if [[ -z $SM_PROJECT_NAME ]]; then + echo "No Sagemaker Project Name provided" + exit 1 +fi + + +# Fetch Values from CloudFormation Stack Output +export STACK_NAME=$(aws cloudformation describe-stacks --query 'Stacks[?Tags[?Key == `sagemaker:project-id` && Value == `'$DEFAULT_SM_PROJECT_ID'`]].{StackName: StackName}' --output text) + +export PREFIX=$(aws cloudformation describe-stacks --query "Stacks[?StackName=='$STACK_NAME'][].Outputs[?OutputKey=='Prefix'].OutputValue" --output text) + +export STATE_BUCKET_NAME=$(aws cloudformation describe-stacks --query "Stacks[?StackName=='$STACK_NAME'][].Outputs[?OutputKey=='MlOpsProjectStateBucket'].OutputValue" --output text) + +export ARTIFACT_BUCKET_NAME=$(aws cloudformation describe-stacks --query "Stacks[?StackName=='$STACK_NAME'][].Outputs[?OutputKey=='MlOpsArtifactsBucket'].OutputValue" --output text) + +export BUCKET_REGION=$(aws cloudformation describe-stacks --query "Stacks[?StackName=='$STACK_NAME'][].Outputs[?OutputKey=='BucketRegion'].OutputValue" --output text) + +CF_DEFAULT_BRANCH=$(aws cloudformation describe-stacks --query "Stacks[?StackName=='$STACK_NAME'][].Outputs[?OutputKey=='DefaultBranch'].OutputValue" --output text) + +export DEFAULT_BRANCH=${CF_DEFAULT_BRANCH:-main} + +# Get the AWS Account ID. Should be set as environmet variable in SageMaker Studio +read -p "AWS_ACCOUNT_ID (default \"$AWS_ACCOUNT_ID\"): " input_account_id +export AWS_ACCOUNT_ID="${input_account_id:-$AWS_ACCOUNT_ID}" + +if [[ -z "$AWS_ACCOUNT_ID" ]]; then + echo "No AWS_ACCOUNT_ID provided" + exit 1 +fi + +# Get the AWS Region ID. Should be set as environmet variable in SageMaker Studio +read -p "AWS_REGION (default \"$AWS_REGION\"): " input_region +export AWS_REGION="${input_region:-$AWS_REGION}" + +if [[ -z "$AWS_REGION" ]]; then + echo "No AWS_REGION provided" + exit 1 +fi + +CODECOMMIT_ID=$(cat .sagemaker-code-config | jq -r .codeRepositoryName) +read -p "CODECOMMIT_ID (default \"$CODECOMMIT_ID\"): " input_codecommit_id +export CODECOMMIT_ID="${input_codecommit_id:-$CODECOMMIT_ID}" + +if [[ -z "$CODECOMMIT_ID" ]]; then + echo "No CODECOMMIT_ID provided" + exit 1 +fi + +read -p "DEFAULT_BRANCH (default \"$DEFAULT_BRANCH\"): " input_default_branch +export DEFAULT_BRANCH="${input_default_branch:-$DEFAULT_BRANCH}" + +if [[ -z "$DEFAULT_BRANCH" ]]; then + echo "No DEFAULT_BRANCH provided" + exit 1 +fi + + +echo "--------------Bootstrap Output-----------------" +echo "Prefix: $PREFIX" +echo "Sagmaker Project ID: $SM_PROJECT_ID" +echo "Sagmaker Name: $SM_PROJECT_NAME" +echo "AWS Account ID: $AWS_ACCOUNT_ID" +echo "AWS Region: $AWS_REGION" +echo "Artifact bucket name: $ARTIFACT_BUCKET_NAME" +echo "State bucket name: $STATE_BUCKET_NAME" +echo "CodeCommit ID: $CODECOMMIT_ID" +echo "Default Branch: $DEFAULT_BRANCH" +echo "-----------------------------------------------" + +# Update provider.tf to use bucket and region +envsubst < "./infra_scripts/bootstrap/provider.template" > "./terraform/provider.tf" + +# Update terraform.tfvars +envsubst < "./infra_scripts/bootstrap/terraform.tfvars.template" > "./terraform/terraform.tfvars" + +cd terraform + +terraform init + +cd .. + +echo "Commit & update tfvars, terraform provider and sagemaker-code-config" +git add .sagemaker-code-config +git add terraform/provider.tf +git add terraform/terraform.tfvars +git commit --author="SM Projects <>" -m "Bootstrapping complete" +git push +echo "Bootstrapping completed" \ No newline at end of file diff --git a/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/infra_scripts/bootstrap/provider.template b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/infra_scripts/bootstrap/provider.template new file mode 100644 index 00000000..d90101c3 --- /dev/null +++ b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/infra_scripts/bootstrap/provider.template @@ -0,0 +1,12 @@ +provider "aws" { + region = "$AWS_REGION" +} + +terraform { + required_version = ">= $TERRAFORM_VERSION" + backend "s3" { + bucket = "$STATE_BUCKET_NAME" + key = "projects/state/build/mlops-$SM_PROJECT_ID-build-state.tfstate" + region = "$AWS_REGION" + } +} diff --git a/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/infra_scripts/bootstrap/terraform.tfvars.template b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/infra_scripts/bootstrap/terraform.tfvars.template new file mode 100644 index 00000000..d08f1e0a --- /dev/null +++ b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/infra_scripts/bootstrap/terraform.tfvars.template @@ -0,0 +1,8 @@ +aws_region = "$AWS_REGION" +aws_account_id = "$AWS_ACCOUNT_ID" +sm_project_id = "$SM_PROJECT_ID" +sm_project_name = "$SM_PROJECT_NAME" +codecommit_id = "$CODECOMMIT_ID" +prefix = "$PREFIX" +artifact_bucket_name = "$ARTIFACT_BUCKET_NAME" +default_branch = "$DEFAULT_BRANCH" \ No newline at end of file diff --git a/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/ml_pipelines/README.md b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/ml_pipelines/README.md new file mode 100644 index 00000000..8e309f81 --- /dev/null +++ b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/ml_pipelines/README.md @@ -0,0 +1,7 @@ +# SageMaker Pipelines + +This folder contains SageMaker Pipeline definitions and helper scripts to either simply "get" a SageMaker Pipeline definition (JSON dictionnary) with `get_pipeline_definition.py`, or "run" a SageMaker Pipeline from a SageMaker pipeline definition with `run_pipeline.py`. + +Those files are generic and can be reused to call any SageMaker Pipeline. + +Each SageMaker Pipeline definition should be be treated as a modul inside its own folder, for example here the "training" pipeline, contained inside `training/`. diff --git a/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/ml_pipelines/__init__.py b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/ml_pipelines/__init__.py new file mode 100644 index 00000000..ff79f21c --- /dev/null +++ b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/ml_pipelines/__init__.py @@ -0,0 +1,30 @@ +# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. +# +# SPDX-License-Identifier: MIT-0 +# +# Permission is hereby granted, free of charge, to any person obtaining a copy of this +# software and associated documentation files (the "Software"), to deal in the Software +# without restriction, including without limitation the rights to use, copy, modify, +# merge, publish, distribute, sublicense, and/or sell copies of the Software, and to +# permit persons to whom the Software is furnished to do so. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, +# INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A +# PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT +# HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION +# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE +# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + +# © 2021 Amazon Web Services, Inc. or its affiliates. All Rights Reserved. This +# AWS Content is provided subject to the terms of the AWS Customer Agreement +# available at http://aws.amazon.com/agreement or other written agreement between +# Customer and either Amazon Web Services, Inc. or Amazon Web Services EMEA SARL +# or both. +# +# Any code, applications, scripts, templates, proofs of concept, documentation +# and other items provided by AWS under this SOW are "AWS Content," as defined +# in the Agreement, and are provided for illustration purposes only. All such +# AWS Content is provided solely at the option of AWS, and is subject to the +# terms of the Addendum and the Agreement. Customer is solely responsible for +# using, deploying, testing, and supporting any code and applications provided +# by AWS under this SOW. diff --git a/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/ml_pipelines/__version__.py b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/ml_pipelines/__version__.py new file mode 100644 index 00000000..660d19ee --- /dev/null +++ b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/ml_pipelines/__version__.py @@ -0,0 +1,26 @@ +# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. +# +# SPDX-License-Identifier: MIT-0 +# +# Permission is hereby granted, free of charge, to any person obtaining a copy of this +# software and associated documentation files (the "Software"), to deal in the Software +# without restriction, including without limitation the rights to use, copy, modify, +# merge, publish, distribute, sublicense, and/or sell copies of the Software, and to +# permit persons to whom the Software is furnished to do so. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, +# INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A +# PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT +# HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION +# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE +# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + +"""Metadata for the ml pipelines package.""" + +__title__ = "ml_pipelines" +__description__ = "ml pipelines - template package" +__version__ = "0.0.1" +__author__ = "" +__author_email__ = "" +__license__ = "Apache 2.0" +__url__ = "" diff --git a/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/ml_pipelines/_utils.py b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/ml_pipelines/_utils.py new file mode 100644 index 00000000..581e1eb7 --- /dev/null +++ b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/ml_pipelines/_utils.py @@ -0,0 +1,91 @@ +# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. +# +# SPDX-License-Identifier: MIT-0 +# +# Permission is hereby granted, free of charge, to any person obtaining a copy of this +# software and associated documentation files (the "Software"), to deal in the Software +# without restriction, including without limitation the rights to use, copy, modify, +# merge, publish, distribute, sublicense, and/or sell copies of the Software, and to +# permit persons to whom the Software is furnished to do so. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, +# INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A +# PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT +# HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION +# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE +# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + +# © 2021 Amazon Web Services, Inc. or its affiliates. All Rights Reserved. This +# AWS Content is provided subject to the terms of the AWS Customer Agreement +# available at http://aws.amazon.com/agreement or other written agreement between +# Customer and either Amazon Web Services, Inc. or Amazon Web Services EMEA SARL +# or both. +# +# Any code, applications, scripts, templates, proofs of concept, documentation +# and other items provided by AWS under this SOW are "AWS Content," as defined +# in the Agreement, and are provided for illustration purposes only. All such +# AWS Content is provided solely at the option of AWS, and is subject to the +# terms of the Addendum and the Agreement. Customer is solely responsible for +# using, deploying, testing, and supporting any code and applications provided +# by AWS under this SOW. + +# Copyright 2020 Amazon.com, Inc. or its affiliates. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"). You +# may not use this file except in compliance with the License. A copy of +# the License is located at +# +# http://aws.amazon.com/apache2.0/ +# +# or in the "license" file accompanying this file. This file is +# distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF +# ANY KIND, either express or implied. See the License for the specific +# language governing permissions and limitations under the License. +"""Provides utilities for SageMaker Pipeline CLI.""" +from __future__ import absolute_import + +import ast + + +def get_pipeline_driver(module_name, passed_args=None): + """Gets the driver for generating your pipeline definition. + + Pipeline modules must define a get_pipeline() module-level method. + + Args: + module_name: The module name of your pipeline. + passed_args: Optional passed arguments that your pipeline may be templated by. + + Returns: + The SageMaker Workflow pipeline. + """ + _imports = __import__(module_name, fromlist=["get_pipeline"]) + kwargs = convert_struct(passed_args) + return _imports.get_pipeline(**kwargs) + + +def convert_struct(str_struct=None): + """convert the string argument to it's proper type + + Args: + str_struct (str, optional): string to be evaluated. Defaults to None. + + Returns: + string struct as it's actuat evaluated type + """ + return ast.literal_eval(str_struct) if str_struct else {} + + +def get_pipeline_custom_tags(module_name, args, tags): + """Gets the custom tags for pipeline + + Returns: + Custom tags to be added to the pipeline + """ + try: + _imports = __import__(module_name, fromlist=["get_pipeline_custom_tags"]) + kwargs = convert_struct(args) + return _imports.get_pipeline_custom_tags(tags, kwargs["region"], kwargs["sagemaker_project_arn"]) + except Exception as e: + print(f"Error getting project tags: {e}") + return tags diff --git a/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/ml_pipelines/get_pipeline_definition.py b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/ml_pipelines/get_pipeline_definition.py new file mode 100644 index 00000000..edfb6b40 --- /dev/null +++ b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/ml_pipelines/get_pipeline_definition.py @@ -0,0 +1,77 @@ +# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. +# +# SPDX-License-Identifier: MIT-0 +# +# Permission is hereby granted, free of charge, to any person obtaining a copy of this +# software and associated documentation files (the "Software"), to deal in the Software +# without restriction, including without limitation the rights to use, copy, modify, +# merge, publish, distribute, sublicense, and/or sell copies of the Software, and to +# permit persons to whom the Software is furnished to do so. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, +# INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A +# PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT +# HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION +# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE +# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + + +"""A CLI to get pipeline definitions from pipeline modules.""" +from __future__ import absolute_import + +import argparse +import sys + +from ml_pipelines._utils import get_pipeline_driver + + +def main(): # pragma: no cover + """The main harness that gets the pipeline definition JSON. + + Prints the json to stdout or saves to file. + """ + parser = argparse.ArgumentParser("Gets the pipeline definition for the pipeline script.") + + parser.add_argument( + "-n", + "--module-name", + dest="module_name", + type=str, + help="The module name of the pipeline to import.", + ) + parser.add_argument( + "-f", + "--file-name", + dest="file_name", + type=str, + default=None, + help="The file to output the pipeline definition json to.", + ) + parser.add_argument( + "-kwargs", + "--kwargs", + dest="kwargs", + default=None, + help="Dict string of keyword arguments for the pipeline generation (if supported)", + ) + args = parser.parse_args() + + if args.module_name is None: + parser.print_help() + sys.exit(2) + + try: + pipeline = get_pipeline_driver(args.module_name, args.kwargs) + content = pipeline.definition() + if args.file_name: + with open(args.file_name, "w") as f: + f.write(content) + else: + print(content) + except Exception as e: # pylint: disable=W0703 + print(f"Exception: {e}") + sys.exit(1) + + +if __name__ == "__main__": + main() diff --git a/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/ml_pipelines/run_pipeline.py b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/ml_pipelines/run_pipeline.py new file mode 100644 index 00000000..7d07235d --- /dev/null +++ b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/ml_pipelines/run_pipeline.py @@ -0,0 +1,110 @@ +# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. +# +# SPDX-License-Identifier: MIT-0 +# +# Permission is hereby granted, free of charge, to any person obtaining a copy of this +# software and associated documentation files (the "Software"), to deal in the Software +# without restriction, including without limitation the rights to use, copy, modify, +# merge, publish, distribute, sublicense, and/or sell copies of the Software, and to +# permit persons to whom the Software is furnished to do so. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, +# INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A +# PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT +# HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION +# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE +# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + +"""A CLI to create or update and run pipelines.""" +from __future__ import absolute_import + +import argparse +import json +import sys + +from ml_pipelines._utils import get_pipeline_driver, convert_struct, get_pipeline_custom_tags + + +def main(): # pragma: no cover + """The main harness that creates or updates and runs the pipeline. + + Creates or updates the pipeline and runs it. + """ + parser = argparse.ArgumentParser("Creates or updates and runs the pipeline for the pipeline script.") + + parser.add_argument( + "-n", + "--module-name", + dest="module_name", + type=str, + help="The module name of the pipeline to import.", + ) + parser.add_argument( + "-kwargs", + "--kwargs", + dest="kwargs", + default=None, + help="Dict string of keyword arguments for the pipeline generation (if supported)", + ) + parser.add_argument( + "-role-arn", + "--role-arn", + dest="role_arn", + type=str, + help="The role arn for the pipeline service execution role.", + ) + parser.add_argument( + "-description", + "--description", + dest="description", + type=str, + default=None, + help="The description of the pipeline.", + ) + parser.add_argument( + "-tags", + "--tags", + dest="tags", + default=None, + help="""List of dict strings of '[{"Key": "string", "Value": "string"}, ..]'""", + ) + args = parser.parse_args() + + if args.module_name is None or args.role_arn is None: + parser.print_help() + sys.exit(2) + tags = convert_struct(args.tags) + + try: + pipeline = get_pipeline_driver(args.module_name, args.kwargs) + print("###### Creating/updating a SageMaker Pipeline with the following definition:") + parsed = json.loads(pipeline.definition()) + print(json.dumps(parsed, indent=2, sort_keys=True)) + + all_tags = get_pipeline_custom_tags(args.module_name, args.kwargs, tags) + + upsert_response = pipeline.upsert(role_arn=args.role_arn, description=args.description, tags=all_tags) + + upsert_response = pipeline.upsert( + role_arn=args.role_arn, description=args.description + ) # , tags=tags) # Removing tag momentaneously + print("\n###### Created/Updated SageMaker Pipeline: Response received:") + print(upsert_response) + + execution = pipeline.start() + print(f"\n###### Execution started with PipelineExecutionArn: {execution.arn}") + + # TODO removiong wait time as training can take some time + #print("Waiting for the execution to finish...") + #execution.wait() + #print("\n#####Execution completed. Execution step details:") + print("Execution started. To view the pipeline execution go to SageMaker Studio -> Projects -> Your Project -> Pipelines") + + #print(execution.list_steps()) + except Exception as e: # pylint: disable=W0703 + print(f"Exception: {e}") + sys.exit(1) + + +if __name__ == "__main__": + main() diff --git a/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/ml_pipelines/training/README.md b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/ml_pipelines/training/README.md new file mode 100644 index 00000000..8a493ac6 --- /dev/null +++ b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/ml_pipelines/training/README.md @@ -0,0 +1,7 @@ +# Training SageMaker Pipeline + +This SageMaker Pipeline definition creates a workflow that will: +- Prepare the Abalone dataset through a SageMaker Processing Job +- Train an XGBoost algorithm on the train set +- Evaluate the performance of the trained XGBoost algorithm on the validation set +- If the performance reaches a specified threshold, send the model for Manual Approval to SageMaker Model Registry. diff --git a/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/ml_pipelines/training/__init__.py b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/ml_pipelines/training/__init__.py new file mode 100644 index 00000000..ff79f21c --- /dev/null +++ b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/ml_pipelines/training/__init__.py @@ -0,0 +1,30 @@ +# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. +# +# SPDX-License-Identifier: MIT-0 +# +# Permission is hereby granted, free of charge, to any person obtaining a copy of this +# software and associated documentation files (the "Software"), to deal in the Software +# without restriction, including without limitation the rights to use, copy, modify, +# merge, publish, distribute, sublicense, and/or sell copies of the Software, and to +# permit persons to whom the Software is furnished to do so. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, +# INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A +# PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT +# HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION +# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE +# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + +# © 2021 Amazon Web Services, Inc. or its affiliates. All Rights Reserved. This +# AWS Content is provided subject to the terms of the AWS Customer Agreement +# available at http://aws.amazon.com/agreement or other written agreement between +# Customer and either Amazon Web Services, Inc. or Amazon Web Services EMEA SARL +# or both. +# +# Any code, applications, scripts, templates, proofs of concept, documentation +# and other items provided by AWS under this SOW are "AWS Content," as defined +# in the Agreement, and are provided for illustration purposes only. All such +# AWS Content is provided solely at the option of AWS, and is subject to the +# terms of the Addendum and the Agreement. Customer is solely responsible for +# using, deploying, testing, and supporting any code and applications provided +# by AWS under this SOW. diff --git a/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/ml_pipelines/training/_utils.py b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/ml_pipelines/training/_utils.py new file mode 100644 index 00000000..78330433 --- /dev/null +++ b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/ml_pipelines/training/_utils.py @@ -0,0 +1,86 @@ +# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. +# +# SPDX-License-Identifier: MIT-0 +# +# Permission is hereby granted, free of charge, to any person obtaining a copy of this +# software and associated documentation files (the "Software"), to deal in the Software +# without restriction, including without limitation the rights to use, copy, modify, +# merge, publish, distribute, sublicense, and/or sell copies of the Software, and to +# permit persons to whom the Software is furnished to do so. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, +# INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A +# PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT +# HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION +# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE +# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + +import logging + +from botocore.exceptions import ClientError + +logger = logging.getLogger(__name__) + + +def resolve_ecr_uri_from_image_versions(sagemaker_session, image_versions, image_name): + """Gets ECR URI from image versions + Args: + sagemaker_session: boto3 session for sagemaker client + image_versions: list of the image versions + image_name: Name of the image + + Returns: + ECR URI of the image version + """ + + # Fetch image details to get the Base Image URI + for image_version in image_versions: + if image_version["ImageVersionStatus"] == "CREATED": + image_arn = image_version["ImageVersionArn"] + version = image_version["Version"] + logger.info(f"Identified the latest image version: {image_arn}") + response = sagemaker_session.sagemaker_client.describe_image_version(ImageName=image_name, Version=version) + return response["ContainerImage"] + return None + + +def resolve_ecr_uri(sagemaker_session, image_arn): + """Gets the ECR URI from the image name + + Args: + sagemaker_session: boto3 session for sagemaker client + image_name: name of the image + + Returns: + ECR URI of the latest image version + """ + + # Fetching image name from image_arn (^arn:aws(-[\w]+)*:sagemaker:.+:[0-9]{12}:image/[a-z0-9]([-.]?[a-z0-9])*$) + image_name = image_arn.partition("image/")[2] + try: + # Fetch the image versions + next_token = "" + while True: + response = sagemaker_session.sagemaker_client.list_image_versions( + ImageName=image_name, MaxResults=100, SortBy="VERSION", SortOrder="DESCENDING", NextToken=next_token + ) + + ecr_uri = resolve_ecr_uri_from_image_versions(sagemaker_session, response["ImageVersions"], image_name) + + if ecr_uri is not None: + return ecr_uri + + if "NextToken" in response: + next_token = response["NextToken"] + else: + break + + # Return error if no versions of the image found + error_message = f"No image version found for image name: {image_name}" + logger.error(error_message) + raise Exception(error_message) + + except (ClientError, sagemaker_session.sagemaker_client.exceptions.ResourceNotFound) as e: + error_message = e.response["Error"]["Message"] + logger.error(error_message) + raise Exception(error_message) diff --git a/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/ml_pipelines/training/pipeline.py b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/ml_pipelines/training/pipeline.py new file mode 100644 index 00000000..de25b8ba --- /dev/null +++ b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/ml_pipelines/training/pipeline.py @@ -0,0 +1,331 @@ +# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. +# +# SPDX-License-Identifier: MIT-0 +# +# Permission is hereby granted, free of charge, to any person obtaining a copy of this +# software and associated documentation files (the "Software"), to deal in the Software +# without restriction, including without limitation the rights to use, copy, modify, +# merge, publish, distribute, sublicense, and/or sell copies of the Software, and to +# permit persons to whom the Software is furnished to do so. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, +# INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A +# PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT +# HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION +# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE +# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + +"""Example workflow pipeline script for abalone pipeline. + + . -RegisterModel + . + Process-> Train -> Evaluate -> Condition . + . + . -(stop) + +Implements a get_pipeline(**kwargs) method. +""" +import os + +import boto3 +import logging +import sagemaker +import sagemaker.session + +from sagemaker.estimator import Estimator +from sagemaker.inputs import TrainingInput +from sagemaker.model_metrics import ( + MetricsSource, + ModelMetrics, +) +from sagemaker.processing import ( + ProcessingInput, + ProcessingOutput, + ScriptProcessor, +) +from sagemaker.sklearn.processing import SKLearnProcessor +from sagemaker.workflow.conditions import ConditionLessThanOrEqualTo +from sagemaker.workflow.condition_step import ( + ConditionStep, +) +from sagemaker.workflow.functions import ( + JsonGet, +) +from sagemaker.workflow.parameters import ( + ParameterInteger, + ParameterString, +) +from sagemaker.workflow.pipeline import Pipeline +from sagemaker.workflow.properties import PropertyFile +from sagemaker.workflow.steps import ( + ProcessingStep, + TrainingStep, +) +from sagemaker.workflow.step_collections import RegisterModel + +from botocore.exceptions import ClientError +from sagemaker.network import NetworkConfig + + +# BASE_DIR = os.path.dirname(os.path.realpath(__file__)) + +logger = logging.getLogger(__name__) + + +def get_session(region, default_bucket): + """Gets the sagemaker session based on the region. + + Args: + region: the aws region to start the session + default_bucket: the bucket to use for storing the artifacts + + Returns: + `sagemaker.session.Session instance + """ + + boto_session = boto3.Session(region_name=region) + + sagemaker_client = boto_session.client("sagemaker") + runtime_client = boto_session.client("sagemaker-runtime") + session = sagemaker.session.Session( + boto_session=boto_session, + sagemaker_client=sagemaker_client, + sagemaker_runtime_client=runtime_client, + default_bucket=default_bucket, + ) + + return session + + +def get_pipeline( + region, + role=None, + default_bucket=None, + bucket_kms_id=None, + model_package_group_name="AbalonePackageGroup", + pipeline_name="AbalonePipeline", + base_job_prefix="Abalone", + project_id="SageMakerProjectId", +): + """Gets a SageMaker ML Pipeline instance working with on abalone data. + + Args: + region: AWS region to create and run the pipeline. + role: IAM role to create and run steps and pipeline. + default_bucket: the bucket to use for storing the artifacts + + Returns: + an instance of a pipeline + """ + + sagemaker_session = get_session(region, default_bucket) + if role is None: + role = sagemaker.session.get_execution_role(sagemaker_session) + + # parameters for pipeline execution + processing_instance_count = ParameterInteger(name="ProcessingInstanceCount", default_value=1) + processing_instance_type = ParameterString(name="ProcessingInstanceType", default_value="ml.m4.xlarge") + training_instance_type = ParameterString(name="TrainingInstanceType", default_value="ml.m4.xlarge") + inference_instance_type = ParameterString(name="InferenceInstanceType", default_value="ml.m4.xlarge") + model_approval_status = ParameterString(name="ModelApprovalStatus", default_value="PendingManualApproval") + input_data = ParameterString( + name="InputDataUrl", + default_value=f"s3://sagemaker-servicecatalog-seedcode-{region}/dataset/abalone-dataset.csv", + ) + processing_image_name = "sagemaker-{0}-processingimagebuild".format(project_id) + training_image_name = "sagemaker-{0}-trainingimagebuild".format(project_id) + inference_image_name = "sagemaker-{0}-inferenceimagebuild".format(project_id) + + # network_config = NetworkConfig( + # enable_network_isolation=True, + # security_group_ids=security_group_ids, + # subnets=subnets, + # encrypt_inter_container_traffic=True, + # ) + + # processing step for feature engineering + try: + processing_image_uri = sagemaker_session.sagemaker_client.describe_image_version( + ImageName=processing_image_name + )["ContainerImage"] + except (sagemaker_session.sagemaker_client.exceptions.ResourceNotFound): + processing_image_uri = sagemaker.image_uris.retrieve( + framework="xgboost", + region=region, + version="1.0-1", + py_version="py3", + instance_type="ml.m4.xlarge", + ) + script_processor = ScriptProcessor( + image_uri=processing_image_uri, + instance_type=processing_instance_type, + instance_count=processing_instance_count, + base_job_name=f"{base_job_prefix}/sklearn-abalone-preprocess", + command=["python3"], + sagemaker_session=sagemaker_session, + role=role, + output_kms_key=bucket_kms_id, + ) + step_process = ProcessingStep( + name="PreprocessAbaloneData", + processor=script_processor, + outputs=[ + ProcessingOutput(output_name="train", source="/opt/ml/processing/train"), + ProcessingOutput(output_name="validation", source="/opt/ml/processing/validation"), + ProcessingOutput(output_name="test", source="/opt/ml/processing/test"), + ], + code="source_scripts/preprocessing/prepare_abalone_data/main.py", # we must figure out this path to get it from step_source directory + job_arguments=["--input-data", input_data], + ) + + # training step for generating model artifacts + model_path = f"s3://{default_bucket}/{base_job_prefix}/AbaloneTrain" + + try: + training_image_uri = sagemaker_session.sagemaker_client.describe_image_version(ImageName=training_image_name)[ + "ContainerImage" + ] + except (sagemaker_session.sagemaker_client.exceptions.ResourceNotFound): + training_image_uri = sagemaker.image_uris.retrieve( + framework="xgboost", + region=region, + version="1.0-1", + py_version="py3", + instance_type="ml.m4.xlarge", + ) + + xgb_train = Estimator( + image_uri=training_image_uri, + instance_type=training_instance_type, + instance_count=1, + output_path=model_path, + base_job_name=f"{base_job_prefix}/abalone-train", + sagemaker_session=sagemaker_session, + role=role, + output_kms_key=bucket_kms_id, + ) + xgb_train.set_hyperparameters( + objective="reg:linear", + num_round=50, + max_depth=5, + eta=0.2, + gamma=4, + min_child_weight=6, + subsample=0.7, + silent=0, + ) + step_train = TrainingStep( + name="TrainAbaloneModel", + estimator=xgb_train, + inputs={ + "train": TrainingInput( + s3_data=step_process.properties.ProcessingOutputConfig.Outputs["train"].S3Output.S3Uri, + content_type="text/csv", + ), + "validation": TrainingInput( + s3_data=step_process.properties.ProcessingOutputConfig.Outputs["validation"].S3Output.S3Uri, + content_type="text/csv", + ), + }, + ) + + # processing step for evaluation + script_eval = ScriptProcessor( + image_uri=training_image_uri, + command=["python3"], + instance_type=processing_instance_type, + instance_count=1, + base_job_name=f"{base_job_prefix}/script-abalone-eval", + sagemaker_session=sagemaker_session, + role=role, + output_kms_key=bucket_kms_id, + ) + evaluation_report = PropertyFile( + name="AbaloneEvaluationReport", + output_name="evaluation", + path="evaluation.json", + ) + step_eval = ProcessingStep( + name="EvaluateAbaloneModel", + processor=script_eval, + inputs=[ + ProcessingInput( + source=step_train.properties.ModelArtifacts.S3ModelArtifacts, + destination="/opt/ml/processing/model", + ), + ProcessingInput( + source=step_process.properties.ProcessingOutputConfig.Outputs["test"].S3Output.S3Uri, + destination="/opt/ml/processing/test", + ), + ], + outputs=[ + ProcessingOutput(output_name="evaluation", source="/opt/ml/processing/evaluation"), + ], + code="source_scripts/evaluate/evaluate_xgboost/main.py", + property_files=[evaluation_report], + ) + + # register model step that will be conditionally executed + model_metrics = ModelMetrics( + model_statistics=MetricsSource( + s3_uri="{}/evaluation.json".format( + step_eval.arguments["ProcessingOutputConfig"]["Outputs"][0]["S3Output"]["S3Uri"] + ), + content_type="application/json", + ) + ) + + try: + inference_image_uri = sagemaker_session.sagemaker_client.describe_image_version(ImageName=inference_image_name)[ + "ContainerImage" + ] + except (sagemaker_session.sagemaker_client.exceptions.ResourceNotFound): + inference_image_uri = sagemaker.image_uris.retrieve( + framework="xgboost", + region=region, + version="1.0-1", + py_version="py3", + instance_type="ml.m4.xlarge", + ) + step_register = RegisterModel( + name="RegisterAbaloneModel", + estimator=xgb_train, + image_uri=inference_image_uri, + model_data=step_train.properties.ModelArtifacts.S3ModelArtifacts, + content_types=["text/csv"], + response_types=["text/csv"], + inference_instances=["ml.m4.xlarge", "ml.m4.xlarge"], + transform_instances=["ml.m4.xlarge"], + model_package_group_name=model_package_group_name, + approval_status=model_approval_status, + model_metrics=model_metrics, + ) + + # condition step for evaluating model quality and branching execution + cond_lte = ConditionLessThanOrEqualTo( + left=JsonGet( + step_name=step_eval.name, property_file=evaluation_report, json_path="regression_metrics.mse.value" + ), + right=6.0, + ) + step_cond = ConditionStep( + name="CheckMSEAbaloneEvaluation", + conditions=[cond_lte], + if_steps=[step_register], + else_steps=[], + ) + + # pipeline instance + pipeline = Pipeline( + name=pipeline_name, + parameters=[ + processing_instance_type, + processing_instance_count, + training_instance_type, + model_approval_status, + input_data, + ], + steps=[step_process, step_train, step_eval, step_cond], + sagemaker_session=sagemaker_session, + ) + return pipeline diff --git a/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/notebooks/README.md b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/notebooks/README.md new file mode 100644 index 00000000..c0749333 --- /dev/null +++ b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/notebooks/README.md @@ -0,0 +1,4 @@ +# Jupyter Notebooks + +This folder is intended to store your experiment notebooks. +Typically the first step would be to store your Data Science notebooks, and start defining example SageMaker pipelines in here. Once satisfied with the first iteration of a SageMaker pipeline, the code should move as python scripts inside the respective `ml_pipelines/` and `source_scripts/` folders. diff --git a/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/notebooks/sagemaker-pipelines-preprocess-train-evaluate-batch-transform.ipynb b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/notebooks/sagemaker-pipelines-preprocess-train-evaluate-batch-transform.ipynb new file mode 100644 index 00000000..58db3199 --- /dev/null +++ b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/notebooks/sagemaker-pipelines-preprocess-train-evaluate-batch-transform.ipynb @@ -0,0 +1,1577 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "# Orchestrate Jobs to Train and Evaluate Models with Amazon SageMaker Pipelines\n", + "\n", + "Amazon SageMaker Pipelines offers machine learning (ML) application developers and operations engineers the ability to orchestrate SageMaker jobs and author reproducible ML pipelines. It also enables them to deploy custom-built models for inference in real-time with low latency, run offline inferences with Batch Transform, and track lineage of artifacts. They can institute sound operational practices in deploying and monitoring production workflows, deploying model artifacts, and tracking artifact lineage through a simple interface, adhering to safety and best practice paradigms for ML application development.\n", + "\n", + "The SageMaker Pipelines service supports a SageMaker Pipeline domain specific language (DSL), which is a declarative JSON specification. This DSL defines a directed acyclic graph (DAG) of pipeline parameters and SageMaker job steps. The SageMaker Python Software Developer Kit (SDK) streamlines the generation of the pipeline DSL using constructs that engineers and scientists are already familiar with.\n", + "\n", + "## Runtime\n", + "\n", + "This notebook takes approximately an hour to run.\n", + "\n", + "## Contents\n", + "\n", + "1. [SageMaker Pipelines](#SageMaker-Pipelines)\n", + "1. [Notebook Overview](#Notebook-Overview)\n", + "1. [A SageMaker Pipeline](#A-SageMaker-Pipeline)\n", + "1. [Dataset](#Dataset)\n", + "1. [Define Parameters to Parametrize Pipeline Execution](#Define-Parameters-to-Parametrize-Pipeline-Execution)\n", + "1. [Define a Processing Step for Feature Engineering](#Define-a-Processing-Step-for-Feature-Engineering)\n", + "1. [Define a Training Step to Train a Model](#Define-a-Training-Step-to-Train-a-Model)\n", + "1. [Define a Model Evaluation Step to Evaluate the Trained Model](#Define-a-Model-Evaluation-Step-to-Evaluate-the-Trained-Model)\n", + "1. [Define a Create Model Step to Create a Model](#Define-a-Create-Model-Step-to-Create-a-Model)\n", + "1. [Define a Transform Step to Perform Batch Transformation](#Define-a-Transform-Step-to-Perform-Batch-Transformation)\n", + "1. [Define a Register Model Step to Create a Model Package](#Define-a-Register-Model-Step-to-Create-a-Model-Package)\n", + "1. [Define a Fail Step to Terminate the Pipeline Execution and Mark it as Failed](#Define-a-Fail-Step-to-Terminate-the-Pipeline-Execution-and-Mark-it-as-Failed)\n", + "1. [Define a Condition Step to Check Accuracy and Conditionally Create a Model and Run a Batch Transformation and Register a Model in the Model Registry, Or Terminate the Execution in Failed State](#Define-a-Condition-Step-to-Check-Accuracy-and-Conditionally-Create-a-Model-and-Run-a-Batch-Transformation-and-Register-a-Model-in-the-Model-Registry,-Or-Terminate-the-Execution-in-Failed-State)\n", + "1. [Define a Pipeline of Parameters, Steps, and Conditions](#Define-a-Pipeline-of-Parameters,-Steps,-and-Conditions)\n", + "1. [Submit the pipeline to SageMaker and start execution](#Submit-the-pipeline-to-SageMaker-and-start-execution)\n", + "1. [Pipeline Operations: Examining and Waiting for Pipeline Execution](#Pipeline-Operations:-Examining-and-Waiting-for-Pipeline-Execution)\n", + " 1. [Examining the Evaluation](#Examining-the-Evaluation)\n", + " 1. [Lineage](#Lineage)\n", + " 1. [Parametrized Executions](#Parametrized-Executions)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "## SageMaker Pipelines\n", + "\n", + "SageMaker Pipelines supports the following activities, which are demonstrated in this notebook:\n", + "\n", + "* Pipelines - A DAG of steps and conditions to orchestrate SageMaker jobs and resource creation.\n", + "* Processing job steps - A simplified, managed experience on SageMaker to run data processing workloads, such as feature engineering, data validation, model evaluation, and model interpretation.\n", + "* Training job steps - An iterative process that teaches a model to make predictions by presenting examples from a training dataset.\n", + "* Conditional execution steps - A step that provides conditional execution of branches in a pipeline.\n", + "* Register model steps - A step that creates a model package resource in the Model Registry that can be used to create deployable models in Amazon SageMaker.\n", + "* Create model steps - A step that creates a model for use in transform steps or later publication as an endpoint.\n", + "* Transform job steps - A batch transform to preprocess datasets to remove noise or bias that interferes with training or inference from a dataset, get inferences from large datasets, and run inference when a persistent endpoint is not needed.\n", + "* Fail steps - A step that stops a pipeline execution and marks the pipeline execution as failed.\n", + "* Parametrized Pipeline executions - Enables variation in pipeline executions according to specified parameters." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "## Notebook Overview\n", + "\n", + "This notebook shows how to:\n", + "\n", + "* Define a set of Pipeline parameters that can be used to parametrize a SageMaker Pipeline.\n", + "* Define a Processing step that performs cleaning, feature engineering, and splitting the input data into train and test data sets.\n", + "* Define a Training step that trains a model on the preprocessed train data set.\n", + "* Define a Processing step that evaluates the trained model's performance on the test dataset.\n", + "* Define a Create Model step that creates a model from the model artifacts used in training.\n", + "* Define a Transform step that performs batch transformation based on the model that was created.\n", + "* Define a Register Model step that creates a model package from the estimator and model artifacts used to train the model.\n", + "* Define a Conditional step that measures a condition based on output from prior steps and conditionally executes other steps.\n", + "* Define a Fail step with a customized error message indicating the cause of the execution failure.\n", + "* Define and create a Pipeline definition in a DAG, with the defined parameters and steps.\n", + "* Start a Pipeline execution and wait for execution to complete.\n", + "* Download the model evaluation report from the S3 bucket for examination.\n", + "* Start a second Pipeline execution." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "## A SageMaker Pipeline\n", + "\n", + "The pipeline that you create follows a typical machine learning (ML) application pattern of preprocessing, training, evaluation, model creation, batch transformation, and model registration:\n", + "\n", + "![A typical ML Application pipeline](img/pipeline-full.png)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "## Dataset\n", + "\n", + "The dataset you use is the [UCI Machine Learning Abalone Dataset](https://archive.ics.uci.edu/ml/datasets/abalone) [1]. The aim for this task is to determine the age of an abalone snail from its physical measurements. At the core, this is a regression problem.\n", + "\n", + "The dataset contains several features: length (the longest shell measurement), diameter (the diameter perpendicular to length), height (the height with meat in the shell), whole_weight (the weight of whole abalone), shucked_weight (the weight of meat), viscera_weight (the gut weight after bleeding), shell_weight (the weight after being dried), sex ('M', 'F', 'I' where 'I' is Infant), and rings (integer).\n", + "\n", + "The number of rings turns out to be a good approximation for age (age is rings + 1.5). However, to obtain this number requires cutting the shell through the cone, staining the section, and counting the number of rings through a microscope, which is a time-consuming task. However, the other physical measurements are easier to determine. You use the dataset to build a predictive model of the variable rings through these other physical measurements.\n", + "\n", + "Before you upload the data to an S3 bucket, install the SageMaker Python SDK and gather some constants you can use later in this notebook.\n", + "\n", + "> [1] Dua, D. and Graff, C. (2019). [UCI Machine Learning Repository](http://archive.ics.uci.edu/ml). Irvine, CA: University of California, School of Information and Computer Science." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "import sys\n", + "\n", + "!{sys.executable} -m pip install \"sagemaker>=2.99.0\"\n", + "\n", + "import boto3\n", + "import sagemaker\n", + "from sagemaker.workflow.pipeline_context import PipelineSession\n", + "\n", + "sagemaker_session = sagemaker.session.Session()\n", + "region = sagemaker_session.boto_region_name\n", + "role = sagemaker.get_execution_role()\n", + "pipeline_session = PipelineSession()\n", + "default_bucket = sagemaker_session.default_bucket()\n", + "model_package_group_name = f\"AbaloneModelPackageGroupName\"" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "Now, upload the data into the default bucket. You can select our own data set for the `input_data_uri` as is appropriate." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "!mkdir -p data" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "local_path = \"data/abalone-dataset.csv\"\n", + "\n", + "s3 = boto3.resource(\"s3\")\n", + "s3.Bucket(f\"sagemaker-sample-files\").download_file(\n", + " \"datasets/tabular/uci_abalone/abalone.csv\", local_path\n", + ")\n", + "\n", + "base_uri = f\"s3://{default_bucket}/abalone\"\n", + "input_data_uri = sagemaker.s3.S3Uploader.upload(\n", + " local_path=local_path,\n", + " desired_s3_uri=base_uri,\n", + ")\n", + "print(input_data_uri)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "Download a second dataset for batch transformation after model creation. You can select our own dataset for the `batch_data_uri` as is appropriate." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "local_path = \"data/abalone-dataset-batch\"\n", + "\n", + "s3 = boto3.resource(\"s3\")\n", + "s3.Bucket(f\"sagemaker-servicecatalog-seedcode-{region}\").download_file(\n", + " \"dataset/abalone-dataset-batch\", local_path\n", + ")\n", + "\n", + "base_uri = f\"s3://{default_bucket}/abalone\"\n", + "batch_data_uri = sagemaker.s3.S3Uploader.upload(\n", + " local_path=local_path,\n", + " desired_s3_uri=base_uri,\n", + ")\n", + "print(batch_data_uri)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "## Define Parameters to Parametrize Pipeline Execution\n", + "\n", + "Define Pipeline parameters that you can use to parametrize the pipeline. Parameters enable custom pipeline executions and schedules without having to modify the Pipeline definition.\n", + "\n", + "The supported parameter types include:\n", + "\n", + "* `ParameterString` - represents a `str` Python type\n", + "* `ParameterInteger` - represents an `int` Python type\n", + "* `ParameterFloat` - represents a `float` Python type\n", + "\n", + "These parameters support providing a default value, which can be overridden on pipeline execution. The default value specified should be an instance of the type of the parameter.\n", + "\n", + "The parameters defined in this workflow include:\n", + "\n", + "* `processing_instance_count` - The instance count of the processing job.\n", + "* `instance_type` - The `ml.*` instance type of the training job.\n", + "* `model_approval_status` - The approval status to register with the trained model for CI/CD purposes (\"PendingManualApproval\" is the default).\n", + "* `input_data` - The S3 bucket URI location of the input data.\n", + "* `batch_data` - The S3 bucket URI location of the batch data.\n", + "* `mse_threshold` - The Mean Squared Error (MSE) threshold used to verify the accuracy of a model." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "from sagemaker.workflow.parameters import (\n", + " ParameterInteger,\n", + " ParameterString,\n", + " ParameterFloat,\n", + ")\n", + "\n", + "processing_instance_count = ParameterInteger(name=\"ProcessingInstanceCount\", default_value=1)\n", + "instance_type = ParameterString(name=\"TrainingInstanceType\", default_value=\"ml.m4.xlarge\")\n", + "model_approval_status = ParameterString(\n", + " name=\"ModelApprovalStatus\", default_value=\"PendingManualApproval\"\n", + ")\n", + "input_data = ParameterString(\n", + " name=\"InputData\",\n", + " default_value=input_data_uri,\n", + ")\n", + "batch_data = ParameterString(\n", + " name=\"BatchData\",\n", + " default_value=batch_data_uri,\n", + ")\n", + "mse_threshold = ParameterFloat(name=\"MseThreshold\", default_value=6.0)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "![Define Parameters](img/pipeline-1.png)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "## Define a Processing Step for Feature Engineering\n", + "\n", + "First, develop a preprocessing script that is specified in the Processing step.\n", + "\n", + "This notebook cell writes a file `preprocessing_abalone.py`, which contains the preprocessing script. You can update the script, and rerun this cell to overwrite. The preprocessing script uses `scikit-learn` to do the following:\n", + "\n", + "* Fill in missing sex category data and encode it so that it is suitable for training.\n", + "* Scale and normalize all numerical fields, aside from sex and rings numerical data.\n", + "* Split the data into training, validation, and test datasets.\n", + "\n", + "The Processing step executes the script on the input data. The Training step uses the preprocessed training features and labels to train a model. The Evaluation step uses the trained model and preprocessed test features and labels to evaluate the model." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "!mkdir -p code" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "%%writefile code/preprocessing.py\n", + "import argparse\n", + "import os\n", + "import requests\n", + "import tempfile\n", + "\n", + "import numpy as np\n", + "import pandas as pd\n", + "\n", + "from sklearn.compose import ColumnTransformer\n", + "from sklearn.impute import SimpleImputer\n", + "from sklearn.pipeline import Pipeline\n", + "from sklearn.preprocessing import StandardScaler, OneHotEncoder\n", + "\n", + "\n", + "# Since we get a headerless CSV file, we specify the column names here.\n", + "feature_columns_names = [\n", + " \"sex\",\n", + " \"length\",\n", + " \"diameter\",\n", + " \"height\",\n", + " \"whole_weight\",\n", + " \"shucked_weight\",\n", + " \"viscera_weight\",\n", + " \"shell_weight\",\n", + "]\n", + "label_column = \"rings\"\n", + "\n", + "feature_columns_dtype = {\n", + " \"sex\": str,\n", + " \"length\": np.float64,\n", + " \"diameter\": np.float64,\n", + " \"height\": np.float64,\n", + " \"whole_weight\": np.float64,\n", + " \"shucked_weight\": np.float64,\n", + " \"viscera_weight\": np.float64,\n", + " \"shell_weight\": np.float64,\n", + "}\n", + "label_column_dtype = {\"rings\": np.float64}\n", + "\n", + "\n", + "def merge_two_dicts(x, y):\n", + " z = x.copy()\n", + " z.update(y)\n", + " return z\n", + "\n", + "\n", + "if __name__ == \"__main__\":\n", + " base_dir = \"/opt/ml/processing\"\n", + "\n", + " df = pd.read_csv(\n", + " f\"{base_dir}/input/abalone-dataset.csv\",\n", + " header=None,\n", + " names=feature_columns_names + [label_column],\n", + " dtype=merge_two_dicts(feature_columns_dtype, label_column_dtype),\n", + " )\n", + " numeric_features = list(feature_columns_names)\n", + " numeric_features.remove(\"sex\")\n", + " numeric_transformer = Pipeline(\n", + " steps=[(\"imputer\", SimpleImputer(strategy=\"median\")), (\"scaler\", StandardScaler())]\n", + " )\n", + "\n", + " categorical_features = [\"sex\"]\n", + " categorical_transformer = Pipeline(\n", + " steps=[\n", + " (\"imputer\", SimpleImputer(strategy=\"constant\", fill_value=\"missing\")),\n", + " (\"onehot\", OneHotEncoder(handle_unknown=\"ignore\")),\n", + " ]\n", + " )\n", + "\n", + " preprocess = ColumnTransformer(\n", + " transformers=[\n", + " (\"num\", numeric_transformer, numeric_features),\n", + " (\"cat\", categorical_transformer, categorical_features),\n", + " ]\n", + " )\n", + "\n", + " y = df.pop(\"rings\")\n", + " X_pre = preprocess.fit_transform(df)\n", + " y_pre = y.to_numpy().reshape(len(y), 1)\n", + "\n", + " X = np.concatenate((y_pre, X_pre), axis=1)\n", + "\n", + " np.random.shuffle(X)\n", + " train, validation, test = np.split(X, [int(0.7 * len(X)), int(0.85 * len(X))])\n", + "\n", + " pd.DataFrame(train).to_csv(f\"{base_dir}/train/train.csv\", header=False, index=False)\n", + " pd.DataFrame(validation).to_csv(\n", + " f\"{base_dir}/validation/validation.csv\", header=False, index=False\n", + " )\n", + " pd.DataFrame(test).to_csv(f\"{base_dir}/test/test.csv\", header=False, index=False)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "Next, create an instance of a `SKLearnProcessor` processor and use that in our `ProcessingStep`.\n", + "\n", + "You also specify the `framework_version` to use throughout this notebook.\n", + "\n", + "Note the `processing_instance_count` parameter used by the processor instance." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "from sagemaker.sklearn.processing import SKLearnProcessor\n", + "\n", + "\n", + "framework_version = \"0.23-1\"\n", + "\n", + "sklearn_processor = SKLearnProcessor(\n", + " framework_version=framework_version,\n", + " instance_type=\"ml.m4.xlarge\",\n", + " instance_count=processing_instance_count,\n", + " base_job_name=\"sklearn-abalone-process\",\n", + " role=role,\n", + " sagemaker_session=pipeline_session,\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "Finally, we take the output of the processor's `run` method and pass that as arguments to the `ProcessingStep`. By passing the `pipeline_session` to the `sagemaker_session`, calling `.run()` does not launch the processing job, it returns the arguments needed to run the job as a step in the pipeline.\n", + "\n", + "Note the `\"train_data\"` and `\"test_data\"` named channels specified in the output configuration for the processing job. Step `Properties` can be used in subsequent steps and resolve to their runtime values at execution. Specifically, this usage is called out when you define the training step." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "from sagemaker.processing import ProcessingInput, ProcessingOutput\n", + "from sagemaker.workflow.steps import ProcessingStep\n", + "\n", + "processor_args = sklearn_processor.run(\n", + " inputs=[\n", + " ProcessingInput(source=input_data, destination=\"/opt/ml/processing/input\"),\n", + " ],\n", + " outputs=[\n", + " ProcessingOutput(output_name=\"train\", source=\"/opt/ml/processing/train\"),\n", + " ProcessingOutput(output_name=\"validation\", source=\"/opt/ml/processing/validation\"),\n", + " ProcessingOutput(output_name=\"test\", source=\"/opt/ml/processing/test\"),\n", + " ],\n", + " code=\"code/preprocessing.py\",\n", + ")\n", + "\n", + "step_process = ProcessingStep(name=\"AbaloneProcess\", step_args=processor_args)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "![Define a Processing Step for Feature Engineering](img/pipeline-2.png)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "## Define a Training Step to Train a Model\n", + "\n", + "In this section, use Amazon SageMaker's [XGBoost Algorithm](https://docs.aws.amazon.com/sagemaker/latest/dg/xgboost.html) to train on this dataset. Configure an Estimator for the XGBoost algorithm and the input dataset. A typical training script loads data from the input channels, configures training with hyperparameters, trains a model, and saves a model to `model_dir` so that it can be hosted later.\n", + "\n", + "The model path where the models from training are saved is also specified.\n", + "\n", + "Note the `instance_type` parameter may be used in multiple places in the pipeline. In this case, the `instance_type` is passed into the estimator." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "from sagemaker.estimator import Estimator\n", + "from sagemaker.inputs import TrainingInput\n", + "\n", + "model_path = f\"s3://{default_bucket}/AbaloneTrain\"\n", + "image_uri = sagemaker.image_uris.retrieve(\n", + " framework=\"xgboost\",\n", + " region=region,\n", + " version=\"1.0-1\",\n", + " py_version=\"py3\",\n", + " instance_type=\"ml.m4.xlarge\",\n", + ")\n", + "xgb_train = Estimator(\n", + " image_uri=image_uri,\n", + " instance_type=instance_type,\n", + " instance_count=1,\n", + " output_path=model_path,\n", + " role=role,\n", + " sagemaker_session=pipeline_session,\n", + ")\n", + "xgb_train.set_hyperparameters(\n", + " objective=\"reg:linear\",\n", + " num_round=50,\n", + " max_depth=5,\n", + " eta=0.2,\n", + " gamma=4,\n", + " min_child_weight=6,\n", + " subsample=0.7,\n", + ")\n", + "\n", + "train_args = xgb_train.fit(\n", + " inputs={\n", + " \"train\": TrainingInput(\n", + " s3_data=step_process.properties.ProcessingOutputConfig.Outputs[\"train\"].S3Output.S3Uri,\n", + " content_type=\"text/csv\",\n", + " ),\n", + " \"validation\": TrainingInput(\n", + " s3_data=step_process.properties.ProcessingOutputConfig.Outputs[\n", + " \"validation\"\n", + " ].S3Output.S3Uri,\n", + " content_type=\"text/csv\",\n", + " ),\n", + " }\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "Finally, we use the output of the estimator's `.fit()` method as arguments to the `TrainingStep`. By passing the `pipeline_session` to the `sagemaker_session`, calling `.fit()` does not launch the training job, it returns the arguments needed to run the job as a step in the pipeline.\n", + "\n", + "Pass in the `S3Uri` of the `\"train_data\"` output channel to the `.fit()` method. Also, use the other `\"test_data\"` output channel for model evaluation in the pipeline. The `properties` attribute of a Pipeline step matches the object model of the corresponding response of a describe call. These properties can be referenced as placeholder values and are resolved at runtime. For example, the `ProcessingStep` `properties` attribute matches the object model of the [DescribeProcessingJob](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_DescribeProcessingJob.html) response object." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "from sagemaker.inputs import TrainingInput\n", + "from sagemaker.workflow.steps import TrainingStep\n", + "\n", + "\n", + "step_train = TrainingStep(\n", + " name=\"AbaloneTrain\",\n", + " step_args=train_args,\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "![Define a Training Step to Train a Model](img/pipeline-3.png)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "## Define a Model Evaluation Step to Evaluate the Trained Model\n", + "\n", + "First, develop an evaluation script that is specified in a Processing step that performs the model evaluation.\n", + "\n", + "After pipeline execution, you can examine the resulting `evaluation.json` for analysis.\n", + "\n", + "The evaluation script uses `xgboost` to do the following:\n", + "\n", + "* Load the model.\n", + "* Read the test data.\n", + "* Issue predictions against the test data.\n", + "* Build a classification report, including accuracy and ROC curve.\n", + "* Save the evaluation report to the evaluation directory." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "%%writefile code/evaluation.py\n", + "import json\n", + "import pathlib\n", + "import pickle\n", + "import tarfile\n", + "\n", + "import joblib\n", + "import numpy as np\n", + "import pandas as pd\n", + "import xgboost\n", + "\n", + "from sklearn.metrics import mean_squared_error\n", + "\n", + "\n", + "if __name__ == \"__main__\":\n", + " model_path = f\"/opt/ml/processing/model/model.tar.gz\"\n", + " with tarfile.open(model_path) as tar:\n", + " tar.extractall(path=\".\")\n", + "\n", + " model = pickle.load(open(\"xgboost-model\", \"rb\"))\n", + "\n", + " test_path = \"/opt/ml/processing/test/test.csv\"\n", + " df = pd.read_csv(test_path, header=None)\n", + "\n", + " y_test = df.iloc[:, 0].to_numpy()\n", + " df.drop(df.columns[0], axis=1, inplace=True)\n", + "\n", + " X_test = xgboost.DMatrix(df.values)\n", + "\n", + " predictions = model.predict(X_test)\n", + "\n", + " mse = mean_squared_error(y_test, predictions)\n", + " std = np.std(y_test - predictions)\n", + " report_dict = {\n", + " \"regression_metrics\": {\n", + " \"mse\": {\"value\": mse, \"standard_deviation\": std},\n", + " },\n", + " }\n", + "\n", + " output_dir = \"/opt/ml/processing/evaluation\"\n", + " pathlib.Path(output_dir).mkdir(parents=True, exist_ok=True)\n", + "\n", + " evaluation_path = f\"{output_dir}/evaluation.json\"\n", + " with open(evaluation_path, \"w\") as f:\n", + " f.write(json.dumps(report_dict))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "Next, create an instance of a `ScriptProcessor` processor and use it in the `ProcessingStep`." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "from sagemaker.processing import ScriptProcessor\n", + "\n", + "\n", + "script_eval = ScriptProcessor(\n", + " image_uri=image_uri,\n", + " command=[\"python3\"],\n", + " instance_type=\"ml.m4.xlarge\",\n", + " instance_count=1,\n", + " base_job_name=\"script-abalone-eval\",\n", + " role=role,\n", + " sagemaker_session=pipeline_session,\n", + ")\n", + "\n", + "eval_args = script_eval.run(\n", + " inputs=[\n", + " ProcessingInput(\n", + " source=step_train.properties.ModelArtifacts.S3ModelArtifacts,\n", + " destination=\"/opt/ml/processing/model\",\n", + " ),\n", + " ProcessingInput(\n", + " source=step_process.properties.ProcessingOutputConfig.Outputs[\"test\"].S3Output.S3Uri,\n", + " destination=\"/opt/ml/processing/test\",\n", + " ),\n", + " ],\n", + " outputs=[\n", + " ProcessingOutput(output_name=\"evaluation\", source=\"/opt/ml/processing/evaluation\"),\n", + " ],\n", + " code=\"code/evaluation.py\",\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "Use the processor's arguments returned by `.run()` to construct a `ProcessingStep`, along with the input and output channels and the code that will be executed when the pipeline invokes pipeline execution.\n", + "\n", + "Specifically, the `S3ModelArtifacts` from the `step_train` `properties` and the `S3Uri` of the `\"test_data\"` output channel of the `step_process` `properties` are passed as inputs. The `TrainingStep` and `ProcessingStep` `properties` attribute matches the object model of the [DescribeTrainingJob](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_DescribeTrainingJob.html) and [DescribeProcessingJob](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_DescribeProcessingJob.html) response objects, respectively." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "from sagemaker.workflow.properties import PropertyFile\n", + "\n", + "\n", + "evaluation_report = PropertyFile(\n", + " name=\"EvaluationReport\", output_name=\"evaluation\", path=\"evaluation.json\"\n", + ")\n", + "step_eval = ProcessingStep(\n", + " name=\"AbaloneEval\",\n", + " step_args=eval_args,\n", + " property_files=[evaluation_report],\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "![Define a Model Evaluation Step to Evaluate the Trained Model](img/pipeline-4.png)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "## Define a Create Model Step to Create a Model\n", + "\n", + "In order to perform batch transformation using the example model, create a SageMaker model.\n", + "\n", + "Specifically, pass in the `S3ModelArtifacts` from the `TrainingStep`, `step_train` properties. The `TrainingStep` `properties` attribute matches the object model of the [DescribeTrainingJob](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_DescribeTrainingJob.html) response object." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "from sagemaker.model import Model\n", + "\n", + "model = Model(\n", + " image_uri=image_uri,\n", + " model_data=step_train.properties.ModelArtifacts.S3ModelArtifacts,\n", + " sagemaker_session=pipeline_session,\n", + " role=role,\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "Define the `ModelStep` by providing the return values from `model.create()` as the step arguments." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "from sagemaker.inputs import CreateModelInput\n", + "from sagemaker.workflow.model_step import ModelStep\n", + "\n", + "step_create_model = ModelStep(\n", + " name=\"AbaloneCreateModel\",\n", + " step_args=model.create(instance_type=\"ml.m4.xlarge\", accelerator_type=\"ml.eia1.medium\"),\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "## Define a Transform Step to Perform Batch Transformation\n", + "\n", + "Now that a model instance is defined, create a `Transformer` instance with the appropriate model type, compute instance type, and desired output S3 URI.\n", + "\n", + "Specifically, pass in the `ModelName` from the `CreateModelStep`, `step_create_model` properties. The `CreateModelStep` `properties` attribute matches the object model of the [DescribeModel](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_DescribeModel.html) response object." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "from sagemaker.transformer import Transformer\n", + "\n", + "\n", + "transformer = Transformer(\n", + " model_name=step_create_model.properties.ModelName,\n", + " instance_type=\"ml.m4.xlarge\",\n", + " instance_count=1,\n", + " output_path=f\"s3://{default_bucket}/AbaloneTransform\",\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "Pass in the transformer instance and the `TransformInput` with the `batch_data` pipeline parameter defined earlier." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "from sagemaker.inputs import TransformInput\n", + "from sagemaker.workflow.steps import TransformStep\n", + "\n", + "\n", + "step_transform = TransformStep(\n", + " name=\"AbaloneTransform\", transformer=transformer, inputs=TransformInput(data=batch_data)\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "## Define a Register Model Step to Create a Model Package\n", + "\n", + "A model package is an abstraction of reusable model artifacts that packages all ingredients required for inference. Primarily, it consists of an inference specification that defines the inference image to use along with an optional model weights location.\n", + "\n", + "A model package group is a collection of model packages. A model package group can be created for a specific ML business problem, and new versions of the model packages can be added to it. Typically, customers are expected to create a ModelPackageGroup for a SageMaker pipeline so that model package versions can be added to the group for every SageMaker Pipeline run.\n", + "\n", + "To register a model in the Model Registry, we take the model created in the previous steps\n", + "```\n", + "model = Model(\n", + " image_uri=image_uri,\n", + " model_data=step_train.properties.ModelArtifacts.S3ModelArtifacts,\n", + " sagemaker_session=pipeline_session,\n", + " role=role,\n", + ")\n", + "```\n", + "and call the `.register()` function on it while passing all the parameters needed for registering the model.\n", + "\n", + "We take the outputs of the `.register()` call and pass that to the `ModelStep` as step arguments." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "from sagemaker.model_metrics import MetricsSource, ModelMetrics\n", + "\n", + "model_metrics = ModelMetrics(\n", + " model_statistics=MetricsSource(\n", + " s3_uri=\"{}/evaluation.json\".format(\n", + " step_eval.arguments[\"ProcessingOutputConfig\"][\"Outputs\"][0][\"S3Output\"][\"S3Uri\"]\n", + " ),\n", + " content_type=\"application/json\",\n", + " )\n", + ")\n", + "\n", + "register_args = model.register(\n", + " content_types=[\"text/csv\"],\n", + " response_types=[\"text/csv\"],\n", + " inference_instances=[\"ml.m4.xlarge\"],\n", + " transform_instances=[\"ml.m4.xlarge\"],\n", + " model_package_group_name=model_package_group_name,\n", + " approval_status=model_approval_status,\n", + " model_metrics=model_metrics,\n", + ")\n", + "step_register = ModelStep(name=\"AbaloneRegisterModel\", step_args=register_args)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "![Define a Create Model Step and Batch Transform to Process Data in Batch at Scale](img/pipeline-5.png)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "## Define a Fail Step to Terminate the Pipeline Execution and Mark it as Failed\n", + "\n", + "This section walks you through the following steps:\n", + "\n", + "* Define a `FailStep` with customized error message, which indicates the cause of the execution failure.\n", + "* Enter the `FailStep` error message with a `Join` function, which appends a static text string with the dynamic `mse_threshold` parameter to build a more informative error message." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "from sagemaker.workflow.fail_step import FailStep\n", + "from sagemaker.workflow.functions import Join\n", + "\n", + "step_fail = FailStep(\n", + " name=\"AbaloneMSEFail\",\n", + " error_message=Join(on=\" \", values=[\"Execution failed due to MSE >\", mse_threshold]),\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "![Define a Fail Step to Terminate the Execution in Failed State](img/pipeline-8.png)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "## Define a Condition Step to Check Accuracy and Conditionally Create a Model and Run a Batch Transformation and Register a Model in the Model Registry, Or Terminate the Execution in Failed State\n", + "\n", + "In this step, the model is registered only if the accuracy of the model, as determined by the evaluation step `step_eval`, exceeded a specified value. Otherwise, the pipeline execution fails and terminates. A `ConditionStep` enables pipelines to support conditional execution in the pipeline DAG based on the conditions of the step properties.\n", + "\n", + "In the following section, you:\n", + "\n", + "* Define a `ConditionLessThanOrEqualTo` on the accuracy value found in the output of the evaluation step, `step_eval`.\n", + "* Use the condition in the list of conditions in a `ConditionStep`.\n", + "* Pass the `CreateModelStep` and `TransformStep` steps, and the `RegisterModel` step collection into the `if_steps` of the `ConditionStep`, which are only executed if the condition evaluates to `True`.\n", + "* Pass the `FailStep` step into the `else_steps`of the `ConditionStep`, which is only executed if the condition evaluates to `False`." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "from sagemaker.workflow.conditions import ConditionLessThanOrEqualTo\n", + "from sagemaker.workflow.condition_step import ConditionStep\n", + "from sagemaker.workflow.functions import JsonGet\n", + "\n", + "\n", + "cond_lte = ConditionLessThanOrEqualTo(\n", + " left=JsonGet(\n", + " step_name=step_eval.name,\n", + " property_file=evaluation_report,\n", + " json_path=\"regression_metrics.mse.value\",\n", + " ),\n", + " right=mse_threshold,\n", + ")\n", + "\n", + "step_cond = ConditionStep(\n", + " name=\"AbaloneMSECond\",\n", + " conditions=[cond_lte],\n", + " if_steps=[step_register, step_create_model, step_transform],\n", + " else_steps=[step_fail],\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "![Define a Condition Step to Check Accuracy and Conditionally Execute Steps](img/pipeline-6.png)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "## Define a Pipeline of Parameters, Steps, and Conditions\n", + "\n", + "In this section, combine the steps into a Pipeline so it can be executed.\n", + "\n", + "A pipeline requires a `name`, `parameters`, and `steps`. Names must be unique within an `(account, region)` pair.\n", + "\n", + "Note:\n", + "\n", + "* All the parameters used in the definitions must be present.\n", + "* Steps passed into the pipeline do not have to be listed in the order of execution. The SageMaker Pipeline service resolves the data dependency DAG as steps for the execution to complete.\n", + "* Steps must be unique to across the pipeline step list and all condition step if/else lists." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "from sagemaker.workflow.pipeline import Pipeline\n", + "\n", + "\n", + "pipeline_name = f\"AbalonePipeline\"\n", + "pipeline = Pipeline(\n", + " name=pipeline_name,\n", + " parameters=[\n", + " processing_instance_count,\n", + " instance_type,\n", + " model_approval_status,\n", + " input_data,\n", + " batch_data,\n", + " mse_threshold,\n", + " ],\n", + " steps=[step_process, step_train, step_eval, step_cond],\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "![Define a Pipeline of Parameters, Steps, and Conditions](img/pipeline-7.png)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "### (Optional) Examining the pipeline definition\n", + "\n", + "The JSON of the pipeline definition can be examined to confirm the pipeline is well-defined and the parameters and step properties resolve correctly." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "import json\n", + "\n", + "\n", + "definition = json.loads(pipeline.definition())\n", + "definition" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "## Submit the pipeline to SageMaker and start execution\n", + "\n", + "Submit the pipeline definition to the Pipeline service. The Pipeline service uses the role that is passed in to create all the jobs defined in the steps." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "pipeline.upsert(role_arn=role)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "Start the pipeline and accept all the default parameters." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "execution = pipeline.start()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "## Pipeline Operations: Examining and Waiting for Pipeline Execution\n", + "\n", + "Describe the pipeline execution." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "execution.describe()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "Wait for the execution to complete." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "execution.wait()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "List the steps in the execution. These are the steps in the pipeline that have been resolved by the step executor service." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "execution.list_steps()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "### Examining the Evaluation\n", + "\n", + "Examine the resulting model evaluation after the pipeline completes. Download the resulting `evaluation.json` file from S3 and print the report." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "from pprint import pprint\n", + "\n", + "\n", + "evaluation_json = sagemaker.s3.S3Downloader.read_file(\n", + " \"{}/evaluation.json\".format(\n", + " step_eval.arguments[\"ProcessingOutputConfig\"][\"Outputs\"][0][\"S3Output\"][\"S3Uri\"]\n", + " )\n", + ")\n", + "pprint(json.loads(evaluation_json))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "### Lineage\n", + "\n", + "Review the lineage of the artifacts generated by the pipeline." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "import time\n", + "from sagemaker.lineage.visualizer import LineageTableVisualizer\n", + "\n", + "\n", + "viz = LineageTableVisualizer(sagemaker.session.Session())\n", + "for execution_step in reversed(execution.list_steps()):\n", + " print(execution_step)\n", + " display(viz.show(pipeline_execution_step=execution_step))\n", + " time.sleep(5)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "### Parametrized Executions\n", + "\n", + "You can run additional executions of the pipeline and specify different pipeline parameters. The `parameters` argument is a dictionary containing parameter names, and where the values are used to override the defaults values.\n", + "\n", + "Based on the performance of the model, you might want to kick off another pipeline execution on a compute-optimized instance type and set the model approval status to \"Approved\" automatically. This means that the model package version generated by the `RegisterModel` step is automatically ready for deployment through CI/CD pipelines, such as with SageMaker Projects." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "execution = pipeline.start(\n", + " parameters=dict(\n", + " ModelApprovalStatus=\"Approved\",\n", + " )\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "execution.wait()" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "execution.list_steps()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "Apart from that, you might also want to adjust the MSE threshold to a smaller value and raise the bar for the accuracy of the registered model. In this case you can override the MSE threshold like the following:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "execution = pipeline.start(parameters=dict(MseThreshold=3.0))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "If the MSE threshold is not satisfied, the pipeline execution enters the `FailStep` and is marked as failed." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "try:\n", + " execution.wait()\n", + "except Exception as error:\n", + " print(error)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "execution.list_steps()" + ] + } + ], + "metadata": { + "instance_type": "ml.t3.medium", + "kernelspec": { + "display_name": "Python 3 (Data Science)", + "language": "python", + "name": "python3__SAGEMAKER_INTERNAL__arn:aws:sagemaker:eu-west-1:470317259841:image/datascience-1.0" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.7.10" + } + }, + "nbformat": 4, + "nbformat_minor": 4 +} diff --git a/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/notebooks/sm_pipelines_runbook.ipynb b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/notebooks/sm_pipelines_runbook.ipynb new file mode 100644 index 00000000..7196902d --- /dev/null +++ b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/notebooks/sm_pipelines_runbook.ipynb @@ -0,0 +1,538 @@ +{ + "cells": [ + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "import os\n", + "\n", + "import boto3\n", + "import logging\n", + "import sagemaker\n", + "import sagemaker.session\n", + "\n", + "from sagemaker.estimator import Estimator\n", + "from sagemaker.inputs import TrainingInput\n", + "from sagemaker.model_metrics import (\n", + " MetricsSource,\n", + " ModelMetrics,\n", + ")\n", + "from sagemaker.processing import (\n", + " ProcessingInput,\n", + " ProcessingOutput,\n", + " ScriptProcessor,\n", + ")\n", + "from sagemaker.sklearn.processing import SKLearnProcessor\n", + "from sagemaker.workflow.conditions import ConditionLessThanOrEqualTo\n", + "from sagemaker.workflow.condition_step import (\n", + " ConditionStep,\n", + ")\n", + "from sagemaker.workflow.functions import (\n", + " JsonGet,\n", + ")\n", + "from sagemaker.workflow.parameters import (\n", + " ParameterInteger,\n", + " ParameterString,\n", + ")\n", + "from sagemaker.workflow.pipeline import Pipeline\n", + "from sagemaker.workflow.properties import PropertyFile\n", + "from sagemaker.workflow.steps import (\n", + " ProcessingStep,\n", + " TrainingStep,\n", + ")\n", + "from sagemaker.workflow.step_collections import RegisterModel\n", + "\n", + "from botocore.exceptions import ClientError" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "logger = logging.getLogger(__name__)\n", + "\n", + "\"\"\"Environment Variables\"\"\"\n", + "proj_dir = \"TO_BE_DEFINED\"\n", + "region= \"TO_BE_DEFINED\"\n", + "model_artefact_bucket= \"TO_BE_DEFINED\"\n", + "role = \"TO_BE_DEFINED\"\n", + "project_name= \"TO_BE_DEFINED\"\n", + "stage= \"test\"\n", + "model_package_group_name=\"AbalonePackageGroup\",\n", + "pipeline_name=\"AbalonePipeline\",\n", + "base_job_prefix=\"Abalone\",\n", + "project_id=\"SageMakerProjectId\",\n", + "processing_image_uri=None\n", + "training_image_uri=None\n", + "inference_image_uri=None" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "def get_session(region, default_bucket):\n", + " \"\"\"Gets the sagemaker session based on the region.\n", + "\n", + " Args:\n", + " region: the aws region to start the session\n", + " default_bucket: the bucket to use for storing the artifacts\n", + "\n", + " Returns:\n", + " `sagemaker.session.Session instance\n", + " \"\"\"\n", + "\n", + " boto_session = boto3.Session(region_name=region)\n", + "\n", + " sagemaker_client = boto_session.client(\"sagemaker\")\n", + " runtime_client = boto_session.client(\"sagemaker-runtime\")\n", + " return sagemaker.session.Session(\n", + " boto_session=boto_session,\n", + " sagemaker_client=sagemaker_client,\n", + " sagemaker_runtime_client=runtime_client,\n", + " default_bucket=default_bucket,\n", + " )" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "sagemaker_session = get_session(region, model_artefact_bucket)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "## Feature Engineering\n", + "This section describes the different steps involved in feature engineering which includes loading and transforming different data sources to build the features needed for the ML Use Case" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "processing_instance_count = ParameterInteger(name=\"ProcessingInstanceCount\", default_value=1)\n", + "processing_instance_type = ParameterString(name=\"ProcessingInstanceType\", default_value=\"ml.m5.xlarge\")\n", + "training_instance_type = ParameterString(name=\"TrainingInstanceType\", default_value=\"ml.m5.xlarge\")\n", + "inference_instance_type = ParameterString(name=\"InferenceInstanceType\", default_value=\"ml.m5.xlarge\")\n", + "model_approval_status = ParameterString(name=\"ModelApprovalStatus\", default_value=\"PendingManualApproval\")\n", + "input_data = ParameterString(\n", + " name=\"InputDataUrl\",\n", + " default_value=f\"s3://sagemaker-servicecatalog-seedcode-{region}/dataset/abalone-dataset.csv\",\n", + ")\n", + "processing_image_name = \"sagemaker-{0}-processingimagebuild\".format(project_id)\n", + "training_image_name = \"sagemaker-{0}-trainingimagebuild\".format(project_id)\n", + "inference_image_name = \"sagemaker-{0}-inferenceimagebuild\".format(project_id)\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "# processing step for feature engineering\n", + "try:\n", + " processing_image_uri = sagemaker_session.sagemaker_client.describe_image_version(\n", + " ImageName=processing_image_name\n", + " )[\"ContainerImage\"]\n", + "\n", + "except (sagemaker_session.sagemaker_client.exceptions.ResourceNotFound):\n", + " processing_image_uri = sagemaker.image_uris.retrieve(\n", + " framework=\"xgboost\",\n", + " region=region,\n", + " version=\"1.0-1\",\n", + " py_version=\"py3\",\n", + " instance_type=processing_instance_type,\n", + " )" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "# Define Script Processor\n", + "script_processor = ScriptProcessor(\n", + " image_uri=processing_image_uri,\n", + " instance_type=processing_instance_type,\n", + " instance_count=processing_instance_count,\n", + " base_job_name=f\"{base_job_prefix}/sklearn-abalone-preprocess\",\n", + " command=[\"python3\"],\n", + " sagemaker_session=sagemaker_session,\n", + " role=role,\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "# Define ProcessingStep\n", + "step_process = ProcessingStep(\n", + " name=\"PreprocessAbaloneData\",\n", + " processor=script_processor,\n", + " outputs=[\n", + " ProcessingOutput(output_name=\"train\", source=\"/opt/ml/processing/train\"),\n", + " ProcessingOutput(output_name=\"validation\", source=\"/opt/ml/processing/validation\"),\n", + " ProcessingOutput(output_name=\"test\", source=\"/opt/ml/processing/test\"),\n", + " ],\n", + " code=\"source_scripts/preprocessing/prepare_abalone_data/main.py\", # we must figure out this path to get it from step_source directory\n", + " job_arguments=[\"--input-data\", input_data],\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "# Training an XGBoost model" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "# training step for generating model artifacts\n", + "model_path = f\"s3://{sagemaker_session.default_bucket()}/{base_job_prefix}/AbaloneTrain\"\n", + "\n", + "try:\n", + " training_image_uri = sagemaker_session.sagemaker_client.describe_image_version(ImageName=training_image_name)[\n", + " \"ContainerImage\"\n", + " ]\n", + "except (sagemaker_session.sagemaker_client.exceptions.ResourceNotFound):\n", + " training_image_uri = sagemaker.image_uris.retrieve(\n", + " framework=\"xgboost\",\n", + " region=region,\n", + " version=\"1.0-1\",\n", + " py_version=\"py3\",\n", + " instance_type=training_instance_type,\n", + " )\n", + "\n", + "xgb_train = Estimator(\n", + " image_uri=training_image_uri,\n", + " instance_type=training_instance_type,\n", + " instance_count=1,\n", + " output_path=model_path,\n", + " base_job_name=f\"{base_job_prefix}/abalone-train\",\n", + " sagemaker_session=sagemaker_session,\n", + " role=role,\n", + ")\n", + "xgb_train.set_hyperparameters(\n", + " objective=\"reg:linear\",\n", + " num_round=50,\n", + " max_depth=5,\n", + " eta=0.2,\n", + " gamma=4,\n", + " min_child_weight=6,\n", + " subsample=0.7,\n", + " silent=0,\n", + ")\n", + "step_train = TrainingStep(\n", + " name=\"TrainAbaloneModel\",\n", + " estimator=xgb_train,\n", + " inputs={\n", + " \"train\": TrainingInput(\n", + " s3_data=step_process.properties.ProcessingOutputConfig.Outputs[\"train\"].S3Output.S3Uri,\n", + " content_type=\"text/csv\",\n", + " ),\n", + " \"validation\": TrainingInput(\n", + " s3_data=step_process.properties.ProcessingOutputConfig.Outputs[\"validation\"].S3Output.S3Uri,\n", + " content_type=\"text/csv\",\n", + " ),\n", + " },\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "# Evaluate the Model" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "# processing step for evaluation\n", + "script_eval = ScriptProcessor(\n", + " image_uri=training_image_uri,\n", + " command=[\"python3\"],\n", + " instance_type=processing_instance_type,\n", + " instance_count=1,\n", + " base_job_name=f\"{base_job_prefix}/script-abalone-eval\",\n", + " sagemaker_session=sagemaker_session,\n", + " role=role,\n", + ")\n", + "evaluation_report = PropertyFile(\n", + " name=\"AbaloneEvaluationReport\",\n", + " output_name=\"evaluation\",\n", + " path=\"evaluation.json\",\n", + ")\n", + "step_eval = ProcessingStep(\n", + " name=\"EvaluateAbaloneModel\",\n", + " processor=script_eval,\n", + " inputs=[\n", + " ProcessingInput(\n", + " source=step_train.properties.ModelArtifacts.S3ModelArtifacts,\n", + " destination=\"/opt/ml/processing/model\",\n", + " ),\n", + " ProcessingInput(\n", + " source=step_process.properties.ProcessingOutputConfig.Outputs[\"test\"].S3Output.S3Uri,\n", + " destination=\"/opt/ml/processing/test\",\n", + " ),\n", + " ],\n", + " outputs=[\n", + " ProcessingOutput(output_name=\"evaluation\", source=\"/opt/ml/processing/evaluation\"),\n", + " ],\n", + " code=\"source_scripts/evaluate/evaluate_xgboost/main.py\",\n", + " property_files=[evaluation_report],\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "# Conditional step to push model to SageMaker Model Registry" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "# register model step that will be conditionally executed\n", + "model_metrics = ModelMetrics(\n", + " model_statistics=MetricsSource(\n", + " s3_uri=\"{}/evaluation.json\".format(\n", + " step_eval.arguments[\"ProcessingOutputConfig\"][\"Outputs\"][0][\"S3Output\"][\"S3Uri\"]\n", + " ),\n", + " content_type=\"application/json\",\n", + " )\n", + ")\n", + "\n", + "try:\n", + " inference_image_uri = sagemaker_session.sagemaker_client.describe_image_version(ImageName=inference_image_name)[\n", + " \"ContainerImage\"\n", + " ]\n", + "except (sagemaker_session.sagemaker_client.exceptions.ResourceNotFound):\n", + " inference_image_uri = sagemaker.image_uris.retrieve(\n", + " framework=\"xgboost\",\n", + " region=region,\n", + " version=\"1.0-1\",\n", + " py_version=\"py3\",\n", + " instance_type=inference_instance_type,\n", + " )\n", + "step_register = RegisterModel(\n", + " name=\"RegisterAbaloneModel\",\n", + " estimator=xgb_train,\n", + " image_uri=inference_image_uri,\n", + " model_data=step_train.properties.ModelArtifacts.S3ModelArtifacts,\n", + " content_types=[\"text/csv\"],\n", + " response_types=[\"text/csv\"],\n", + " inference_instances=[\"ml.t2.medium\", \"ml.m5.large\"],\n", + " transform_instances=[\"ml.m5.large\"],\n", + " model_package_group_name=model_package_group_name,\n", + " approval_status=model_approval_status,\n", + " model_metrics=model_metrics,\n", + ")\n", + "\n", + "# condition step for evaluating model quality and branching execution\n", + "cond_lte = ConditionLessThanOrEqualTo(\n", + " left=JsonGet(\n", + " step_name=step_eval.name, property_file=evaluation_report, json_path=\"regression_metrics.mse.value\"\n", + " ),\n", + " right=6.0,\n", + ")\n", + "step_cond = ConditionStep(\n", + " name=\"CheckMSEAbaloneEvaluation\",\n", + " conditions=[cond_lte],\n", + " if_steps=[step_register],\n", + " else_steps=[],\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "pycharm": { + "name": "#%% md\n" + } + }, + "source": [ + "# Create and run the Pipeline" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "# pipeline instance\n", + "pipeline = Pipeline(\n", + " name=pipeline_name,\n", + " parameters=[\n", + " processing_instance_type,\n", + " processing_instance_count,\n", + " training_instance_type,\n", + " model_approval_status,\n", + " input_data,\n", + " ],\n", + " steps=[step_process, step_train, step_eval, step_cond],\n", + " sagemaker_session=sagemaker_session,\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "import json\n", + "\n", + "\n", + "definition = json.loads(pipeline.definition())\n", + "definition" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "pipeline.upsert(role_arn=role, description=f'{stage} pipelines for {project_name}')" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "pycharm": { + "name": "#%%\n" + } + }, + "outputs": [], + "source": [ + "pipeline.start()" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "conda_python3", + "language": "python", + "name": "conda_python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.6.13" + } + }, + "nbformat": 4, + "nbformat_minor": 4 +} \ No newline at end of file diff --git a/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/requirements-dev.txt b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/requirements-dev.txt new file mode 100644 index 00000000..92709451 --- /dev/null +++ b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/requirements-dev.txt @@ -0,0 +1 @@ +pytest==6.2.5 diff --git a/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/requirements.txt b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/requirements.txt new file mode 100644 index 00000000..db9da2db --- /dev/null +++ b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/requirements.txt @@ -0,0 +1,2 @@ +sagemaker +boto3 diff --git a/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/setup.cfg b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/setup.cfg new file mode 100644 index 00000000..6f878705 --- /dev/null +++ b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/setup.cfg @@ -0,0 +1,14 @@ +[tool:pytest] +addopts = + -vv +testpaths = tests + +[aliases] +test=pytest + +[metadata] +description-file = README.md +license_file = LICENSE + +[wheel] +universal = 1 diff --git a/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/setup.py b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/setup.py new file mode 100644 index 00000000..b10bb142 --- /dev/null +++ b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/setup.py @@ -0,0 +1,77 @@ +# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. +# +# SPDX-License-Identifier: MIT-0 +# +# Permission is hereby granted, free of charge, to any person obtaining a copy of this +# software and associated documentation files (the "Software"), to deal in the Software +# without restriction, including without limitation the rights to use, copy, modify, +# merge, publish, distribute, sublicense, and/or sell copies of the Software, and to +# permit persons to whom the Software is furnished to do so. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, +# INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A +# PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT +# HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION +# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE +# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + +import os +import setuptools + + +about = {} +here = os.path.abspath(os.path.dirname(__file__)) +with open(os.path.join(here, "ml_pipelines", "__version__.py")) as f: + exec(f.read(), about) + + +with open("README.md", "r") as f: + readme = f.read() + + +required_packages = ["sagemaker"] +extras = { + "test": [ + "black", + "coverage", + "flake8", + "mock", + "pydocstyle", + "pytest", + "pytest-cov", + "sagemaker", + "tox", + ] +} +setuptools.setup( + name=about["__title__"], + description=about["__description__"], + version=about["__version__"], + author=about["__author__"], + author_email=["__author_email__"], + long_description=readme, + long_description_content_type="text/markdown", + url=about["__url__"], + license=about["__license__"], + packages=setuptools.find_packages(), + include_package_data=True, + python_requires=">=3.6", + install_requires=required_packages, + extras_require=extras, + entry_points={ + "console_scripts": [ + "get-pipeline-definition=pipelines.get_pipeline_definition:main", + "run-pipeline=ml_pipelines.run_pipeline:main", + ] + }, + classifiers=[ + "Development Status :: 3 - Alpha", + "Intended Audience :: Developers", + "Natural Language :: English", + "Programming Language :: Python", + "Programming Language :: Python :: 3", + "Programming Language :: Python :: 3.6", + "Programming Language :: Python :: 3.7", + "Programming Language :: Python :: 3.8", + ], +) diff --git a/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/source_scripts/Dockerfile b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/source_scripts/Dockerfile new file mode 100644 index 00000000..7057bb4f --- /dev/null +++ b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/source_scripts/Dockerfile @@ -0,0 +1,40 @@ +FROM public.ecr.aws/docker/library/python:3.7-buster as base + +RUN apt-get -y update && apt-get install -y \ + nginx \ + ca-certificates \ + policycoreutils \ + && rm -rf /var/lib/apt/lists/* + +ENV PATH="/usr/sbin/:${PATH}" + +COPY helpers/requirements.txt /requirements.txt + +RUN pip install --upgrade pip && pip install --no-cache -r /requirements.txt && \ + rm /requirements.txt +# Set up the program in the image +COPY helpers /opt/program + + +### start of TRAINING container +FROM base as xgboost +COPY training/xgboost/requirements.txt /requirements.txt +RUN pip install --no-cache -r /requirements.txt && \ + rm /requirements.txt + +# sm vars +ENV SAGEMAKER_MODEL_SERVER_TIMEOUT="300" +ENV MODEL_SERVER_TIMEOUT="300" +ENV PYTHONUNBUFFERED=TRUE +ENV PYTHONDONTWRITEBYTECODE=TRUE +ENV PATH="/opt/program:${PATH}" + +# env vars + +# Set up the program in the image +COPY training/xgboost /opt/program + +# set permissions of entrypoint +RUN chmod +x /opt/program/__main__.py + +WORKDIR /opt/program diff --git a/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/source_scripts/README.md b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/source_scripts/README.md new file mode 100644 index 00000000..e69de29b diff --git a/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/source_scripts/evaluate/evaluate_xgboost/README.md b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/source_scripts/evaluate/evaluate_xgboost/README.md new file mode 100644 index 00000000..e69de29b diff --git a/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/source_scripts/evaluate/evaluate_xgboost/main.py b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/source_scripts/evaluate/evaluate_xgboost/main.py new file mode 100644 index 00000000..7027811e --- /dev/null +++ b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/source_scripts/evaluate/evaluate_xgboost/main.py @@ -0,0 +1,72 @@ +# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. +# +# SPDX-License-Identifier: MIT-0 +# +# Permission is hereby granted, free of charge, to any person obtaining a copy of this +# software and associated documentation files (the "Software"), to deal in the Software +# without restriction, including without limitation the rights to use, copy, modify, +# merge, publish, distribute, sublicense, and/or sell copies of the Software, and to +# permit persons to whom the Software is furnished to do so. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, +# INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A +# PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT +# HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION +# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE +# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + +"""Evaluation script for measuring mean squared error.""" +import json +import logging +import pathlib +import pickle +import tarfile + +import numpy as np +import pandas as pd +import xgboost + +from sklearn.metrics import mean_squared_error + +logger = logging.getLogger() +logger.setLevel(logging.INFO) +logger.addHandler(logging.StreamHandler()) + + +if __name__ == "__main__": + logger.debug("Starting evaluation.") + model_path = "/opt/ml/processing/model/model.tar.gz" + with tarfile.open(model_path) as tar: + tar.extractall(path=".") + + logger.debug("Loading xgboost model.") + model = pickle.load(open("xgboost-model", "rb")) + + logger.debug("Reading test data.") + test_path = "/opt/ml/processing/test/test.csv" + df = pd.read_csv(test_path, header=None) + + logger.debug("Reading test data.") + y_test = df.iloc[:, 0].to_numpy() + df.drop(df.columns[0], axis=1, inplace=True) + X_test = xgboost.DMatrix(df.values) + + logger.info("Performing predictions against test data.") + predictions = model.predict(X_test) + + logger.debug("Calculating mean squared error.") + mse = mean_squared_error(y_test, predictions) + std = np.std(y_test - predictions) + report_dict = { + "regression_metrics": { + "mse": {"value": mse, "standard_deviation": std}, + }, + } + + output_dir = "/opt/ml/processing/evaluation" + pathlib.Path(output_dir).mkdir(parents=True, exist_ok=True) + + logger.info("Writing out evaluation report with mse: %f", mse) + evaluation_path = f"{output_dir}/evaluation.json" + with open(evaluation_path, "w") as f: + f.write(json.dumps(report_dict)) diff --git a/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/source_scripts/evaluate/evaluate_xgboost/requirements.txt b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/source_scripts/evaluate/evaluate_xgboost/requirements.txt new file mode 100644 index 00000000..e69de29b diff --git a/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/source_scripts/helpers/README.md b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/source_scripts/helpers/README.md new file mode 100644 index 00000000..e69de29b diff --git a/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/source_scripts/helpers/logger.py b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/source_scripts/helpers/logger.py new file mode 100644 index 00000000..bc27f7d9 --- /dev/null +++ b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/source_scripts/helpers/logger.py @@ -0,0 +1,16 @@ +# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. +# +# SPDX-License-Identifier: MIT-0 +# +# Permission is hereby granted, free of charge, to any person obtaining a copy of this +# software and associated documentation files (the "Software"), to deal in the Software +# without restriction, including without limitation the rights to use, copy, modify, +# merge, publish, distribute, sublicense, and/or sell copies of the Software, and to +# permit persons to whom the Software is furnished to do so. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, +# INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A +# PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT +# HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION +# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE +# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. diff --git a/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/source_scripts/helpers/requirements.txt b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/source_scripts/helpers/requirements.txt new file mode 100644 index 00000000..e69de29b diff --git a/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/source_scripts/helpers/s3_helper.py b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/source_scripts/helpers/s3_helper.py new file mode 100644 index 00000000..bc27f7d9 --- /dev/null +++ b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/source_scripts/helpers/s3_helper.py @@ -0,0 +1,16 @@ +# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. +# +# SPDX-License-Identifier: MIT-0 +# +# Permission is hereby granted, free of charge, to any person obtaining a copy of this +# software and associated documentation files (the "Software"), to deal in the Software +# without restriction, including without limitation the rights to use, copy, modify, +# merge, publish, distribute, sublicense, and/or sell copies of the Software, and to +# permit persons to whom the Software is furnished to do so. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, +# INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A +# PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT +# HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION +# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE +# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. diff --git a/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/source_scripts/helpers/test/test_a.py b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/source_scripts/helpers/test/test_a.py new file mode 100644 index 00000000..bc27f7d9 --- /dev/null +++ b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/source_scripts/helpers/test/test_a.py @@ -0,0 +1,16 @@ +# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. +# +# SPDX-License-Identifier: MIT-0 +# +# Permission is hereby granted, free of charge, to any person obtaining a copy of this +# software and associated documentation files (the "Software"), to deal in the Software +# without restriction, including without limitation the rights to use, copy, modify, +# merge, publish, distribute, sublicense, and/or sell copies of the Software, and to +# permit persons to whom the Software is furnished to do so. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, +# INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A +# PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT +# HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION +# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE +# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. diff --git a/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/source_scripts/preprocessing/prepare_abalone_data/README.md b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/source_scripts/preprocessing/prepare_abalone_data/README.md new file mode 100644 index 00000000..e69de29b diff --git a/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/source_scripts/preprocessing/prepare_abalone_data/main.py b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/source_scripts/preprocessing/prepare_abalone_data/main.py new file mode 100644 index 00000000..063a1d81 --- /dev/null +++ b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/source_scripts/preprocessing/prepare_abalone_data/main.py @@ -0,0 +1,132 @@ +# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. +# +# SPDX-License-Identifier: MIT-0 +# +# Permission is hereby granted, free of charge, to any person obtaining a copy of this +# software and associated documentation files (the "Software"), to deal in the Software +# without restriction, including without limitation the rights to use, copy, modify, +# merge, publish, distribute, sublicense, and/or sell copies of the Software, and to +# permit persons to whom the Software is furnished to do so. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, +# INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A +# PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT +# HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION +# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE +# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. +"""Feature engineers the abalone dataset.""" +import argparse +import logging +import os +import pathlib +import requests +import tempfile + +import boto3 +import numpy as np +import pandas as pd + +from sklearn.compose import ColumnTransformer +from sklearn.impute import SimpleImputer +from sklearn.pipeline import Pipeline +from sklearn.preprocessing import StandardScaler, OneHotEncoder + +logger = logging.getLogger() +logger.setLevel(logging.INFO) +logger.addHandler(logging.StreamHandler()) + + +# Since we get a headerless CSV file we specify the column names here. +feature_columns_names = [ + "sex", + "length", + "diameter", + "height", + "whole_weight", + "shucked_weight", + "viscera_weight", + "shell_weight", +] +label_column = "rings" + +feature_columns_dtype = { + "sex": str, + "length": np.float64, + "diameter": np.float64, + "height": np.float64, + "whole_weight": np.float64, + "shucked_weight": np.float64, + "viscera_weight": np.float64, + "shell_weight": np.float64, +} +label_column_dtype = {"rings": np.float64} + + +def merge_two_dicts(x, y): + """Merges two dicts, returning a new copy.""" + z = x.copy() + z.update(y) + return z + + +if __name__ == "__main__": + logger.debug("Starting preprocessing.") + parser = argparse.ArgumentParser() + parser.add_argument("--input-data", type=str, required=True) + args = parser.parse_args() + + base_dir = "/opt/ml/processing" + pathlib.Path(f"{base_dir}/data").mkdir(parents=True, exist_ok=True) + input_data = args.input_data + bucket = input_data.split("/")[2] + key = "/".join(input_data.split("/")[3:]) + + logger.info("Downloading data from bucket: %s, key: %s", bucket, key) + fn = f"{base_dir}/data/abalone-dataset.csv" + s3 = boto3.resource("s3") + s3.Bucket(bucket).download_file(key, fn) + + logger.debug("Reading downloaded data.") + df = pd.read_csv( + fn, + header=None, + names=feature_columns_names + [label_column], + dtype=merge_two_dicts(feature_columns_dtype, label_column_dtype), + ) + os.unlink(fn) + + logger.debug("Defining transformers.") + numeric_features = list(feature_columns_names) + numeric_features.remove("sex") + numeric_transformer = Pipeline(steps=[("imputer", SimpleImputer(strategy="median")), ("scaler", StandardScaler())]) + + categorical_features = ["sex"] + categorical_transformer = Pipeline( + steps=[ + ("imputer", SimpleImputer(strategy="constant", fill_value="missing")), + ("onehot", OneHotEncoder(handle_unknown="ignore")), + ] + ) + + preprocess = ColumnTransformer( + transformers=[ + ("num", numeric_transformer, numeric_features), + ("cat", categorical_transformer, categorical_features), + ] + ) + + logger.info("Applying transforms.") + y = df.pop("rings") + X_pre = preprocess.fit_transform(df) + y_pre = y.to_numpy().reshape(len(y), 1) + + X = np.concatenate((y_pre, X_pre), axis=1) + + logger.info("Splitting %d rows of data into train, validation, test datasets.", len(X)) + np.random.shuffle(X) + train, validation, test = np.split(X, [int(0.7 * len(X)), int(0.85 * len(X))]) + + logger.info("Writing out datasets to %s.", base_dir) + pd.DataFrame(train).to_csv(f"{base_dir}/train/train.csv", header=False, index=False) + pd.DataFrame(validation).to_csv(f"{base_dir}/validation/validation.csv", header=False, index=False) + pd.DataFrame(test).to_csv(f"{base_dir}/test/test.csv", header=False, index=False) diff --git a/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/source_scripts/preprocessing/prepare_abalone_data/requirements.txt b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/source_scripts/preprocessing/prepare_abalone_data/requirements.txt new file mode 100644 index 00000000..e69de29b diff --git a/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/source_scripts/training/xgboost/README.md b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/source_scripts/training/xgboost/README.md new file mode 100644 index 00000000..e69de29b diff --git a/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/source_scripts/training/xgboost/__main__.py b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/source_scripts/training/xgboost/__main__.py new file mode 100644 index 00000000..bc27f7d9 --- /dev/null +++ b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/source_scripts/training/xgboost/__main__.py @@ -0,0 +1,16 @@ +# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. +# +# SPDX-License-Identifier: MIT-0 +# +# Permission is hereby granted, free of charge, to any person obtaining a copy of this +# software and associated documentation files (the "Software"), to deal in the Software +# without restriction, including without limitation the rights to use, copy, modify, +# merge, publish, distribute, sublicense, and/or sell copies of the Software, and to +# permit persons to whom the Software is furnished to do so. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, +# INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A +# PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT +# HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION +# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE +# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. diff --git a/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/source_scripts/training/xgboost/requirements.txt b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/source_scripts/training/xgboost/requirements.txt new file mode 100644 index 00000000..e69de29b diff --git a/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/source_scripts/training/xgboost/test/test_a.py b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/source_scripts/training/xgboost/test/test_a.py new file mode 100644 index 00000000..bc27f7d9 --- /dev/null +++ b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/source_scripts/training/xgboost/test/test_a.py @@ -0,0 +1,16 @@ +# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. +# +# SPDX-License-Identifier: MIT-0 +# +# Permission is hereby granted, free of charge, to any person obtaining a copy of this +# software and associated documentation files (the "Software"), to deal in the Software +# without restriction, including without limitation the rights to use, copy, modify, +# merge, publish, distribute, sublicense, and/or sell copies of the Software, and to +# permit persons to whom the Software is furnished to do so. +# +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, +# INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A +# PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT +# HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION +# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE +# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. diff --git a/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/terraform/.gitkeep b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/terraform/.gitkeep new file mode 100644 index 00000000..e69de29b diff --git a/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/terraform/main.tf b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/terraform/main.tf new file mode 100644 index 00000000..fee92b0d --- /dev/null +++ b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/terraform/main.tf @@ -0,0 +1,28 @@ +locals { + prefix = var.prefix + sm_project_id = var.sm_project_id + sm_project_name = var.sm_project_name + aws_account_id = var.aws_account_id + aws_region = var.aws_region + target_branch = var.default_branch + codecommit_id = var.codecommit_id + artifact_bucket_name = var.artifact_bucket_name + sm_pipeline_name = "modelbuild-pipeline" + model_package_group_name = "models" + pipeline_name = "modelbuild-pipeline" +} + +module "cicd_build_pipeline" { + source = ".//modules/cicd" + + prefix = local.prefix + sm_project_id = local.sm_project_id + sm_project_name = local.sm_project_name + sm_pipeline_name = local.sm_pipeline_name + model_package_group_name = local.model_package_group_name + codecommit_id = local.codecommit_id + target_branch = local.target_branch + artifact_bucket_name = local.artifact_bucket_name + pipeline_name = local.pipeline_name +} + diff --git a/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/terraform/modules/cicd/codebuild.tf b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/terraform/modules/cicd/codebuild.tf new file mode 100644 index 00000000..942c5b58 --- /dev/null +++ b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/build_app/terraform/modules/cicd/codebuild.tf @@ -0,0 +1,118 @@ +data "aws_iam_policy_document" "codebuild_assume_policy" { + statement { + effect = "Allow" + actions = ["sts:AssumeRole"] + + principals { + type = "Service" + identifiers = ["codebuild.amazonaws.com"] + } + } +} + +resource "aws_iam_role" "codebuild_role" { + name = "${local.prefix}-${local.sm_project_id}-codebuild-modelbuild" + assume_role_policy = data.aws_iam_policy_document.codebuild_assume_policy.json +} + +# TODO: SCOPE THIS DOWN!!!!!!!!!!!!!!!!!!!!! +resource "aws_iam_role_policy_attachment" "power_user" { + role = aws_iam_role.codebuild_role.id + policy_arn = "arn:aws:iam::aws:policy/PowerUserAccess" +} + +resource "aws_iam_policy" "pass_role_to_sm_pipelines" { + description = "pass_role_to_sm_pipelines for SM Pipelines for ${local.sm_project_id}" + + policy = < Open Git Repository In Terminal +- Run the following commands + +```bash +make bootstrap +# tap enter for the defaults +make plan +make apply +``` + +Your CICD pipeline should now be deployed. Push bootstrapping changes to the repo + +``` +git add -A +git commit -m "bootstrapping" +git push +``` + +Open CodePipeline and see the pipeline execute. + +## Automated Endpoint Deployment + +Whenever a new model has been trained and approved by the modelbuild project, the CICD pipeline for this project will +trigger and attempt to deploy a SageMaker Endpoint. \ No newline at end of file diff --git a/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/deploy_app/buildspec.yml b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/deploy_app/buildspec.yml new file mode 100644 index 00000000..8cc367e2 --- /dev/null +++ b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/deploy_app/buildspec.yml @@ -0,0 +1,23 @@ +version: 0.2 +env: + variables: + TERRAFORM_VERSION: 1.2.4 + +phases: + install: + runtime-versions: + python: 3.8 + commands: + - pip install --upgrade --force-reinstall "awscli>1.20.30" + - pip install -r requirements.txt + - export PYTHONUNBUFFERED=TRUE + - echo "Installing Terraform" + - curl -o /tmp/terraform_${TERRAFORM_VERSION}_linux_amd64.zip https://releases.hashicorp.com/terraform/${TERRAFORM_VERSION}/terraform_${TERRAFORM_VERSION}_linux_amd64.zip + - unzip -o /tmp/terraform_${TERRAFORM_VERSION}_linux_amd64.zip -d /tmp && mv /tmp/terraform /usr/bin + - chmod +x /usr/bin/terraform + - terraform --version + + build: + commands: + - echo "Looking up last approved model and deploying infra" + - make auto-apply \ No newline at end of file diff --git a/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/deploy_app/infra_scripts/bootstrap.sh b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/deploy_app/infra_scripts/bootstrap.sh new file mode 100644 index 00000000..a7dd0d8a --- /dev/null +++ b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/deploy_app/infra_scripts/bootstrap.sh @@ -0,0 +1,131 @@ +#!/bin/bash + +sudo yum install -y gettext wget unzip jq + +export TERRAFORM_VERSION="1.2.4" + +echo "Attempting to install terraform" && \ +wget -q https://releases.hashicorp.com/terraform/${TERRAFORM_VERSION}/terraform_${TERRAFORM_VERSION}_linux_amd64.zip -P /tmp && \ +unzip -q /tmp/terraform_${TERRAFORM_VERSION}_linux_amd64.zip -d /tmp && \ +sudo mv /tmp/terraform /usr/local/bin/ && \ +rm -rf /tmp/terraform_${TERRAFORM_VERSION}_linux_amd64.zip && \ +echo "terraform is installed successfully" + +FOLDER_NAME=$(basename "$PWD") + +# Read the SM_PROJECT_ID from the folder name +DEFAULT_SM_PROJECT_ID=$(cat .sagemaker-code-config | jq -r .sagemakerProjectId) +read -p "Sagemaker Project ID (default \"$DEFAULT_SM_PROJECT_ID\"): " sm_project_id_input +export SM_PROJECT_ID="${sm_project_id_input:-$DEFAULT_SM_PROJECT_ID}" + +if [[ -z $SM_PROJECT_ID ]]; then + echo "No Sagemaker Project ID provided" + exit 1 +fi + +# Read the SM_PROJECT_NAME from the folder name +DEFAULT_SM_PROJECT_NAME=$(cat .sagemaker-code-config | jq -r .sagemakerProjectName) +read -p "Sagemaker Project ID (default \"$DEFAULT_SM_PROJECT_NAME\"): " sm_project_id_input +export SM_PROJECT_NAME="${sm_project_id_input:-$DEFAULT_SM_PROJECT_NAME}" + +if [[ -z $SM_PROJECT_NAME ]]; then + echo "No Sagemaker Project Name provided" + exit 1 +fi + + +# Fetch Values from CloudFormation Stack Output +export STACK_NAME=$(aws cloudformation describe-stacks --query 'Stacks[?Tags[?Key == `sagemaker:project-id` && Value == `'$DEFAULT_SM_PROJECT_ID'`]].{StackName: StackName}' --output text) + +export PREFIX=$(aws cloudformation describe-stacks --query "Stacks[?StackName=='$STACK_NAME'][].Outputs[?OutputKey=='Prefix'].OutputValue" --output text) + +export STATE_BUCKET_NAME=$(aws cloudformation describe-stacks --query "Stacks[?StackName=='$STACK_NAME'][].Outputs[?OutputKey=='MlOpsProjectStateBucket'].OutputValue" --output text) + +export ARTIFACT_BUCKET_NAME=$(aws cloudformation describe-stacks --query "Stacks[?StackName=='$STACK_NAME'][].Outputs[?OutputKey=='MlOpsArtifactsBucket'].OutputValue" --output text) + +export BUCKET_REGION=$(aws cloudformation describe-stacks --query "Stacks[?StackName=='$STACK_NAME'][].Outputs[?OutputKey=='BucketRegion'].OutputValue" --output text) + +CF_DEFAULT_BRANCH=$(aws cloudformation describe-stacks --query "Stacks[?StackName=='$STACK_NAME'][].Outputs[?OutputKey=='DefaultBranch'].OutputValue" --output text) + +export DEFAULT_BRANCH=${CF_DEFAULT_BRANCH:-main} + + +# Get the AWS Account ID. Should be set as environmet variable in SageMaker Studio +read -p "AWS_ACCOUNT_ID (default \"$AWS_ACCOUNT_ID\"): " input_account_id +export AWS_ACCOUNT_ID="${input_account_id:-$AWS_ACCOUNT_ID}" + +if [[ -z "$AWS_ACCOUNT_ID" ]]; then + echo "No AWS_ACCOUNT_ID provided" + exit 1 +fi + +# Get the AWS Region ID. Should be set as environmet variable in SageMaker Studio +read -p "AWS_REGION (default \"$AWS_REGION\"): " input_region +export AWS_REGION="${input_region:-$AWS_REGION}" + +if [[ -z "$AWS_REGION" ]]; then + echo "No AWS_REGION provided" + exit 1 +fi + +CODECOMMIT_ID=$(cat .sagemaker-code-config | jq -r .codeRepositoryName) +read -p "CODECOMMIT_ID (default \"$CODECOMMIT_ID\"): " input_codecommit_id +export CODECOMMIT_ID="${input_codecommit_id:-$CODECOMMIT_ID}" + +if [[ -z "$CODECOMMIT_ID" ]]; then + echo "No CODECOMMIT_ID provided" + exit 1 +fi + +read -p "DEFAULT_BRANCH (default \"$DEFAULT_BRANCH\"): " input_default_branch +export DEFAULT_BRANCH="${input_default_branch:-$DEFAULT_BRANCH}" + +if [[ -z "$DEFAULT_BRANCH" ]]; then + echo "No DEFAULT_BRANCH provided" + exit 1 +fi + + +export SM_EXECUTION_ROLE_ARN="arn:aws:iam::$AWS_ACCOUNT_ID:role/$PREFIX-sagemaker-execution-role" +export MODEL_PACKAGE_GROUP_NAME="$PREFIX-$SM_PROJECT_NAME-models" + +echo "--------------Bootstrap Output-----------------" +echo "Prefix: $PREFIX" +echo "Sagmaker Project ID: $SM_PROJECT_ID" +echo "Sagmaker Name: $SM_PROJECT_NAME" +echo "AWS Account ID: $AWS_ACCOUNT_ID" +echo "AWS Region: $AWS_REGION" +echo "Artifact bucket name: $ARTIFACT_BUCKET_NAME" +echo "State bucket name: $STATE_BUCKET_NAME" +echo "CodeCommit ID: $CODECOMMIT_ID" +echo "Default Branch: $DEFAULT_BRANCH" +echo "SageMaker Execution Role: $SM_EXECUTION_ROLE_ARN" +echo "Model Package Group Name: $MODEL_PACKAGE_GROUP_NAME" +echo "-----------------------------------------------" + +# Update provider.tf to use bucket and region +envsubst < "./infra_scripts/bootstrap/provider.template" > "./terraform/provider.tf" + +# Update terraform.tfvars +envsubst < "./infra_scripts/bootstrap/terraform.tfvars.template" > "./terraform/terraform.tfvars" + +# Update .project.env +SM_CODE_CONFIG=$(cat .sagemaker-code-config) +UPDATED_SM_CODE_CONFIG=$(echo $SM_CODE_CONFIG | jq ".prefix |= \"${PREFIX}\"" | jq ".model_package_group_name |= \"$MODEL_PACKAGE_GROUP_NAME\"" ) + +echo ${UPDATED_SM_CODE_CONFIG} | jq '.' > .sagemaker-code-config + +cd terraform + +terraform init + +cd .. + +echo "Commit & update tfvars, terraform provider and sagemaker-code-config" +git add .sagemaker-code-config +git add terraform/provider.tf +git add terraform/terraform.tfvars +git commit --author="SM Projects <>" -m "Bootstrapping complete" +git push +echo "Bootstrapping completed" + diff --git a/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/deploy_app/infra_scripts/bootstrap/provider.template b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/deploy_app/infra_scripts/bootstrap/provider.template new file mode 100644 index 00000000..abfbbe43 --- /dev/null +++ b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/deploy_app/infra_scripts/bootstrap/provider.template @@ -0,0 +1,12 @@ +provider "aws" { + region = "$AWS_REGION" +} + +terraform { + required_version = ">= $TERRAFORM_VERSION" + backend "s3" { + bucket = "$STATE_BUCKET_NAME" + key = "projects/state/deploy/mlops-$SM_PROJECT_ID-deploy-state.tfstate" + region = "$AWS_REGION" + } +} diff --git a/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/deploy_app/infra_scripts/bootstrap/terraform.tfvars.template b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/deploy_app/infra_scripts/bootstrap/terraform.tfvars.template new file mode 100644 index 00000000..e3926291 --- /dev/null +++ b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/deploy_app/infra_scripts/bootstrap/terraform.tfvars.template @@ -0,0 +1,10 @@ +aws_region = "$AWS_REGION" +aws_account_id = "$AWS_ACCOUNT_ID" +sm_project_id = "$SM_PROJECT_ID" +sm_project_name = "$SM_PROJECT_NAME" +codecommit_id = "$CODECOMMIT_ID" +prefix = "$PREFIX" +artifact_bucket_name = "$ARTIFACT_BUCKET_NAME" +default_branch = "$DEFAULT_BRANCH" +sm_execution_role_arn = "$SM_EXECUTION_ROLE_ARN" +model_package_group_name = "$MODEL_PACKAGE_GROUP_NAME" \ No newline at end of file diff --git a/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/deploy_app/infra_scripts/get_last_approved_model.py b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/deploy_app/infra_scripts/get_last_approved_model.py new file mode 100644 index 00000000..adb39c6d --- /dev/null +++ b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/deploy_app/infra_scripts/get_last_approved_model.py @@ -0,0 +1,44 @@ +import os +import boto3 +import json +import botocore + +region = boto3.Session().region_name +sm_client = boto3.client('sagemaker', region_name=region) + +sagemaker_code_config = open('.sagemaker-code-config') +config = json.load(sagemaker_code_config) +model_package_group_name = config["model_package_group_name"] + +mpg = sm_client.list_model_packages(ModelPackageGroupName=model_package_group_name) + +mpsl = mpg["ModelPackageSummaryList"] +last_approved_model = None +for model_package in mpsl: + approval_status = model_package["ModelApprovalStatus"] + model_arn = model_package["ModelPackageArn"] + # print(f'version:{model_package["ModelPackageVersion"]}\tapproval:{approval_status}\tarn:{model_arn}') + if last_approved_model is None and approval_status == "Approved": + last_approved_model = model_arn + +try: + model_response = sm_client.describe_model_package( + ModelPackageName=last_approved_model + ) + inference_image = model_response['InferenceSpecification']['Containers'][0]['Image'] + model_data_url = model_response['InferenceSpecification']['Containers'][0]['ModelDataUrl'] + +except botocore.exceptions.ParamValidationError as e: + if 'Invalid type for parameter ModelPackageName, value: None' in str(e): + inference_image = 'None' + model_data_url = 'None' + else: + raise e + +tf_vars_data = f'''inference_image = "{inference_image}" +model_data_url = "{model_data_url}"''' + +with open('terraform/model.auto.tfvars', 'w') as f: + f.write(tf_vars_data) + +print(f"Last Approved Model: {last_approved_model}") \ No newline at end of file diff --git a/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/deploy_app/requirements-dev.txt b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/deploy_app/requirements-dev.txt new file mode 100644 index 00000000..92709451 --- /dev/null +++ b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/deploy_app/requirements-dev.txt @@ -0,0 +1 @@ +pytest==6.2.5 diff --git a/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/deploy_app/requirements.txt b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/deploy_app/requirements.txt new file mode 100644 index 00000000..30ddf823 --- /dev/null +++ b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/deploy_app/requirements.txt @@ -0,0 +1 @@ +boto3 diff --git a/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/deploy_app/terraform/.gitkeep b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/deploy_app/terraform/.gitkeep new file mode 100644 index 00000000..e69de29b diff --git a/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/deploy_app/terraform/main.tf b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/deploy_app/terraform/main.tf new file mode 100644 index 00000000..d9f7ea99 --- /dev/null +++ b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/deploy_app/terraform/main.tf @@ -0,0 +1,53 @@ +locals { + prefix = var.prefix + sm_project_id = var.sm_project_id + sm_project_name = var.sm_project_name + aws_account_id = var.aws_account_id + aws_region = var.aws_region + target_branch = var.default_branch + artifact_bucket_name = var.artifact_bucket_name + codecommit_id = var.codecommit_id + initial_instance_count = 1 + initial_sampling_percentage = 30 + initial_variant_weight = 1 + instance_type = "ml.t2.medium" + model_name = "${local.sm_project_name}-model" + sagemaker_endpoint_configuration_name = "${local.prefix}-${local.sm_project_name}-config" + sagemaker_endpoint_name = "${local.prefix}-${local.sm_project_name}-endpoint" + variant_name = "v1" + sagemaker_execution_role = var.sm_execution_role_arn + inference_image = var.inference_image + model_data_url = var.model_data_url + pipeline_name = "modeldeploy-pipeline" + model_package_group_name = var.model_package_group_name +} + + +module "cicd_build_pipeline" { + source = ".//modules/cicd" + + prefix = local.prefix + sm_project_id = local.sm_project_id + sm_project_name = local.sm_project_name + model_package_group_name = local.model_package_group_name + codecommit_id = local.codecommit_id + target_branch = local.target_branch + artifact_bucket_name = local.artifact_bucket_name + pipeline_name = local.pipeline_name + sagemaker_execution_role = local.sagemaker_execution_role +} + +module "sm_endpoint_slim" { + source = ".//modules/sm_endpoint_slim" + + count = "${local.inference_image == "None" ? 0 : 1}" + prefix = local.prefix + instance_type = local.instance_type + initial_instance_count = local.initial_instance_count + model_name = local.model_name + sagemaker_execution_role = local.sagemaker_execution_role + sagemaker_endpoint_name = local.sagemaker_endpoint_name + variant_name = local.variant_name + inference_image = local.inference_image + model_data_url = local.model_data_url +} \ No newline at end of file diff --git a/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/deploy_app/terraform/modules/cicd/codebuild.tf b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/deploy_app/terraform/modules/cicd/codebuild.tf new file mode 100644 index 00000000..90640134 --- /dev/null +++ b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/deploy_app/terraform/modules/cicd/codebuild.tf @@ -0,0 +1,150 @@ +data "aws_iam_policy_document" "codebuild_assume_policy" { + statement { + effect = "Allow" + actions = ["sts:AssumeRole"] + + principals { + type = "Service" + identifiers = ["codebuild.amazonaws.com"] + } + } +} + +resource "aws_iam_role" "codebuild_role" { + name = "${local.prefix}-${local.sm_project_name}-codebuild-modeldeploy-role" + assume_role_policy = data.aws_iam_policy_document.codebuild_assume_policy.json +} + +# TODO: SCOPE THIS DOWN!!!!!!!!!!!!!!!!!!!!! +resource "aws_iam_role_policy_attachment" "power_user" { + role = aws_iam_role.codebuild_role.id + policy_arn = "arn:aws:iam::aws:policy/PowerUserAccess" +} + +resource "aws_iam_policy" "manage_roles" { + description = "pass_role_to_sm_pipelines for SM Pipelines for ${local.sm_project_name}" + + policy = < +## Requirements + +| Name | Version | +|------|---------| +| [terraform](#requirement\_terraform) | >= 1.0 , < 1.1 | +| [aws](#requirement\_aws) | ~> 3.0 | + +## Providers + +| Name | Version | +|------|---------| +| [aws](#provider\_aws) | ~> 3.0 | + +## Modules + +| Name | Source | Version | +|------|--------|---------| +| [module\_tags](#module\_module\_tags) | ../tags_module | n/a | + +## Resources + +| Name | Type | +|------|------| +| [aws_sagemaker_endpoint.inference_endpoint](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/sagemaker_endpoint) | resource | +| [aws_sagemaker_endpoint_configuration.inference_endpoint_configuration](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/sagemaker_endpoint_configuration) | resource | +| [aws_caller_identity.current](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/caller_identity) | data source | + +## Inputs + +| Name | Description | Type | Default | Required | +|------|-------------|------|---------|:--------:| +| [bucket\_name](#input\_bucket\_name) | The name of the S3 bucket. | `string` | n/a | yes | +| [capture\_modes](#input\_capture\_modes) | Data Capture Mode. Acceptable Values Input and Output. | `list(string)` |
[
"Input",
"Output"
]
| no | +| [csv\_content\_types](#input\_csv\_content\_types) | The CSV content type headers to capture. | `list(string)` |
[
"text/csv"
]
| no | +| [enable\_data\_capture\_config](#input\_enable\_data\_capture\_config) | Enable Data Capture Config. | `bool` | `false` | no | +| [initial\_instance\_count](#input\_initial\_instance\_count) | Instance Count. | `number` | `1` | no | +| [initial\_sampling\_percentage](#input\_initial\_sampling\_percentage) | Sampling Percentage.Acceptable Value Range Between 0 and 100. | `number` | `30` | no | +| [initial\_variant\_weight](#input\_initial\_variant\_weight) | Initial Variant Weight. Acceptable Value Range Between 0 and 1. | `number` | `1` | no | +| [instance\_type](#input\_instance\_type) | Instance Type. | `string` | `"ml.t2.medium"` | no | +| [kms\_key\_arn](#input\_kms\_key\_arn) | KMS key ARN. | `string` | n/a | yes | +| [model\_name](#input\_model\_name) | Approved Model Name. | `string` | n/a | yes | +| [sagemaker\_endpoint\_configuration\_name](#input\_sagemaker\_endpoint\_configuration\_name) | The name of the inference endpoint configuration. | `string` | n/a | yes | +| [sagemaker\_endpoint\_name](#input\_sagemaker\_endpoint\_name) | The name of the inference endpoint. | `string` | n/a | yes | +| [tags](#input\_tags) | Tags to be attached to the module tags of the resources. | `map(string)` | `{}` | no | +| [variant\_name](#input\_variant\_name) | The name of the variant. | `string` | `"variant-1"` | no | + +## Outputs + +| Name | Description | +|------|-------------| +| [inference\_endpoint](#output\_inference\_endpoint) | n/a | + diff --git a/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/deploy_app/terraform/modules/sm_endpoint_slim/main.tf b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/deploy_app/terraform/modules/sm_endpoint_slim/main.tf new file mode 100644 index 00000000..ee01ddda --- /dev/null +++ b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/deploy_app/terraform/modules/sm_endpoint_slim/main.tf @@ -0,0 +1,52 @@ +data "aws_caller_identity" "current" {} + +locals { + prefix = var.prefix + model_name = var.model_name + initial_instance_count = var.initial_instance_count + instance_type = var.instance_type + sagemaker_endpoint_name = var.sagemaker_endpoint_name + variant_name = var.variant_name + sagemaker_execution_role = var.sagemaker_execution_role + inference_image = var.inference_image + model_data_url = var.model_data_url +} + + +resource "random_id" "force_endpoint_update" { + keepers = { + # Generate a new id each time we switch model data url + model_data_url = local.model_data_url + } + byte_length = 8 +} + +resource "aws_sagemaker_model" "model" { + name = "${local.prefix}-${local.model_name}-${random_id.force_endpoint_update.dec}" + execution_role_arn = local.sagemaker_execution_role + + container { + image = local.inference_image + model_data_url = local.model_data_url + } +} + +resource "aws_sagemaker_endpoint_configuration" "inference_endpoint_configuration" { + name = "${local.sagemaker_endpoint_name}-${random_id.force_endpoint_update.dec}" + + production_variants { + variant_name = local.variant_name + model_name = aws_sagemaker_model.model.name + initial_instance_count = local.initial_instance_count + instance_type = local.instance_type + } + + lifecycle { + create_before_destroy = true + } +} + +resource "aws_sagemaker_endpoint" "inference_endpoint" { + endpoint_config_name = aws_sagemaker_endpoint_configuration.inference_endpoint_configuration.name + name = local.sagemaker_endpoint_name +} \ No newline at end of file diff --git a/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/deploy_app/terraform/modules/sm_endpoint_slim/outputs.tf b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/deploy_app/terraform/modules/sm_endpoint_slim/outputs.tf new file mode 100644 index 00000000..8677673c --- /dev/null +++ b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/deploy_app/terraform/modules/sm_endpoint_slim/outputs.tf @@ -0,0 +1,3 @@ +output "inference_endpoint" { + value = aws_sagemaker_endpoint.inference_endpoint +} diff --git a/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/deploy_app/terraform/modules/sm_endpoint_slim/terraform.tf b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/deploy_app/terraform/modules/sm_endpoint_slim/terraform.tf new file mode 100644 index 00000000..03d0656a --- /dev/null +++ b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/deploy_app/terraform/modules/sm_endpoint_slim/terraform.tf @@ -0,0 +1,9 @@ +terraform { + required_providers { + aws = { + source = "hashicorp/aws" + version = "~> 3.0" + } + } + required_version = ">= 1.0 , <= 1.2.4" +} diff --git a/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/deploy_app/terraform/modules/sm_endpoint_slim/variables.tf b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/deploy_app/terraform/modules/sm_endpoint_slim/variables.tf new file mode 100644 index 00000000..49097ad3 --- /dev/null +++ b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/deploy_app/terraform/modules/sm_endpoint_slim/variables.tf @@ -0,0 +1,47 @@ +variable "prefix" { + type = string + description = "Namespace prefix from the template." +} + +variable "initial_instance_count" { + default = 1 + description = "Instance Count." + type = number +} + +variable "instance_type" { + default = "ml.t2.medium" + description = "Instance Type." + type = string +} + +variable "model_name" { + description = "Model Name." + type = string +} + +variable "sagemaker_endpoint_name" { + description = "The name of the sagemaker endpoint." + type = string +} + +variable "variant_name" { + default = "variant-1" + description = "The name of the variant." + type = string +} + +variable "sagemaker_execution_role" { + description = "The name of the SM execution role for the model" + type = string +} + +variable "inference_image" { + type = string + description = "SageMaker inference image" +} + +variable "model_data_url" { + type = string + description = "SageMaker model data url" +} diff --git a/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/deploy_app/terraform/variables.tf b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/deploy_app/terraform/variables.tf new file mode 100644 index 00000000..16225a3c --- /dev/null +++ b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/seed_code/deploy_app/terraform/variables.tf @@ -0,0 +1,71 @@ +variable "sm_project_id" { + type = string + description = "Sagemaker Project ID" +} + +variable "sm_project_name" { + type = string + description = "Sagemaker Project Name" +} + +variable "aws_account_id" { + type = string + description = "AWS Account ID" +} + +variable "aws_region" { + type = string + description = "AWS Region" +} + +variable "codecommit_id" { + type = string + description = "CodeCommit ID." +} + +variable "prefix" { + type = string + description = "Namespace prefix from the template." +} + +variable "artifact_bucket_name" { + type = string + description = "Project artifact bucket name." +} + +variable "default_branch" { + type = string + description = "Default CodeCommit branch." +} + +variable "model_name" { + type = string + description = "SageMaker model name" + default = "" +} + +variable "sm_execution_role_arn" { + type = string + description = "SageMaker Execution role arn" +} + +variable "inference_image" { + type = string + default = "" + description = "Model inference image" +} + +variable "model_data_url" { + type = string + description = "SageMaker model data url" + default = "" +} + +variable "model_package_group_name" { + type = string + description = "Name of project's model package group" +} + + + + diff --git a/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/service_catalog_product_template.yaml.tpl b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/service_catalog_product_template.yaml.tpl new file mode 100644 index 00000000..e35cb2d8 --- /dev/null +++ b/mlops-template-terraform/mlops_templates/terraform/templates/mlops_terraform_template/service_catalog_product_template.yaml.tpl @@ -0,0 +1,63 @@ +Description: |- + Sagemaker Projects - MLOps Template using Terraform +Parameters: + SageMakerProjectName: + Type: String + AllowedPattern: ^[a-zA-Z](-*[a-zA-Z0-9])* + Description: Name of the project + MaxLength: 32 + MinLength: 1 + SageMakerProjectId: + Type: String + Description: Service generated Id of the project. +Resources: + MlOpsArtifactsBucket: + Type: AWS::S3::Bucket + Properties: + BucketName: + Fn::Sub: "${artifact_bucket_name}" + DeletionPolicy: Retain + MlOpsProjectStateBucket: + Type: AWS::S3::Bucket + Properties: + BucketName: + Fn::Sub: "${state_bucket_name}" + DeletionPolicy: Retain + ModelBuildCodeCommitRepository: + Type: AWS::CodeCommit::Repository + Properties: + RepositoryName: + Fn::Sub: "${build_code_repo_name}" + Code: + BranchName: ${default_branch} + S3: + Bucket: ${seed_code_bucket} + Key: ${seed_code_build_key} + RepositoryDescription: Model Build Code + ModelDeployCodeCommitRepository: + Type: AWS::CodeCommit::Repository + Properties: + RepositoryName: + Fn::Sub: "${deploy_code_repo_name}" + Code: + BranchName: ${default_branch} + S3: + Bucket: ${seed_code_bucket} + Key: ${seed_code_deploy_key} + RepositoryDescription: Model Deploy Code +Outputs: + Prefix: + Value: "${prefix}" + Description: Prefix value for infrastructure created by this template + DefaultBranch: + Value: "${default_branch}" + Description: Default branch CodeCommit + MlOpsProjectStateBucket: + Value: !Ref MlOpsProjectStateBucket + Description: Name of State Bucket for Project + MlOpsArtifactsBucket: + Value: !Ref MlOpsArtifactsBucket + Description: Name of Artifacts Bucket for Project + BucketRegion: + Value: !Sub "$${AWS::Region}" + Description: Region on State Bucket for Project diff --git a/mlops-template-terraform/mlops_templates/terraform/variables.tf b/mlops-template-terraform/mlops_templates/terraform/variables.tf new file mode 100644 index 00000000..139597f9 --- /dev/null +++ b/mlops-template-terraform/mlops_templates/terraform/variables.tf @@ -0,0 +1,2 @@ + +