### Install Dependencies for Building Docs Source: https://github.com/aws/sagemaker-python-sdk/blob/master/CONTRIBUTING.md Initial setup for building Sphinx documentation. This command installs necessary requirements and the SDK in editable mode. ```shell # Initial setup, only required for the first run pip install -r requirements.txt pip install -e ../ ``` -------------------------------- ### Setup Environment and Install Dependencies Source: https://github.com/aws/sagemaker-python-sdk/blob/master/v3-examples/model-customization-examples/recipe_override_evaluator_example.ipynb Sets up the Python environment by adjusting sys.path and environment variables, and installs necessary packages like omegaconf and pyyaml. ```python # Setup environment import sys import os sys.path.insert(0, '../../sagemaker-train/src') sys.path.insert(0, '../../sagemaker-core/src') # Point botocore at the bundled service model os.environ['AWS_DATA_PATH'] = os.path.abspath('../../sagemaker-core/sample') !pip install -q omegaconf pyyaml ``` -------------------------------- ### Install Dependencies and Set Environment Source: https://github.com/aws/sagemaker-python-sdk/blob/master/v3-examples/model-customization-examples/recipe_override_sft_trainer_example.ipynb Installs necessary libraries and configures environment variables for SageMaker training. This is a prerequisite for running the subsequent code examples. ```python import sys import os sys.path.insert(0, '../../sagemaker-train/src') sys.path.insert(0, '../../sagemaker-core/src') os.environ['AWS_DATA_PATH'] = os.path.abspath('../../sagemaker-core/sample') !pip install -q omegaconf pyyaml ``` -------------------------------- ### Install Project Source: https://github.com/aws/sagemaker-python-sdk/blob/master/sagemaker-train/CONTRIBUTING.md Install the project locally using pip. Ensure you are in the project's root directory. ```bash pip install . ``` -------------------------------- ### Install SageMaker Python SDK V3 Core Functionality Source: https://github.com/aws/sagemaker-python-sdk/blob/master/docs/installation.md Installs only the core components of the SageMaker Python SDK V3, useful for minimal setups. ```bash pip install sagemaker-core ``` -------------------------------- ### Install SageMaker Python SDK from Source Source: https://github.com/aws/sagemaker-python-sdk/blob/master/README.rst Install the SageMaker Python SDK directly from its GitHub repository. This involves cloning the repository and then running a pip install command from the root directory. ```bash git clone https://github.com/aws/sagemaker-python-sdk.git cd sagemaker-python-sdk pip install . ``` -------------------------------- ### Install All SageMaker Packages Source: https://github.com/aws/sagemaker-python-sdk/blob/master/sagemaker-mlops/README.md Install all SageMaker packages together in editable mode for development. ```bash pip install -e sagemaker-core pip install -e sagemaker-train pip install -e sagemaker-serve pip install -e sagemaker-mlops ``` -------------------------------- ### Install SageMaker SDK Helper Source: https://github.com/aws/sagemaker-python-sdk/blob/master/migration.md Install the SageMaker SDK helper tool for migration. Use --force-reinstall to ensure the latest version is applied. ```bash pip install --no-cache-dir https://d3azyja9oqj8z1.cloudfront.net/sagemaker_sdk_helper-0.2.0.tar.gz --force-reinstall ``` -------------------------------- ### Verify SageMaker SDK Helper Installation Source: https://github.com/aws/sagemaker-python-sdk/blob/master/migration.md Verify the installation of the SageMaker SDK helper by checking its executable path and running the help command. ```bash which sagemaker-sdk-helper # Should output the path to the executable ``` ```bash sagemaker-sdk-helper --help # Test the server runs ``` -------------------------------- ### Install Test Libraries Source: https://github.com/aws/sagemaker-python-sdk/blob/master/README.rst Installs the necessary libraries for running tests. Use the first command for general use and the second for Zsh users. ```bash pip install --upgrade .[test] ``` ```bash pip install --upgrade .[test] ``` -------------------------------- ### Install Helm 3 Source: https://github.com/aws/sagemaker-python-sdk/blob/master/docs/model_customization/finetuning_hyperpod.md Installs Helm 3, a prerequisite for the SageMaker HyperPod CLI. This script downloads, makes executable, and installs Helm, then verifies the installation. ```bash curl -fsSL -o get_helm.sh https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 chmod 700 get_helm.sh ./get_helm.sh rm -f ./get_helm.sh helm version # Verify installation ``` -------------------------------- ### Install Documentation Dependencies with Pip Source: https://github.com/aws/sagemaker-python-sdk/blob/master/README.rst Installs documentation dependencies using pip from the requirements.txt file. ```bash pip install -r doc/requirements.txt ``` -------------------------------- ### start Source: https://github.com/aws/sagemaker-python-sdk/blob/master/sagemaker-core/docs/index.md Starts a SageMaker Notebook Instance. This operation can be used to resume a previously stopped instance. ```APIDOC ## start ### Description Starts a NotebookInstance resource. ### Method Not specified (SDK method) ### Parameters #### Path Parameters None #### Query Parameters None #### Request Body None ### Parameters * **session** (*Session* | *None*) – Boto3 session. * **region** (*str* | *None*) – Region name. ### Raises * **botocore.exceptions.ClientError** – This exception is raised for AWS service related errors. * **ResourceLimitExceeded** – You have exceeded an SageMaker resource limit. ### Return type: None ``` -------------------------------- ### Install Python and Activate Virtual Environment Source: https://github.com/aws/sagemaker-python-sdk/blob/master/sagemaker-core/CONTRIBUTING.md Installs a specific Python version (3.10.14), creates a virtual environment named 'py3.10.14', and activates it. ```bash pyenv install 3.10.14 pyenv virtualenv 3.10.14 py3.10.14 pyenv activate py3.10.14 ``` -------------------------------- ### Quick Install SageMaker Python SDK V3 Source: https://github.com/aws/sagemaker-python-sdk/blob/master/docs/installation.md Installs the latest version of the SageMaker Python SDK V3. This is the most straightforward method for general use. ```bash pip install sagemaker ``` -------------------------------- ### Install SageMaker Python SDK V3 Training Capabilities Source: https://github.com/aws/sagemaker-python-sdk/blob/master/docs/installation.md Installs the SageMaker Python SDK V3 components specifically for model training tasks. ```bash pip install sagemaker-train ``` -------------------------------- ### Setup and Configuration Source: https://github.com/aws/sagemaker-python-sdk/blob/master/v3-examples/model-customization-examples/rlvr_finetuning_example_notebook_v3_prod.ipynb Imports necessary libraries and configures AWS credentials and region for SageMaker RLVR finetuning. ```python # Configure AWS credentials and region #! ada credentials update --provider=isengard --account=<> --role=Admin --profile=default --once #! aws configure set region us-west-2 from sagemaker.train.rlvr_trainer import RLVRTrainer from sagemaker.train.configs import InputData from rich import print as rprint from rich.pretty import pprint from sagemaker.core.resources import ModelPackage from sagemaker.train.common import TrainingType import boto3 import os from sagemaker.core.helper.session_helper import Session # For MLFlow native metrics in Trainer wait, run below line with approriate region os.environ["SAGEMAKER_MLFLOW_CUSTOM_ENDPOINT"] = "https://mlflow.sagemaker.us-west-2.app.aws" ``` -------------------------------- ### Start a SageMaker Notebook Instance Source: https://github.com/aws/sagemaker-python-sdk/blob/master/sagemaker-core/docs/index.md Initiates a SageMaker Notebook Instance. Ensure you have the necessary permissions and that resource limits are not exceeded. ```python # Assuming 'notebook_instance' is an instance of a SageMaker NotebookInstance class notebook_instance.start() ``` -------------------------------- ### Get All Training Plans Source: https://github.com/aws/sagemaker-python-sdk/blob/master/sagemaker-core/docs/api/sagemaker_core.md Retrieves a list of all TrainingPlan resources, with options for filtering by start time, sorting, and pagination. ```APIDOC ## classmethod get_all ### Description Get all TrainingPlan resources. ### Method classmethod ### Parameters #### Query Parameters - **start_time_after** (datetime) - Optional - Filter to list only training plans with an actual start time after this date. - **start_time_before** (datetime) - Optional - Filter to list only training plans with an actual start time before this date. - **sort_by** (str | PipelineVariable) - Optional - The training plan field to sort the results by (e.g., StartTime, Status). - **sort_order** (str | PipelineVariable) - Optional - The order to sort the results (Ascending or Descending). - **filters** (List[TrainingPlanFilter]) - Optional - Additional filters to apply to the list of training plans. - **session** (boto3.session.Session) - Optional - Boto3 session. - **region** (str | PipelineVariable) - Optional - Region name. ### Returns ResourceIterator[[TrainingPlan](#sagemaker.core.resources.TrainingPlan)] - Iterator for listed TrainingPlan resources. ### Raises - **botocore.exceptions.ClientError** - This exception is raised for AWS service related errors. The error message and error code can be parsed from the exception as follows: `` try: # AWS service call here except botocore.exceptions.ClientError as e: error_message = e.response['Error']['Message'] error_code = e.response['Error']['Code'] `` ``` -------------------------------- ### Basic SageMaker Session Setup Source: https://github.com/aws/sagemaker-python-sdk/blob/master/docs/quickstart.md Import the SDK and create a SageMaker session. This also retrieves the execution role and default S3 bucket. ```python from sagemaker.core.helper.session_helper import Session, get_execution_role # Create a SageMaker session session = Session() role = get_execution_role() print(f"Using role: {role}") print(f"Default bucket: {session.default_bucket()}") ``` -------------------------------- ### Get All Training Plans Source: https://github.com/aws/sagemaker-python-sdk/blob/master/sagemaker-core/docs/index.md Retrieves an iterator for all SageMaker TrainingPlan resources. Supports filtering by start time, sorting, and applying additional filters. ```APIDOC ## Get All Training Plans ### Description Retrieves an iterator for all SageMaker TrainingPlan resources. Supports filtering by start time, sorting, and applying additional filters. ### Method classmethod ### Signature get_all(start_time_after=Unassigned(), start_time_before=Unassigned(), sort_by=Unassigned(), sort_order=Unassigned(), filters=Unassigned(), session=None, region=None) ### Parameters #### Optional Parameters * **next_token** - A token to continue pagination if more results are available. * **max_results** - The maximum number of results to return in the response. * **start_time_after** (datetime | None) - Filter to list only training plans with an actual start time after this date. * **start_time_before** (datetime | None) - Filter to list only training plans with an actual start time before this date. * **sort_by** (str | PipelineVariable | None) - The training plan field to sort the results by (e.g., StartTime, Status). * **sort_order** (str | PipelineVariable | None) - The order to sort the results (Ascending or Descending). * **filters** (List[TrainingPlanFilter] | None) - Additional filters to apply to the list of training plans. * **session** (Session | None) - Boto3 session. * **region** (str | PipelineVariable | None) - Region name. ### Returns Iterator for listed TrainingPlan resources. ### Raises * **botocore.exceptions.ClientError** - This exception is raised for AWS service related errors. ``` -------------------------------- ### Create ModelBuilder from JumpStart Configuration Source: https://github.com/aws/sagemaker-python-sdk/blob/master/docs/api/sagemaker_serve.md This example shows how to instantiate a ModelBuilder using a JumpStart configuration, specifying compute resources. It's ideal for deploying pre-trained models from SageMaker JumpStart. ```python >>> from sagemaker.core.jumpstart.configs import JumpStartConfig >>> from sagemaker.serve.model_builder import ModelBuilder >>> >>> js_config = JumpStartConfig( ... model_id="huggingface-llm-mistral-7b", ... model_version="*" ... ) >>> >>> from sagemaker.core.training.configs import Compute >>> >>> model_builder = ModelBuilder.from_jumpstart_config( ... jumpstart_config=js_config, ... compute=Compute(instance_type="ml.g5.2xlarge", instance_count=1) ... ) >>> >>> model = model_builder.build() # Creates Model resource >>> endpoint = model_builder.deploy() # Creates Endpoint resource >>> result = endpoint.invoke(data=input_data) # Make predictions ``` -------------------------------- ### Initialize SageMaker Session (V3) Source: https://github.com/aws/sagemaker-python-sdk/blob/master/docs/ml_ops/lineage.md Sets up a SageMaker session and default bucket using V3 import paths. This is the recommended approach for new projects. ```python import boto3 from sagemaker.core.helper.session_helper import Session region = boto3.Session().region_name sagemaker_session = Session() default_bucket = sagemaker_session.default_bucket() ``` -------------------------------- ### Get All Training Plans with Filters Source: https://github.com/aws/sagemaker-python-sdk/blob/master/docs/api/sagemaker_core.md Retrieves all TrainingPlan resources, allowing filtering by start time, sorting, and additional custom filters. Requires a boto3 session and optionally a region. ```python sagemaker_core.TrainingPlan.get_all(start_time_after=datetime(2023, 1, 1), sort_by='StartTime', sort_order='Ascending') ``` -------------------------------- ### Implement Custom Evaluator in SageMaker Source: https://github.com/aws/sagemaker-python-sdk/blob/master/docs/api/sagemaker_train.md Subclasses of BaseEvaluator must implement the evaluate method to define custom evaluation logic. This example shows how to create a pipeline definition and start an execution. ```python >>> # In a subclass implementation >>> class CustomEvaluator(BaseEvaluator): ... def evaluate(self): ... # Create pipeline definition ... pipeline_definition = self._build_pipeline() ... # Start execution ... return EvaluationPipelineExecution.start(...) ``` -------------------------------- ### Setup SageMaker session and create temporary directories Source: https://github.com/aws/sagemaker-python-sdk/blob/master/v3-examples/training-examples/local-training-example.ipynb Initializes a SageMaker session, retrieves the default training image for PyTorch, and creates temporary directories for source code and data. It also sets the Docker platform for Apple Silicon compatibility. ```python sagemaker_session = Session() region = sagemaker_session.boto_region_name # Get the correct ECR image for your region from sagemaker.core import image_uris DEFAULT_CPU_IMAGE = image_uris.retrieve( framework="pytorch", region=region, version="2.0.0", py_version="py310", instance_type="ml.m5.xlarge", image_scope="training" ) # Set Docker platform for Apple Silicon compatibility import platform if platform.machine() == 'arm64': os.environ['DOCKER_DEFAULT_PLATFORM'] = 'linux/amd64' # Create temporary directories temp_dir = tempfile.mkdtemp() source_dir = os.path.join(temp_dir, "source") data_dir = os.path.join(temp_dir, "data") train_dir = os.path.join(data_dir, "train") test_dir = os.path.join(data_dir, "test") os.makedirs(source_dir, exist_ok=True) os.makedirs(train_dir, exist_ok=True) os.makedirs(test_dir, exist_ok=True) print(f"Created temporary directories in: {temp_dir}") ``` -------------------------------- ### LLMAsJudgeEvaluator with Built-in Metrics and Wait Source: https://github.com/aws/sagemaker-python-sdk/blob/master/docs/api/sagemaker_train.md Initiate an LLM-as-judge evaluation job using built-in metrics and wait for its completion. This example demonstrates a basic setup for evaluating a base model against an evaluator model. ```python evaluator = LLMAsJudgeEvaluator( base_model="llama-3-3-70b-instruct", evaluator_model="anthropic.claude-3-5-sonnet-20240620-v1:0", dataset="s3://my-bucket/my-dataset.jsonl", builtin_metrics=["Correctness", "Helpfulness"], s3_output_path="s3://my-bucket/output" ) execution = evaluator.evaluate() execution.wait() ``` -------------------------------- ### V3 Resource Chaining Example Source: https://github.com/aws/sagemaker-python-sdk/blob/master/migration.md Demonstrates V3 resource chaining by training a model and then using its output to build and deploy a model. ```python # Train a model model_trainer = ModelTrainer(...) model_trainer.train() # Chain training output to model builder model_builder = ModelBuilder(model=model_trainer) # Deploy the trained model endpoint = model_builder.deploy() ``` -------------------------------- ### Serve Sphinx Documentation Locally Source: https://github.com/aws/sagemaker-python-sdk/blob/master/README.rst Starts a local Python web server to preview the built HTML documentation. Access it via http://localhost:8000. ```bash cd _build/html python -m http.server 8000 ``` -------------------------------- ### Initialize, Build, and Deploy Model with ModelBuilder Source: https://github.com/aws/sagemaker-python-sdk/blob/master/docs/api/sagemaker_serve.md Initialize ModelBuilder with a trained model, then build the SageMaker Model resource and deploy it to an endpoint. Finally, invoke the endpoint to get predictions. This example demonstrates the core workflow for deploying a model. ```python from sagemaker.serve.model_builder import ModelBuilder from sagemaker.serve.mode.function_pointers import Mode # Initialize with a trained model model_builder = ModelBuilder( model=my_pytorch_model, role_arn="arn:aws:iam::123456789012:role/SageMakerRole", instance_type="ml.m5.xlarge" ) # Build the model (creates SageMaker Model resource) model = model_builder.build() # Deploy to endpoint (creates SageMaker Endpoint resource) endpoint = model_builder.deploy(endpoint_name="my-endpoint") # Make predictions result = endpoint.invoke(data=input_data) ``` -------------------------------- ### Alternative Data Mix Configurations Source: https://github.com/aws/sagemaker-python-sdk/blob/master/v3-examples/model-customization-examples/nova_data_mixing.ipynb Examples of alternative `DataMixingConfig` setups for different specialization needs: high specialization (90% custom data), balanced mix (50% custom data), and light specialization (30% custom data). ```python # High specialization: mostly your data high_specialization = DataMixingConfig( customer_data_percent=90.0, nova_data_percentages={ "reasoning": 100.0, }, ) # Balanced: equal split with multiple Nova categories balanced_mix = DataMixingConfig( customer_data_percent=50.0, nova_data_percentages={ "code": 40.0, "reasoning": 30.0, "math": 30.0, }, ) # Light specialization: preserve broad capabilities light_specialization = DataMixingConfig( customer_data_percent=30.0, nova_data_percentages={ "code": 25.0, "math": 25.0, "chat": 25.0, "reasoning": 25.0, }, ) ``` -------------------------------- ### Access and Modify Evaluation Hyperparameters Source: https://github.com/aws/sagemaker-python-sdk/blob/master/docs/api/sagemaker_train.md Illustrates how to access, modify, and retrieve evaluation hyperparameters using the `hyperparameters` property of the BenchMarkEvaluator. Includes examples of getting current values, setting new values with validation, converting to a dictionary, and displaying parameter information. ```python evaluator = BenchMarkEvaluator(...) # Access current values print(evaluator.hyperparameters.temperature) # Modify values (with validation) evaluator.hyperparameters.temperature = 0.5 # Get as dictionary params = evaluator.hyperparameters.to_dict() # Display parameter information evaluator.hyperparameters.get_info() evaluator.hyperparameters.get_info('temperature') ``` -------------------------------- ### SFT Trainer Nova Testing Setup Source: https://github.com/aws/sagemaker-python-sdk/blob/master/v3-examples/model-customization-examples/sft_finetuning_example_notebook_pysdk_prod_v3.ipynb Configure environment variables and initialize an `SFTTrainer` for testing with the Nova model. Ensure to replace placeholder values for bucket paths. ```python os.environ['SAGEMAKER_REGION'] = 'us-east-1' # For fine-tuning sft_trainer_nova = SFTTrainer( #model="test-nova-lite-v2", #model="nova-textgeneration-micro", model="nova-textgeneration-lite-v2", training_type=TrainingType.LORA, model_package_group="sdk-test-finetuned-models", mlflow_experiment_name="test-nova-finetuned-models-exp", mlflow_run_name="test-nova-finetuned-models-run", training_dataset="arn:aws:sagemaker:us-east-1:<>:hub-content/sdktest/DataSet/sft-nova-test-dataset/0.0.1", s3_output_path="s3:///output/" # TODO: replace with your S3 output path ) ``` -------------------------------- ### start Source: https://github.com/aws/sagemaker-python-sdk/blob/master/sagemaker-core/docs/index.md Start a PipelineExecution resource. ```APIDOC ## start ### Description Start a PipelineExecution resource. ### Method Not explicitly defined, assumed to be a Python method call. ### Parameters #### Path Parameters None #### Query Parameters None #### Request Body None ### Parameters * **session** (*Session* | None) – Boto3 session. * **region** (*str* | None) – Region name. * **pipeline_name** (*str* | *PipelineVariable*) * **client_request_token** (*str* | *PipelineVariable*) * **pipeline_parameters** (*List*[*Parameter*] | None) * **mlflow_experiment_name** (*str* | *PipelineVariable* | None) ### Raises * **botocore.exceptions.ClientError** – This exception is raised for AWS service related errors. * **ConflictException** – There was a conflict when you attempted to modify a SageMaker entity such as an Experiment or Artifact. * **ResourceLimitExceeded** – You have exceeded an SageMaker resource limit. * **ResourceNotFound** – Resource being accessed is not found. ### Return type: None ``` -------------------------------- ### SageMaker Entry Point Configuration V2 vs V3 Source: https://github.com/aws/sagemaker-python-sdk/blob/master/docs/training/index.md Demonstrates the configuration of the entry point script for training jobs in V2 and V3. V3 uses a `SourceCode` object. ```python entry_point="train.py" ``` ```python SourceCode(entry_script="train.py") ``` -------------------------------- ### Install Python Dependencies Source: https://github.com/aws/sagemaker-python-sdk/blob/master/sagemaker-core/example_notebooks/sagemaker-core-llama-3-8B.ipynb Installs or upgrades necessary Python packages including sagemaker-core and huggingface_hub. Use --quiet for silent installation. ```python %pip install pip --upgrade --quiet %pip install sagemaker-core huggingface_hub --quiet ``` -------------------------------- ### Setup SageMaker Session and Retrieve Training Image Source: https://github.com/aws/sagemaker-python-sdk/blob/master/v3-examples/training-examples/aws_batch/sm-training-queues_getting_started_with_model_trainer.ipynb Initializes logging, a SageMaker session, and retrieves the appropriate training image URI for PyTorch. ```python import logging logging.basicConfig( level=logging.INFO, format="%(asctime)s - %(name)s - %(levelname)s - %(message)s" ) logging.getLogger("botocore.client").setLevel(level=logging.WARN) logger = logging.getLogger(__name__) from sagemaker.core.helper.session_helper import Session from sagemaker.core import image_uris session = Session() image_uri = image_uris.retrieve( framework="pytorch", region=session.boto_session.region_name, version="2.5", instance_type=INSTANCE_TYPE, image_scope="training", ) ``` -------------------------------- ### Install MLflow for Path Resolution Fixes Source: https://github.com/aws/sagemaker-python-sdk/blob/master/v3-examples/ml-ops-examples/v3-mlflow-train-inference-e2e-example.ipynb Installs a specific version of MLflow to address known issues with model path resolution. A kernel restart is required after installation. ```python # Install fix for MLflow path resolution issues %pip install mlflow==3.4.0 ``` -------------------------------- ### Initialize SageMaker Session and Setup Directories Source: https://github.com/aws/sagemaker-python-sdk/blob/master/v3-examples/training-examples/distributed-local-training-example.ipynb Initializes the SageMaker session and creates temporary directories for source code and data. This setup is crucial for organizing training artifacts. ```python sagemaker_session = Session() region = sagemaker_session.boto_region_name DEFAULT_CPU_IMAGE = image_uris.retrieve( framework="pytorch", region=region, version="2.0.0", py_version="py310", instance_type="ml.m5.xlarge", image_scope="training" ) # Create temporary directories temp_dir = tempfile.mkdtemp() source_dir = os.path.join(temp_dir, "source") data_dir = os.path.join(temp_dir, "data") train_dir = os.path.join(data_dir, "train") test_dir = os.path.join(data_dir, "test") os.makedirs(source_dir, exist_ok=True) os.makedirs(train_dir, exist_ok=True) os.makedirs(test_dir, exist_ok=True) print(f"Created temporary directories in: {temp_dir}") print("Note: This will use multiple Docker containers locally for distributed training!") ``` -------------------------------- ### start Source: https://github.com/aws/sagemaker-python-sdk/blob/master/docs/api/sagemaker_mlops.md Starts a Pipeline execution in the Workflow service with optional parameters for customization. ```APIDOC ## start ### Description Starts a Pipeline execution in the Workflow service. Allows overriding pipeline parameters and specifying execution details. ### Method Not specified (assumed to be a method call in a Python SDK) ### Parameters #### Path Parameters None #### Query Parameters None #### Request Body None ### Parameters - **parameters** (Dict[str, Union[str, bool, int, float]], optional) - Values to override pipeline parameters. - **execution_display_name** (str, optional) - The display name of the pipeline execution. - **execution_description** (str, optional) - A description of the execution. - **parallelism_config** (ParallelismConfiguration, optional) - Parallelism configuration that is applied to each of the executions of the pipeline. It takes precedence over the parallelism configuration of the parent pipeline. - **selective_execution_config** (SelectiveExecutionConfig, optional) - The configuration for selective step execution. - **mlflow_experiment_name** (str, optional) - Optional MLflow experiment name to override the experiment name specified in the pipeline’s mlflow_config. If provided, this will override the experiment name for this specific pipeline execution only, without modifying the pipeline definition. - **pipeline_version_id** (int, optional) - Version ID of the pipeline to start the execution from. If not specified, uses the latest version ID. ### Response #### Success Response - **PipelineExecution** - A PipelineExecution instance, if successful. ### Request Example ```python # Example usage (Python SDK): parameters = {"param1": "value1"} sagemaker_pipeline.start(parameters=parameters, execution_display_name="MyExecution") ``` ### Response Example ```json { "example": "PipelineExecution(arn='arn:aws:sagemaker:us-east-1:123456789012:pipeline/MyPipeline/execution/MyExecutionId')" } ``` ``` -------------------------------- ### Install SageMaker MLOps Package Source: https://github.com/aws/sagemaker-python-sdk/blob/master/sagemaker-mlops/README.md Install the sagemaker-mlops package in editable mode for development. ```bash pip install -e sagemaker-mlops ``` -------------------------------- ### Setup SageMaker Session and Create Test Files Source: https://github.com/aws/sagemaker-python-sdk/blob/master/v3-examples/training-examples/custom-distributed-training-example.ipynb Initializes the SageMaker session, retrieves the execution role and region, and sets up temporary directories for custom driver and script files. It also retrieves the default PyTorch training image. ```python sagemaker_session = Session() role = get_execution_role() region = sagemaker_session.boto_region_name DEFAULT_CPU_IMAGE = image_uris.retrieve( framework="pytorch", region=region, version="2.0.0", py_version="py310", instance_type="ml.m5.xlarge", image_scope="training" ) # Create temporary directories temp_dir = tempfile.mkdtemp() custom_drivers_dir = os.path.join(temp_dir, "custom_drivers") scripts_dir = os.path.join(temp_dir, "scripts") os.makedirs(custom_drivers_dir, exist_ok=True) os.makedirs(scripts_dir, exist_ok=True) print(f"Created temporary directories in: {temp_dir}") ``` -------------------------------- ### Install Additional Packages Source: https://github.com/aws/sagemaker-python-sdk/blob/master/sagemaker-core/example_notebooks/sagemaker_core_overview.ipynb Installs or upgrades scikit-learn, pandas, and boto3 to their latest versions. ```bash # Install additional packages !pip install -U scikit-learn pandas boto3 ``` -------------------------------- ### Install SageMaker Python SDK from Source Source: https://github.com/aws/sagemaker-python-sdk/blob/master/sagemaker-train/README.rst Clone the repository, navigate to the sagemaker_train directory, and install using pip. This method is useful for development or when needing the latest unreleased changes. ```bash git clone https://github.com/aws/sagemaker-python-sdk-staging.git cd sagemaker-python-sdk-staging/sagemaker_train pip install . ``` -------------------------------- ### start Source: https://github.com/aws/sagemaker-python-sdk/blob/master/sagemaker-core/docs/index.md Starts a MonitoringSchedule resource. This operation can be used to initiate a scheduled monitoring job. ```APIDOC ## start ### Description Starts a MonitoringSchedule resource. ### Method Not applicable (Python SDK method) ### Parameters #### Path Parameters None #### Query Parameters None #### Request Body None ### Parameters * **session** (*Session* | *None*) – Boto3 session. * **region** (*str* | *None*) – Region name. ### Raises * **botocore.exceptions.ClientError** – This exception is raised for AWS service related errors. * **ResourceNotFound** – Resource being access is not found. ### Return type None ``` -------------------------------- ### Intelligent Defaults Configuration Example Source: https://github.com/aws/sagemaker-python-sdk/blob/master/sagemaker-core/example_notebooks/intelligent_defaults_and_logging.ipynb Example JSON structure for configuring intelligent defaults, including global and resource-specific settings for VPC configuration and training job parameters. Replace placeholder values with your actual resource IDs and ARNs. ```json { "SchemaVesion": "1.0", "SageMaker": { "PythonSDK": { "Resources": { "GlobalDefaults": { "vpc_config": { "security_group_ids": [ "sg-xxxxxxxxxxxxxxxxx" // Replace with security group id ], "subnets": [ "subnet-xxxxxxxxxxxxxxxxx", // Replace with subnet id "subnet-xxxxxxxxxxxxxxxxx" // Replace with subnet id ] } // ... }, "TrainingJob": { "role_arn": "arn:aws:xxxxxxxxxxx:role/xxxxx", // Replace with role arn "output_data_config": { "s3_output_path": "s3://xxxxxxxxxxx", // Replace with S3 URI }, // ... } } } } } ``` -------------------------------- ### start Source: https://github.com/aws/sagemaker-python-sdk/blob/master/sagemaker-core/docs/index.md Starts a InferenceExperiment resource. This operation initiates the execution of a specified inference experiment. ```APIDOC ## start ### Description Start a InferenceExperiment resource. This operation initiates the execution of a specified inference experiment. ### Method Method ### Parameters #### Path Parameters None #### Query Parameters * **session** (Session | None) - Boto3 session. * **region** (str | None) - Region name. ### Raises * **botocore.exceptions.ClientError** - This exception is raised for AWS service related errors. * **ConflictException** - There was a conflict when you attempted to modify a SageMaker entity such as an Experiment or Artifact. * **ResourceNotFound** - Resource being access is not found. ### Return type None ``` -------------------------------- ### Setup SageMaker Session and Environment Source: https://github.com/aws/sagemaker-python-sdk/blob/master/v3-examples/training-examples/hyperparameter-training-example.ipynb Initializes the SageMaker session, retrieves the execution role and region, defines expected hyperparameters, and sets up a temporary directory for training artifacts. It also retrieves the default training image URI. ```python sagemaker_session = Session() role = get_execution_role() region = sagemaker_session.boto_region_name # Expected hyperparameters EXPECTED_HYPERPARAMETERS = { "integer": 1, "boolean": True, "float": 3.14, "string": "Hello World", "list": [1, 2, 3], "dict": { "string": "value", "integer": 3, "float": 3.14, "list": [1, 2, 3], "dict": {"key": "value"}, "boolean": True, }, } DEFAULT_CPU_IMAGE = image_uris.retrieve( framework="pytorch", region=region, version="2.0.0", py_version="py310", instance_type="ml.m5.xlarge", image_scope="training" ) # Create temporary directory temp_dir = tempfile.mkdtemp() source_dir = os.path.join(temp_dir, "source") os.makedirs(source_dir, exist_ok=True) print(f"Created temporary directory: {temp_dir}") ``` -------------------------------- ### Train and Deploy JumpStart Model with ModelBuilder Source: https://github.com/aws/sagemaker-python-sdk/blob/master/docs/inference/index.md Train a JumpStart model using ModelTrainer.from_jumpstart_config() and then deploy it via ModelBuilder. Ensure necessary imports are present. ```python from sagemaker.train.model_trainer import ModelTrainer from sagemaker.serve.model_builder import ModelBuilder from sagemaker.core.jumpstart.configs import JumpStartConfig jumpstart_config = JumpStartConfig(model_id="huggingface-spc-bert-base-cased") trainer = ModelTrainer.from_jumpstart_config( jumpstart_config=jumpstart_config, base_job_name="js-training", hyperparameters={\"epochs\": 1}, ) trainer.train() model_builder = ModelBuilder(model=trainer, dependencies={\"auto\": False}) core_model = model_builder.build(model_name="bert-trained") endpoint = model_builder.deploy(endpoint_name="bert-endpoint") ``` -------------------------------- ### Install Additional Python Packages Source: https://github.com/aws/sagemaker-python-sdk/blob/master/sagemaker-core/example_notebooks/inference_and_resource_chaining.ipynb Installs or upgrades scikit-learn, pandas, and boto3 to their latest versions. ```bash !pip install -U scikit-learn pandas boto3 ``` -------------------------------- ### SageMaker Migration Tool Usage Examples Source: https://github.com/aws/sagemaker-python-sdk/blob/master/migration.md Examples of how to use the SageMaker migration tools for analyzing, transforming, and asking questions about V2 to V3 code migration. ```bash # Analyze V2 code Analyze this V2 code for migration: [paste your V2 code] # Transform V2 code to V3 Transform this V2 code to V3: [paste your V2 code] # Ask migration questions What is ModelTrainer in V3? How do I migrate Estimator to V3? ``` -------------------------------- ### Import SageMaker SDK and Initialize Session Source: https://github.com/aws/sagemaker-python-sdk/blob/master/v3-examples/model-customization-examples/rlaif_finetuning_example_notebook_v3_prod.ipynb Import necessary classes from the SageMaker SDK and initialize the SageMaker session. Configure MLFlow endpoint if needed for native metrics. ```python #!/usr/bin/env python3 from sagemaker.train.rlaif_trainer import RLAIFTrainer from sagemaker.train.configs import InputData from rich import print as rprint from rich.pretty import pprint from sagemaker.core.resources import ModelPackage import os #os.environ['SAGEMAKER_REGION'] = 'us-east-1' import boto3 from sagemaker.core.helper.session_helper import Session # For MLFlow native metrics in Trainer wait, run below line with approriate region os.environ["SAGEMAKER_MLFLOW_CUSTOM_ENDPOINT"] = "https://mlflow.sagemaker.us-west-2.app.aws" ``` -------------------------------- ### Verify HyperPod CLI Installation Source: https://github.com/aws/sagemaker-python-sdk/blob/master/docs/model_customization/finetuning_hyperpod.md Checks if the HyperPod CLI is installed correctly by displaying its help message. ```bash hyperpod --help ``` -------------------------------- ### Deploy JumpStart Model with ModelBuilder Source: https://github.com/aws/sagemaker-python-sdk/blob/master/docs/inference/index.md This snippet demonstrates deploying pre-trained models from the JumpStart hub using ModelBuilder.from_jumpstart_config(), specifying compute resources and model configuration. ```python from sagemaker.serve.model_builder import ModelBuilder from sagemaker.core.jumpstart.configs import JumpStartConfig from sagemaker.train.configs import Compute compute = Compute(instance_type="ml.g5.2xlarge") jumpstart_config = JumpStartConfig(model_id="huggingface-llm-falcon-7b-bf16") model_builder = ModelBuilder.from_jumpstart_config( jumpstart_config=jumpstart_config, compute=compute, ) core_model = model_builder.build(model_name="falcon-model") endpoint = model_builder.deploy(endpoint_name="falcon-endpoint") result = endpoint.invoke( body=json.dumps({"inputs": "What are falcons?", "parameters": {"max_new_tokens": 32}}), content_type="application/json" ) ``` -------------------------------- ### Install Pyenv Source: https://github.com/aws/sagemaker-python-sdk/blob/master/sagemaker-core/CONTRIBUTING.md Installs Pyenv, a tool for managing multiple Python versions, using a curl command. ```bash curl https://pyenv.run | bash ``` -------------------------------- ### Initialize SageMaker Session (V2 Legacy) Source: https://github.com/aws/sagemaker-python-sdk/blob/master/docs/ml_ops/lineage.md Sets up a SageMaker session and default bucket using V2 import paths. This code is for legacy projects. ```python import boto3 import sagemaker region = boto3.Session().region_name sagemaker_session = sagemaker.session.Session() default_bucket = sagemaker_session.default_bucket() ``` -------------------------------- ### start Source: https://github.com/aws/sagemaker-python-sdk/blob/master/sagemaker-core/docs/api/sagemaker_core.md Starts a SageMaker PipelineExecution resource. This is the entry point for initiating a pipeline run with specified parameters. ```APIDOC ## start ### Description Start a PipelineExecution resource. ### Method Not explicitly defined, but implies an SDK method call. ### Parameters #### Path Parameters None #### Query Parameters None #### Request Body None ### Parameters * **pipeline_name** (str | PipelineVariable) - Required - The name of the pipeline to start. * **client_request_token** (str | PipelineVariable) - Required - A unique, case-sensitive identifier that you provide to ensure the idempotency of the operation. * **pipeline_parameters** (List[Parameter] | None) - Optional - A list of parameters to use for the pipeline execution. * **mlflow_experiment_name** (str | PipelineVariable | None) - Optional - The MLflow experiment name to associate with the pipeline execution. * **session** (boto3.session.Session | None) - Optional - Boto3 session. * **region** (str | None) - Optional - Region name. ### Raises * **botocore.exceptions.ClientError** - This exception is raised for AWS service related errors. * **ConflictException** - There was a conflict when you attempted to modify a SageMaker entity. * **ResourceLimitExceeded** - You have exceeded an SageMaker resource limit. * **ResourceNotFound** - Resource being accessed is not found. ``` -------------------------------- ### Create Training Job: Boto3 vs SageMaker Core Source: https://github.com/aws/sagemaker-python-sdk/blob/master/sagemaker-core/docs/sagemaker_core/index.md Compares the traditional Boto3 approach with the simplified SageMaker Core approach for creating a training job. SageMaker Core requires fewer parameters due to intelligent defaults. ```python import boto3 client = boto3.client('sagemaker') response = client.create_training_job( TrainingJobName='my-training-job', RoleArn='arn:aws:iam::123456789012:role/SageMakerRole', InputDataConfig=[ { 'ChannelName': 'training', 'DataSource': { 'S3DataSource': { 'S3DataType': 'S3Prefix', 'S3Uri': 's3://my-bucket/train', 'S3DataDistributionType': 'FullyReplicated' } } } ], # ... many more required parameters ) ``` ```python from sagemaker.core.resources import TrainingJob from sagemaker.core.shapes import TrainingJobConfig training_job = TrainingJob.create( training_job_name="my-training-job", role_arn="arn:aws:iam::123456789012:role/SageMakerRole", input_data_config=[ { "channel_name": "training", "data_source": "s3://my-bucket/train" } ] ) ```