MLFlow in SageMaker¶
MLFlow Capabilities¶
SageMaker features a capability called Bring Your Own Container (BYOC)
, which allows you to run custom Docker containers
on the inference endpoint. These containers must meet specific requirements, such as running a web server that exposes certain REST endpoints, having a designated container entrypoint, setting environment variables, etc. Writing a Dockerfile and serving script that meets these requirements can be a tedious task.
How MLFlow integrates with S3 and ECR?
MLflow automates the process by building a Docker image
from the MLflow Model on your behalf. Subsequently, it pushed the image to Elastic Container Registry
and creates a SageMaker endpoint
using this image. It also uploads the model artifact to an S3 bucket
and configures the endpoint to download the model from there.
The container provides the same REST endpoints as a local inference server. For instance, the /invocations
endpoint accepts CSV and JSON input data and returns prediction results.
Step 1. Run model locally¶
It’s recommended to test your model locally before deploying it to a production environment. The mlflow deployments run-local command deploys the model in a Docker container with an identical image and environment configuration, making it ideal for pre-deployment testing.
$ mlflow deployments run-local -t sagemaker -m runs:/<run_id>/model -p 5000
You can then test the model by sending a POST request to the endpoint:
$ curl -X POST -H "Content-Type:application/json; format=pandas-split" --data '{"columns":["a","b"],"data":[[1,2]]}' http://localhost:5000/invocations
Step 2. Build a Docker Image and Push to ECR¶
The mlflow sagemaker build-and-push-container command builds a Docker image compatible with SageMaker and uploads it to ECR.
$ mlflow sagemaker build-and-push-container -m runs:/<run_id>/model
Step 3. Deploy to SageMaker Endpoint¶
The mlflow deployments create command deploys the model to an Amazon SageMaker endpoint. MLflow uploads the Python Function model to S3 and automatically initiates an Amazon SageMaker endpoint serving the model.
$ mlflow deployments create -t sagemaker -m runs:/<run_id>/model \
-C region_name=<your-region> \
-C instance-type=ml.m4.xlarge \
-C instance-count=1 \
-C env='{"DISABLE_NGINX": "true"}''