PyTorch Models
PyTorch models in Chariot are packaged as TorchServe Model Archive (.mar
) files. This format allows you to deploy custom PyTorch models with custom preprocessing and postprocessing logic.
File Format Requirements
PyTorch models require a .mar
(Model Archive) file containing:
- Model weights (
.pth
file) - Model architecture definition (
model.py
) - Request handler (
handler.py
) - Dependencies (
model_requirements.txt
)
Quick Import Example
In order to import a PyTorch model into Chariot, a .mar
file is required. We have provided instructions below on how to create .mar archive files.
The following example code shows how to import a PyTorch model once you have a .mar
file:
import chariot.client
from chariot.models import import_model, ArtifactType, TaskType
# This is a path to the `.mar` file, or a directory or tar file containing the .mar file.
model_path = "path/to/torch/model.tar.gz"
# Define mapping of class labels to integer output.
class_labels = {"dog": 0}
chariot.client.connect()
# This will create a new model entry in the catalog, at the project and name specified.
model = import_model(
name="<NAME OF MODEL>",
# One of `project_id` or `project_name` is required.
project_id="<PROJECT ID>",
project_name="<PROJECT NAME>",
version="<MODEL VERSION>",
summary="testing pytorch model import",
task_type=TaskType.OBJECT_DETECTION,
artifact_type=ArtifactType.PYTORCH,
model_path=model_path,
class_labels=class_labels,
)
Creating Model Archive Files (.mar
) for Chariot
This section explains how to build .mar
files for Chariot. This is a file format that the TorchServe runtime uses to serve PyTorch models for inference. However, this format is not limited to PyTorch models as they can be modified to serve any kind of model (see the subsection below on MAR hacking).
To begin, copy the Makefile below into an empty directory and run make install-mar-reqs
.
MODEL_NAME:=my-model # Can be whatever you want
MODEL_WEIGHTS:=my_model_weights.pth # Path to your weights file
install-mar-reqs:
pip install torch-model-archiver
install-deps:
mkdir -p pkg
pip install -r model_requirements.txt -t pkg
$(MODEL_NAME).mar:
tar -czvf pkg.tar.gz ./pkg
torch-model-archiver --model-name ${MODEL_NAME} --version 1.0 --model-file model.py --serialized-file ${MODEL_WEIGHTS} --handler handler.py --extra-files "pkg.tar.gz"
rm pkg.tar.gz
mkdir -p mar
mv ${MODEL_NAME}.mar mar
$(MODEL_NAME)_hack.mar:
tar -czvf pkg.tar.gz ./pkg
torch-model-archiver --model-name ${MODEL_NAME} --version 1.0 --model-file model.py --handler handler.py --extra-files "pkg.tar.gz"
rm pkg.tar.gz
mkdir -p mar
mv ${MODEL_NAME}.mar mar
You can change the MODEL_NAME
and MODEL_WEIGHTS
in the Makefile to match your desired model name and weights files.
In this directory, you will need to provide:
- Source code for your model's architecture, placed into a directory called
pkg
(for example,pkg/my_src_code
). Anything inpkg
will be automatically accessible for you in your model and handler files (see below). - A model weights file (for example
my_model_weights.pth
).
To make your .mar file, you will also need to create three files called model.py
, handler.py
, and model_requirements.txt
and then place them in your current directory (these three files are described below). At this point, your current directory should look something like this:
.
├── Makefile
├── handler.py
├── model.py
├── model_requirements.txt
├── my_model_weights.pth
└── pkg
└── my_src_code
├── module1.py
├── module2.py
└── utils.py
Then, to create the .mar
file:
make install-deps
: This will install your model's external dependencies, specified bymodel_requirements.txt
, intopkg
.make my-model.mar
(you can rename the model by editing theMakefile
): This will create a tarpkg
and include it in the creation of the.mar
file. It is important to note that this command assumes your model weights are calledmy_model_weights.pth
, so either change your weights file to match that or edit the Make target with the appropriate weights file name.
After this is complete, you will find your .mar
file in the mar
directory.
model.py
This should have exactly one class definition that specifies the model architecture. A good way to do this without modifying your source code (e.g., pkg/my_src_code
) is to make a pass-through class like this example:
from my_src_code.module1 import MyModel
class MyModel_copy(MyModel):
def __init__(self):
# Initialize the parent model with any required parameters.
super().__init__(
arg1,
arg2,
kwarg1=kwarg1,
kwarg2=kwarg2,
...
)
The name of this class does not matter. Note that the __init__
of this class cannot have any positional arguments and should call the __init__
of the parent class (which can, of course, have positional arguments).
Serving Other Runtimes (aka MAR Hacking)
Since TorchServe
is a PyTorch runtime, whatever class you define in model.py
needs to eventually inherit from torch.nn.Module
. If your model does not inherit from this class, you can do the following workaround for your model.py
:
from my_src_code.module1 import initialize_my_model
from torch import nn
class MyModel_copy(nn.Module):
def __init__(self):
"""
Manually instantiate your model.
When you refer to your weights, assume they are in `pkg`.
For example, that might look like:
`self.model = initialize_my_model(weights="pkg/my_weights.pth")`.
"""
super().__init__() # required to come first
self.model = initialize_my_model(
arg1,
arg2,
kwarg1=kwarg1,
kwarg2=kwarg2,
...
)
def forward(self, *args, **kwargs):
"""
The inference call. Edit as necessary.
"""
return self.model.forward(*args, **kwargs)
def to(self, *args, **kwargs):
"""
Method for sending your model to the device (CPU, CUDA, etc.).
Edit this as necessary. If your model doesn't need to run on GPU,
then you can leave this method blank.
"""
...
def eval(self, *args, **kwargs):
"""
Method for putting your model in eval mode.
Edit this as necessary.
"""
...
In this case, we have forced MyModel_copy
to inherit from nn.Module
, and we endowed it with a self.model
attribute that is the actual underlying model. This can be any model you like (TensorFlow, sklearn, etc.). You also must implement the .__call__()
, .to()
and .eval()
methods for this class, which tells the runtime how to perform inference, send your model to a GPU, and place it into eval mode.
If you are using this workaround, follow these steps:
- Place the weights in
pkg
(This wasn't necessary for PyTorch runtimes because themake my-model.mar
command specified the weights directly). - To make the
mar
file, use themake my-model_hack.mar
rather thanmake my-model.mar
.
handler.py
This is a file that specifies how to preprocess and postprocess model outputs. This should also have exactly one class defined, and it should inherit from ts.torch_handler.base_handler.BaseHandler
. If your task is a computer vision task (i.e., your inference requests will be sending images), then use the ts.torch_handler.vision_handler.VisionHandler
as your base.
The handler.py
should take the following form:
from ts.torch_handler.vision_handler import VisionHandler
import torchvision.transforms as T
from PIL import Image
import tarfile
import os
import sys
# Unpack and add `pkg` to path-allows us to import dependencies from there.
try:
with tarfile.open("pkg.tar.gz") as tf:
tf.extractall(".")
except FileNotFoundError:
pass
current_dir = os.path.dirname(os.path.abspath(__file__))
sys.path = [current_dir + "/pkg"] + sys.path
class ChariotClassificationHandler(VisionHandler):
image_processing = T.Compose([
T.Resize(256),
T.ToTensor(),
]) # Required to provide
def preprocess(self, data):
""" Preprocessing step (optional). """
# Use the base class preprocess first. This converts your payload to a PIL.Image and then applies `image_processing` above.
data = super().preprocess(data)
#data = some_stuff(data) # do your own stuff optionally
return data
def inference(self, data, *args, **kwargs):
""" Inference step (optional). Assume `data` is the output of `self.preprocess`. """
data = super().inference(data, *args, **kwargs) # calls model.__call__
#data = some_stuff(data) # do your own stuff optionally
return data
def postprocess(self, data):
"""
Postprocessing step (optional). Assume `data` is the output of `self.inference`.
The return value should be JSON serializable.
"""
data = super().postprocess(data)
#data = some_stuff(data) # do your own postprocessing optionally
return data
You should change the preprocess
, inference
, and postprocess
methods as you need to properly execute your pipeline.
The base preprocessor handles PyTorch transforms (such as resize or normalization) using the attribute image_processing
shown above. It should at least have the T.ToTensor()
transform.
model_requirements.txt
This is a regular requirements.txt
file for any external dependencies that your model uses.
When doing make install-deps
, depending on what is in your requirements file, large packages (such as torch
) are often installed into pkg
that are unnecessary since TorchServe runs in an environment where torch
, torchvision
, etc., are already installed. So, after doing make install-deps
, you can remove any redundant large packages—usually torch
, torchvision
, nvidia
—that get put into pkg
, which will reduce your final mar
file size.