Scikit-learn Models
Scikit-learn models are traditional machine learning models ideal for structured data tasks like classification and regression. These models are typically smaller, faster to train, and require less computational resources than deep learning models.
File Format Requirements
Scikit-learn models can be provided as:
- Python object: Direct
model_object
parameter (for trained models in memory) - Joblib file:
.joblib
serialized model file - Archive:
.tar.gz
or directory containing a.joblib
file
Import Example
Importing a scikit-learn classification model trained on the classic Iris dataset may look like the following:
from chariot.models import import_model, ArtifactType, TaskType
# Define mapping of class labels to integer output.
class_labels = {"Setosa": 0, "Versicolour": 1, "Virginica": 2}
# Provide user-friendly names and descriptions of the input features.
input_info = [
{
"name": "Sepal length (cm)",
"description": "Length of the iris' sepals. Sepals are the leaf-like structure surrounding the petals.",
},
{
"name": "Sepal width (cm)",
"description": "Width of the iris' sepals. Sepals are the leaf-like structures surrounding the petals.",
},
{
"name": "Petal length (cm)",
"description": "Length of the iris' petals.",
},
{
"name": "Petal width (cm)",
"description": "Width of the iris' petals.",
},
]
# This will create a new model entry in the catalog, at the project and name specified.
model = import_model(
name="<NAME OF MODEL>",
# One of `project_id` or `project_name` is required.
project_id="<PROJECT ID>",
project_name="<PROJECT NAME>",
version="<MODEL VERSION>",
artifact_type=ArtifactType.SKLEARN,
task_type=TaskType.STRUCTURED_DATA_CLASSIFICATION,
class_labels=class_labels,
summary="testing scikit-learn model import",
input_info=input_info,
model_object=sklearn_model,
)
where model_object
is a fit model or pipeline, e.g., an object of type RandomForestClassifier
, LogisticRegression
, Pipeline
, etc.
You can alternatively store your scikit-learn model as a model.joblib
file and upload it with the model_path=path_to_file
argument. A directory or .tar.gz
containing the model.joblib
is also accepted.