chariot.training_v2 package
Submodules
chariot.training_v2.blueprint module
- chariot.training_v2.blueprint.lookup_blueprint_id(name: str, version_ilike: str | None = None) str [source]
Returns the id of the blueprint specified by the arguments
Parameters
- namestr
name of the blueprint. if version_ilike is not provided, then the latest version of the blueprint with this name will be used.
- version_ilikestr
this parameter accepts a SQL ILIKE pattern for matching a blueprint version. If multiple blueprints match the given version, the most recent one will be used.
Returns
str : id of the blueprint
Raises
- BlueprintDoesNotExistError
If the blueprint does not exist or has been deleted, this will be raised.
- ValueError
If name is not provided.
- APIException
If api communication fails, request is unauthorized or is unauthenticated.
chariot.training_v2.checkpoint module
- class chariot.training_v2.checkpoint.Checkpoint(*, id: str | None = None, run_id: str | None = None, global_step: int | None = None, project_id: str | None = None, created_at: datetime | None = None, status: str | None = None, status_updated_at: datetime | None = None, bucket_name: str | None = None, key_prefix: str | None = None)[source]
Bases:
BaseModelWithDatetime
- bucket_name: str | None
- create_model(*, name: str, version: str, summary: str, project_id: str | None = None) str [source]
Create a model from this checkpoint
Parameters
- namestr
The name to give the model
- versionstr
The version to give the model. Must be in SemVer format
- summarystr
A short summary of the model
- project_idOptional[str]
The ID of the project to create the model in. If omitted, the project ID of the associated run will be used.
Returns
- model_idstr
The ID of the created model
Raises
- APIException
If api communication fails, request is unauthorized or is unauthenticated.
- created_at: datetime | None
- global_step: int | None
- id: str | None
- key_prefix: str | None
- model_config: ClassVar[ConfigDict] = {'protected_namespaces': (), 'validate_assignment': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- project_id: str | None
- run_id: str | None
- status: str | None
- status_updated_at: datetime | None
- chariot.training_v2.checkpoint.create_model_from_checkpoint(*, checkpoint_id: str, name: str, version: str, summary: str, project_id: str | None = None) str [source]
Create a model from a checkpoint
Parameters
- checkpoint_idstr
The ID of the checkpoint to create the model from
- namestr
The name to give the model
- versionstr
The version to give the model. Must be in SemVer format
- summarystr
A short summary of the model
- project_idOptional[str]
The ID of the project to create the model in. If omitted, the project ID of the associated run will be used.
Returns
- model_idstr
The ID of the created model
Raises
- APIException
If api communication fails, request is unauthorized or is unauthenticated.
- chariot.training_v2.checkpoint.delete_checkpoints(*, ids: list[str] | None = None, run_ids: list[str] | None = None) None [source]
Delete checkpoints matching the provided filters
Parameters
- idOptional[List[str]]
If specified, filter to checkpoints with any of the given Checkpoint IDs. Note: either id or run_id must be specified, in order to prevent accidental deletion of all checkpoints.
- run_idOptional[List[str]]
If specified, filter to checkpoints with any of the given Run IDs. Note: either id or run_id must be specified, in order to prevent accidental deletion of all checkpoints.
Raises
- APIException
If api communication fails, request is unauthorized or is unauthenticated.
- chariot.training_v2.checkpoint.download_checkpoint(id: str, file_dir: str) None [source]
Download checkpoint artifacts
Parameters
- idstr
The ID of the checkpoint to download
- file_dirstr
The file dir for the downloaded checkpoint artifacts, file dir must exist.
Returns
Raises
- APIException
If api communication fails, request is unauthorized or is unauthenticated.
- chariot.training_v2.checkpoint.get_checkpoints(*, ids: list[str] | None = None, run_ids: list[str] | None = None, project_ids: list[str] | None = None, global_steps: list[int] | None = None, statuses: list[Literal['incomplete', 'complete']] | None = None, created_before: datetime | None = None, created_after: datetime | None = None, select: list[Literal['id', 'created_at', 'run_id', 'project_id', 'global_step', 'status', 'status_updated_at']] | None = None, sort: list[Literal['id:asc', 'id:desc', 'created_at:asc', 'created_at:desc']] | None = None, limit: int | None = None, offset: int | None = None) list[Checkpoint] [source]
Get checkpoints matching the provided filters
Parameters
- idsOptional[List[str]]
If specified, filter to checkpoints with any of the given IDs
- run_idsOptional[List[str]]
If specified, filter to checkpoints for any of the given Run Ids
- project_idsOptional[List[str]]
If specified, filter to checkpoints with any of the given Project Ids
- global_stepsOptional[List[int]]
If specified, filter to checkpoints from any of the given Global Steps
- statusesOptional[List[Literal[“incomplete”, “complete”]]]
If specified, filter to checkpoints with a specific status.
- created_beforeOptional[datetime]
If specified, filter to checkpoints created before the given date and time. This can be used for keyset pagination
- created_afterOptional[datetime]
If specified, filter to checkpoints created after the given date and time. This can be used for keyset pagination
- selectOptional[List[Literal[“id”, “created_at”, “run_id”,
“project_id”, “global_step”, “status”, “status_updated_at”]]] If specified, only the given fields are included in the response.
- sortOptional[List[Literal[“id:asc”, “id:desc”, “created_at:asc”, “created_at:desc”]]]
Sort by the given fields in the given directions. Default: “created_at:desc”
- limitOptional[int]
Limit the response to the given number of checkpoints. Default: 10
- offsetOptional[int]
Offset based pagination. Default: 0
Returns
- checkpointsList[Checkpoint]
The checkpoints matching the filter criteria
Raises
- APIException
If api communication fails, request is unauthorized or is unauthenticated.
chariot.training_v2.exceptions module
- exception chariot.training_v2.exceptions.ApiException(status=None, reason=None, http_resp=None, *, body: str | None = None, data: Any | None = None)[source]
Bases:
OpenApiException
- exception chariot.training_v2.exceptions.BlueprintDoesNotExistError(id: str | None = None, name: str | None = None, version_ilike: str | None = None)[source]
Bases:
Exception
chariot.training_v2.run module
Training run management.
- class chariot.training_v2.run.Event(*, id: str, sequence: int, run_id: str, created_at: datetime, status: str, details: dict)[source]
Bases:
BaseModelWithDatetime
Training run event.
- created_at: datetime
- details: dict
- id: str
- model_config: ClassVar[ConfigDict] = {'protected_namespaces': (), 'validate_assignment': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- run_id: str
- sequence: int
- status: str
- class chariot.training_v2.run.Gpu(*, count: int, type: str)[source]
Bases:
BaseModelWithDatetime
Gpu resource metadata.
All available gpu types can be found be calling the function
chariot.system_resources.get_available_system_gpus
.- count: int
- model_config: ClassVar[ConfigDict] = {'protected_namespaces': (), 'validate_assignment': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- type: str
- class chariot.training_v2.run.Metric(*, id: str, created_at: datetime, run_id: str, global_step: int, tag: str, value: float | int, job_id: str | None = None)[source]
Bases:
BaseModelWithDatetime
Training run metric.
- created_at: datetime
- global_step: int
- id: str
- job_id: str | None
- model_config: ClassVar[ConfigDict] = {'protected_namespaces': (), 'validate_assignment': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- run_id: str
- tag: str
- value: float | int
- class chariot.training_v2.run.Progress(*, operation: str, value: float | int, final_value: float | int, units: str)[source]
Bases:
BaseModelWithDatetime
Training run progress.
- final_value: float | int
- model_config: ClassVar[ConfigDict] = {'protected_namespaces': (), 'validate_assignment': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- operation: str
- units: str
- value: float | int
- class chariot.training_v2.run.Resources(*, cpu: str, memory: str, ephemeral_storage: str | None = None, gpu: Gpu | None = None)[source]
Bases:
BaseModelWithDatetime
Training run scheduling resources.
These values represent kubernetes resources that will be allocated for a training run.
- Example values:
cpu: “1” cpu: “500m” memory: “5Gi” # gigabytes memory: “5000000Ki” # kilobytes ephemeral_storage: “20Gi”
- cpu: str
- ephemeral_storage: str | None
- memory: str
- model_config: ClassVar[ConfigDict] = {'protected_namespaces': (), 'validate_assignment': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class chariot.training_v2.run.Run(*, id: str | None = None, name: str | None = None, version: str | None = None, created_at: datetime | None = None, blueprint_id: str | None = None, project_id: str | None = None, user_id: str | None = None, progress: list[Progress] | None = None, progress_updated_at: datetime | None = None, status: str | None = None, status_updated_at: datetime | None = None, task_type: str | None = None, resources: Resources | None = None, config: dict | None = None, notes: str | None = None)[source]
Bases:
BaseModelWithDatetime
Training run.
Please use
chariot.training_v2.run.Run.from_id()
to get a run by id, orchariot.training_v2.run.get_runs()
to lookup runs by name, version, etc.Fields marked Optional should be included by default, but might be missing if a
select
filter is applied tochariot.training_v2.run.get_runs()
.- blueprint_id: str | None
- config: dict | None
- created_at: datetime | None
- delete()[source]
Delete this run.
Raises
- APIException
If api communication fails, request is unauthorized or is unauthenticated.
- get_all_metrics() list[Metric] [source]
Get all metrics for this run.
Sort order is unspecified and may change in the future.
Returns
metrics: list[Metric]
Raises
- APIException
If api communication fails, request is unauthorized or is unauthenticated.
- get_checkpoints(*, ids: list[str] | None = None, project_ids: list[str] | None = None, global_steps: list[int] | None = None, statuses: list[Literal['incomplete', 'complete']] | None = None, created_before: datetime | None = None, created_after: datetime | None = None, select: list[Literal['id', 'created_at', 'run_id', 'project_id', 'global_step', 'status', 'status_updated_at']] | None = None, sort: list[Literal['id:asc', 'id:desc', 'created_at:asc', 'created_at:desc']] | None = None, limit: int | None = None, offset: int | None = None) list[Checkpoint] [source]
Get checkpoints for this run.
Parameters
- ids: list[str] | None
If specified, filter to checkpoints with any of the given IDs
- project_ids: list[str] | None
If specified, filter to checkpoints with any of the given Project Ids
- global_steps: list[int] | None
If specified, filter to checkpoints from any of the given Global Steps
- statuses: list[Literal[“incomplete”, “complete”]] | None
If specified, filter to checkpoints with a specific status.
- created_before: datetime | None
If specified, filter to checkpoints created before the given date and time. This can be used for keyset pagination
- created_after: datetime | None
If specified, filter to checkpoints created after the given date and time. This can be used for keyset pagination
- select: list[Literal[ “id”, “created_at”, “run_id”, “project_id”, “global_step”, “status”, “status_updated_at”]] | None
If specified, only the given fields are included in the response.
- sort: list[Literal[“id:asc”, “id:desc”, “created_at:asc”, “created_at:desc”]] | None
Sort by the given fields in the given directions. Default: “created_at:desc”
- limit: int | None
Limit the response to the given number of checkpoints. Default: 10
- offset: int | None
Offset based pagination. Default: 0
Returns
- checkpoints: list[Checkpoint]
The checkpoints matching the filter criteria
Raises
- APIException
If api communication fails, request is unauthorized or is unauthenticated.
- get_events(limit: int | None = None, offset: int | None = None, sort: list[Literal['sequence:desc', 'sequence:asc']] | None = None) list[Event] [source]
Get events for this run.
Parameters
- limit: int | None
Limit the response to the given number of run events. Defaults to 100.
- offset: int | None
Offset based pagination. Defaults to 0.
- sort: list[Literal[“sequence:desc”, “sequence:asc”]] | None
Sort by the given fields in the given directions. The field and direction should be separated by a colon. For example:
sort=sequence:desc
. Defaults tosequence:desc
. Valid field is onlysequence
. Valid directions are ascending (asc
) or descending (desc
). If the direction is not specified it defaults to ascending (asc
).
Raises
- APIException
If api communication fails, request is unauthorized or is unauthenticated.
- get_global_steps_with_checkpoints() list[int] [source]
Get the global steps for which a checkpoint exists for this run.
Returns
list[int]: global steps
Raises
- APIException
If api communication fails, request is unauthorized or is unauthenticated.
- get_metrics(global_steps: list[int] | None = None, tags: list[str] | None = None, limit: int = 1000, created_before: datetime | None = None) list[Metric] [source]
Get metrics for this run.
Parameters
- global_steps: list[int] | None
if specified, only return metrics for these global steps
- tags: list[str] | None
if specified, only return metrics with the given tags
- limit: int (default: 1000)
limit the response to the given number of metrics
- created_before: datetime | None
if specified, filter to metrics created before the given date and time. This can be used for keyset pagination
Returns
metrics: list[Metric]
Raises
- APIException
If api communication fails, request is unauthorized or is unauthenticated.
- id: str | None
- model_config: ClassVar[ConfigDict] = {'protected_namespaces': (), 'validate_assignment': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- name: str | None
- notes: str | None
- progress_updated_at: datetime | None
- project_id: str | None
- reload(fields: list[str] | None = None)[source]
Reload the training run.
Parameters
- fields: list[str] | None
List of fields to reload. Options are “status”, “status_updated_at”, “progress”, “progress_updated_at”, and “notes”. If omitted, all fields will be refreshed.
Raises
- RunDoesNotExistError
If the run does not exist or has been deleted, this will be raised.
- ValueError
If fields are invalid, or the id is not set on this run.
- APIException
If api communication fails, request is unauthorized or is unauthenticated.
- status: str | None
- status_updated_at: datetime | None
- stop(grace_period: timedelta | None = None) None [source]
Stop the training run.
Parameters
- grace_period: timedelta | None
Time that will be tolerated before the run should be force stopped. Must be greater than or equal to 1 second. If not provided, will default to 10 minutes
Raises
- APIException
If api communication fails, request is unauthorized or is unauthenticated.
- task_type: str | None
- user_id: str | None
- version: str | None
- chariot.training_v2.run.create_run(*, name: str, version: str, resources: Resources, config: dict, task_type: str, blueprint_id: str, project_id: str, notes: str | None = None) str [source]
Create a training run.
Parameters
- name: str
name of the run
- version: str
version of the run
- resources: Resources
resources to allocate for scheduling the run
- config: dict
the run config
- task_type: str
task type of the run
- project_id: str
the id of the project to create the run in. To lookup a project id by name, use
chariot.projects.get_project_id
.- blueprint_id: str
the id of the blueprint to use. To lookup a blueprint id by name, use
chariot.training_v2.lookup_blueprint_id
.- notes: str, optional
notes associated with the training run
Returns
- run_id: str
the created run’s id
Raises
- ValidationError
if the provided run config is invalid according to the blueprint, or any required parameters are ill-formed.
- APIException
if api communication fails, request is unauthorized or is unauthenticated.
- chariot.training_v2.run.get_runs(*, blueprint_ids: list[str] | None = None, created_after: datetime | None = None, created_before: datetime | None = None, ids: list[str] | None = None, id_after: str | None = None, limit: int | None = None, offset: int | None = None, name_ilikes: list[str] | None = None, project_ids: list[str] | None = None, select: list[Literal['*', 'id', 'project_id', 'user_id', 'created_at', 'name', 'version', 'blueprint_id', 'task_type', 'config', 'resources', 'progress', 'progress_updated_at', 'status', 'status_updated_at']] | None = None, sort: list[Literal['id:asc', 'id:desc', 'created_at:asc', 'created_at:desc']] | None = None, statuses: list[Literal['run_created', 'run_stop_requested', 'run_restart_requested', 'job_create_failed', 'job_created', 'job_submitted', 'job_pending', 'job_running', 'job_terminate_requested', 'job_terminating', 'job_terminated', 'job_failed', 'job_completed', 'job_unknown']] | None = None, task_types: list[str] | None = None, versions: list[str] | None = None, user_ids: list[str] | None = None) list[Run] [source]
Get runs matching the provided critera
Parameters
- blueprint_ids: list[str] | None
If specified, filter to runs with any of the given Blueprint IDs
- created_after: datetime | None
If specified, filter to runs created after the given date and time. This can be used for keyset pagination
- created_before: datetime | None
If specified, filter to runs created before the given date and time. This can be used for keyset pagination
- ids: list[str] | None
If specified, filter to runs with any of the given IDs.
- id_after: str | None
If specified, filter to runs with an ID after the given ID. This can be used for keyset pagination.
- limit: int | None
Limit the response to the given number of runs. Default: 10
- offset: int | None
Offset based pagination. Default: 0
- name_ilikes: list[str] | None
If specified, filter to runs with a
name
that matches any of the given SQL ILIKE patterns. Options for pattern matching are:%
matches any sequence of zero or more characters._
matches any single character. To match the literal characters%
or_
, escape the character with a\
, e.g.\%testrun
To use equality matching, simply provide a plain string with no special characters. Matching is case insensitive. For example: The pattern%test-run%
matchestest-run
,FOOtest-runBAR
, andtest-runBAR
. The pattern\%test_run
matches%test9run
and%test_run
, but notFOOtest_run
,%test__run
, or%test_runBAR
. The patterntest\_run
matchestest_run
and nothing else.- project_ids: list[str] | None
If specified, filter to runs with any of the given Project IDs. To lookup a project id by name, use
chariot.projects.get_project_id
- select: list[Literal[“id”, “project_id”, “user_id”, “created_at”, “name”, “version”, “blueprint_id”, “task_type”, “config”, “resources”, “progress”, “progress_updated_at”, “status”, “status_updated_at”]]] | None
If specified, only the selected fields are included in the response. If all fields are desired, use “*”. Excluded attributes will be None in the
chariot.training_v2.Run
responses.- sort: list[Literal[“id:asc”, “id:desc”, “created_at:asc”, “created_at:desc”]]] | None
Sort by the given fields in the given directions. The field and direction should be separated by a colon. Default:
"created_at:desc"
- statuses: list[Literal[“run_created”, “run_stop_requested”, “run_restart_requested”, “job_create_failed”, “job_created”, “job_submitted”, “job_pending”, “job_running”, “job_terminate_requested”, “job_terminating”, “job_terminated”, “job_failed”, “job_completed”, “job_unknown”]] | None
If specified, filter to runs with any of the given statuses.
- task_types: list[str] | None
If specified, filter to runs with any of the given Task Types. Examples:
"Object Detection"
,"Image Segmentation"
- versions: list[str] | None
If specified, filter to runs with any of the given Versions.
- user_ids: list[str] | None
If specified, filter to runs with any of the given User IDs.
Returns
- list[Run]
Runs matching the filter criteria
Raises
- APIException
If api communication fails, request is unauthorized or is unauthenticated.
- chariot.training_v2.run.validate_run_config(*, blueprint_id: str, config: dict)[source]
Validate a training run configuration against the provided blueprint id.
Parameters
- blueprint_id: str
The blueprint to validate against
- config: dict
The run configuration to validate
Raises
- ValidationError
if the provided run config is invalid
- APIException
if api communication fails, request is unauthorized or is unauthenticated.