chariot.datasets package
Submodules
chariot.datasets.annotations module
- chariot.datasets.annotations.archive_annotation(id: str, *, task_id: str | None = None) Annotation [source]
Archive (soft-delete) an annotation by id
- Parameters:
id (str) – Id of annotation to archive
task_id (Optional[str]) – Id of task to which the datum is locked with
- Returns:
Annotation details
- Return type:
- chariot.datasets.annotations.create_annotation(datum_id: str, *, class_label: str | None = None, contour: list[list[Point]] | None = None, bbox: BoundingBox | None = None, oriented_bbox: OrientedBoundingBox | None = None, text_classification: TextClassification | None = None, text_generation: TextGeneration | None = None, token_classification: TokenClassification | None = None, metadata: dict[str, Any] | None = None, approval_status: ApprovalStatus | None = None, task_id: str | None = None) Annotation [source]
Create a new annotation
- Parameters:
datum_id (str) – Id of datum to add annotation to
class_label (Optional[str]) – Class label of the annotation
contour (Optional[List[List[models.Point]]]) – Contour for an Image Segmentation annotation
bbox (Optional[models.BoundingBox]) – Bounding box for an Object Detection annotation
oriented_bbox (Optional[models.OrientedBoundingBox]) – Oriented bounding box for an Oriented Object Detection annotation
text_classification (Optional[models.TextClassification]) – Text Classification annotation
text_generation (Optional[models.TextGeneration]) – Text Generation annotation
token_classification (Optional[models.TokenClassification]) – Token Classification annotation
metadata (Optional[Dict[str, Any]]) – Metadata associated with the annotation
approval_status (Optional[models.ApprovalStatus]) – Reviewer approval status for the annotation
task_id (Optional[str]) – Id of task to which the datum is locked with
- Returns:
New annotation details
- Return type:
- chariot.datasets.annotations.get_annotation(id: str) Annotation [source]
Get an annotation by id
- Parameters:
id (str) – Id of annotation to get
- Returns:
Annotation details
- Return type:
- chariot.datasets.annotations.update_annotation(annotation_id: str, *, class_label: str | None = None, contour: list[list[Point]] | None = None, bbox: BoundingBox | None = None, oriented_bbox: OrientedBoundingBox | None = None, text_classification: TextClassification | None = None, text_generation: TextGeneration | None = None, token_classification: TokenClassification | None = None, metadata: dict[str, Any] | None = None, approval_status: ApprovalStatus | None = None, updated_at: str | None = None, task_id: str | None = None) Annotation [source]
Update or replace an annotation
- Parameters:
annotation_id (str) – Id of annotation to be updted or replaced
class_label (Optional[str]) – Class label of the annotation
contour (Optional[List[List[models.Point]]]) – Contour for an Image Segmentation annotation
bbox (Optional[models.BoundingBox]) – Bounding box for an Object Detection annotation
oriented_bbox (Optional[models.OrientedBoundingBox]) – Oriented bounding box for an Oriented Object Detection annotation
text_classification (Optional[models.TextClassification]) – Text Classification annotation
text_generation (Optional[models.TextGeneration]) – Text Generation annotation
token_classification (Optional[models.TokenClassification]) – Token Classification annotation
metadata (Optional[Dict[str, Any]]) – Metadata associated with the annotation
approval_status (Optional[models.ApprovalStatus]) – Reviewer approval status for the annotation
updated_at (Optional[str]) – must match the updated_at time on the annotation being updated
task_id (Optional[str]) – Id of task to which the datum is locked with
- Returns:
New annotation details
- Return type:
chariot.datasets.datasets module
- chariot.datasets.datasets.create_dataset(*, name: str, type: DatasetType, project_id: str, description: str | None = None, is_public: bool | None = None, _is_test: bool | None = None) Dataset [source]
Create a new, empty dataset
- Parameters:
name (str) – Dataset name
type (models.DatasetType) – Dataset type
project_id (str) – Project id to create the dataset in
description (Optional[str]) – Dataset description
is_public (Optional[bool]) – When set to true, the dataset will be publically accessible.
- Returns:
New dataset details
- Return type:
- chariot.datasets.datasets.create_dataset_timeline_description(id: str, description: str, timestamp: datetime)[source]
Adds a user-defined description event for a particular timeline event group.
Parameters
- idstr
Id of the dataset.
- descriptionstr
Description of the timeline event group. Must be less than 200 characters.
- timestampdatetime
Timestamp representing the event time of the group leader to which this description will be added
Raises
- APIException
If api communication fails, request is unauthorized or is unauthenticated.
- chariot.datasets.datasets.delete_dataset(id: str) Dataset [source]
Delete a dataset by id. The artifacts for the dataset will be deleted
- Parameters:
id (str) – Id of dataset to delete
- Returns:
Deleted dataset details
- Return type:
- chariot.datasets.datasets.get_authorized_dataset_ids(ids: list[str]) list[str] [source]
Given a list of Dataset Ids, return ids from the list that the user has read access to
- Parameters:
ids (List[str]) – List of dataset ids to check
- chariot.datasets.datasets.get_dataset(id: str) Dataset [source]
Get a dataset by id
- Parameters:
id (str) – Dataset id
- Returns:
Dataset details
- Return type:
- chariot.datasets.datasets.get_dataset_statistics(*, exact_name_match: bool | None = None, exclude_unlabeled: bool | None = None, limit_to_write_access: bool | None = None, name: str | None = None, project_ids: list[str] | None = None, dataset_ids: list[str] | None = None, task_type_label_filters: list[TaskTypeLabelFilter] | None = None, type: DatasetType | None = None) DatasetStatistics [source]
Get dataset statistics with various criteria.
- Parameters:
exact_name_match (Optional[bool]) – Require name filter to match exactly (defaults to false)
exclude_unlabeled (Optional[bool]) – Should unlabeled datasets be included (defaults to false)
limit_to_write_access (Optional[bool]) – Should the results only include datasets that the user has write access to (defaults to false)
name (Optional[str]) – Filter by dataset name
project_ids (Optional[List[str]] :param dataset_ids: Filter by dataset ids) – Filter by project ids
task_type_label_filters (Optional[List[models.TaskTypeLabelFilter]]) – Filter by task types and associated labels
type (Optional[models.DatasetType]) – Filter by dataset type
- Returns:
Statistics for the datasets
- Return type:
- chariot.datasets.datasets.get_dataset_timeline(id: str, *, max_items: int | None = None, direction: SortDirection | None = None, min_groups: int | None = None, max_ungrouped_events: int | None = None) Iterator[DatasetTimelineEvent] [source]
Get a series of dataset change events ordered by time and grouped by event type.
- Parameters:
id (str) – Dataset id to get events for
max_items (Optional[int]) – Limit the returned generator to only produce this many items
direction (Optional[models.SortDirection]) – Whether to sort in ascending or descending order
min_groups (Optional[int]) – How many groups are required before grouping behavior is turned on
max_ungrouped_events (Optional[int]) – The maximum number of events allowed before grouping behavior is turned on
- Returns:
Events for the dataset
- Return type:
Iterator[models.DatasetTimelineEvent]
- chariot.datasets.datasets.get_datasets(*, exact_name_match: bool | None = None, exclude_unlabeled: bool | None = None, limit_to_write_access: bool | None = None, name: str | None = None, project_ids: list[str] | None = None, dataset_ids: list[str] | None = None, task_type_label_filters: list[TaskTypeLabelFilter] | None = None, type: DatasetType | None = None, sort: DatasetSortColumn | None = None, direction: SortDirection | None = None, max_items: int | None = None) Generator[Dataset, None, None] [source]
Get datasets with various criteria. Returns a generator over all matching datasets.
- Parameters:
exact_name_match (Optional[bool]) – Require name filter to match exactly (defaults to false)
exclude_unlabeled (Optional[bool]) – Should unlabeled datasets be included (defaults to false)
limit_to_write_access (Optional[bool]) – Should the results only include datasets that the user has write access to (defaults to false)
name (Optional[str]) – Filter by dataset name
project_ids (Optional[List[str]]) – Filter by project ids
dataset_ids (Optional[List[str]]) – Filter by dataset ids
task_type_label_filters (Optional[List[models.TaskTypeLabelFilter]]) – Filter by task types and associated labels
type (Optional[models.DatasetType]) – Filter by dataset type
sort (Optional[models.DatasetSortColumn]) – How to sort the returned datasets
direction (Optional[models.SortDirection]) – Whether to sort in ascending or descending order
max_items (Optional[int]) – Limit the returned generator to only produce this many items
- Returns:
Dataset details for datasets matching the criteria
- Return type:
Generator[models.Dataset, None, None]
- chariot.datasets.datasets.update_dataset(id: str, *, name: str | None = None, description: str | None = None) Dataset [source]
Update a dataset’s name or description
- Parameters:
id (str) – Dataset id to update
name (Optional[str]) – New name for the dataset. Name remains unmodified if set to None
description (Optional[str]) – New description for the dataset. Description remains unmodified if set to None
- Returns:
Updated dataset details
- Return type:
chariot.datasets.datums module
- chariot.datasets.datums.archive_datum(id: str) Datum [source]
Archive (soft-delete) a datum by id
- Parameters:
id (str) – Id of datum to archive
- Returns:
Datum details
- Return type:
- chariot.datasets.datums.get_dataset_datum_count(dataset_id: str, *, task_type_label_filters: list[TaskTypeLabelFilter] | None = None, gps_coordinates_circle: Circle | None = None, gps_coordinates_rectangle: Rectangle | None = None, gps_coordinates_polygon: list[GeoPoint] | None = None, capture_timestamp_range: TimestampRange | None = None, metadata: dict[str, str] | None = None, asof_timestamp: datetime | None = None, unannotated: bool | None = None, datum_ids: list[str] | None = None, approval_status: list[str] | None = None, annotation_metadata: dict[str, str] | None = None) int [source]
Get dataset datum count with various criteria.
- Parameters:
dataset_id (str) – Id of dataset to get datums for
task_type_label_filters (Optional[List[models.TaskTypeLabelFilter]]) – Filter by task types and associated labels
gps_coordinates_circle (Optional[models.Circle]) – Filter datums within the given circle
gps_coordinates_rectangle (Optional[models.Rectangle]) – Filter datums within the given rectangle
gps_coordinates_polygon (Optional[List[models.GeoPoint]]) – Filter datums within the given polygon
capture_timestamp_range (Optional[models.TimestampRange]) – Filter by datum capture timestamp
metadata (Optional[Dict[str, str]]) – Filter by datum metadata values
asof_timestamp (Optional[datetime]) – Filter datums and/or annotations at the timestamp
unannotated (Optional[bool]) – Filter datums without annotation
datum_ids (Optional[List[str]]) – Filter datums with a list of datum ids
approval_status (Optional[List[str]]) – Filter by annotation approval status
annotation_metadata (Optional[Dict[str, str]]) – Filter by annotation metadata values
- Returns:
Datum count for matching datums
- Return type:
int
- chariot.datasets.datums.get_dataset_datum_labels(dataset_id: str, *, task_type_label_filter: TaskTypeLabelFilter | None = None, gps_coordinates_circle: Circle | None = None, gps_coordinates_rectangle: Rectangle | None = None, gps_coordinates_polygon: list[GeoPoint] | None = None, capture_timestamp_range: TimestampRange | None = None, metadata: dict[str, str] | None = None, asof_timestamp: datetime | None = None, datum_ids: list[str] | None = None, approval_status: list[str] | None = None, annotation_metadata: dict[str, str] | None = None, max_items: int | None = None) Generator[str, None, None] [source]
Get dataset datum labels with various criteria
- Parameters:
dataset_id (str) – Id of dataset to get datums for
task_type_label_filter (Optional[models.TaskTypeLabelFilter]) – Filter by a single task type and associated labels
gps_coordinates_circle (Optional[models.Circle]) – Filter datums within the given circle
gps_coordinates_rectangle (Optional[models.Rectangle]) – Filter datums within the given rectangle
gps_coordinates_polygon (Optional[List[models.GeoPoint]]) – Filter datums within the given polygon
capture_timestamp_range (Optional[models.TimestampRange]) – Filter by datum capture timestamp
metadata (Optional[Dict[str, str]]) – Filter by datum metadata values
asof_timestamp (Optional[datetime]) – Filter datums and/or annotations at the timestamp
datum_ids (Optional[List[str]]) – Filter datums with a list of datum ids
approval_status (Optional[List[str]]) – Filter by annotation approval status
annotation_metadata (Optional[Dict[str, str]]) – Filter by annotation metadata values
max_items (Optional[int]) – Limit the returned generator to only produce this many items
- Returns:
Generator over the matching datum labels
- Return type:
Generator[str, None, None]
- chariot.datasets.datums.get_dataset_datum_statistics(dataset_id: str, *, task_type_label_filters: list[TaskTypeLabelFilter] | None = None, gps_coordinates_circle: Circle | None = None, gps_coordinates_rectangle: Rectangle | None = None, gps_coordinates_polygon: list[GeoPoint] | None = None, capture_timestamp_range: TimestampRange | None = None, metadata: dict[str, str] | None = None, asof_timestamp: datetime | None = None, unannotated: bool | None = None, datum_ids: list[str] | None = None, approval_status: list[str] | None = None, annotation_metadata: dict[str, str] | None = None) DatumStatistics [source]
Get dataset datum statistics with various criteria
- Parameters:
dataset_id (str) – Id of dataset to get datums for
task_type_label_filters (Optional[List[models.TaskTypeLabelFilter]]) – Filter by task types and associated labels
gps_coordinates_circle (Optional[models.Circle]) – Filter datums within the given circle
gps_coordinates_rectangle (Optional[models.Rectangle]) – Filter datums within the given rectangle
gps_coordinates_polygon (Optional[List[models.GeoPoint]]) – Filter datums within the given polygon
capture_timestamp_range (Optional[models.TimestampRange]) – Filter by datum capture timestamp
metadata (Optional[Dict[str, str]]) – Filter by datum metadata values
asof_timestamp (Optional[datetime]) – Filter datums and/or annotations at the timestamp
unannotated (Optional[bool]) – Filter datums without annotation
datum_ids (Optional[List[str]]) – Filter datums with a list of datum ids
approval_status (Optional[List[str]]) – Filter by annotation approval status
annotation_metadata (Optional[Dict[str, str]]) – Filter by annotation metadata values
- Returns:
Datum statistics for matching datums
- Return type:
- chariot.datasets.datums.get_dataset_datums(dataset_id: str, *, task_type_label_filters: list[TaskTypeLabelFilter] | None = None, gps_coordinates_circle: Circle | None = None, gps_coordinates_rectangle: Rectangle | None = None, gps_coordinates_polygon: list[GeoPoint] | None = None, capture_timestamp_range: TimestampRange | None = None, metadata: dict[str, str] | None = None, asof_timestamp: datetime | None = None, unannotated: bool | None = None, datum_ids: list[str] | None = None, approval_status: list[str] | None = None, annotation_metadata: dict[str, str] | None = None, max_items: int | None = None) Generator[Datum, None, None] [source]
Get dataset datums with various criteria
- Parameters:
dataset_id (str) – Id of dataset to get datums for
task_type_label_filters (Optional[List[models.TaskTypeLabelFilter]]) – Filter by task types and associated labels
gps_coordinates_circle (Optional[models.Circle]) – Filter datums within the given circle
gps_coordinates_rectangle (Optional[models.Rectangle]) – Filter datums within the given rectangle
gps_coordinates_polygon (Optional[List[models.GeoPoint]]) – Filter datums within the given polygon
capture_timestamp_range (Optional[models.TimestampRange]) – Filter by datum capture timestamp
metadata (Optional[Dict[str, str]]) – Filter by datum metadata values
asof_timestamp (Optional[datetime]) – Filter datums and/or annotations at the timestamp
unannotated (Optional[bool]) – Filter datums without annotation
datum_ids (Optional[List[str]]) – Filter datums with a list of datum ids
approval_status (Optional[List[str]]) – Filter by annotation approval status
annotation_metadata (Optional[Dict[str, str]]) – Filter by annotation metadata values
max_items (Optional[int]) – Limit the returned generator to only produce this many items
- Returns:
Generator over the matching datums
- Return type:
Generator[models.Datum, None, None]
- chariot.datasets.datums.get_datum(id: str, *, task_type: TaskType | None = None) Datum [source]
Get a datum by id
- Parameters:
id (str) – Id of datum to get
task_type (Optional[models.TaskType]) – Task type annotation filter
- Returns:
Datum details
- Return type:
- chariot.datasets.datums.get_snapshot_datum_labels(snapshot_id: str, *, task_type_label_filter: TaskTypeLabelFilter | None = None, gps_coordinates_circle: Circle | None = None, gps_coordinates_rectangle: Rectangle | None = None, gps_coordinates_polygon: list[GeoPoint] | None = None, capture_timestamp_range: TimestampRange | None = None, metadata: dict[str, str] | None = None, split: str | None = None, datum_ids: list[str] | None = None, approval_status: list[str] | None = None, annotation_metadata: dict[str, str] | None = None, max_items: int | None = None) Generator[str, None, None] [source]
Get snapshot datum labels with various criteria
- Parameters:
snapshot_id (str) – Id of snapshot to get datums for
task_type_label_filter (Optional[models.TaskTypeLabelFilter]) – Filter by a single task type and associated labels
gps_coordinates_circle (Optional[models.Circle]) – Filter datums within the given circle
gps_coordinates_rectangle (Optional[models.Rectangle]) – Filter datums within the given rectangle
gps_coordinates_polygon (Optional[List[models.GeoPoint]]) – Filter datums within the given polygon
capture_timestamp_range (Optional[models.TimestampRange]) – Filter by datum capture timestamp
metadata (Optional[Dict[str, str]]) – Filter by datum metadata values
split (Optional[str]) – Filter by datum split assignment
datum_ids (Optional[List[str]]) – Filter datums with a list of datum ids
approval_status (Optional[List[str]]) – Filter by annotation approval status
annotation_metadata (Optional[Dict[str, str]]) – Filter by annotation metadata values
max_items (Optional[int]) – Limit the returned generator to only produce this many items
- Returns:
Generator over the matching datum labels
- Return type:
Generator[str, None, None]
- chariot.datasets.datums.get_snapshot_datum_statistics(snapshot_id: str, *, task_type_label_filters: list[TaskTypeLabelFilter] | None = None, gps_coordinates_circle: Circle | None = None, gps_coordinates_rectangle: Rectangle | None = None, gps_coordinates_polygon: list[GeoPoint] | None = None, capture_timestamp_range: TimestampRange | None = None, metadata: dict[str, str] | None = None, split: str | None = None, datum_ids: list[str] | None = None, approval_status: list[str] | None = None, annotation_metadata: dict[str, str] | None = None) DatumStatistics [source]
Get snapshot datum statistics with various criteria
- Parameters:
snapshot_id (str) – Id of snapshot to get datums for
task_type_label_filters (Optional[List[models.TaskTypeLabelFilter]]) – Filter by task types and associated labels
gps_coordinates_circle (Optional[models.Circle]) – Filter datums within the given circle
gps_coordinates_rectangle (Optional[models.Rectangle]) – Filter datums within the given rectangle
gps_coordinates_polygon (Optional[List[models.GeoPoint]]) – Filter datums within the given polygon
capture_timestamp_range (Optional[models.TimestampRange]) – Filter by datum capture timestamp
metadata (Optional[Dict[str, str]]) – Filter by datum metadata values
split (Optional[str]) – Filter by datum split assignment
datum_ids (Optional[List[str]]) – Filter datums with a list of datum ids
approval_status (Optional[List[str]]) – Filter by annotation approval status
annotation_metadata (Optional[Dict[str, str]]) – Filter by annotation metadata values
- Returns:
Datum statistics for matching datums
- Return type:
- chariot.datasets.datums.get_snapshot_datums(snapshot_id: str, *, task_type_label_filters: list[TaskTypeLabelFilter] | None = None, gps_coordinates_circle: Circle | None = None, gps_coordinates_rectangle: Rectangle | None = None, gps_coordinates_polygon: list[GeoPoint] | None = None, capture_timestamp_range: TimestampRange | None = None, metadata: dict[str, str] | None = None, split: str | None = None, datum_ids: list[str] | None = None, approval_status: list[str] | None = None, annotation_metadata: dict[str, str] | None = None, max_items: int | None = None) Generator[Datum, None, None] [source]
Get snapshot datums with various criteria
- Parameters:
snapshot_id (str) – Id of snapshot to get datums for
task_type_label_filters (Optional[List[models.TaskTypeLabelFilter]]) – Filter by task types and associated labels
gps_coordinates_circle (Optional[models.Circle]) – Filter datums within the given circle
gps_coordinates_rectangle (Optional[models.Rectangle]) – Filter datums within the given rectangle
gps_coordinates_polygon (Optional[List[models.GeoPoint]]) – Filter datums within the given polygon
capture_timestamp_range (Optional[models.TimestampRange]) – Filter by datum capture timestamp
metadata (Optional[Dict[str, str]]) – Filter by datum metadata values
split (Optional[str]) – Filter by datum split assignment
datum_ids (Optional[List[str]]) – Filter datums with a list of datum ids
approval_status (Optional[List[str]]) – Filter by annotation approval status
annotation_metadata (Optional[Dict[str, str]]) – Filter by annotation metadata values
max_items (Optional[int]) – Limit the returned generator to only produce this many items
- Returns:
Generator over the matching datums
- Return type:
Generator[models.Datum, None, None]
- chariot.datasets.datums.get_upload_datums(upload_id: str, *, task_type_label_filters: list[TaskTypeLabelFilter] | None = None, gps_coordinates_circle: Circle | None = None, gps_coordinates_rectangle: Rectangle | None = None, gps_coordinates_polygon: list[GeoPoint] | None = None, capture_timestamp_range: TimestampRange | None = None, metadata: dict[str, str] | None = None, unannotated: bool | None = None, datum_ids: list[str] | None = None, approval_status: list[str] | None = None, annotation_metadata: dict[str, str] | None = None, max_items: int | None = None) Generator[Datum, None, None] [source]
Get upload datums with various criteria
- Parameters:
upload_id (str) – Id of upload to get datums for
task_type_label_filters (Optional[List[models.TaskTypeLabelFilter]]) – Filter by task types and associated labels
gps_coordinates_circle (Optional[models.Circle]) – Filter datums within the given circle
gps_coordinates_rectangle (Optional[models.Rectangle]) – Filter datums within the given rectangle
gps_coordinates_polygon (Optional[List[models.GeoPoint]]) – Filter datums within the given polygon
capture_timestamp_range (Optional[models.TimestampRange]) – Filter by datum capture timestamp
metadata (Optional[Dict[str, str]]) – Filter by datum metadata values
unannotated (Optional[bool]) – Filter datums without annotation
datum_ids (Optional[List[str]]) – Filter datums with a list of datum ids
approval_status (Optional[List[str]]) – Filter by annotation approval status
annotation_metadata (Optional[Dict[str, str]]) – Filter by annotation metadata values
max_items (Optional[int]) – Limit the returned generator to only produce this many items
- Returns:
Generator over the matching datums
- Return type:
Generator[models.Datum, None, None]
chariot.datasets.exceptions module
- exception chariot.datasets.exceptions.UploadIncompleteError(upload_id: str, status: UploadStatus)[source]
Bases:
Exception
- status: UploadStatus
- upload_id: str
chariot.datasets.files module
- chariot.datasets.files.create_dataset_file(*, dataset_id: str, file_format: FileFormat | None = None, file_type: FileType, manifest_type: ManifestType | None = None, split: SplitName | None = None) File [source]
Create or retrieve an archive file or manifest file for a dataset, return the file object with location if available. The Function only starts the file creation process if the file does not already exist. Note: Creating archive files for datasets are not currently supported and will result in an error.
- Parameters:
dataset_id (str) – Id of dataset to create file for
file_format (Optional[models.FileFormat]) – File format
file_type (models.FileType) – File type
manifest_type (Optional[models.ManifestType]) – Manifest type
split (Optional[models.SplitName]) – Split
- Returns:
File detail for the newly created or existent file
- Return type:
- chariot.datasets.files.create_dataset_file_and_wait(*, dataset_id: str, file_format: FileFormat | None = None, file_type: FileType, manifest_type: ManifestType | None = None, split: SplitName | None = None, timeout: float = 120, wait_interval: float = 0.5) File [source]
Create or retrieve an archive file or manifest file for a dataset. Returns the file object with location. The function polls the API until the presigned url for the dataset file is populated or the timeout is reached. Note: Creating archive files for datasets are not currently supported and will result in an error.
- Parameters:
dataset_id (str) – Id of dataset to create file for
file_format (Optional[models.FileFormat]) – File format
file_type (models.FileType) – File type
manifest_type (Optional[models.ManifestType]) – Manifest type
split (Optional[models.SplitName]) – Split
timeout (float) – Number of seconds to wait for file completion (default 120 second)
wait_interval (float) – Number of seconds between successive calls to check the file presigned url (default 0.5)
- Returns:
File detail for the newly created or existent file
- Return type:
- Raises:
TimeoutError – If the timeout has been reached
- chariot.datasets.files.create_snapshot_file(*, snapshot_id: str, file_format: FileFormat | None = None, file_type: FileType, manifest_type: ManifestType | None = None, split: SplitName | None = None) File [source]
Create or retrieve an archive file or manifest file for a snapshot, return the file object with location if available. The Function only starts the file creation process if the file does not already exist.
- Parameters:
snapshot_id – Id of snapshot to create file for
file_format (Optional[models.FileFormat]) – File format
file_type (models.FileType) – File type
manifest_type (Optional[models.ManifestType]) – Manifest type
split (Optional[models.SplitName]) – Split
- Returns:
File detail for the newly created or existent file
- Return type:
- chariot.datasets.files.create_snapshot_file_and_wait(*, snapshot_id: str, file_format: FileFormat | None = None, file_type: FileType, manifest_type: ManifestType | None = None, split: SplitName | None = None, timeout: float = 120, wait_interval: float = 0.5) File [source]
Create or retrieve an archive file or manifest file for a snapshot. Returns the file object with location. The function polls the API until the presigned url for the snapshot file is populated or the timeout is reached.
- Parameters:
snapshot_id – Id of snapshot to create file for
file_format (Optional[models.FileFormat]) – File format
file_type (models.FileType) – File type
manifest_type (Optional[models.ManifestType]) – Manifest type
split (Optional[models.SplitName]) – Split
timeout (float) – Number of seconds to wait for file completion (default 120 second)
wait_interval (float) – Number of seconds between successive calls to check the file presigned url (default 0.5)
- Returns:
File detail for the newly created or existent file
- Return type:
- Raises:
TimeoutError – If the timeout has been reached
- chariot.datasets.files.get_dataset_files(dataset_id: str) list[File] [source]
Get files for a dataset
- Parameters:
dataset_id (str) – Dataset ID to retrieve files for.
- Returns:
File details for the dataset ID
- Return type:
List[models.File]
- chariot.datasets.files.get_snapshot_files(snapshot_id: str) list[File] [source]
Get files for a snapshot
- Parameters:
snapshot_id (str) – Snapshot ID to retrieve files for.
- Returns:
File details for the snapshot ID
- Return type:
List[models.File]
- chariot.datasets.files.wait_for_file(id: str, *, timeout: float = 120, wait_interval: float = 0.5) File [source]
Polls the given file until it has finished processing.
- Parameters:
id (str) – Id of the file to wait for
timeout (float) – Number of seconds to wait for file to complete (default 120)
wait_interval (float) – Number of seconds between successive calls to check the file for completion (default 0.5)
- Returns:
The file details
- Return type:
- Raises:
TimeoutError – If the timeout has been reached
chariot.datasets.models module
Datasets models.
- class chariot.datasets.models.Annotation(id: str, datum_id: str | None, upload_id: str | None, task_type: chariot.datasets.models.TaskType, class_label: str | None, contour: list[list[chariot.datasets.models.Point]] | None, bbox: chariot.datasets.models.BoundingBox | None, oriented_bbox: chariot.datasets.models.OrientedBoundingBox | None, text_classification: chariot.datasets.models.TextClassification | None, token_classification: chariot.datasets.models.TokenClassification | None, text_generation: chariot.datasets.models.TextGeneration | None, created_at: datetime.datetime, updated_at: datetime.datetime, archived_at: datetime.datetime | None, archived_upload_id: str | None, size: int | None, approval_status: str, metadata: dict[str, Any] | None = None, previous_annotation_id: str | None = None, datum_annotation_updated_at: str | None = None, prev_datum_annotation_updated_at: str | None = None)[source]
Bases:
Base
- approval_status: str
- archived_at: datetime | None
- archived_upload_id: str | None
- bbox: BoundingBox | None
- class_label: str | None
- created_at: datetime
- datum_annotation_updated_at: str | None = None
- datum_id: str | None
- id: str
- metadata: dict[str, Any] | None = None
- oriented_bbox: OrientedBoundingBox | None
- prev_datum_annotation_updated_at: str | None = None
- previous_annotation_id: str | None = None
- size: int | None
- text_classification: TextClassification | None
- text_generation: TextGeneration | None
- token_classification: TokenClassification | None
- updated_at: datetime
- upload_id: str | None
- class chariot.datasets.models.ApprovalStatus(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]
Bases:
Enum
- NEEDS_REVIEW = 'needs_review'
- NOT_REVIEWED = ''
- REJECTED = 'rejected'
- VERIFIED = 'verified'
- class chariot.datasets.models.BoundingBox(xmin: float, xmax: float, ymin: float, ymax: float)[source]
Bases:
Base
- xmax: float
- xmin: float
- ymax: float
- ymin: float
- class chariot.datasets.models.Circle(center: chariot.datasets.models.GeoPoint, radius: float)[source]
Bases:
Base
- radius: float
- class chariot.datasets.models.ContextLabelFilter(context: str | None = None, labels: list[str] | None = None)[source]
Bases:
Base
- context: str | None = None
- labels: list[str] | None = None
- class chariot.datasets.models.Dataset(id: str, name: str, type: chariot.datasets.models.DatasetType, project_id: str, is_public: bool, is_test: bool, delete_lock: bool, created_at: datetime.datetime, updated_at: datetime.datetime, description: str | None = None, archived_at: datetime.datetime | None = None, archived_by: str | None = None, summary: chariot.datasets.models.DatasetSummary | None = None, migration_status: chariot.datasets.models.MigrationStatus | None = None)[source]
Bases:
Base
- archived_at: datetime | None = None
- archived_by: str | None = None
- created_at: datetime
- delete_lock: bool
- description: str | None = None
- id: str
- is_public: bool
- is_test: bool
- migration_status: MigrationStatus | None = None
- name: str
- project_id: str
- summary: DatasetSummary | None = None
- type: DatasetType
- updated_at: datetime
- class chariot.datasets.models.DatasetConfig(dataset_ids: list[str] | None = None, dataset_names: list[str] | None = None, exact_name_match: bool | None = None, limit_to_write_access: bool | None = None, dataset_type: str | None = None)[source]
Bases:
Base
- dataset_ids: list[str] | None = None
- dataset_names: list[str] | None = None
- dataset_type: str | None = None
- exact_name_match: bool | None = None
- limit_to_write_access: bool | None = None
- class chariot.datasets.models.DatasetSortColumn(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]
Bases:
Enum
- CREATION_TIMESTAMP = 'creation timestamp'
- DATUM_COUNT = 'datum count'
- NAME = 'name'
- UPDATED_TIMESTAMP = 'updated timestamp'
- class chariot.datasets.models.DatasetStatistics(datum_count: int, available_datum_count: int, new_datum_count: int, annotation_count: int, class_label_count: int, bounding_box_count: int, oriented_bounding_box_count: int, contour_count: int, text_classification_count: int, token_classification_count: int, text_generation_count: int, class_label_distribution: dict[str, int] | None, text_classification_distribution: list[chariot.datasets.models.Distribution] | None, token_classification_distribution: dict[str, int] | None, text_generation_distribution: dict[str, int] | None, annotation_count_by_approval_status: dict[str, int] | None, dataset_count: int, total_datum_size: int, largest_datum_size: int, unannotated_datum_count: int)[source]
Bases:
DatumStatistics
- dataset_count: int
- largest_datum_size: int
- total_datum_size: int
- unannotated_datum_count: int
- class chariot.datasets.models.DatasetSummary(datum_count: int, available_datum_count: int, new_datum_count: int, annotation_count: int, class_label_count: int, bounding_box_count: int, oriented_bounding_box_count: int, contour_count: int, text_classification_count: int, token_classification_count: int, text_generation_count: int, class_label_distribution: dict[str, int] | None, text_classification_distribution: list[chariot.datasets.models.Distribution] | None, token_classification_distribution: dict[str, int] | None, text_generation_distribution: dict[str, int] | None, annotation_count_by_approval_status: dict[str, int] | None, total_datum_size: int, largest_datum_size: int, unannotated_datum_count: int)[source]
Bases:
DatumStatistics
- largest_datum_size: int
- total_datum_size: int
- unannotated_datum_count: int
- class chariot.datasets.models.DatasetTimelineEvent(event_timestamp: datetime.datetime, dataset_id: str, event_associated_record_id: str | None, event_operation: str | None, event_user_id: str | None, datums_created: int | None, datums_deleted: str | None, datums_modified: str | None, annotations_created: str | None, annotations_deleted: str | None, annotations_modified: str | None, snapshots: list[chariot.datasets.models.Snapshot] | None, event_group_num_timestamps: int, event_group_num_users: int, event_group_start_timestamp: datetime.datetime, event_group_description: str, event_associated_task_id: str | None)[source]
Bases:
Base
- annotations_created: str | None
- annotations_deleted: str | None
- annotations_modified: str | None
- dataset_id: str
- datums_created: int | None
- datums_deleted: str | None
- datums_modified: str | None
- event_associated_record_id: str | None
- event_associated_task_id: str | None
- event_group_description: str
- event_group_num_timestamps: int
- event_group_num_users: int
- event_group_start_timestamp: datetime
- event_operation: str | None
- event_timestamp: datetime
- event_user_id: str | None
- class chariot.datasets.models.DatasetType(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]
Bases:
Enum
- IMAGE = 'image'
- TEXT = 'text'
- class chariot.datasets.models.Datum(id: str, coordinates: chariot.datasets.models.GeoPoint | None, timestamp: datetime.datetime | None, metadata: dict[str, Any] | None, created_at: datetime.datetime, archived_at: datetime.datetime | None, dataset: chariot.datasets.models.Dataset | None, annotations: list[chariot.datasets.models.Annotation] | None, presigned_url: str, signature: str, size: int, split: chariot.datasets.models.SplitName | None, datum_annotation_updated_at: str | None = None, task_lock_details: chariot.datasets.models.DatumTaskActivity | None = None, preview_presigned_urls: list[str] | None = None)[source]
Bases:
Base
- annotations: list[Annotation] | None
- archived_at: datetime | None
- created_at: datetime
- datum_annotation_updated_at: str | None = None
- id: str
- metadata: dict[str, Any] | None
- presigned_url: str
- preview_presigned_urls: list[str] | None = None
- signature: str
- size: int
- task_lock_details: DatumTaskActivity | None = None
- timestamp: datetime | None
- class chariot.datasets.models.DatumConfig(task_type_label_filters: list[chariot.datasets.models.TaskTypeLabelFilter] | None, gps_coordinates_circle: chariot.datasets.models.Circle | None, gps_coordinates_rectangle: chariot.datasets.models.Rectangle | None, gps_coordinates_polygon: list[chariot.datasets.models.GeoPoint] | None, capture_timestamp_range: chariot.datasets.models.TimestampRange | None, metadata: dict[str, str] | None, asof_timestamp: datetime.datetime | None, unannotated: bool | None, datum_ids: list[str] | None, approval_status: list[chariot.datasets.models.ApprovalStatus] | None, annotation_metadata: dict[str, str] | None)[source]
Bases:
DatumFilter
- class chariot.datasets.models.DatumFilter(task_type_label_filters: list[chariot.datasets.models.TaskTypeLabelFilter] | None, gps_coordinates_circle: chariot.datasets.models.Circle | None, gps_coordinates_rectangle: chariot.datasets.models.Rectangle | None, gps_coordinates_polygon: list[chariot.datasets.models.GeoPoint] | None, capture_timestamp_range: chariot.datasets.models.TimestampRange | None, metadata: dict[str, str] | None, asof_timestamp: datetime.datetime | None, unannotated: bool | None, datum_ids: list[str] | None, approval_status: list[chariot.datasets.models.ApprovalStatus] | None, annotation_metadata: dict[str, str] | None)[source]
Bases:
Base
- annotation_metadata: dict[str, str] | None
- approval_status: list[ApprovalStatus] | None
- asof_timestamp: datetime | None
- capture_timestamp_range: TimestampRange | None
- datum_ids: list[str] | None
- metadata: dict[str, str] | None
- task_type_label_filters: list[TaskTypeLabelFilter] | None
- unannotated: bool | None
- class chariot.datasets.models.DatumSortColumn(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]
Bases:
Enum
- CREATION_TIMESTAMP = 'creation timestamp'
- class chariot.datasets.models.DatumStatistics(datum_count: int, available_datum_count: int, new_datum_count: int, annotation_count: int, class_label_count: int, bounding_box_count: int, oriented_bounding_box_count: int, contour_count: int, text_classification_count: int, token_classification_count: int, text_generation_count: int, class_label_distribution: dict[str, int] | None, text_classification_distribution: list[chariot.datasets.models.Distribution] | None, token_classification_distribution: dict[str, int] | None, text_generation_distribution: dict[str, int] | None, annotation_count_by_approval_status: dict[str, int] | None)[source]
Bases:
Base
- annotation_count: int
- annotation_count_by_approval_status: dict[str, int] | None
- available_datum_count: int
- bounding_box_count: int
- class_label_count: int
- class_label_distribution: dict[str, int] | None
- contour_count: int
- datum_count: int
- new_datum_count: int
- oriented_bounding_box_count: int
- text_classification_count: int
- text_classification_distribution: list[Distribution] | None
- text_generation_count: int
- text_generation_distribution: dict[str, int] | None
- token_classification_count: int
- token_classification_distribution: dict[str, int] | None
- class chariot.datasets.models.DatumTask(id: str, name: str, description: str | None, created_at: datetime.datetime, updated_at: datetime.datetime, archived_at: datetime.datetime | None, created_by: str, updated_by: str, archived_by: str | None, project_id: str, dataset_config: chariot.datasets.models.DatasetConfig | None, datum_config: chariot.datasets.models.DatumConfig | None)[source]
Bases:
Base
- archived_at: datetime | None
- archived_by: str | None
- created_at: datetime
- created_by: str
- dataset_config: DatasetConfig | None
- datum_config: DatumConfig | None
- description: str | None
- id: str
- name: str
- project_id: str
- updated_at: datetime
- updated_by: str
- class chariot.datasets.models.DatumTaskActivity(dataset_id: str | None, datum_id: str | None, task_id: str, user_id: str, activity: chariot.datasets.models.DatumTaskActivityCode | None, activity_start_time: datetime.datetime | None, activity_end_time: datetime.datetime | None)[source]
Bases:
Base
- activity: DatumTaskActivityCode | None
- activity_end_time: datetime | None
- activity_start_time: datetime | None
- dataset_id: str | None
- datum_id: str | None
- task_id: str
- user_id: str
- class chariot.datasets.models.DatumTaskActivityCode(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]
Bases:
StrEnum
- LOCKED = 'locked'
- SKIPPED = 'skipped'
- VIEWED = 'viewed'
- class chariot.datasets.models.DatumTaskDetails(id: str, name: str, description: str | None, created_at: datetime.datetime, updated_at: datetime.datetime, archived_at: datetime.datetime | None, created_by: str, updated_by: str, archived_by: str | None, project_id: str, dataset_config: chariot.datasets.models.DatasetConfig | None, datum_config: chariot.datasets.models.DatumConfig | None, statistics: chariot.datasets.models.DatumTaskStatistics)[source]
Bases:
DatumTask
- statistics: DatumTaskStatistics
- class chariot.datasets.models.DatumTaskStatistics(datum_count: int, available_datum_count: int, new_datum_count: int, annotation_count: int, class_label_count: int, bounding_box_count: int, oriented_bounding_box_count: int, contour_count: int, text_classification_count: int, token_classification_count: int, text_generation_count: int, class_label_distribution: dict[str, int] | None, text_classification_distribution: list[chariot.datasets.models.Distribution] | None, token_classification_distribution: dict[str, int] | None, text_generation_distribution: dict[str, int] | None, annotation_count_by_approval_status: dict[str, int] | None, task_count: int, dataset_count: int, user_count: int, datum_count_by_user_id: dict[str, int] | None, activity_count: int, datum_count_by_activity_status: dict[str, int] | None)[source]
Bases:
DatumStatistics
- activity_count: int
- dataset_count: int
- datum_count_by_activity_status: dict[str, int] | None
- datum_count_by_user_id: dict[str, int] | None
- task_count: int
- user_count: int
- class chariot.datasets.models.Distribution(context: str | None, distribution: dict[str, int])[source]
Bases:
Base
- context: str | None
- distribution: dict[str, int]
- class chariot.datasets.models.File(id: str, dataset: chariot.datasets.models.Dataset | None, dataset_timestamp: datetime.datetime | None, snapshot: chariot.datasets.models.Snapshot | None, split: chariot.datasets.models.SplitName | None, type: chariot.datasets.models.FileType, manifest_type: chariot.datasets.models.ManifestType | None, file_format: chariot.datasets.models.FileFormat, presigned_url: str | None, created_at: datetime.datetime, updated_at: datetime.datetime, archived_at: datetime.datetime | None, expires_at: datetime.datetime | None, job: chariot.datasets.models.Job | None, status: chariot.datasets.models.FileStatus | None)[source]
Bases:
Base
- archived_at: datetime | None
- created_at: datetime
- dataset_timestamp: datetime | None
- expires_at: datetime | None
- file_format: FileFormat
- id: str
- manifest_type: ManifestType | None
- presigned_url: str | None
- status: FileStatus | None
- updated_at: datetime
- class chariot.datasets.models.FileFormat(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]
Bases:
Enum
- GZ = 'gz'
- TGZ = 'tgz'
- ZIP = 'zip'
- class chariot.datasets.models.FileStatus(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]
Bases:
Enum
- ARCHIVED = 'archived'
- COMPLETE = 'complete'
- ERROR = 'error'
- PENDING = 'pending'
- PROCESSING = 'processing'
- class chariot.datasets.models.FileType(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]
Bases:
Enum
- ARCHIVE = 'archive'
- MANIFEST = 'manifest'
- class chariot.datasets.models.GeoPoint(latitude: float, longitude: float)[source]
Bases:
Base
- latitude: float
- longitude: float
- class chariot.datasets.models.Job(id: str, type: chariot.datasets.models.JobType, status: chariot.datasets.models.JobStatus, progress_message: str | None, dataset: chariot.datasets.models.Dataset | None, upload: Any | None, file: Any | None, view: chariot.datasets.models.View | None, execution_count: int, created_at: datetime.datetime, updated_at: datetime.datetime, start_after: datetime.datetime | None, schedule_cron: str | None)[source]
Bases:
Base
- created_at: datetime
- execution_count: int
- file: Any | None
- id: str
- progress_message: str | None
- schedule_cron: str | None
- start_after: datetime | None
- updated_at: datetime
- upload: Any | None
- class chariot.datasets.models.JobStatus(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]
Bases:
Enum
- IN_PROGRESS = 'in progress'
- READY = 'ready'
- class chariot.datasets.models.JobType(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]
Bases:
Enum
- DELETE_DATASET = 'delete_dataset'
- DELETE_FILE = 'delete_file'
- DELETE_UPLOAD = 'delete_upload'
- FILE = 'file'
- SNAPSHOT = 'snapshot'
- UPLOAD = 'upload'
- class chariot.datasets.models.ManifestType(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]
Bases:
Enum
- ALL = 'all'
- ANNOTATED = 'annotated'
- class chariot.datasets.models.MigrationStatus(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]
Bases:
Enum
- CLEANUP = 'cleanup'
- COMPLETE = 'complete'
- DOWNLOADING = 'downloading'
- ERROR = 'error'
- EXCEPTION = 'exception'
- IDENTIFIED = 'identified'
- PLANNED = 'planned'
- UPLOADING_HORIZONTALS = 'uploading_horizontals'
- UPLOADING_VERTICAL = 'uploading_vertical'
- class chariot.datasets.models.OrientedBoundingBox(cx: float, cy: float, w: float, h: float, r: float)[source]
Bases:
Base
- cx: float
- cy: float
- h: float
- r: float
- w: float
- class chariot.datasets.models.PresignedUrl(method: str, url: str)[source]
Bases:
Base
- method: str
- url: str
- class chariot.datasets.models.Rectangle(p1: chariot.datasets.models.GeoPoint, p2: chariot.datasets.models.GeoPoint)[source]
Bases:
Base
- class chariot.datasets.models.Snapshot(id: str, view: chariot.datasets.models.View, name: str, timestamp: datetime.datetime, summary: chariot.datasets.models.DatasetSummary | None, split_summaries: dict[chariot.datasets.models.SplitName, chariot.datasets.models.DatasetSummary] | None, status: chariot.datasets.models.SnapshotStatus, created_at: datetime.datetime | None, updated_at: datetime.datetime | None)[source]
Bases:
Base
- created_at: datetime | None
- id: str
- name: str
- split_summaries: dict[SplitName, DatasetSummary] | None
- status: SnapshotStatus
- summary: DatasetSummary | None
- timestamp: datetime
- updated_at: datetime | None
- class chariot.datasets.models.SnapshotSortColumn(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]
Bases:
Enum
- CREATION_TIMESTAMP = 'creation timestamp'
- ID = 'id'
- NAME = 'name'
- TIMESTAMP = 'timestamp'
- class chariot.datasets.models.SnapshotStatus(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]
Bases:
Enum
- COMPLETE = 'complete'
- ERROR = 'error'
- PENDING = 'pending'
- PREVIEW = 'preview'
- class chariot.datasets.models.SortDirection(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]
Bases:
Enum
- ASCENDING = 'asc'
- DESCENDING = 'desc'
- class chariot.datasets.models.SplitAlgorithm(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]
Bases:
Enum
- RANDOM = 'random'
- class chariot.datasets.models.SplitConfig(sample_count: int | None, split_algorithm: chariot.datasets.models.SplitAlgorithm | None, apply_default_split: bool | None, splits: dict[chariot.datasets.models.SplitName, float] | None)[source]
Bases:
Base
- apply_default_split: bool | None
- sample_count: int | None
- split_algorithm: SplitAlgorithm | None
- class chariot.datasets.models.SplitName(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]
Bases:
StrEnum
- TEST = 'test'
- TRAIN = 'train'
- VAL = 'val'
- class chariot.datasets.models.TaskActivitySortColumn(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]
Bases:
Enum
- ACTIVITY_END_TIME = 'activity end timestamp'
- ACTIVITY_START_TIME = 'activity start timestamp'
- class chariot.datasets.models.TaskSortColumn(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]
Bases:
Enum
- ID = 'id'
- NAME = 'name'
- class chariot.datasets.models.TaskType(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]
Bases:
StrEnum
- IMAGE_CLASSIFICATION = 'Image Classification'
- IMAGE_SEGMENTATION = 'Image Segmentation'
- OBJECT_DETECTION = 'Object Detection'
- ORIENTED_OBJECT_DETECTION = 'Oriented Object Detection'
- TEXT_CLASSIFICATION = 'Text Classification'
- TEXT_GENERATION = 'Text Generation'
- TOKEN_CLASSIFICATION = 'Token Classification'
- class chariot.datasets.models.TaskTypeLabelFilter(task_type: chariot.datasets.models.TaskType, labels: list[str] | None = None, contexts: list[str | None] | None = None, context_labels: list[chariot.datasets.models.ContextLabelFilter] | None = None)[source]
Bases:
Base
- context_labels: list[ContextLabelFilter] | None = None
- contexts: list[str | None] | None = None
- labels: list[str] | None = None
- class chariot.datasets.models.TextClassification(context: str | None, label: str)[source]
Bases:
Base
- context: str | None
- label: str
- class chariot.datasets.models.TextGeneration(context: str | None, generated_text: str | None, generated_text_presigned_url: str | None)[source]
Bases:
Base
- context: str | None
- generated_text: str | None
- generated_text_presigned_url: str | None
- class chariot.datasets.models.TimestampRange(start: datetime.datetime | None, end: datetime.datetime | None)[source]
Bases:
Base
- end: datetime | None
- start: datetime | None
- class chariot.datasets.models.TokenClassification(label: str, start: int, end: int)[source]
Bases:
Base
- end: int
- label: str
- start: int
- class chariot.datasets.models.Upload(id: str, job: chariot.datasets.models.Job | None, type: chariot.datasets.models.UploadType, is_gzipped: bool | None, split: chariot.datasets.models.SplitName | None, status: chariot.datasets.models.UploadStatus, name: str | None, size: int | None, delete_source: bool, max_validation_errors: int, image_validation: bool, validation_errors: list[str] | None, created_at: datetime.datetime, updated_at: datetime.datetime, data_created_at: datetime.datetime | None, presigned_urls: list[chariot.datasets.models.PresignedUrl] | None, source_urls: list[str] | None, datum_metadata: list[dict[str, Any]] | None, dataset: chariot.datasets.models.Dataset | None, video_options: chariot.datasets.models.VideoSamplingOptions | None)[source]
Bases:
Base
- created_at: datetime
- data_created_at: datetime | None
- datum_metadata: list[dict[str, Any]] | None
- delete_source: bool
- id: str
- image_validation: bool
- is_gzipped: bool | None
- max_validation_errors: int
- name: str | None
- presigned_urls: list[PresignedUrl] | None
- size: int | None
- source_urls: list[str] | None
- status: UploadStatus
- type: UploadType
- updated_at: datetime
- validation_errors: list[str] | None
- video_options: VideoSamplingOptions | None
- class chariot.datasets.models.UploadSortColumn(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]
Bases:
Enum
- CREATION_TIMESTAMP = 'creation timestamp'
- STATUS = 'status'
- TYPE = 'type'
- class chariot.datasets.models.UploadStatistics(upload_count: int)[source]
Bases:
Base
- upload_count: int
- class chariot.datasets.models.UploadStatus(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]
Bases:
Enum
- CLEANUP = 'cleanup'
- COMPLETE = 'complete'
- CREATED = 'created'
- ERROR = 'error'
- PROCESSING = 'processing'
- class chariot.datasets.models.UploadType(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]
Bases:
Enum
- ANNOTATION = 'annotation'
- ARCHIVE = 'archive'
- DATUM = 'datum'
- INFERENCE = 'inference'
- RAIC = 'raic'
- TEXT = 'text'
- VIDEO = 'video'
- class chariot.datasets.models.VideoSamplingOptions(sampling_type: chariot.datasets.models.VideoSamplingType, sampling_value: int, deinterlace: bool)[source]
Bases:
object
- deinterlace: bool
- sampling_type: VideoSamplingType
- sampling_value: int
- class chariot.datasets.models.VideoSamplingType(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]
Bases:
str
,Enum
- NONE = 'none'
- RATE = 'rate'
- RATIO = 'ratio'
- class chariot.datasets.models.View(task_type_label_filters: list[chariot.datasets.models.TaskTypeLabelFilter] | None, gps_coordinates_circle: chariot.datasets.models.Circle | None, gps_coordinates_rectangle: chariot.datasets.models.Rectangle | None, gps_coordinates_polygon: list[chariot.datasets.models.GeoPoint] | None, capture_timestamp_range: chariot.datasets.models.TimestampRange | None, metadata: dict[str, str] | None, asof_timestamp: datetime.datetime | None, unannotated: bool | None, datum_ids: list[str] | None, approval_status: list[chariot.datasets.models.ApprovalStatus] | None, annotation_metadata: dict[str, str] | None, sample_count: int | None, split_algorithm: chariot.datasets.models.SplitAlgorithm | None, apply_default_split: bool | None, splits: dict[chariot.datasets.models.SplitName, float] | None, id: str, name: str, snapshot_count: int | None, created_at: datetime.datetime, updated_at: datetime.datetime, archived_at: datetime.datetime | None = None, archived_by: str | None = None, dataset: chariot.datasets.models.Dataset | None = None)[source]
Bases:
SplitConfig
,DatumFilter
- archived_at: datetime | None = None
- archived_by: str | None = None
- created_at: datetime
- id: str
- name: str
- snapshot_count: int | None
- updated_at: datetime
chariot.datasets.snapshots module
- chariot.datasets.snapshots.create_snapshot(*, view_id: str, name: str, timestamp: datetime, is_dry_run: bool = False) Snapshot [source]
Creates a new snapshot for a view at the specified event timestamp.
The newly created snapshot will be in status PENDING while datums are being assigned. You can call get_snapshot with the returned ID to check and see if the status is COMPLETE. To do this all in one call, use create_snapshot_and_wait
- Parameters:
view_id (str) – Id of the view that the snapshot should belong to
name (str) – Snapshot name
timestamp (datetime) – Event timestamp that the snapshot should reflect
is_dry_run (bool) – If set to true, the function will return a snapshot for preview with datum count for each split of the most recent snapshot if exists and expected datum count for each split of new snapshot, as well as, available datum counts from unassigned datums with or without default splits. Default is false.
- Returns:
The newly created snapshot in a PENDING status
- Return type:
- chariot.datasets.snapshots.create_snapshot_and_wait(*, view_id: str, name: str, timestamp: datetime, timeout: float = 5, wait_interval: float = 0.5) Snapshot [source]
Creates a new snapshot for a view at the specified event timestamp and polls the API until the snapshot is in a COMPLETE status or the timeout is reached.
- Parameters:
view_id (str) – Id of the view that the snapshot should belong to
name (str) – Snapshot name
timestamp (datetime) – Event timestamp that the snapshot should reflect
timeout (float) – Number of seconds to wait for snapshot completion (default 5)
wait_interval (float) – Number of seconds between successive calls to check the snapshot for completion (default 1)
- Returns:
The COMPLETE snapshot after datums have been assigned.
- Return type:
- Raises:
RuntimeError – If the timeout has been reached
- chariot.datasets.snapshots.delete_snapshot(id: str) None [source]
Delete a snapshot by id. This can only be done if the snapshot’s status is still PENDING.
This will only start the deletion process on the backend. You can call get_snapshot with the snapshot’s ID and check for a NotFoundException to be raised to confirm deletion. To do this all in one call, use delete_snapshot_and_wait.
- Parameters:
id (str) – Id of the snapshot to delete
- chariot.datasets.snapshots.delete_snapshot_and_wait(id: str, *, timeout: float = 5, wait_interval: float = 0.5) None [source]
Delete a snapshot by id. This can only be done if the snapshot’s status is still PENDING. The returned task will poll the snapshot to confirm deletion. Once this is successful, the task will return.
- Parameters:
id (str) – Id of the snapshot to delete
timeout (float) – Number of seconds to wait for snapshot deletion (default 5)
wait_interval (float) – Number of seconds between successive calls to check the snapshot for deletion (default 1)
- Raises:
RuntimeError – If the timeout has been reached
- chariot.datasets.snapshots.get_all_snapshots(*, exact_name_match: bool | None = None, name: str | None = None, timestamp_interval: TimestampRange | None = None, snapshot_ids: list[str] | None = None, sort: SnapshotSortColumn | None = None, direction: SortDirection | None = None, max_items: int | None = None) Generator[Snapshot, None, None] [source]
Get all snapshots with optional filters. Returns a generator over all matching snapshots. Only admin user can access this function
- Parameters:
exact_name_match (Optional[bool]) – Require name filter to match exactly (defaults to false)
name (Optional[str]) – Filter by snapshot name
timestamp_interval (Optional[models.TimestampRange]) – Filter by snapshots occurring during the intenval
snapshot_ids (Optional[List[str]]) – Filter by snapshot ids
sort (Optional[models.SnapshotSortColumn]) – How to sort the returned snapshots
direction (Optional[models.SortDirection]) – Whether to sort in ascending or descending order
max_items (Optional[int]) – The maximum number of snapshots to return
- Returns:
Snapshot details for snapshots matching the criteria
- Return type:
Generator[models.Snapshot, None, None]
- chariot.datasets.snapshots.get_dataset_snapshots(dataset_id: str, *, exact_name_match: bool | None = None, name: str | None = None, timestamp_interval: TimestampRange | None = None, snapshot_ids: list[str] | None = None, sort: SnapshotSortColumn | None = None, direction: SortDirection | None = None, max_items: int | None = None) Generator[Snapshot, None, None] [source]
Get a dataset’s snapshots with optional filters. Returns a generator over all matching snapshots.
- Parameters:
dataset_id (str) – Id of the dataset that the snapshots belong to
exact_name_match (Optional[bool]) – Require name filter to match exactly (defaults to false)
name (Optional[str]) – Filter by snapshot name
timestamp_interval (Optional[models.TimestampRange]) – Filter by snapshots occurring during the intenval
snapshot_ids (Optional[List[str]]) – Filter by snapshot ids
sort (Optional[models.SnapshotSortColumn]) – How to sort the returned snapshots
direction (Optional[models.SortDirection]) – Whether to sort in ascending or descending order
max_items (Optional[int]) – The maximum number of snapshots to return
- Returns:
Snapshot details for snapshots matching the criteria
- Return type:
Generator[models.Snapshot, None, None]
- chariot.datasets.snapshots.get_snapshot(id: str) Snapshot [source]
Get a snapshot by id
- Parameters:
id (str) – Snapshot id
- Returns:
Snapshot details
- Return type:
- chariot.datasets.snapshots.get_view_snapshot_count(view_id: str, *, exact_name_match: bool | None = None, name: str | None = None, timestamp_interval: TimestampRange | None = None, snapshot_ids: list[str] | None = None) int [source]
Get number of snapshots for the given view id with optional filters.
- Parameters:
view_id (str) – Id of the view that the snapshots belong to
exact_name_match (Optional[bool]) – Require name filter to match exactly (defaults to false)
name (Optional[str]) – Filter by snapshot name
timestamp_interval (Optional[models.TimestampRange]) – Filter by snapshots occurring during the intenval
snapshot_ids (Optional[List[str]]) – Filter by snapshot ids
- Returns:
Number of snapshots matching the criteria
- Return type:
int
- chariot.datasets.snapshots.get_view_snapshots(view_id: str, *, exact_name_match: bool | None = None, name: str | None = None, timestamp_interval: TimestampRange | None = None, snapshot_ids: list[str] | None = None, sort: SnapshotSortColumn | None = None, direction: SortDirection | None = None, max_items: int | None = None) Generator[Snapshot, None, None] [source]
Get a view’s snapshots with optional filters. Returns a generator over all matching snapshots.
- Parameters:
view_id (str) – Id of the view that the snapshots belong to
exact_name_match (Optional[bool]) – Require name filter to match exactly (defaults to false)
name (Optional[str]) – Filter by snapshot name
timestamp_interval (Optional[models.TimestampRange]) – Filter by snapshots occurring during the intenval
snapshot_ids (Optional[List[str]]) – Filter by snapshot ids
sort (Optional[models.SnapshotSortColumn]) – How to sort the returned snapshots
direction (Optional[models.SortDirection]) – Whether to sort in ascending or descending order
max_items (Optional[int]) – The maximum number of snapshots to return
- Returns:
Snapshot details for snapshots matching the criteria
- Return type:
Generator[models.Snapshot, None, None]
chariot.datasets.tasks module
- chariot.datasets.tasks.archive_task(id: str) DatumTask [source]
Archive a datum annotation task.
- Parameters:
id (str) – Datum annotation task id
- Returns:
The datum annotation task that has been archived
- Return type:
- chariot.datasets.tasks.count_task_activity(task_id: str, *, activities: list[DatumTaskActivityCode] | None = None, dataset_ids: list[str] | None = None, user_ids: list[str] | None = None) int [source]
Count the activities for the provided task and filters.
- Parameters:
task_id (str) – Id of the task
activities (Optional[List[models.DatumTaskActivityCode]]) – List of activity types to filter by
dataset_ids (Optional[List[str]]) – List of dataset ids to filter by
user_ids (Optional[List[str]]) – List of user ids to filter by
- Returns:
Number of matching task activities
- Return type:
int
- chariot.datasets.tasks.count_tasks(*, search: str | None = None, exact_name_match: bool | None = None, include_archived: bool | None = None, project_ids: list[str] | None = None, task_ids: list[str] | None = None) int [source]
Get number of tasks that match given criteria.
- Parameters:
search – Search string (full text search against name and description fields)
exact_name_match (Optional[bool]) – Require search to exactly match the task name (defaults to false)
include_archived (Optional[bool]) – If true, archived tasks will be included in the results (defaults to false)
project_ids (Optional[List[str]]) – Filter by project ids
task_ids (Optional[List[str]]) – Filter by task ids
- Returns:
Number of tasks that match given criteria
- Return type:
int
- chariot.datasets.tasks.count_tasks_activity(exact_name_match: bool | None = None, search: str | None = None, project_ids: list[str] | None = None, task_ids: list[str] | None = None, activities: list[DatumTaskActivityCode] | None = None, dataset_ids: list[str] | None = None, user_ids: list[str] | None = None) int [source]
Count matching activities .
- Parameters:
exact_name_match (Optional[bool]) – Require search filter to match exactly (defaults to false)
search (Optional[str]) – Search string (full text search against task name and description fields)
project_ids (Optional[List[str]]) – List of project ids to filter by
task_ids (Optional[List[str]]) – List of task ids to filter by
activities (Optional[List[models.DatumTaskActivityCode]]) – List of activity types to filter by
dataset_ids (Optional[List[str]]) – List of dataset ids to filter by
user_ids (Optional[List[str]]) – List of user ids to filter by
- Returns:
Number of matching task activities
- Return type:
int
- chariot.datasets.tasks.create_task(*, name: str, project_id: str, description: str | None = None, dataset_config: DatasetConfig | None = None, datum_config: DatumConfig | None = None) DatumTask [source]
Create a new datum annotation task.
- Parameters:
name (str) – Datum annotation task name
project_id (str) – Project id that datum annotation Task belongs to
description (Optional[str]) – Datum annotation task description
- Returns:
New datum annotation task detail
- Return type:
- chariot.datasets.tasks.delete_datum_lock_for_task(id: str, task_id: str, user_id: str | None = None) None [source]
Delete the specified datum’s lock for a given task.
Must be the current user holding the lock.
- Parameters:
id (str) – The id of the datum
task_id (str) – The id of the task
user_id (Optional[str]) – The id of the user who holds the lock
- Returns:
None
- chariot.datasets.tasks.get_datum_for_task(task_id: str, *, unannotated: bool = False, random: bool = False, id_after: str | None = None, prev_datum_id: str | None = None, skip_prev: bool | None = None) Datum | None [source]
Get the next available datum for the given task. Returns None if there are no datums available.
- Parameters:
task_id (str) – The id of the task
unannotated (Optional[bool]) – If true, only unannotated datums will be returned (defaults to false)
random (Optional[bool]) – If true, returns a random available datum instead of the next available datum (defaults to false)
id_after (Optional[str]) – If provided, will return a datum that is after the given datum id (can be used to resume a task from a specific point, or to skip a specific datum)
prev_datum_id (Optional[str]) – if specified, any lock held by the user on this datum will be released if a new datum is acquired
skip_prev (Optional[bool]) – if true, the datum specified by prev_datum_id will be marked as ‘skipped’ rather than ‘viewed’ when its lock is released, putting it at the end of the task queue
- Returns:
The datum, or None if no datums matching the request are available
- Return type:
Optional[models.Datum]
- chariot.datasets.tasks.get_datum_for_task_by_id(id: str, task_id: str) Datum | None [source]
Get the specific datum designated by id.
- Parameters:
id (str) – The id of the datum
task_id (str) – The id of the task
- Returns:
The datum, None if no datum is found or
the datum does not apply to the specified task. :rtype: Optional[models.Datum]
- chariot.datasets.tasks.get_task(id: str) DatumTaskDetails [source]
Get a datum annotation task by id.
- Parameters:
id (str) – Datum annotation task id
- Returns:
The datum annotation task details
- Return type:
- chariot.datasets.tasks.get_task_activity(task_id: str, *, activities: list[DatumTaskActivityCode] | None = None, dataset_ids: list[str] | None = None, user_ids: list[str] | None = None, direction: SortDirection | None = None, sort: TaskActivitySortColumn | None = None, max_items: int | None = None) Generator[DatumTaskActivity, None, None] [source]
Get the activities for the provided task and filters.
- Parameters:
task_id (str) – Id of the task
activities (Optional[List[models.DatumTaskActivityCode]]) – List of activity types to filter by
dataset_ids (Optional[List[str]]) – List of dataset ids to filter by
user_ids (Optional[List[str]]) – List of user ids to filter by
direction (Optional[models.SortDirection]) – Sort direction
sort (Optional[models.TaskActivitySortColumn]) – Sort column
max_items (Optional[int]) – Limit the returned generator to only produce this many items
- Returns:
Generator over the matching task activities
- Return type:
Generator[models.DatumTaskActivity, None, None]
- chariot.datasets.tasks.get_task_datum_count(task_id: str) int [source]
Get the number of datums in the provided task.
- Parameters:
task_id (str) – The id of the task
- Returns:
The datum count
- Return type:
int
- chariot.datasets.tasks.get_task_statistics(id: str, *, task_type_label_filters: list[TaskTypeLabelFilter] | None = None, gps_coordinates_circle: Circle | None = None, gps_coordinates_rectangle: Rectangle | None = None, gps_coordinates_polygon: list[GeoPoint] | None = None, capture_timestamp_range: TimestampRange | None = None, metadata: dict[str, str] | None = None, asof_timestamp: datetime | None = None, unannotated: bool | None = None, datum_ids: list[str] | None = None, approval_status: list[str] | None = None, annotation_metadata: dict[str, str] | None = None) DatumTaskStatistics [source]
Get dataset datum statistics with various criteria
- Parameters:
id (str) – Id of datum task to get statistics for
task_type_label_filters (Optional[List[models.TaskTypeLabelFilter]]) – Filter by task types and associated labels
gps_coordinates_circle (Optional[models.Circle]) – Filter datums within the given circle
gps_coordinates_rectangle (Optional[models.Rectangle]) – Filter datums within the given rectangle
gps_coordinates_polygon (Optional[List[models.GeoPoint]]) – Filter datums within the given polygon
capture_timestamp_range (Optional[models.TimestampRange]) – Filter by datum capture timestamp
metadata (Optional[Dict[str, str]]) – Filter by datum metadata values
datum_ids (Optional[List[str]]) – Filter datums with a list of datum ids
approval_status (Optional[List[str]]) – Filter by annotation approval status
annotation_metadata (Optional[Dict[str, str]]) – Filter by annotation metadata values
- Returns:
Datum task statistics
- Return type:
- chariot.datasets.tasks.get_tasks(*, search: str | None = None, exact_name_match: bool | None = None, include_archived: bool | None = None, project_ids: list[str] | None = None, task_ids: list[str] | None = None, sort: TaskSortColumn | None = None, direction: SortDirection | None = None, max_items: int | None = None) Generator[DatumTask, None, None] [source]
Get datum annotation tasks that match various criteria. Returns a generator over all matching tasks.
- Parameters:
search (Optional[str]) – Search string (full text search against name and description fields)
exact_name_match (Optional[bool]) – Require search to exactly match the task name (defaults to false)
include_archived (Optional[bool]) – If true, archived tasks will be included in the results (defaults to false)
project_ids (Optional[List[str]]) – Filter by project ids
task_ids (Optional[List[str]]) – Filter by task ids
sort (Optional[models.TaskSortColumn]) – What column to sort the tasks by (defaults to name)
direction (Optional[models.SortDirection]) – Whether to sort in ascending or descending order
max_items (Optional[int]) – Limit the returned generator to only produce this many items
- Returns:
Task definitions for tasks matching the criteria
- Return type:
Generator[models.DatumTask, None, None]
- chariot.datasets.tasks.get_tasks_activity(exact_name_match: bool | None = None, search: str | None = None, project_ids: list[str] | None = None, task_ids: list[str] | None = None, activities: list[DatumTaskActivityCode] | None = None, dataset_ids: list[str] | None = None, user_ids: list[str] | None = None, direction: SortDirection | None = None, sort: TaskActivitySortColumn | None = None, max_items: int | None = None) Generator[DatumTaskActivity, None, None] [source]
Get the matching activities.
- Parameters:
exact_name_match (Optional[bool]) – Require search filter to match exactly (defaults to false)
search (Optional[str]) – Search string (full text search against task name and description fields)
project_ids (Optional[List[str]]) – List of project ids to filter by
task_ids (Optional[List[str]]) – List of task ids to filter by
activities (Optional[List[models.DatumTaskActivityCode]]) – List of activity types to filter by
dataset_ids (Optional[List[str]]) – List of dataset ids to filter by
user_ids (Optional[List[str]]) – List of user ids to filter by
direction (Optional[models.SortDirection]) – Sort direction
sort (Optional[models.TaskActivitySortColumn]) – Sort column
max_items (Optional[int]) – Limit the returned generator to only produce this many items
- Returns:
Generator over the matching task activities
- Return type:
Generator[models.DatumTaskActivity, None, None]
chariot.datasets.uploads module
- chariot.datasets.uploads.delete_upload(id: str) Upload [source]
Delete an upload by id. This can only be done if the upload’s status is not COMPLETE or CLEANUP.
- Parameters:
id (str) – Id of the upload to delete
- Returns:
The upload details
- Return type:
- chariot.datasets.uploads.delete_upload_and_wait(id: str, *, timeout: float = 5, wait_interval: float = 0.5) None [source]
Delete an upload by id. This can only be done if the upload’s status is not COMPLETE or CLEANUP. Polls for the upload, blocking until the upload has been deleted or the timeout has been reached.
- Parameters:
id (str) – Id of the upload to delete
timeout (float) – Number of seconds to wait for snapshot deletion (default 5)
wait_interval (float) – Number of seconds between successive calls to check the upload for deletion (default 0.5)
- Raises:
TimeoutError – If the timeout has been reached
- chariot.datasets.uploads.get_upload_statistics(*, dataset_id: str, type: list[UploadType] | None = None, status: list[UploadStatus] | None = None) UploadStatistics [source]
Get upload statistics with various criteria.
- Parameters:
dataset_id (str) – Id of the dataset to get uploads for
type (Optional[models.UploadType]) – Filter snapshots by upload type
status – Filter snapshots by upload status
- Returns:
Statistics of uploads matching the criteria
- Return type:
- chariot.datasets.uploads.get_uploads(dataset_id: str, *, type: list[UploadType] | None = None, status: list[UploadStatus] | None = None, sort: UploadSortColumn | None = None, direction: SortDirection | None = None, max_items: int | None = None) Generator[Upload, None, None] [source]
Get uploads for a dataset
- Parameters:
dataset_id (str) – Id of the dataset to get uploads for
type (Optional[models.UploadType]) – Filter snapshots by upload type
status – Filter snapshots by upload status
sort (Optional[models.UploadSortColumn]) – How to sort the uploads
direction (Optional[models.SortDirection]) – Whether to sort in ascending or descending order
max_items (Optional[int]) – The maximum number of uploads to return
- Returns:
Upload details for uploads lmatching the criteria
- Return type:
Generator[models.Upload, None, None]
- chariot.datasets.uploads.retry_upload(id: str) Upload [source]
Retry processing of an upload that previously did not succeed.
- Parameters:
id (str) – Id of the upload to delete
- Returns:
The upload details
- Return type:
- chariot.datasets.uploads.retry_upload_and_wait(id: str, *, timeout: float = 5, wait_interval: float = 0.5) Upload [source]
Retry processing of an upload that previously did not succeed. Polls for the upload, blocking until the upload has finished processing or the timeout has been reached.
- Parameters:
id (str) – Id of the upload to delete
timeout (float) – Number of seconds to wait for snapshot deletion (default 5)
wait_interval (float) – Number of seconds between successive calls to check the upload for completion (default 0.5)
- Returns:
The upload details
- Return type:
- Raises:
TimeoutError – If the timeout has been reached
- chariot.datasets.uploads.upload_bytes(dataset_id: str, *, type: UploadType, data: bytes, max_validation_errors: int | None = None, image_validation: bool | None = None, split: SplitName | None = None, datum_metadata: dict[str, Any] | None = None, video_sampling_type: VideoSamplingType | None = None, video_sampling_value: float | None = None, video_deinterlace: bool | None = None) Upload [source]
Uploads a set of bytes as a single file. Does not wait for the upload to complete processing.
- Parameters:
dataset_id (str) – Id of the dataset to upload to
type (models.UploadType) – The type of file being uploaded.
data (bytes) – Bytes to upload
max_validation_errors (Optional[int]) – Maximum number of validation errors to tolerate before failing the upload
image_validation (Optional[bool]) – Whether or not to perform extra validations on image datums
split (Optional[models.SplitName]) – Name of split to upload datums to.
datum_metadata (Optional[Dict[str, Any]]) – When uploading a single datum (type=models.UploadType.DATUM), include custom metadata on this datum
video_sampling_type (Optional[models.VideoSamplingType]) – When uploading a video, optionally control how frames are sampled (at a constant rate, by a ratio of the videos frame rate, or none [all frames are extracted])
video_sampling_value (Optional[float]) – When uploading a video with a video_sampling_type of VideoSamplingType.RATE or VideoSamplingType.RATIO, this value controls the rate or ratio of sampling (either an FPS value or a multiplier for the video’s FPS, respectively)
video_deinterlace (Optional[bool]) – When uploading a video, optionally have a deinterlacing filter applied prior to extracting frames
- Returns:
The upload details
- Return type:
- chariot.datasets.uploads.upload_bytes_and_wait(dataset_id: str, *, type: UploadType, data: bytes, max_validation_errors: int | None = None, image_validation: bool | None = None, split: SplitName | None = None, datum_metadata: dict[str, Any] | None = None, video_sampling_type: VideoSamplingType | None = None, video_sampling_value: float | None = None, video_deinterlace: bool | None = None, timeout: float = 3600, wait_interval: float = 0.5) Upload [source]
Uploads a set of bytes as a single file, and waits for the upload to complete processing.
- Parameters:
dataset_id (str) – Id of the dataset to upload to
type (models.UploadType) – The type of file being uploaded.
data (bytes) – Bytes to upload
max_validation_errors (Optional[int]) – Maximum number of validation errors to tolerate before failing the upload
image_validation (Optional[bool]) – Whether or not to perform extra validations on image datums
split (Optional[models.SplitName]) – Name of split to upload datums to.
datum_metadata (Optional[Dict[str, Any]]) – When uploading a single datum (type=models.UploadType.DATUM), include custom metadata on this datum
video_sampling_type (Optional[models.VideoSamplingType]) – When uploading a video, optionally control how frames are sampled (at a constant rate, by a ratio of the videos frame rate, or none [all frames are extracted])
video_sampling_value (Optional[float]) – When uploading a video with a video_sampling_type of VideoSamplingType.RATE or VideoSamplingType.RATIO, this value controls the rate or ratio of sampling (either an FPS value or a multiplier for the video’s FPS, respectively)
video_deinterlace (Optional[bool]) – When uploading a video, optionally have a deinterlacing filter applied prior to extracting frames
timeout (float) – Number of seconds to wait for upload to complete (default 3600)
wait_interval (float) – Number of seconds between successive calls to check the upload for completion (default 0.5)
- Returns:
The upload details
- Return type:
- Raises:
TimeoutError – If the timeout has been reached
exceptions.UploadValidationError – If the upload fails and has validation errors
exceptions.UploadUnknownError – If the upload fails without a specified reason
exceptions.UploadIncompleteError – If the upload has stopped making progress without reaching a terminal state. Upload should probably be retried
- chariot.datasets.uploads.upload_file(dataset_id: str, *, type: UploadType, path: str, max_validation_errors: int | None = None, image_validation: bool | None = None, split: SplitName | None = None, datum_metadata: dict[str, Any] | None = None, video_sampling_type: VideoSamplingType | None = None, video_sampling_value: float | None = None, video_deinterlace: bool | None = None) Upload [source]
Uploads a single file. Does not wait for the upload to complete processing.
- Parameters:
dataset_id (str) – Id of the dataset to upload to
type (models.UploadType) – The type of file being uploaded.
path (str) – Path of file to upload
max_validation_errors (Optional[int]) – Maximum number of validation errors to tolerate before failing the upload
image_validation (Optional[bool]) – Whether or not to perform extra validations on image datums
split (Optional[models.SplitName]) – Name of split to upload datums to.
datum_metadata (Optional[Dict[str, Any]]) – When uploading a single datum (type=models.UploadType.DATUM), include custom metadata on this datum
video_sampling_type (Optional[models.VideoSamplingType]) – When uploading a video, optionally control how frames are sampled (at a constant rate, by a ratio of the videos frame rate, or none [all frames are extracted])
video_sampling_value (Optional[float]) – When uploading a video with a video_sampling_type of VideoSamplingType.RATE or VideoSamplingType.RATIO, this value controls the rate or ratio of sampling (either an FPS value or a multiplier for the video’s FPS, respectively)
video_deinterlace (Optional[bool]) – When uploading a video, optionally have a deinterlacing filter applied prior to extracting frames
- Returns:
The upload details
- Return type:
- chariot.datasets.uploads.upload_file_and_wait(dataset_id: str, *, type: UploadType, path: str, max_validation_errors: int | None = None, image_validation: bool | None = None, split: SplitName | None = None, datum_metadata: dict[str, Any] | None = None, video_sampling_type: VideoSamplingType | None = None, video_sampling_value: float | None = None, video_deinterlace: bool | None = None, timeout: float = 3600, wait_interval: float = 0.5) Upload [source]
Uploads a single file, and waits for the upload to complete processing.
- Parameters:
dataset_id (str) – Id of the dataset to upload to
type (models.UploadType) – The type of file being uploaded.
path (str) – Path of file to upload
max_validation_errors (Optional[int]) – Maximum number of validation errors to tolerate before failing the upload
image_validation (Optional[bool]) – Whether or not to perform extra validations on image datums
split (Optional[models.SplitName]) – Name of split to upload datums to.
datum_metadata (Optional[Dict[str, Any]]) – When uploading a single datum (type=models.UploadType.DATUM), include custom metadata on this datum
video_sampling_type (Optional[models.VideoSamplingType]) – When uploading a video, optionally control how frames are sampled (at a constant rate, by a ratio of the videos frame rate, or none [all frames are extracted])
video_sampling_value (Optional[float]) – When uploading a video with a video_sampling_type of VideoSamplingType.RATE or VideoSamplingType.RATIO, this value controls the rate or ratio of sampling (either an FPS value or a multiplier for the video’s FPS, respectively)
video_deinterlace (Optional[bool]) – When uploading a video, optionally have a deinterlacing filter applied prior to extracting frames
timeout (float) – Number of seconds to wait for upload to complete (default 3600)
wait_interval (float) – Number of seconds between successive calls to check the upload for completion (default 0.5)
- Returns:
The upload details
- Return type:
- Raises:
TimeoutError – If the timeout has been reached
exceptions.UploadValidationError – If the upload fails and has validation errors
exceptions.UploadUnknownError – If the upload fails without a specified reason
exceptions.UploadIncompleteError – If the upload has stopped making progress without reaching a terminal state. Upload should probably be retried
- chariot.datasets.uploads.upload_files_from_urls(dataset_id: str, *, type: UploadType, source_urls: list[str], source_urls_datum_metadata: list[dict[str, Any]] | None = None, annotations_url: str | None = None, max_validation_errors: int | None = None, image_validation: bool | None = None, split: SplitName | None = None) Upload [source]
Uploads a list of urls to a dataset as individual datums. Does not wait for the upload to complete processing.
- Parameters:
type (models.UploadType) – The type of file being uploaded. Must be one of models.UploadType.{ARCHIVE|DATUM}
source_urls (List[str]) – List of URLs from which the datums are read. len() must be equal to 1 for ARCHIVE upload type.
source_urls_datum_metadata (Optional[List[Dict[str, Any]]]) – When uploading individual datums (type=models.UploadType.DATUM), include custom metadata for datums created by each URL. List index should match the desired source_urls list index, empty array elements should include empty Dicts.
annotations_url (Optional[str]) – URL from which a gzipped annotations file in jsonl format will be downloaded and processed along datums from source_urls. Attribute path in the annotations file will be datum index in source_urls.
max_validation_errors (Optional[int]) – Maximum number of validation errors to tolerate before failing the upload
image_validation (Optional[bool]) – Whether or not to perform extra validations on image datums
split (Optional[models.SplitName]) – Name of split to upload datums to.
- Returns:
The upload details
- Return type:
- chariot.datasets.uploads.upload_files_from_urls_and_wait(dataset_id: str, *, type: UploadType, source_urls: list[str], source_urls_datum_metadata: list[dict[str, Any]] | None = None, annotations_url: str | None = None, max_validation_errors: int | None = None, image_validation: bool | None = None, split: SplitName | None = None, timeout: float = 3600, wait_interval: float = 0.5) Upload [source]
Uploads a set of bytes as a single file, and waits for the upload to complete processing.
- Parameters:
dataset_id (str) – Id of the dataset to upload to
type (models.UploadType) – The type of file being uploaded. Must be one of models.UploadType.{ARCHIVE|DATUM}
source_urls (List[str]) – List of URLs from which the datums are read. len() must be equal to 1 for ARCHIVE upload type.
source_urls_datum_metadata (Optional[List[Dict[str, Any]]]) – When uploading individual datums (type=models.UploadType.DATUM), include custom metadata for datums created by each URL. List index should match the desired source_urls list index and empty array elements should include empty Dicts.
annotations_url (Optional[str]) – URL from which a gzipped annotations file in jsonl format will be downloaded and processed along datums from source_urls. Attribute path in the annotations file will be datum index in source_urls.
max_validation_errors (Optional[int]) – Maximum number of validation errors to tolerate before failing the upload
image_validation (Optional[bool]) – Whether or not to perform extra validations on image datums
split (Optional[models.SplitName]) – Name of split to upload datums to.
timeout (float) – Number of seconds to wait for upload to complete (default 3600)
wait_interval (float) – Number of seconds between successive calls to check the upload for completion (default 0.5)
- Returns:
The upload details
- Return type:
- Raises:
TimeoutError – If the timeout has been reached
exceptions.UploadValidationError – If the upload fails and has validation errors
exceptions.UploadUnknownError – If the upload fails without a specified reason
exceptions.UploadIncompleteError – If the upload has stopped making progress without reaching a terminal state. Upload should probably be retried
- chariot.datasets.uploads.upload_folder(dataset_id: str, *, path: str, max_validation_errors: int | None = None, image_validation: bool | None = None, split: SplitName | None = None) Upload [source]
Uploads the contents of a folder. Equivalent to creating an archive from that folder and then uploading that archive with type=UploadType.ARCHIVE. Does not wait for the upload to complete processing.
- Parameters:
dataset_id (str) – Id of the dataset to upload to
path (str) – Path of folder to upload
max_validation_errors (Optional[int]) – Maximum number of validation errors to tolerate before failing the upload
image_validation (Optional[bool]) – Whether or not to perform extra validations on image datums
split (Optional[models.SplitName]) – Name of split to upload datums to.
- Returns:
The upload details
- Return type:
- chariot.datasets.uploads.upload_folder_and_wait(dataset_id: str, *, path: str, max_validation_errors: int | None = None, image_validation: bool | None = None, split: SplitName | None = None, timeout: float = 3600, wait_interval: float = 0.5) Upload [source]
Uploads the contents of a folder. Equivalent to creating an archive from that folder and then uploading that archive with type=UploadType.ARCHIVE. Waits for the upload to complete processing.
- Parameters:
dataset_id (str) – Id of the dataset to upload to
path (str) – Path of folder to upload
max_validation_errors (Optional[int]) – Maximum number of validation errors to tolerate before failing the upload
image_validation (Optional[bool]) – Whether or not to perform extra validations on image datums
split (Optional[models.SplitName]) – Name of split to upload datums to.
timeout (float) – Number of seconds to wait for upload to complete (default 3600)
wait_interval (float) – Number of seconds between successive calls to check the upload for completion (default 0.5)
- Returns:
The upload details
- Return type:
- Raises:
TimeoutError – If the timeout has been reached
exceptions.UploadValidationError – If the upload fails and has validation errors
exceptions.UploadUnknownError – If the upload fails without a specified reason
exceptions.UploadIncompleteError – If the upload has stopped making progress without reaching a terminal state. Upload should probably be retried
- chariot.datasets.uploads.wait_for_upload(id: str, *, timeout: float = 3600, wait_interval: float = 0.5) Upload [source]
Polls the given upload until it has finished processing.
- Parameters:
id (str) – Id of the upload to wait for
timeout (float) – Number of seconds to wait for upload to complete (default 3600)
wait_interval (float) – Number of seconds between successive calls to check the upload for completion (default 0.5)
- Returns:
The upload details
- Return type:
- Raises:
TimeoutError – If the timeout has been reached
exceptions.UploadValidationError – If the upload fails and has validation errors
exceptions.UploadUnknownError – If the upload fails without a specified reason
exceptions.UploadIncompleteError – If the upload has stopped making progress without reaching a terminal state. Upload should probably be retried
chariot.datasets.views module
- chariot.datasets.views.create_view(*, dataset_id: str, name: str, split_algorithm: SplitAlgorithm | None = None, apply_default_split: bool | None = None, splits: dict[SplitName, float] | None = None, metadata: dict[str, str] | None = None, capture_timestamp_range: TimestampRange | None = None, gps_coordinates_circle: Circle | None = None, gps_coordinates_polygon: list[GeoPoint] | None = None, gps_coordinates_rectangle: Rectangle | None = None, task_type_label_filters: list[TaskTypeLabelFilter] | None = None, approval_status: list[str] | None = None, annotation_metadata: dict[str, str] | None = None, sample_count: int | None = None) View [source]
Create a new view in the dataset ID given.
- Parameters:
dataset_id (str) – Id of dataset to create new view in
name (str) – View name
split_algorithm (Optional[models.SplitAlgorithm],) – Splitting algorithm for the view (defaults to Random)
apply_default_split (Optional[bool]) – Whether default splits are used when splitting (defaults to true)
splits (Optional[Dict[models.SplitName, float]]) – Split weights for splitting datums in the view
metadata (Optional[Dict[str, str]]) – Add metadata filter to view
capture_timestamp_range (Optional[models.TimestampRange]) – Add capture timestamp range filter to view
gps_coordinates_circle (Optional[models.Circle]) – Add circle filter to view
gps_coordinates_polygon (Optional[List[models.GeoPoint]]) – Add polygon filter to view
gps_coordinates_rectangle (Optional[models.Rectangle]) – Add rectangle filter to view
task_type_label_filters (Optional[List[models.TaskTypeLabelFilter]]) – Add filter for task types and associated labels to view
approval_status (Optional[List[str]]) – Filter by annotation approval status
annotation_metadata (Optional[Dict[str, str]]) – Filter by annotation metadata values
sample_count (Optional[int]) – Sample count for the view
- Returns:
View details for the newly created view
- Return type:
- chariot.datasets.views.delete_view(id: str) View [source]
Delete a view by id. The artifacts for the view will be deleted as well.
- Parameters:
id (str) – Id of view to delete
- Returns:
View that was deleted
- Return type:
- chariot.datasets.views.get_all_view_count(*, name: str | None = None, exact_name_match: bool | None = None, view_ids: list[str] | None = None) int [source]
Get number of views in the dataset id given. Only admin user can access this function
- Parameters:
name (Optional[str]) – Filter views counted by name
exact_name_match (Optional[bool]) – Require name filter to match exactly (defaults to false)
view_ids (Optional[List[str]]) – Filter by view ids
- Returns:
Number of views in all datasets
- Return type:
int
- chariot.datasets.views.get_all_views(*, name: str | None = None, exact_name_match: bool | None = None, view_ids: list[str] | None = None, sort: ViewSortColumn | None = None, direction: SortDirection | None = None, max_items: int | None = None) Generator[View, None, None] [source]
Get views for all datasets with various criteria. Returns a generator over all matching views Only admin user can access this function
- Parameters:
name (Optional[str]) – Filter by view name
exact_name_match (Optional[bool]) – Require name filter to match exactly (defaults to false)
view_ids (Optional[List[str]]) – Filter by view ids
sort (Optional[models.ViewSortColumn]) – What column to sort the views by (defaults to name)
direction (Optional[models.SortDirection]) – Whether to sort in ascending or descending order
max_items (Optional[int]) – Limit the returned generator to only produce this many items
- Returns:
View details for views matching the criteria
- Return type:
Generator[models.View, None, None]
- chariot.datasets.views.get_dataset_view_count(dataset_id: str, *, name: str | None = None, exact_name_match: bool | None = None, view_ids: list[str] | None = None) int [source]
Get number of views in the dataset id given.
- Parameters:
dataset_id (str) – Id of dataset to get number of views in
name (Optional[str]) – Filter views counted by name
exact_name_match (Optional[bool]) – Require name filter to match exactly (defaults to false)
view_ids (Optional[List[str]]) – Filter by view ids
- Returns:
Number of views in the provided dataset id
- Return type:
int
- chariot.datasets.views.get_dataset_views(dataset_id: str, *, name: str | None = None, exact_name_match: bool | None = None, view_ids: list[str] | None = None, sort: ViewSortColumn | None = None, direction: SortDirection | None = None, max_items: int | None = None) Generator[View, None, None] [source]
Get views for dataset id with various criteria. Returns a generator over all matching views
- Parameters:
dataset_id (str) – Dataset ID to search for views in.
name (Optional[str]) – Filter by view name
exact_name_match (Optional[bool]) – Require name filter to match exactly (defaults to false)
view_ids (Optional[List[str]]) – Filter by view ids
sort (Optional[models.ViewSortColumn]) – What column to sort the views by (defaults to name)
direction (Optional[models.SortDirection]) – Whether to sort in ascending or descending order
max_items (Optional[int]) – Limit the returned generator to only produce this many items
- Returns:
View details for views matching the criteria
- Return type:
Generator[models.View, None, None]
- chariot.datasets.views.get_view_timeline(id: str, *, max_items: int | None = None, direction: SortDirection | None = None, since_last_snapshot: bool | None = None, min_groups: int | None = None, max_ungrouped_events: int | None = None) Iterator[DatasetTimelineEvent] [source]
Get a series of dataset change events affecting the given view ordered by time and grouped by event type.
- Parameters:
id (str) – Id of view to get events for
max_items (Optional[int]) – Limit the returned generator to only produce this many items
direction (Optional[models.SortDirection]) – Whether to sort in ascending or descending order
since_last_snapshot (Optional[bool]) – Whether or not to only return events since the last snapshot for this view (defaults to false)
min_groups (Optional[int]) – How many groups are required before grouping behavior is turned on
max_ungrouped_events (Optional[int]) – The maximum number of events allowed before grouping behavior is turned on
- Returns:
Events for the view
- Return type:
Iterator[models.DatasetTimelineEvent]