Annotation Format
Annotations of a dataset can be specified by including an annotations.json
or annotations.jsonl
file in the root of your archive. An annotations.json
file contains an array of annotation JSON objects, whereas a annotations.jsonl
file should contain an annotation JSON object on each line. An annotation JSON object has the following fields:
path
must precede annotations
or the path will be ignored and the system will inform you that some of your annotations are missing a valid path.
Annotations may be added by uploading an individual annotation.jsonl file using the SDK. See SDK documentation for more information. For annotation files uploaded outside of an archive, replace the path
attribute with a datum_id
attribute in this section's documentation. The datum id can be retrieved via the datum's URL in the UI or when interacting with datums via the SDK.
-
path:
Absolute path to the image or text file within the compressed file structure; see the example below. -
annotations:
A list of objects; one for each annotated object in the image file or text file.- For images, there are four annotation entry options:
- Image Classification:
class_label
- Image Segmentation:
class_label
andcontour
- Object Detection:
class_label
andbbox
- Oriented Object Detection:
class_label
andoriented_bbox
- Image Classification:
- For text, there are three annotation entry options:
- Text Classification:
context
andlabel
- Text Token Classification:
start
,end
, andlabel
- Text Generation:
context
andgenerated_text
- Text Classification:
- For images, there are four annotation entry options:
All class_label
values in your annotation files must be strings (e.g., "cat", "dog", "1", "2"). Do not use integers or other types for class labels. Using integers will cause your dataset upload to fail.
Annotation Examples
Task Type Field
All annotations specify a task_type
field to categorize their purpose. This is not required for uploads; the task_type
is inferred based on the fields included with that upload. However, it will be in the file when a dataset archive is downloaded from Chariot.
While not required, it is recommended that users include a task type to keep things as similar as possible to what they intend to see in a downloaded archive of the dataset.
The below task types are valid:
Image Classification
Object Detection
Oriented Object Detection
Image Segmentation
Token Classification
Text Classification
Text Generation
Image Classification
A dataset supporting image classification may have any number of Image Classification annotations. For example, the following .jsonl
file defines an annotation for a dataset consisting of two images, the first of which is a dog and the second of which is a cat.
{"path": "a/b/c/img1.png", "annotations": [{"class_label": "dog"}]}
{"path": "a/b/c/img2.png", "annotations": [{"class_label": "cat"}]}
Object Detection
For a dataset to support object detection, each annotation should have a bbox
field, with the keys xmin
, ymin
, xmax
, and ymax
specifying the bounding box.
In the example below, a dataset contains three defined images: The first contains a dog and person, the second contains a single object (a cat), and the third image contains no object of interest.
{"path": "a/b/d/img1.png", "annotations": [{"class_label": "dog", "bbox": {"xmin": 16, "ymin": 130, "xmax": 70, "ymax": 150}}, {"class_label": "person", "bbox": {"xmin": 89, "ymin": 10, "xmax": 97, "ymax": 110}}]}
{"path": "a/b/d/img2.png", "annotations": [{"class_label": "cat", "bbox": {"xmin": 500, "ymin": 220, "xmax": 530, "ymax": 260}}]}
{"path": "a/b/d/img3.png", "annotations": []}
Oriented Object Detection
For oriented object detection, each annotation should have an oriented_bbox
field, with the keys cx
, cy
, w
, h
, and r
specifying the oriented bounding box.
The keys are defined as:
cx
- The center X coordinate of the bounding box, as a fraction of the image's widthcy
- The center Y coordinate of the bounding box, as a fraction of the image's heightw
- The width of the bounding box, as a fraction of the image's widthh
- The height of the bounding box, as a fraction of the image's heightr
- The rotation of the bounding box, in radians. Defined as the clockwise angle between thew
edge of the bounding box and the image's width axis
In the example below, a dataset contains two defined images: The first contains a dog and person, the second contains a single object (a cat).
{"path": "a/b/d/img1.png", "annotations": [{"class_label": "dog", "oriented_bbox": {"cx": 0.52, "cy": 0.82, "w": 0.07, "h": 0.02, "r": 0.17}}, {"class_label": "person", "oriented_bbox": {"cx": 0.13, "cy": 0.43, "w": 0.18, "h": 0.08, "r": 0.06}}]}
{"path": "a/b/d/img2.png", "annotations": [{"class_label": "cat", "oriented_bbox": {"cx": 0.85, "cy": 0.15, "w": 0.38, "h": 0.28, "r": 0.97}}]}
Image Segmentation
For segmentation tasks, polygon contours must be specified to better determine data visualization. Polygonal contours are a list of elevation values that can describe an occluded View of a single object to better visualize image information.
For example, in an image of a car parked behind a telephone pole, the annotator can specify two regions that describe the car: an outer list that encapsulates the full contour and two inner lists that describe the points within each region.
{"path": "a/b/c/img1.png", "annotations": [{"class_label": "dog", "contour": [[{"x": 10.0, "y": 15.5}, {"x": 20.9, "y": 50.2}, {"x": 25.9, "y": 28.4}]]}]}
{"path": "a/b/c/img2.png", "annotations": [{"class_label": "cat", "contour": [[{"x": 97.2, "y": 40.2}, {"x": 33.33, "y": 44.3}, {"x": 10.9, "y": 18.7}]]}]}
Text Classification
Text classification tasks provide a global label to a selection of text. For example, text within a dataset might say that "the economy in the USA has grown at twice the rate of that of the UK." The classification task might have the context "Is the content of this text pro-America?" In this case, an annotator could label this as positive.
{"path": "a/b/c/text1.text","annotations":[{"task_type":"Text Classification","text_classification":{"context":"Pro-America?","label":"positive"}}]}
Text Token Classification
Token classification is a more granular version of text classification that labels the characters, words, or phrases within the selection of text. Using the above example, an annotator could label the word "USA" as a "place." This is a common form of token classification called named-entity recognition, in which words are labeled.
{
"path": "a/b/c/text1.text",
"annotations": [
{
"token_classification": {
"start_position": 19,
"end_position": 22,
"label": "place"
}
}
]
}
For effective classification, the start and end positions should be specified at the character level, not the word level. Characters in the text string begin at index 0, which means that the character in position 19 (the "U" in "USA") would be the 20th character in the string. Characters included in the string begin at the start_position (19 in the above example) and do not include the character at the end_position. In the above example, the characters in "USA" occupy positions 19, 20, and 21.
Text Generation
Text generation is an annotation for text-based classifications that covers all forms of generated text, such as text summarization and text translation.
Following the example above, a text summarization annotation of a text file could produce a text summary of the "USA economy is growing."
{
"path": "a/b/c/text1.text",
"task_type": "Text Generation",
"text_generation": {
"context": "summary",
"generated_text": "USA economy is growing."
}
}
Running a text translation annotation to translate the above example into the French language would produce a result of "L'économie américaine en croissance."
{
"path": "a/b/c/text1.text",
"task_type": "Text Generation",
"text_generation": {
"context": "translation-english-french",
"generated_text": "L'économie américaine en croissance"
}
}
Annotation Approval Status and Metadata
All annotations might optionally specify an approval_status
field with one of the three values needs_review
, verified
, rejected
, and an metadata
field related with the annotation. Those are not required for uploads, and will be in the downloaded archive of the dataset if provided.
{"path": "a/b/c/img1.png", "annotations": [{"class_label": "dog", "approval_status": "verified", "metadata": {"annotator": "alice", "confidence": 0.9}}]}
Example Image Annotation File
{
"annotations": [
{
"path": "data_folder/a.jpg",
"annotations": [
{
"contour": [
[
{
"x": 10.0,
"y": 15.5
},
{
"x": 20.9,
"y": 50.2
},
{
"x": 25.9,
"y": 28.4
}
],
[
{
"x": 60.0,
"y": 15.5
},
{
"x": 70.9,
"y": 50.2
},
{
"x": 75.9,
"y": 28.4
}
]
],
"class_label": "standing",
"approval_status": "needs_review",
"metadata": {"annotator":"bryan", "confidence":0.6}
}
]
},
{
"path": "data_folder/b.jpg",
"annotations": [
{
"bbox": {"ymin": 174.02, "xmin": 25.89, "ymax": 448.72, "xmax": 289.08},
"class_label": "other",
"approval_status": "verified"
}
]
},
{
"path": "data_folder/c.jpg",
"annotations": [
{
"oriented_bbox": {"cx": 0.52, "cy": 0.83, "w": 0.07, "h": 0.02, "r": 0.17},
"class_label": "lying",
"metadata": {"annotator":"felix", "note": "between lying and sitting"}
}
]
},
{
"path": "data_folder/d.jpg",
"annotations": [
{
"class_label": "sitting",
"approval_status": "rejected",
}
]
},
]
}
Example Text Annotation File
{
"annotations": [
{
"path": "data_folder/a.txt",
"annotations": [
{
"token_classification": {
"start_position": 0,
"end_position": 5,
"label": "animal"
},
"approval_status": "rejected",
"metadata": {"annotator":"bryan", "note":"invalid label value"}
},
{
"token_classification": {
"start_position": 16,
"end_position": 22,
"label": "noun"
},
"approval_status": "verified"
}
]
},
{
"path": "data_folder/b.txt",
"annotations": [
{
"translation": {
"origin_language": "english",
"target_language": "spanish",
"translation": "son las gatas malas"
},
"approval_status": "verified",
"metadata": {"annotator":"bryan", "note":"straight"}
}
]
},
{
"path": "data_folder/c.txt",
"annotations": [
{
"sentiment": {
"context": "Can this sport be dangerous?",
"label": "positive"
},
"approval_status": "rejected"
}
]
},
{
"path": "data_folder/d.txt",
"annotations": [
{
"summarization": {
"summary": "A notoriously bad football team lately."
}
}
]
}
]
}
Example Conversion to Chariot Dataset
A Chariot dataset may be populated with datums and annotations using an archive as discussed above, or the SDK may be used to upload datums and/or annotations. Code examples converting a COCO formatted archive (example archived may be downloaded here) to Chariot are provided below for each. Note that there are additional possible implementations, e.g., uploading one datum at a time or uploading a bulk set of annotations by themselves. Depending on the scenario, different implementations may be more appropriate; for example, a compressed archive will upload faster but will fail entirely if there are any issues such as formatting. Meanwhile, uploading individual datums and/or annotations may be slower, but a single incorrectly formatted upload will not cause the remaining properly formatted ones to fail.
- Creating a Chariot Archive
- Creating annotations via SDK
import json
import numpy as np
import os
import shutil
COCO_IMG_ROOT_PATH, COCO_ANN_PATH = "./coco/val2017/val2017", "./coco/anns.json"
DST_PATH = "./coco/coco_converted_to_chariot_archive"
os.makedirs(DST_PATH, exist_ok=True)
with open(COCO_ANN_PATH, "r") as f:
coco_annotations = json.load(f)
chariot_anns = {}
for i in coco_annotations['images']:
chariot_anns[i['id']] = {"path": i["file_name"], "annotations": []}
shutil.copy(os.path.join(COCO_IMG_ROOT_PATH, i['file_name']), os.path.join(DST_PATH, i['file_name']))
cat_id_to_label = {c["id"]: c["name"] for c in coco_annotations["categories"]}
for a in coco_annotations["annotations"]:
chariot_anns[a["image_id"]]["annotations"].append(
{
"class_label": cat_id_to_label[a["category_id"]],
"bbox": {
"xmin": a["bbox"][0],
"ymin": a["bbox"][1],
"xmax": a["bbox"][0] + a["bbox"][2],
"ymax": a["bbox"][1] + a["bbox"][3],
}
}
)
if not a["iscrowd"]:
chariot_anns[a["image_id"]]["annotations"].append(
{
"class_label": cat_id_to_label[a["category_id"]],
"contour": [[{"x": x, "y": y} for x, y in zip(a['segmentation'][0][::2], a['segmentation'][0][1::2])]]
}
)
else: # In this case, COCO segmentation is a binary mask formatted as Run Length Encoding
height, width = a['segmentation']['size']
mask = np.zeros(height * width, dtype=np.uint8)
idx = 0
# Binary mask in RLE format is represented by a sequence of counts of 0's and counts of 1's
for neg_count, pos_count in zip(a['segmentation']['counts'][::2], a['segmentation']['counts'][1::2]):
mmask[idx + neg_count:idx + neg_count + pos_count] = 1
idx += (neg_count + pos_count)
mask = mask.reshape((height, width), order='F')
# Pad the mask to avoid contour artifacts
padded_mask = np.pad(mask, pad_width=1, mode='constant', constant_values=0)
contours, _ = cv2.findContours(padded_mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
for contour in contours:
contour -= 1 # Remove padding offset
chariot_anns[a["image_id"]]["annotations"].append(
{
"class_label": cat_id_to_label[a["category_id"]],
"contour": [[{"x": int(vertex[0][0]), "y": int(vertex[0][1])} for vertex in contour]]
}
)
with open(os.path.join(DST_PATH, "annotations.jsonl"), "w") as f:
for v in chariot_anns.values():
f.write(json.dumps(v)+"\n")
import json
import numpy as np
from chariot.datasets.annotations import create_annotation
from chariot.datasets.datasets import create_dataset
from chariot.datasets.datums import get_upload_datums
from chariot.datasets.models import BoundingBox, DatasetType, Point, UploadType
from chariot.datasets.uploads import upload_file_and_wait
COCO_IMG_ARCHIVE_PATH, COCO_ANN_PATH = "./coco/val2017_slim.zip", "./coco/anns.json"
with open(COCO_ANN_PATH, "r") as f:
coco_annotations = json.load(f)
dataset = create_dataset(name="COCO conversion example via SDK", type=DatasetType.IMAGE, project_id="<project-id>")
upload = upload_file_and_wait(dataset.id, type=UploadType.ARCHIVE, path=COCO_IMG_ARCHIVE_PATH)
source_file_to_image_id = {i['file_name']: i['id'] for i in coco_annotations['images']}
image_id_to_datum_id = {source_file_to_image_id[d.metadata["SourceFile"]]: d.id for d in get_upload_datums(upload.id)}
cat_id_to_label = {c["id"]: c["name"] for c in coco_annotations["categories"]}
for a in coco_annotations["annotations"]:
create_annotation(
image_id_to_datum_id[a['image_id']],
class_label = cat_id_to_label[a["category_id"]],
bbox = BoundingBox(
xmin = a["bbox"][0],
ymin = a["bbox"][1],
xmax = a["bbox"][0] + a["bbox"][2],
ymax = a["bbox"][1] + a["bbox"][3]
)
)
if not a["iscrowd"]:
create_annotation(
image_id_to_datum_id[a['image_id']],
class_label = cat_id_to_label[a["category_id"]],
contour = [[Point(x=x, y=y) for x, y in zip(a['segmentation'][0][::2], a['segmentation'][0][1::2])]]
)
else: # In this case, COCO segmentation is a binary mask formatted as Run Length Encoding
height, width = a['segmentation']['size']
mask = np.zeros(height * width, dtype=np.uint8)
idx = 0
# Binary mask in RLE format is represented by a sequence of counts of 0's and counts of 1's
for neg_count, pos_count in zip(a['segmentation']['counts'][::2], a['segmentation']['counts'][1::2]):
mask[idx + neg_count:idx + neg_count + pos_count] = 1
idx += (neg_count + pos_count)
mask = mask.reshape((height, width), order='F')
# Pad the mask to avoid contour artifacts
padded_mask = np.pad(mask, pad_width=1, mode='constant', constant_values=0)
contours, _ = cv2.findContours(padded_mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
for contour in contours:
contour -= 1 # Remove padding offset
create_annotation(
image_id_to_datum_id[a['image_id']],
class_label = cat_id_to_label[a["category_id"]],
contour = [[Point(x=int(vertex[0][0]), y=int(vertex[0][1])) for vertex in contour]]
)