Supported loader formats
| Format | Status | Notes |
coco | Implemented | COCO JSON with boxes normalized from xywh to xyxy |
csv | Implemented | Flat annotation tables with path, bbox, label, and optional split columns |
json / jsonl / ndjson | Implemented | Simple records as a JSON list, records object, or line-delimited objects |
yolo | Implemented | YOLO TXT labels with split inferred from label subfolders |
voc | Implemented | Pascal VOC XML with ImageSets split files when present |
Recommended layout
dataset_root/
annotations/
instances_train.json
instances_val.json
instances_test.json
images/
image_0001.png
image_0002.png
Normalized loader payload
load_dataset() returns images, annotations, samples, categories, category_map, splits, and meta for every supported adapter. Annotation boxes use x_min, y_min, x_max, and y_max. Image, annotation, and sample records carry the resolved split so training helpers can filter deterministically.
COCO example
{
"images": [{"id": 1, "file_name": "image_0001.png", "width": 1024, "height": 1024}],
"annotations": [{"id": 1, "image_id": 1, "category_id": 0, "bbox": [100, 120, 50, 80], "area": 4000, "iscrowd": 0}],
"categories": [{"id": 0, "name": "vessel"}]
}
Practical rule
Use generic loaders for data exploration and the lightweight helpers. Convert to COCO JSON before using the native runtime helpers with detector_spec=.... The native datamodule resolves Annotations/train_annotations.json, Annotations/val_annotations.json, Annotations/test_annotations.json, or matching annotations/instances_*.json files, supports shared and split-specific paired image/target transforms, then raises NativeDataValidationError during setup if the requested training, validation, or test split has no samples.