SimpleDet Docs

Datasets

SimpleDet can inspect several dataset formats, but the suite plus pipeline workflow is still centered on COCO-style annotations.

Use this page to decide which dataset formats are suitable for exploration and which ones must be converted before full training.

Supported loader formats

FormatStatusNotes
cocoImplementedCOCO JSON with boxes normalized from xywh to xyxy
csvImplementedFlat annotation tables with path, bbox, label, and optional split columns
json / jsonl / ndjsonImplementedSimple records as a JSON list, records object, or line-delimited objects
yoloImplementedYOLO TXT labels with split inferred from label subfolders
vocImplementedPascal VOC XML with ImageSets split files when present

Recommended layout

dataset_root/
  annotations/
    instances_train.json
    instances_val.json
    instances_test.json
  images/
    image_0001.png
    image_0002.png

Normalized loader payload

load_dataset() returns images, annotations, samples, categories, category_map, splits, and meta for every supported adapter. Annotation boxes use x_min, y_min, x_max, and y_max. Image, annotation, and sample records carry the resolved split so training helpers can filter deterministically.

COCO example

{
  "images": [{"id": 1, "file_name": "image_0001.png", "width": 1024, "height": 1024}],
  "annotations": [{"id": 1, "image_id": 1, "category_id": 0, "bbox": [100, 120, 50, 80], "area": 4000, "iscrowd": 0}],
  "categories": [{"id": 0, "name": "vessel"}]
}

Practical rule

Use generic loaders for data exploration and the lightweight helpers. Convert to COCO JSON before using the native runtime helpers with detector_spec=.... The native datamodule resolves Annotations/train_annotations.json, Annotations/val_annotations.json, Annotations/test_annotations.json, or matching annotations/instances_*.json files, supports shared and split-specific paired image/target transforms, then raises NativeDataValidationError during setup if the requested training, validation, or test split has no samples.