SimpleDet Docs

Training

Training is where the detector definition becomes an experiment record. The important idea is that build() prepares the runtime, while train() executes the run and writes the workdir artifacts.

Use this page when you want to understand what the pipeline prepares, what it writes, and why the workdir matters.

Recommended training path

Use the project pipeline path when you want consistent checkpoints, logs, evaluator outputs, validation, and one place to rerun the experiment later.

from simpledet import create_project_pipeline
from simpledet.suite import build_detector, build_neck

detector_spec = build_detector(
    "vfnet",
    encoder="convnext_tiny.in12k_ft_in1k",
    neck=build_neck(name="FPN", out_channels=192, num_outs=5),
    num_classes=3,
    in_channels=3,
)

pipeline = create_project_pipeline(
    dataset_root="/data/project",
    detector_spec=detector_spec,
    result_folder="results/project",
    in_channels=3,
    tif_channels_to_load=[1, 2, 3],
    categories=("car", "building", "ship"),
    resize=768,
    batch_size=2,
    learning_rate=0.001,
    max_epochs=30,
)

pipeline.validate_paths(strict=True)
pipeline.run(stages=("build", "train"))

The detector spec controls what model is trained. The pipeline arguments control how and where it is trained. In particular, categories must match the semantic class list expected by the run, while num_classes on the detector spec controls the size of the prediction heads.

Direct non-config execution

If you want the same operational path without writing a project file or handling a pipeline object explicitly, use the direct execution helper.

from simpledet import run_training

result = run_training(
    dataset_root="/data/project",
    detector_spec=detector_spec,
    categories=("car", "building", "ship"),
    in_channels=3,
    tif_channels_to_load=[1, 2, 3],
    resize=768,
    batch_size=2,
    learning_rate=0.001,
    max_epochs=30,
)

This still builds the project-layout pipeline under the hood. The difference is only the interface: you pass the run arguments directly and receive the stage outputs immediately.

Config-driven operations

When you want a repeatable operational package workflow, move the same settings into a project config file and run them through the CLI.

python -m simpledet --project-validate project.toml
python -m simpledet --project-run project.toml --stages build train

What the pipeline injects

  • dataset roots and annotation paths
  • resize and dataloader settings
  • optimizer and schedule choices
  • workdir, logging, and evaluator wiring

This is why the pipeline path is better for benchmarkable work: the resulting workdir keeps the model definition, runtime configuration, checkpointing, and evaluation outputs aligned.

What gets written

  • native-manifest.json for the run summary
  • checkpoints/ for the retained checkpoint files
  • native Lightning trainer outputs in the workdir

Compatibility path

Use the lightweight path only when you want a fast local check without the full native experiment stack.

from simpledet.detectors.train import train

result = train(config={
    "dataset": "data/instances_train.json",
    "format": "coco",
    "output_dir": "runs/exp001",
    "model_name": "faster_rcnn_resnet50_fpn",
    "epochs": 2,
})