Recommended training path
Use the project pipeline path when you want consistent checkpoints, logs, evaluator outputs, validation, and one place to rerun the experiment later.
from simpledet import create_project_pipeline
from simpledet.suite import build_detector, build_neck
detector_spec = build_detector(
"vfnet",
encoder="convnext_tiny.in12k_ft_in1k",
neck=build_neck(name="FPN", out_channels=192, num_outs=5),
num_classes=3,
in_channels=3,
)
pipeline = create_project_pipeline(
dataset_root="/data/project",
detector_spec=detector_spec,
result_folder="results/project",
in_channels=3,
tif_channels_to_load=[1, 2, 3],
categories=("car", "building", "ship"),
resize=768,
batch_size=2,
learning_rate=0.001,
max_epochs=30,
)
pipeline.validate_paths(strict=True)
pipeline.run(stages=("build", "train"))
The detector spec controls what model is trained. The pipeline arguments control how and where it is trained.
In particular, categories must match the semantic class list expected by the run, while
num_classes on the detector spec controls the size of the prediction heads.
Direct non-config execution
If you want the same operational path without writing a project file or handling a pipeline object explicitly, use the direct execution helper.
from simpledet import run_training
result = run_training(
dataset_root="/data/project",
detector_spec=detector_spec,
categories=("car", "building", "ship"),
in_channels=3,
tif_channels_to_load=[1, 2, 3],
resize=768,
batch_size=2,
learning_rate=0.001,
max_epochs=30,
)
This still builds the project-layout pipeline under the hood. The difference is only the interface:
you pass the run arguments directly and receive the stage outputs immediately.
Config-driven operations
When you want a repeatable operational package workflow, move the same settings into a project config file and run them through the CLI.
python -m simpledet --project-validate project.toml
python -m simpledet --project-run project.toml --stages build train
What the pipeline injects
- dataset roots and annotation paths
- resize and dataloader settings
- optimizer and schedule choices
- workdir, logging, and evaluator wiring
This is why the pipeline path is better for benchmarkable work: the resulting workdir keeps the model definition,
runtime configuration, checkpointing, and evaluation outputs aligned.
What gets written
native-manifest.json for the run summary
checkpoints/ for the retained checkpoint files
- native Lightning trainer outputs in the workdir
Compatibility path
Use the lightweight path only when you want a fast local check without the full native experiment stack.
from simpledet.detectors.train import train
result = train(config={
"dataset": "data/instances_train.json",
"format": "coco",
"output_dir": "runs/exp001",
"model_name": "faster_rcnn_resnet50_fpn",
"epochs": 2,
})