HydraNet Docs

HydraNet Repository Overview

HydraNet is a repository for loading, training, and exporting student and teacher models, including a routerset-backed Mixture-of-Experts path called phidranet.

What This Repository Contains

Student model loading Teacher model loading MoE switcher training Full training and export Routerset data preparation ONNX export utilities

Main Model Paths

Student

The default student path loads the HydraNet student model and its preset variants.

Teacher

The teacher path loads PhiSatNet-style downstream models for supervised tasks.

MoE Student

The MoE path shares one encoder, uses a routing switcher, and activates task-specific decoder experts.

Export

Scripts support export workflows such as ONNX conversion and packaged bundle generation.

Key Directories

  • src/hydranet/: package entrypoints, loading helpers, and training orchestration
  • src/hydranet/models/: model definitions for student, teacher, and MoE components
  • scripts/: train, smoke-test, export, and inference entrypoints
  • routerset/: dataset manifests and routerset-specific docs
  • tests/: regression coverage for training and helpers
  • docs/: static documentation pages and diagrams

Common Workflows

  1. Install or activate the Python environment.
  2. Load a student or teacher model for inference.
  3. Prepare the routerset dataset and runtime caches for MoE work.
  4. Run preflight or smoke tests before longer training runs.
  5. Train the MoE switcher or launch the full train/export flow.
The README remains the source of truth for the concrete commands and environment variables.

Important Entry Points

  • scripts/train_moe_switcher.py: train-only switcher entrypoint
  • scripts/full_train_moe.py: full training plus export orchestration
  • scripts/smoke_test_moe.py: faster functional end-to-end smoke path
  • scripts/export_onnx.py: ONNX export helper
  • src/hydranet/loading.py: public loading helpers

Mixture-of-Experts Overview

The MoE path is the most specialized workflow in this repository. It combines one shared encoder, one learned routing switcher, and multiple task-specific decoder experts aligned to routerset.

If you want the model topology, routing behavior, and training explanation in one place, open the dedicated model page:

HydraNet MoE Model Guide