Core Module
The core module contains the fundamental components of PyNAS for neural architecture search.
Population
- class pynas.core.population.Population(n_individuals, max_layers, dm, max_parameters=100_000, save_directory=None)[source]
Bases:
object
- __init__(n_individuals, max_layers, dm, max_parameters=100_000, save_directory=None)[source]
Initialize a new population for the evolutionary neural architecture search.
- Parameters:
n_individuals (int) – Number of individuals in the population
max_layers (int) – Maximum number of layers in an individual’s architecture
dm (object) – Data module for model creation and evaluation
max_parameters (int, optional) – Maximum number of parameters allowed in a model. Defaults to 100,000.
save_directory (str, optional) – Directory to save models and checkpoints. Defaults to “./models_traced”.
- Raises:
ValueError – If input parameters are invalid (negative values, none data module)
- static setup_logger(log_file='./logs/population.log', log_level=logging.DEBUG)[source]
Set up a logger for the population module.
If the log file already exists, create a new one by appending a timestamp to the filename.
- Parameters:
- Returns:
Configured logger instance.
- Return type:
- create_random_individual(max_attempts=5)[source]
Create a random individual with a random number of layers.
This function attempts to create a valid random individual with proper error handling and retry logic to ensure robustness.
- Parameters:
max_attempts (int) – Maximum number of attempts to create a valid individual. Defaults to 5.
- Returns:
A valid random individual.
- Return type:
- Raises:
RuntimeError – If unable to create a valid individual after max_attempts.
- check_individual(individual)[source]
Validate if an individual can be built into a functional model with acceptable parameters.
This method: 1. Validates the input individual object 2. Attempts to build a model from the individual’s genetic representation 3. Evaluates the model’s parameter count 4. Ensures the model meets size constraints 5. Updates the individual with its model_size
- Parameters:
individual (Individual) – The individual to check
- Returns:
True if the individual is valid, False otherwise
- Return type:
- create_population(max_attempts=200, timeout_seconds=300)[source]
Create a population of unique, valid individuals.
This function generates random individuals and checks if they’re valid using check_individual. It includes comprehensive error handling, duplicate removal, and recovery mechanisms.
- Parameters:
- Returns:
A list of unique, valid individuals.
- Return type:
- Raises:
RuntimeError – If unable to generate a complete population after max_attempts
- elite_models(k_best=1)[source]
Retrieve the top k_best elite models from the current population based on fitness.
The population is sorted in descending order based on the fitness attribute of each individual. This function then returns deep copies of the top k_best individuals to ensure that the original models remain immutable during further operations.
- evolve(mating_pool_cutoff=0.5, mutation_probability=0.85, k_best=1, n_random=3)[source]
Generates a new population ensuring that the total number of individuals equals pop.n_individuals.
- Parameters:
pop – List or collection of individuals. Assumed to have attributes: .n_individuals and .generation.
mating_pool_cutoff – Fraction determining the size of the mating pool (top percent of individuals).
mutation_probability – The probability to use during mutation.
k_best – The number of best individuals from the current population to retain.
- Returns:
A list representing the new generation of individuals.
- Return type:
new_population
Note
Assumes that helper functions single_point_crossover(), mutation(), and create_random_individual() exist.
- remove_duplicates(population)[source]
Remove duplicates from the given population by replacing duplicates with newly generated unique individuals.
- build_model(parsed_layers, task='segmentation')[source]
Build a model based on the provided parsed layers.
This function creates an encoder using the parsed layers and constructs a model by combining the encoder with a head layer via the ModelConstructor. The constructed model is built to process inputs defined by the data module (dm).
- Parameters:
parsed_layers – The parsed architecture configuration used by the encoder to build the network.
- Returns:
A PyTorch model constructed with the encoder and head layer.
- evaluate_parameters(model)[source]
Calculate the total number of parameters of the given model.
- Parameters:
model (torch.nn.Module) – The PyTorch model.
- Returns:
The total number of parameters.
- Return type:
- save_dataframe()[source]
Save the DataFrame containing the population statistics to a pickle file.
The DataFrame is saved at a path that includes the current generation number. In case of an error during saving, the exception details are printed.
- Returns:
None
- train_individual(idx, task, epochs=20, lr=1e-3, batch_size=None)[source]
Train the individual using the data module and the specified number of epochs and learning rate.
- Parameters:
individual (Individual) – The individual to train.
epochs (int) – The number of epochs to train the individual. Defaults to 20.
lr (float) – The learning rate to use during training. Defaults to 1e-3.
- Returns:
None
- train_generation(task='classification', lr=0.001, epochs=4, batch_size=32)[source]
Train all individuals in the current generation that have not been trained yet.
- save_model(LM, save_torchscript=True, ts_save_path=None, save_standard=True, std_save_path=None, save_myriad=True, openvino_save_path=None)[source]
Individual
- class pynas.core.individual.Individual(max_layers, min_layers=3)[source]
Bases:
object
The Individual class represents an individual entity in the genetic algorithm or evolutionary computation context. It encapsulates the architecture, chromosome, and associated properties such as fitness, IOU (Intersection over Union), FPS (Frames Per Second), and model size. The class provides methods for converting between architecture and chromosome representations, resetting properties, and creating deep copies of the individual. .. attribute:: architecture
The architecture code representing the individual’s structure.
- type:
str
- __init__(max_layers, min_layers=3)[source]
Initializes an individual with a random architecture and its corresponding chromosome.
- architecture2chromosome(input_architecture)[source]
Converts an architecture code into a chromosome list.
- chromosome2architecture(input_chromosome)[source]
Converts a chromosome list back into an architecture code.
- architecture2chromosome(input_architecture)[source]
Converts an architecture code into a chromosome list by splitting the architecture code using ‘E’. This method also handles the case where the architecture ends with ‘EE’, avoiding an empty string at the end of the list.
- chromosome2architecture(input_chromosome)[source]
Converts the chromosome list back into an architecture code by joining the list items with ‘E’ and ensuring the architecture ends with ‘EE’.
Architecture Builder
- pynas.core.architecture_builder.generate_random_architecture_code(min_layers=3, max_layers=5)[source]
Generates a random architecture code string consisting of layers and pooling layers. The function creates a sequence of encoder layers and pooling layers, appending them to form a string representation of an architecture. Each layer and pooling layer is separated by an “E”. The architecture ends with an additional “E”. :param min_layers: The minimum number of layers to include in the architecture. :type min_layers: int :param max_layers: The maximum number of layers to include in the architecture. :type max_layers: int
- pynas.core.architecture_builder.generate_layer_code()[source]
Generates a string representation of a neural network layer configuration. This function randomly selects a layer type from a predefined vocabulary and generates a corresponding layer code based on its parameters. The parameters are configured using values from a config.ini file. The generated code includes details such as activation type, kernel size, padding, stride, dropout rate, and other layer-specific attributes. :returns: A string representing the configuration of the generated layer. :rtype: str
- pynas.core.architecture_builder.generate_pooling_layer_code()[source]
Generates a string representing a pooling layer configuration.
The function randomly selects a pooling type from the pooling_layer_vocabulary and combines it with a predefined pooling factor to create a pooling layer code.
- Returns:
- A string representing the pooling layer configuration in the format
”P<pooling_type><pooling_factor>”.
- Return type:
- pynas.core.architecture_builder.generate_upsampling_layer_code(scale_factor=2)[source]
Generates a string representing the configuration of an upsampling layer.
This function reads the available upsampling modes from a configuration file (‘config.ini’) under the ‘Upsample’ section and randomly selects one of the modes. It then combines the scale factor and the selected mode into a formatted string.
- pynas.core.architecture_builder.generate_skip_connection_code(layer_index)[source]
Generates a string representing a skip connection identifier for a given layer index.
- pynas.core.architecture_builder.parse_architecture_code(architecture_code)[source]
Parses a given architecture code string into a list of layer configurations. The function interprets the architecture code by splitting it into segments, identifying the type of each segment (e.g., convolution, pooling, upsampling, etc.), and extracting the associated parameters based on predefined vocabularies and rules. :param architecture_code: A string representing the architecture code. Each segment
of the code corresponds to a layer or operation, with specific characters denoting the type and parameters of the layer.
- Returns:
- A list of dictionaries, where each dictionary represents a parsed layer
or operation. Each dictionary contains: - ‘layer_type’ (str): The type of the layer (e.g., “Convolution”, “Pooling”). - Additional keys for parameters specific to the layer type, such as:
’scale_factor’ (int): The scale factor for upsampling layers.
’mode’ (str): The mode for certain operations.
’dropout_rate’ (float): The dropout rate for layers with dropout.
’activation’ (str): The activation function for layers with activation.
’out_channels_coefficient’ (int): Coefficient for output channels.
Other parameters as defined in the layer’s parameter vocabulary.
- Return type:
Notes
- The function relies on several predefined vocabularies and mappings:
convolution_layer_vocabulary: Maps codes to convolution layer types.
pooling_layer_vocabulary: Maps codes to pooling layer types.
head_vocabulary: Maps codes to head layer types.
upsampling_layer_vocabulary: Maps codes to upsampling layer types.
layer_parameters: Defines expected parameters for each layer type.
parameter_vocabulary: Maps parameter codes to parameter names.
activation_functions_vocabulary: Maps activation codes to function names.
Segments with unknown or unsupported codes are assigned “Unknown” as the layer type.
Skip connections are explicitly identified with the type “SkipConnection”.
Example
architecture_code = “L1f2mRPE2H3” parsed_layers = parse_architecture_code(architecture_code) # parsed_layers will be a list of dictionaries representing the parsed layers.
- pynas.core.architecture_builder.generate_code_from_parsed_architecture(parsed_layers)[source]
Generates a compact string representation of a neural network architecture based on a list of parsed layer configurations. The function converts each layer’s type and parameters into a coded segment using predefined vocabularies and appends them together to form the final architecture code. Each layer segment ends with “E”, and the entire architecture code ends with “EE”. :param parsed_layers: A list of dictionaries where each dictionary
represents a layer configuration. Each dictionary must contain a ‘layer_type’ key and may include additional parameters specific to the layer type.
- Returns:
A string representing the encoded architecture.
- Return type:
Notes
The function uses reverse mappings of predefined vocabularies to encode layer types and parameters.
Special handling is applied for certain parameters like ‘activation’, ‘dropout_rate’, and ‘out_channels_coefficient’.
Layer types such as “Dropout”, “Upsample”, and “SkipConnection” have specific encoding rules.
Generic U-Net
- class pynas.core.generic_unet.UNetDecoder(encoder_shapes, num_classes=2, output_shape=None)[source]
Bases:
Module
A PyTorch implementation of a U-Net decoder module. This class implements the decoder part of the U-Net architecture, which reconstructs the output from the bottleneck features by progressively upsampling and combining them with skip connections from the encoder. .. attribute:: num_stages
The number of decoding stages, equal to the number of skip connections.
- type:
int
- up_convs
A list of transposed convolution layers for upsampling.
- Type:
nn.ModuleList
- conv_blocks
A list of convolutional blocks for processing concatenated upsampled and skip connection features.
- Type:
nn.ModuleList
- out_conv
The final convolutional layer that produces the output.
- Type:
nn.Conv2d
- Parameters:
encoder_shapes (list of torch.Size) – A list of shapes of the encoder features in the order: [skip0, skip1, …, skip_(N-1), bottleneck]. Each shape is expected to be a torch.Size object.
num_classes (int, optional) – The number of output classes. Default is 2.
- forward(encoder_features)[source]
Performs the forward pass of the decoder. :param encoder_features: A list of encoder feature maps in the order:
[skip0, skip1, …, skip_(N-1), bottleneck]. The number of feature maps must match the number of stages + 1.
- Returns:
The output tensor after decoding.
- Return type:
- __init__(encoder_shapes, num_classes=2, output_shape=None)[source]
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(encoder_features, verbose=False)[source]
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class pynas.core.generic_unet.GenericUNetNetwork(parsed_layers, input_channels=3, input_height=256, input_width=256, num_classes=2, MaxParams=200_000_00, encoder_only=False)[source]
Bases:
Module
GenericUNetNetwork is a PyTorch-based implementation of a generic U-Net architecture. This class allows for flexible construction of U-Net models by parsing layer configurations and dynamically building the encoder and decoder components.
parsed_layers (list): A list of layer configurations for building the encoder. input_channels (int): Number of input channels for the input tensor. Default is 3. input_height (int): Height of the input tensor. Default is 256. input_width (int): Width of the input tensor. Default is 256. num_classes (int): Number of output classes for the segmentation task. Default is 2. max_params (int): Maximum allowed number of parameters for the model. Default is 200,000,000. encoder (nn.ModuleList): A list of layers forming the encoder part of the U-Net. decoder (nn.ModuleList): A list of layers forming the decoder part of the U-Net. encoder_shapes (list): A list of shapes of the encoder outputs for use in the decoder. total_params (int): Total number of parameters in the model. config (ConfigParser): Configuration parser for reading additional settings from ‘config.ini’.
- __init__(self, parsed_layers, input_channels=3, input_height=256, input_width=256, num_classes=2, MaxParams=200_000_000)[source]
Initializes the GenericUNetNetwork with the given parameters and builds the encoder and decoder.
- encoder_forward(self, x, features_only=True)[source]
Performs a forward pass through the encoder and optionally returns only the encoder features.
- _encoder_shapes_tracing(self)[source]
Creates a dummy forward pass through the encoder to determine the shapes of the encoder outputs.
- _build_decoder(self)[source]
Builds the decoder component of the U-Net model using the encoder shapes and number of output classes.
- forward(self, x)[source]
Defines the forward pass of the model, passing the input through the encoder and decoder.
- get_activation_fn(activation)[source]
Retrieves the specified activation function from the activations module.
- __init__(parsed_layers, input_channels=3, input_height=256, input_width=256, num_classes=2, MaxParams=200_000_00, encoder_only=False)[source]
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]
Defines the forward pass of the model.
- Parameters:
x (torch.Tensor) – Input tensor to the model.
- Returns:
Output tensor after passing through the encoder and decoder.
- Return type:
- static get_activation_fn(activation)[source]
Retrieves the specified activation function from the activations module.
- Parameters:
activation (str) – The name of the activation function to retrieve.
- Returns:
- The activation function corresponding to the given name.
If the specified activation function is not found, defaults to activations.ReLU.
- Return type:
Callable
- pynas.core.generic_unet.list_convolution_layers()[source]
Retrieves a list of all classes defined in the convolutions module.
This function uses the inspect module to dynamically inspect the convolutions module and collect all objects that are classes.
- Returns:
A list of class objects defined in the convolutions module.
- Return type:
- pynas.core.generic_unet.build_layer(layer, config, current_channels, current_height, current_width, idx, get_activation_fn)[source]
Builds a neural network layer based on the provided configuration. :param layer: A dictionary containing the layer configuration. Must include the key ‘layer_type’
which specifies the type of layer to build.
- Parameters:
config (dict) – A dictionary containing default configurations for various layer types.
current_channels (int) – The number of input channels to the layer.
current_height (int) – The height of the input tensor to the layer.
current_width (int) – The width of the input tensor to the layer.
idx (int) – The index of the layer in the model (used for debugging or logging purposes).
get_activation_fn (callable) – A function that takes an activation name (str) and returns the corresponding activation function.
- Returns:
- A tuple containing:
layer_inst (nn.Module): The instantiated layer object.
current_channels (int): The number of output channels after the layer.
current_height (int): The height of the output tensor after the layer.
current_width (int): The width of the output tensor after the layer.
- Return type:
- Raises:
ValueError – If the ‘layer_type’ in the layer dictionary is unknown or unsupported.
- Supported Layer Types:
‘ConvAct’, ‘ConvBnAct’, ‘ConvSE’: Convolutional layers with optional batch normalization and activation.
‘MBConv’, ‘MBConvNoRes’: MobileNetV2-style inverted residual blocks.
‘CSPConvBlock’, ‘CSPMBConvBlock’: Cross Stage Partial blocks for convolution or MBConv.
‘DenseNetBlock’: DenseNet-style block with concatenated outputs.
‘ResNetBlock’: ResNet-style residual block.
‘AvgPool’, ‘MaxPool’: Pooling layers (average or max pooling).
‘Dropout’: Dropout layer for regularization.
- pynas.core.generic_unet.parse_conv_params(layer, config, key, current_channels, current_height, current_width)[source]
Parse convolutional layer parameters and calculate output dimensions.
This function extracts parameters for a convolutional layer from the provided configuration, calculates the output dimensions, and returns all necessary values for setting up a convolutional layer.
- Parameters:
layer (dict) – Dictionary containing layer-specific configuration parameters.
config (dict) – Dictionary containing default configuration parameters.
key (str) – Key to access specific configurations within the config dictionary.
current_channels (int) – Number of input channels for the current layer.
current_height (int) – Height of the input feature map.
current_width (int) – Width of the input feature map.
- Returns:
- A tuple containing:
kernel_size (int): Size of the convolutional kernel.
stride (int): Stride of the convolution.
padding (int): Padding added to input feature map.
out_channels (int): Number of output channels.
new_height (int): Height of the output feature map after convolution.
new_width (int): Width of the output feature map after convolution.
- Return type:
Lightning Module
- class pynas.core.generic_lightning_module.GenericLightningNetwork(model, num_classes, learning_rate=1e-3)[source]
Bases:
LightningModule
- forward(x)[source]
Same as
torch.nn.Module.forward()
.- Parameters:
*args – Whatever you decide to pass into the forward method.
**kwargs – Keyword arguments are also possible.
- Returns:
Your model’s output
- training_step(batch, batch_idx)[source]
Here you compute and return the training loss and some additional metrics for e.g. the progress bar or logger.
- Parameters:
batch – The output of your data iterable, normally a
DataLoader
.batch_idx – The index of this batch.
dataloader_idx – The index of the dataloader that produced this batch. (only if multiple dataloaders used)
- Returns:
Tensor
- The loss tensordict
- A dictionary which can include any keys, but must include the key'loss'
in the case of automatic optimization.None
- In automatic optimization, this will skip to the next batch (but is not supported for multi-GPU, TPU, or DeepSpeed). For manual optimization, this has no special meaning, as returning the loss is not required.
In this step you’d normally do the forward pass and calculate the loss for a batch. You can also do fancier things like multiple forward passes or something model specific.
Example:
def training_step(self, batch, batch_idx): x, y, z = batch out = self.encoder(x) loss = self.loss(out, x) return loss
To use multiple optimizers, you can switch to ‘manual optimization’ and control their stepping:
def __init__(self): super().__init__() self.automatic_optimization = False # Multiple optimizers (e.g.: GANs) def training_step(self, batch, batch_idx): opt1, opt2 = self.optimizers() # do training_step with encoder ... opt1.step() # do training_step with decoder ... opt2.step()
Note
When
accumulate_grad_batches
> 1, the loss returned here will be automatically normalized byaccumulate_grad_batches
internally.
- validation_step(batch, batch_idx)[source]
Operates on a single batch of data from the validation set. In this step you’d might generate examples or calculate anything of interest like accuracy.
- Parameters:
batch – The output of your data iterable, normally a
DataLoader
.batch_idx – The index of this batch.
dataloader_idx – The index of the dataloader that produced this batch. (only if multiple dataloaders used)
- Returns:
Tensor
- The loss tensordict
- A dictionary. Can include any keys, but must include the key'loss'
.None
- Skip to the next batch.
# if you have one val dataloader: def validation_step(self, batch, batch_idx): ... # if you have multiple val dataloaders: def validation_step(self, batch, batch_idx, dataloader_idx=0): ...
Examples:
# CASE 1: A single validation dataset def validation_step(self, batch, batch_idx): x, y = batch # implement your own out = self(x) loss = self.loss(out, y) # log 6 example images # or generated text... or whatever sample_imgs = x[:6] grid = torchvision.utils.make_grid(sample_imgs) self.logger.experiment.add_image('example_images', grid, 0) # calculate acc labels_hat = torch.argmax(out, dim=1) val_acc = torch.sum(y == labels_hat).item() / (len(y) * 1.0) # log the outputs! self.log_dict({'val_loss': loss, 'val_acc': val_acc})
If you pass in multiple val dataloaders,
validation_step()
will have an additional argument. We recommend setting the default value of 0 so that you can quickly switch between single and multiple dataloaders.# CASE 2: multiple validation dataloaders def validation_step(self, batch, batch_idx, dataloader_idx=0): # dataloader_idx tells you which dataset this is. ...
Note
If you don’t need to validate you don’t need to implement this method.
Note
When the
validation_step()
is called, the model has been put in eval mode and PyTorch gradients have been disabled. At the end of validation, the model goes back to training mode and gradients are enabled.
- test_step(batch, batch_idx)[source]
Operates on a single batch of data from the test set. In this step you’d normally generate examples or calculate anything of interest such as accuracy.
- Parameters:
batch – The output of your data iterable, normally a
DataLoader
.batch_idx – The index of this batch.
dataloader_idx – The index of the dataloader that produced this batch. (only if multiple dataloaders used)
- Returns:
Tensor
- The loss tensordict
- A dictionary. Can include any keys, but must include the key'loss'
.None
- Skip to the next batch.
# if you have one test dataloader: def test_step(self, batch, batch_idx): ... # if you have multiple test dataloaders: def test_step(self, batch, batch_idx, dataloader_idx=0): ...
Examples:
# CASE 1: A single test dataset def test_step(self, batch, batch_idx): x, y = batch # implement your own out = self(x) loss = self.loss(out, y) # log 6 example images # or generated text... or whatever sample_imgs = x[:6] grid = torchvision.utils.make_grid(sample_imgs) self.logger.experiment.add_image('example_images', grid, 0) # calculate acc labels_hat = torch.argmax(out, dim=1) test_acc = torch.sum(y == labels_hat).item() / (len(y) * 1.0) # log the outputs! self.log_dict({'test_loss': loss, 'test_acc': test_acc})
If you pass in multiple test dataloaders,
test_step()
will have an additional argument. We recommend setting the default value of 0 so that you can quickly switch between single and multiple dataloaders.# CASE 2: multiple test dataloaders def test_step(self, batch, batch_idx, dataloader_idx=0): # dataloader_idx tells you which dataset this is. ...
Note
If you don’t need to test you don’t need to implement this method.
Note
When the
test_step()
is called, the model has been put in eval mode and PyTorch gradients have been disabled. At the end of the test epoch, the model goes back to training mode and gradients are enabled.
- predict_step(batch, _)[source]
Step function called during
predict()
. By default, it callsforward()
. Override to add any processing logic.The
predict_step()
is used to scale inference on multi-devices.To prevent an OOM error, it is possible to use
BasePredictionWriter
callback to write the predictions to disk or database after each batch or on epoch end.The
BasePredictionWriter
should be used while using a spawn based accelerator. This happens forTrainer(strategy="ddp_spawn")
or training on 8 TPU cores withTrainer(accelerator="tpu", devices=8)
as predictions won’t be returned.- Parameters:
batch – The output of your data iterable, normally a
DataLoader
.batch_idx – The index of this batch.
dataloader_idx – The index of the dataloader that produced this batch. (only if multiple dataloaders used)
- Returns:
Predicted output (optional).
Example
class MyModel(LightningModule): def predict_step(self, batch, batch_idx, dataloader_idx=0): return self(batch) dm = ... model = MyModel() trainer = Trainer(accelerator="gpu", devices=2) predictions = trainer.predict(model, dm)
- configure_optimizers()[source]
Choose what optimizers and learning-rate schedulers to use in your optimization. Normally you’d need one. But in the case of GANs or similar you might have multiple. Optimization with multiple optimizers only works in the manual optimization mode.
- Returns:
Any of these 6 options.
Single optimizer.
List or Tuple of optimizers.
Two lists - The first list has multiple optimizers, and the second has multiple LR schedulers (or multiple
lr_scheduler_config
).Dictionary, with an
"optimizer"
key, and (optionally) a"lr_scheduler"
key whose value is a single LR scheduler orlr_scheduler_config
.None - Fit will run without any optimizer.
The
lr_scheduler_config
is a dictionary which contains the scheduler and its associated configuration. The default configuration is shown below.lr_scheduler_config = { # REQUIRED: The scheduler instance "scheduler": lr_scheduler, # The unit of the scheduler's step size, could also be 'step'. # 'epoch' updates the scheduler on epoch end whereas 'step' # updates it after a optimizer update. "interval": "epoch", # How many epochs/steps should pass between calls to # `scheduler.step()`. 1 corresponds to updating the learning # rate after every epoch/step. "frequency": 1, # Metric to to monitor for schedulers like `ReduceLROnPlateau` "monitor": "val_loss", # If set to `True`, will enforce that the value specified 'monitor' # is available when the scheduler is updated, thus stopping # training if not found. If set to `False`, it will only produce a warning "strict": True, # If using the `LearningRateMonitor` callback to monitor the # learning rate progress, this keyword can be used to specify # a custom logged name "name": None, }
When there are schedulers in which the
.step()
method is conditioned on a value, such as thetorch.optim.lr_scheduler.ReduceLROnPlateau
scheduler, Lightning requires that thelr_scheduler_config
contains the keyword"monitor"
set to the metric name that the scheduler should be conditioned on.Metrics can be made available to monitor by simply logging it using
self.log('metric_to_track', metric_val)
in yourLightningModule
.Note
Some things to know:
Lightning calls
.backward()
and.step()
automatically in case of automatic optimization.If a learning rate scheduler is specified in
configure_optimizers()
with key"interval"
(default “epoch”) in the scheduler configuration, Lightning will call the scheduler’s.step()
method automatically in case of automatic optimization.If you use 16-bit precision (
precision=16
), Lightning will automatically handle the optimizer.If you use
torch.optim.LBFGS
, Lightning handles the closure function automatically for you.If you use multiple optimizers, you will have to switch to ‘manual optimization’ mode and step them yourself.
If you need to control how often the optimizer steps, override the
optimizer_step()
hook.
- class pynas.core.generic_lightning_module.GenericLightningSegmentationNetwork(model, learning_rate=1e-3)[source]
Bases:
LightningModule
GenericLightningSegmentationNetwork is a PyTorch Lightning module designed for segmentation tasks. It wraps a given model and provides training, validation, testing, and prediction steps, along with logging for loss, mean squared error (MSE), and intersection over union (IoU). .. attribute:: model
The segmentation model to be trained and evaluated.
- type:
torch.nn.Module
- loss_fn
The loss function used for training. Default is FocalLoss.
- Type:
callable
- mse
Metric to compute mean squared error.
- Type:
torchmetrics.Metric
- iou
Function to calculate intersection over union (IoU).
- Type:
callable
- _common_step(batch, batch_idx)[source]
Computes the loss, MSE, and IoU for a given batch. Used internally by training, validation, and test steps.
- training_step(batch, batch_idx)[source]
Defines the training step, computes metrics, and logs them.
- validation_step(batch, batch_idx)[source]
Defines the validation step, computes metrics, and logs them.
- predict_step(batch, batch_idx, dataloader_idx=0)[source]
Defines the prediction step, returning the model’s output for a given batch.
- configure_optimizers()[source]
Configures the optimizer for training. Uses Adam optimizer with the specified learning rate.
- forward(x)[source]
Same as
torch.nn.Module.forward()
.- Parameters:
*args – Whatever you decide to pass into the forward method.
**kwargs – Keyword arguments are also possible.
- Returns:
Your model’s output
- training_step(batch, batch_idx)[source]
Here you compute and return the training loss and some additional metrics for e.g. the progress bar or logger.
- Parameters:
batch – The output of your data iterable, normally a
DataLoader
.batch_idx – The index of this batch.
dataloader_idx – The index of the dataloader that produced this batch. (only if multiple dataloaders used)
- Returns:
Tensor
- The loss tensordict
- A dictionary which can include any keys, but must include the key'loss'
in the case of automatic optimization.None
- In automatic optimization, this will skip to the next batch (but is not supported for multi-GPU, TPU, or DeepSpeed). For manual optimization, this has no special meaning, as returning the loss is not required.
In this step you’d normally do the forward pass and calculate the loss for a batch. You can also do fancier things like multiple forward passes or something model specific.
Example:
def training_step(self, batch, batch_idx): x, y, z = batch out = self.encoder(x) loss = self.loss(out, x) return loss
To use multiple optimizers, you can switch to ‘manual optimization’ and control their stepping:
def __init__(self): super().__init__() self.automatic_optimization = False # Multiple optimizers (e.g.: GANs) def training_step(self, batch, batch_idx): opt1, opt2 = self.optimizers() # do training_step with encoder ... opt1.step() # do training_step with decoder ... opt2.step()
Note
When
accumulate_grad_batches
> 1, the loss returned here will be automatically normalized byaccumulate_grad_batches
internally.
- validation_step(batch, batch_idx)[source]
Operates on a single batch of data from the validation set. In this step you’d might generate examples or calculate anything of interest like accuracy.
- Parameters:
batch – The output of your data iterable, normally a
DataLoader
.batch_idx – The index of this batch.
dataloader_idx – The index of the dataloader that produced this batch. (only if multiple dataloaders used)
- Returns:
Tensor
- The loss tensordict
- A dictionary. Can include any keys, but must include the key'loss'
.None
- Skip to the next batch.
# if you have one val dataloader: def validation_step(self, batch, batch_idx): ... # if you have multiple val dataloaders: def validation_step(self, batch, batch_idx, dataloader_idx=0): ...
Examples:
# CASE 1: A single validation dataset def validation_step(self, batch, batch_idx): x, y = batch # implement your own out = self(x) loss = self.loss(out, y) # log 6 example images # or generated text... or whatever sample_imgs = x[:6] grid = torchvision.utils.make_grid(sample_imgs) self.logger.experiment.add_image('example_images', grid, 0) # calculate acc labels_hat = torch.argmax(out, dim=1) val_acc = torch.sum(y == labels_hat).item() / (len(y) * 1.0) # log the outputs! self.log_dict({'val_loss': loss, 'val_acc': val_acc})
If you pass in multiple val dataloaders,
validation_step()
will have an additional argument. We recommend setting the default value of 0 so that you can quickly switch between single and multiple dataloaders.# CASE 2: multiple validation dataloaders def validation_step(self, batch, batch_idx, dataloader_idx=0): # dataloader_idx tells you which dataset this is. ...
Note
If you don’t need to validate you don’t need to implement this method.
Note
When the
validation_step()
is called, the model has been put in eval mode and PyTorch gradients have been disabled. At the end of validation, the model goes back to training mode and gradients are enabled.
- test_step(batch, batch_idx)[source]
Operates on a single batch of data from the test set. In this step you’d normally generate examples or calculate anything of interest such as accuracy.
- Parameters:
batch – The output of your data iterable, normally a
DataLoader
.batch_idx – The index of this batch.
dataloader_idx – The index of the dataloader that produced this batch. (only if multiple dataloaders used)
- Returns:
Tensor
- The loss tensordict
- A dictionary. Can include any keys, but must include the key'loss'
.None
- Skip to the next batch.
# if you have one test dataloader: def test_step(self, batch, batch_idx): ... # if you have multiple test dataloaders: def test_step(self, batch, batch_idx, dataloader_idx=0): ...
Examples:
# CASE 1: A single test dataset def test_step(self, batch, batch_idx): x, y = batch # implement your own out = self(x) loss = self.loss(out, y) # log 6 example images # or generated text... or whatever sample_imgs = x[:6] grid = torchvision.utils.make_grid(sample_imgs) self.logger.experiment.add_image('example_images', grid, 0) # calculate acc labels_hat = torch.argmax(out, dim=1) test_acc = torch.sum(y == labels_hat).item() / (len(y) * 1.0) # log the outputs! self.log_dict({'test_loss': loss, 'test_acc': test_acc})
If you pass in multiple test dataloaders,
test_step()
will have an additional argument. We recommend setting the default value of 0 so that you can quickly switch between single and multiple dataloaders.# CASE 2: multiple test dataloaders def test_step(self, batch, batch_idx, dataloader_idx=0): # dataloader_idx tells you which dataset this is. ...
Note
If you don’t need to test you don’t need to implement this method.
Note
When the
test_step()
is called, the model has been put in eval mode and PyTorch gradients have been disabled. At the end of the test epoch, the model goes back to training mode and gradients are enabled.
- predict_step(batch, batch_idx, dataloader_idx=0)[source]
Step function called during
predict()
. By default, it callsforward()
. Override to add any processing logic.The
predict_step()
is used to scale inference on multi-devices.To prevent an OOM error, it is possible to use
BasePredictionWriter
callback to write the predictions to disk or database after each batch or on epoch end.The
BasePredictionWriter
should be used while using a spawn based accelerator. This happens forTrainer(strategy="ddp_spawn")
or training on 8 TPU cores withTrainer(accelerator="tpu", devices=8)
as predictions won’t be returned.- Parameters:
batch – The output of your data iterable, normally a
DataLoader
.batch_idx – The index of this batch.
dataloader_idx – The index of the dataloader that produced this batch. (only if multiple dataloaders used)
- Returns:
Predicted output (optional).
Example
class MyModel(LightningModule): def predict_step(self, batch, batch_idx, dataloader_idx=0): return self(batch) dm = ... model = MyModel() trainer = Trainer(accelerator="gpu", devices=2) predictions = trainer.predict(model, dm)
- configure_optimizers()[source]
Choose what optimizers and learning-rate schedulers to use in your optimization. Normally you’d need one. But in the case of GANs or similar you might have multiple. Optimization with multiple optimizers only works in the manual optimization mode.
- Returns:
Any of these 6 options.
Single optimizer.
List or Tuple of optimizers.
Two lists - The first list has multiple optimizers, and the second has multiple LR schedulers (or multiple
lr_scheduler_config
).Dictionary, with an
"optimizer"
key, and (optionally) a"lr_scheduler"
key whose value is a single LR scheduler orlr_scheduler_config
.None - Fit will run without any optimizer.
The
lr_scheduler_config
is a dictionary which contains the scheduler and its associated configuration. The default configuration is shown below.lr_scheduler_config = { # REQUIRED: The scheduler instance "scheduler": lr_scheduler, # The unit of the scheduler's step size, could also be 'step'. # 'epoch' updates the scheduler on epoch end whereas 'step' # updates it after a optimizer update. "interval": "epoch", # How many epochs/steps should pass between calls to # `scheduler.step()`. 1 corresponds to updating the learning # rate after every epoch/step. "frequency": 1, # Metric to to monitor for schedulers like `ReduceLROnPlateau` "monitor": "val_loss", # If set to `True`, will enforce that the value specified 'monitor' # is available when the scheduler is updated, thus stopping # training if not found. If set to `False`, it will only produce a warning "strict": True, # If using the `LearningRateMonitor` callback to monitor the # learning rate progress, this keyword can be used to specify # a custom logged name "name": None, }
When there are schedulers in which the
.step()
method is conditioned on a value, such as thetorch.optim.lr_scheduler.ReduceLROnPlateau
scheduler, Lightning requires that thelr_scheduler_config
contains the keyword"monitor"
set to the metric name that the scheduler should be conditioned on.Metrics can be made available to monitor by simply logging it using
self.log('metric_to_track', metric_val)
in yourLightningModule
.Note
Some things to know:
Lightning calls
.backward()
and.step()
automatically in case of automatic optimization.If a learning rate scheduler is specified in
configure_optimizers()
with key"interval"
(default “epoch”) in the scheduler configuration, Lightning will call the scheduler’s.step()
method automatically in case of automatic optimization.If you use 16-bit precision (
precision=16
), Lightning will automatically handle the optimizer.If you use
torch.optim.LBFGS
, Lightning handles the closure function automatically for you.If you use multiple optimizers, you will have to switch to ‘manual optimization’ mode and step them yourself.
If you need to control how often the optimizer steps, override the
optimizer_step()
hook.
- class pynas.core.generic_lightning_module.GenericLightningNetwork_Custom(parsed_layers, model_parameters, input_channels, num_classes, learning_rate=1e-3)[source]
Bases:
LightningModule
- forward(x)[source]
Same as
torch.nn.Module.forward()
.- Parameters:
*args – Whatever you decide to pass into the forward method.
**kwargs – Keyword arguments are also possible.
- Returns:
Your model’s output
- training_step(batch, batch_idx)[source]
Here you compute and return the training loss and some additional metrics for e.g. the progress bar or logger.
- Parameters:
batch – The output of your data iterable, normally a
DataLoader
.batch_idx – The index of this batch.
dataloader_idx – The index of the dataloader that produced this batch. (only if multiple dataloaders used)
- Returns:
Tensor
- The loss tensordict
- A dictionary which can include any keys, but must include the key'loss'
in the case of automatic optimization.None
- In automatic optimization, this will skip to the next batch (but is not supported for multi-GPU, TPU, or DeepSpeed). For manual optimization, this has no special meaning, as returning the loss is not required.
In this step you’d normally do the forward pass and calculate the loss for a batch. You can also do fancier things like multiple forward passes or something model specific.
Example:
def training_step(self, batch, batch_idx): x, y, z = batch out = self.encoder(x) loss = self.loss(out, x) return loss
To use multiple optimizers, you can switch to ‘manual optimization’ and control their stepping:
def __init__(self): super().__init__() self.automatic_optimization = False # Multiple optimizers (e.g.: GANs) def training_step(self, batch, batch_idx): opt1, opt2 = self.optimizers() # do training_step with encoder ... opt1.step() # do training_step with decoder ... opt2.step()
Note
When
accumulate_grad_batches
> 1, the loss returned here will be automatically normalized byaccumulate_grad_batches
internally.
- validation_step(batch, batch_idx)[source]
Operates on a single batch of data from the validation set. In this step you’d might generate examples or calculate anything of interest like accuracy.
- Parameters:
batch – The output of your data iterable, normally a
DataLoader
.batch_idx – The index of this batch.
dataloader_idx – The index of the dataloader that produced this batch. (only if multiple dataloaders used)
- Returns:
Tensor
- The loss tensordict
- A dictionary. Can include any keys, but must include the key'loss'
.None
- Skip to the next batch.
# if you have one val dataloader: def validation_step(self, batch, batch_idx): ... # if you have multiple val dataloaders: def validation_step(self, batch, batch_idx, dataloader_idx=0): ...
Examples:
# CASE 1: A single validation dataset def validation_step(self, batch, batch_idx): x, y = batch # implement your own out = self(x) loss = self.loss(out, y) # log 6 example images # or generated text... or whatever sample_imgs = x[:6] grid = torchvision.utils.make_grid(sample_imgs) self.logger.experiment.add_image('example_images', grid, 0) # calculate acc labels_hat = torch.argmax(out, dim=1) val_acc = torch.sum(y == labels_hat).item() / (len(y) * 1.0) # log the outputs! self.log_dict({'val_loss': loss, 'val_acc': val_acc})
If you pass in multiple val dataloaders,
validation_step()
will have an additional argument. We recommend setting the default value of 0 so that you can quickly switch between single and multiple dataloaders.# CASE 2: multiple validation dataloaders def validation_step(self, batch, batch_idx, dataloader_idx=0): # dataloader_idx tells you which dataset this is. ...
Note
If you don’t need to validate you don’t need to implement this method.
Note
When the
validation_step()
is called, the model has been put in eval mode and PyTorch gradients have been disabled. At the end of validation, the model goes back to training mode and gradients are enabled.
- test_step(batch, batch_idx)[source]
Operates on a single batch of data from the test set. In this step you’d normally generate examples or calculate anything of interest such as accuracy.
- Parameters:
batch – The output of your data iterable, normally a
DataLoader
.batch_idx – The index of this batch.
dataloader_idx – The index of the dataloader that produced this batch. (only if multiple dataloaders used)
- Returns:
Tensor
- The loss tensordict
- A dictionary. Can include any keys, but must include the key'loss'
.None
- Skip to the next batch.
# if you have one test dataloader: def test_step(self, batch, batch_idx): ... # if you have multiple test dataloaders: def test_step(self, batch, batch_idx, dataloader_idx=0): ...
Examples:
# CASE 1: A single test dataset def test_step(self, batch, batch_idx): x, y = batch # implement your own out = self(x) loss = self.loss(out, y) # log 6 example images # or generated text... or whatever sample_imgs = x[:6] grid = torchvision.utils.make_grid(sample_imgs) self.logger.experiment.add_image('example_images', grid, 0) # calculate acc labels_hat = torch.argmax(out, dim=1) test_acc = torch.sum(y == labels_hat).item() / (len(y) * 1.0) # log the outputs! self.log_dict({'test_loss': loss, 'test_acc': test_acc})
If you pass in multiple test dataloaders,
test_step()
will have an additional argument. We recommend setting the default value of 0 so that you can quickly switch between single and multiple dataloaders.# CASE 2: multiple test dataloaders def test_step(self, batch, batch_idx, dataloader_idx=0): # dataloader_idx tells you which dataset this is. ...
Note
If you don’t need to test you don’t need to implement this method.
Note
When the
test_step()
is called, the model has been put in eval mode and PyTorch gradients have been disabled. At the end of the test epoch, the model goes back to training mode and gradients are enabled.
- predict_step(batch, batch_idx)[source]
Step function called during
predict()
. By default, it callsforward()
. Override to add any processing logic.The
predict_step()
is used to scale inference on multi-devices.To prevent an OOM error, it is possible to use
BasePredictionWriter
callback to write the predictions to disk or database after each batch or on epoch end.The
BasePredictionWriter
should be used while using a spawn based accelerator. This happens forTrainer(strategy="ddp_spawn")
or training on 8 TPU cores withTrainer(accelerator="tpu", devices=8)
as predictions won’t be returned.- Parameters:
batch – The output of your data iterable, normally a
DataLoader
.batch_idx – The index of this batch.
dataloader_idx – The index of the dataloader that produced this batch. (only if multiple dataloaders used)
- Returns:
Predicted output (optional).
Example
class MyModel(LightningModule): def predict_step(self, batch, batch_idx, dataloader_idx=0): return self(batch) dm = ... model = MyModel() trainer = Trainer(accelerator="gpu", devices=2) predictions = trainer.predict(model, dm)
- configure_optimizers()[source]
Choose what optimizers and learning-rate schedulers to use in your optimization. Normally you’d need one. But in the case of GANs or similar you might have multiple. Optimization with multiple optimizers only works in the manual optimization mode.
- Returns:
Any of these 6 options.
Single optimizer.
List or Tuple of optimizers.
Two lists - The first list has multiple optimizers, and the second has multiple LR schedulers (or multiple
lr_scheduler_config
).Dictionary, with an
"optimizer"
key, and (optionally) a"lr_scheduler"
key whose value is a single LR scheduler orlr_scheduler_config
.None - Fit will run without any optimizer.
The
lr_scheduler_config
is a dictionary which contains the scheduler and its associated configuration. The default configuration is shown below.lr_scheduler_config = { # REQUIRED: The scheduler instance "scheduler": lr_scheduler, # The unit of the scheduler's step size, could also be 'step'. # 'epoch' updates the scheduler on epoch end whereas 'step' # updates it after a optimizer update. "interval": "epoch", # How many epochs/steps should pass between calls to # `scheduler.step()`. 1 corresponds to updating the learning # rate after every epoch/step. "frequency": 1, # Metric to to monitor for schedulers like `ReduceLROnPlateau` "monitor": "val_loss", # If set to `True`, will enforce that the value specified 'monitor' # is available when the scheduler is updated, thus stopping # training if not found. If set to `False`, it will only produce a warning "strict": True, # If using the `LearningRateMonitor` callback to monitor the # learning rate progress, this keyword can be used to specify # a custom logged name "name": None, }
When there are schedulers in which the
.step()
method is conditioned on a value, such as thetorch.optim.lr_scheduler.ReduceLROnPlateau
scheduler, Lightning requires that thelr_scheduler_config
contains the keyword"monitor"
set to the metric name that the scheduler should be conditioned on.Metrics can be made available to monitor by simply logging it using
self.log('metric_to_track', metric_val)
in yourLightningModule
.Note
Some things to know:
Lightning calls
.backward()
and.step()
automatically in case of automatic optimization.If a learning rate scheduler is specified in
configure_optimizers()
with key"interval"
(default “epoch”) in the scheduler configuration, Lightning will call the scheduler’s.step()
method automatically in case of automatic optimization.If you use 16-bit precision (
precision=16
), Lightning will automatically handle the optimizer.If you use
torch.optim.LBFGS
, Lightning handles the closure function automatically for you.If you use multiple optimizers, you will have to switch to ‘manual optimization’ mode and step them yourself.
If you need to control how often the optimizer steps, override the
optimizer_step()
hook.
- pynas.core.generic_lightning_module.ce_loss(logits, targets, weight=None, use_hard_labels=True, reduction='none')[source]
Wrapper for cross entropy loss in pytorch.
- Args
logits: logit values, shape=[Batch size, # of classes] targets: integer or vector, shape=[Batch size] or [Batch size, # of classes] weight: weights for loss if hard labels are used. use_hard_labels: If True, targets have [Batch size] shape with int values.
If False, the target is vector. Default to True.