Trainer

Description

Handler designed to simplify the process of model training for the user by eliminating the need to manually prescribe a sequence of actions. It is a kind of a wrapper function around the operations on data.

Info

In handlers, depending on the location of the data, splitting can be performed in the following ways:

  • data is placed on the disk: first, data is split into macrobatches - blocks that are entirely placed in the GPU, whereafter the macrobatch is split into smaller batches, which are then fed directly to the model input;
  • data has already been placed in the GPU: it is split into batches, which are then fed directly to the input of the model.

Initializing

def __init__(self, mod, cost, optimizer, onBatchFinish=None, batchsize=128):

Parameters

Parameter Allowed types Description Default
mod Module Trainable neural network -
cost Cost Target function -
optimizer Optimizer Model optimizer -
onBatchFinish callable Function that will be called upon completion of processing of a batch of data None
batchsize int Size of a data batch 128

Explanations

-

Methods

All the basic methods of handlers can be found in the documentation for the parent class Handler.

trainFromHost

def trainFromHost(self, data, target, macroBatchSize=10000, onMacroBatchFinish=None, random=True):

Functionality

Wrapper function around the handleFromHost() method of the Handler parent class that takes into account the specifics of the training process: when the method is called, the accumulated error is reset and the model is switched to the training mode (some network layers can behave differently in the training and inference modes).

Parameters

Parameter Allowed types Description Default
data tensor Data tensor -
target tensor Tensor of data corresponding labels None
macroBatchSize int Size of a macrobatch. The data will be split into macrobatches sized macrobatchSize 10000
onMacroBatchFinish callable Function that will be called after processing the macrobatch None
random bool Whether the data batches should be shuffled randomly before processing True

Explanations

-

train

def train(self, data, target, random=True):

Functionality

Wrapper function around the handle() method of the Handler parent class that takes into account the specifics of the training process: when the method is called, the accumulated error is reset and the model is switched to the training mode (some network layers can behave differently in the training and inference modes).

Parameters

Parameter Allowed types Description Default
data GPUArray Data tensor placed in the GPU -
target GPUArray Tensor of data corresponding labels, placed in the GPU None
random bool Whether the data batches should be shuffled randomly before processing True

Explanations

-

handleBatch

def handleBatch(self, batch, idx, state):

Functionality

Root method of the training handler. It calculates the gradient of the error function on the transmitted batch and propagates the error back, after which it starts the process of updating the model weights by the optimizer.

Parameters

Parameter Allowed types Description Default
batch list List of two elements: [data, target] -
idx int Index number of the data batch -
state dict Dictionary containing information about the state of data processing -

Explanations

-