Trainer¶
Description¶
Handler designed to simplify the process of model training for the user by eliminating the need to manually prescribe a sequence of actions. It is a kind of a wrapper function around the operations on data.
Info
In handlers, depending on the location of the data, splitting can be performed in the following ways:
- data is placed on the disk: first, data is split into macrobatches - blocks that are entirely placed in the GPU, whereafter the macrobatch is split into smaller batches, which are then fed directly to the model input;
- data has already been placed in the GPU: it is split into batches, which are then fed directly to the input of the model.
Initializing¶
def __init__(self, mod, cost, optimizer, onBatchFinish=None, batchsize=128):
Parameters
Parameter | Allowed types | Description | Default |
---|---|---|---|
mod | Module | Trainable neural network | - |
cost | Cost | Target function | - |
optimizer | Optimizer | Model optimizer | - |
onBatchFinish | callable | Function that will be called upon completion of processing of a batch of data | None |
batchsize | int | Size of a data batch | 128 |
Explanations
-
Methods¶
All the basic methods of handlers can be found in the documentation for the parent class Handler.
trainFromHost
¶
def trainFromHost(self, data, target, macroBatchSize=10000, onMacroBatchFinish=None, random=True):
Functionality
Wrapper function around the handleFromHost() method of the Handler parent class that takes into account the specifics of the training process: when the method is called, the accumulated error is reset and the model is switched to the training mode (some network layers can behave differently in the training and inference modes).
Parameters
Parameter | Allowed types | Description | Default |
---|---|---|---|
data | tensor | Data tensor | - |
target | tensor | Tensor of data corresponding labels | None |
macroBatchSize | int | Size of a macrobatch. The data will be split into macrobatches sized macrobatchSize | 10000 |
onMacroBatchFinish | callable | Function that will be called after processing the macrobatch | None |
random | bool | Whether the data batches should be shuffled randomly before processing | True |
Explanations
-
train
¶
def train(self, data, target, random=True):
Functionality
Wrapper function around the handle() method of the Handler parent class that takes into account the specifics of the training process: when the method is called, the accumulated error is reset and the model is switched to the training mode (some network layers can behave differently in the training and inference modes).
Parameters
Parameter | Allowed types | Description | Default |
---|---|---|---|
data | GPUArray | Data tensor placed in the GPU | - |
target | GPUArray | Tensor of data corresponding labels, placed in the GPU | None |
random | bool | Whether the data batches should be shuffled randomly before processing | True |
Explanations
-
handleBatch
¶
def handleBatch(self, batch, idx, state):
Functionality
Root method of the training handler. It calculates the gradient of the error function on the transmitted batch and propagates the error back, after which it starts the process of updating the model weights by the optimizer.
Parameters
Parameter | Allowed types | Description | Default |
---|---|---|---|
batch | list | List of two elements: [data, target] |
- |
idx | int | Index number of the data batch | - |
state | dict | Dictionary containing information about the state of data processing | - |
Explanations
-