Skip to content



Loss function that calculates (cross-entropy). Reflects, like KLDivergence, the measure of error in representing one density (real) of probabilities of another (predicted).

It is used in classification problems.

The error function formula is:

H = -\sum_{с=1}^{M}y_{o,c}\cdot \log{p(y_{o,c})}


M - number of classes; y_{o,c} - binary indicator (0 or 1) that the object o belongs to the class c;
p(y_{o,c}) - probability of the object o belonging to the class c, predicted by the classifier.


def __init__(self, maxlabels=None, weights=None):


Parameter Allowed types Description Default
maxlabels int Index of the last possible class None
weigths tensor Vector of class weights None


maxlabels - needed for additional verification when working with loaded target labels, i.e. if the target labels contain values larger than the value passed in this argument, the class will throw an error;

weights - vector of class weights is needed in order to regulate the influence of a particular class on the value of the error function, for example, if we work with an unbalanced dataset.


Necessary imports:

import numpy as np
from PuzzleLib.Backend import gpuarray
from PuzzleLib.Cost import CrossEntropy


gpuarray is required to properly place the tensor in the GPU.

Synthetic target and prediction tensors:

scores = gpuarray.to_gpu(np.random.randn(20, 10, 3).astype(np.float32))
labels = gpuarray.to_gpu(np.random.randint(low=0, high=10, size=(20, 3)).astype(np.int32))


Please remember that the first dimension of target and prediction tensors is the size of the batch.

Initializing the error function:

entr = CrossEntropy()

Calculating the error and the gradient on the batch:

error, grad = entr(scores, labels)