KLDivergence¶
Description¶
The loss function that calculates the (Kullback–Leibler divergence). It reflects, like CrossEntropy, the measure of error in representing one density (real) of probabilities of another (predicted)
It is used in classification tasks.
The error function formula is:
where
P, Q - continuous random variables in the R^d space;
KL(P || Q) - Kullback-Leibler divergence for distributions P and Q;
p(x), q(x) - distribution densities of P and Q respectively.
Connection with entropy and cross entropy:
where
H(p) - entropy of the distribution of P;
H(p, q) - cross entropy of the distributions P and Q.
Initializing¶
def __init__(self, maxlabels=None, normTarget=False):
Parametrs
Parameter | Allowed types | Description | Default |
---|---|---|---|
maxlabels | int | Index of the last possible class | None |
normTarget | bool | Whether to normalize the target distribution | False |
Explanations
maxlabels
- needed for additional verification when working with loaded target labels, i.e. if the target labels contain values larger than the value passed in this argument, the class will throw an error;
normTarget
- when this flag is set, the values of the target class tensor will be normalized by the softmax function, that is, if the target tensor is received in the "raw" form with values x_i\in{R}, then with the flag set: x_i\in[0, 1], \sum_{i=0}^N x_i = 1.
Examples¶
Necessary imports:
import numpy as np
from PuzzleLib.Backend import gpuarray
from PuzzleLib.Cost import KLDivergence
Info
gpuarray
required to properly place the tensor in the GPU
Synthetic target and prediction tensors:
scores = gpuarray.to_gpu(np.random.randn(10, 10).astype(np.float32))
labels = gpuarray.to_gpu(np.random.randn(10, 10).astype(np.float32))
Important
Please remember that the first dimension of target and prediction tensors is the size of the batch.
Initializing the error function:
div = KLDivergence(normTarget=True)
Calculating the error and the gradient on the batch:
error, grad = div(pred, target)