SoftMax¶

Description¶

Info

Parent class: Module

Derived classes: -

This module applies the softmax function to the input tensor. Softmax is a generalization of the logistic function for the multidimensional case, determined by the formula:

$\begin{equation}\label{eq:softmax} \sigma(z)_i = \frac{e^{z_i}}{\sum \limits_{k=1}^{K} e^{z_k}} \end{equation}$

The softmax function is used along the tensor depth (along the maps) and scales their values so that the elements lie in the range [0, 1] and add up to 1, i.e. if there is a tensor of shape $(N, C, H, W)$ , where $N$ - batch size, $C$ - number of maps (channels), $H$ - map height, $W$ - map width, provided that each element of the maps is $x_{nchw}\in{R}$ , then when passing through the softmax function $x_{nchw}\in[0, 1]$ , whereby $\displaystyle\sum_{c=1}^C x_{nchw} = 1$ .

Softmax is often used for the last layer of deep neural networks for classification tasks. To train the neural network, cross entropy is used as the loss function.

Initializing¶

def __init__(self, name=None):

Parameters

Parameter	Allowed types	Description	Default
name	str	Layer name	None

Explanations

-

Examples¶

Necessary imports.

import numpy as np
from PuzzleLib.Backend import gpuarray
from PuzzleLib.Modules import SoftMax

Info

gpuarray is required to properly place the tensor in the GPU

For simplicity, let us take maps of unit height:

np.random.seed(123)
batchsize, maps, h, w = 1, 3, 1, 3
data = gpuarray.to_gpu(np.random.randn(batchsize, maps, 1, 3).astype(np.float32))
print(data)

[[[[-1.0856307   0.99734545  0.2829785 ]]

  [[-1.5062947  -0.5786002   1.6514366 ]]

  [[-2.4266791  -0.42891264  1.2659363 ]]]]

softmax = SoftMax()
outdata = softmax(data)
print(outdata)

[[[[0.521327   0.69107646 0.13155064]]

  [[0.34230885 0.14292283 0.51690024]]

  [[0.13636416 0.16600075 0.35154915]]]]

print(np.sum(outdata.get(), axis=1))

[[[1. 1. 1.]]]