Skip to content




Parent class: Module

Derived classes: -

This module applies the softmax function to the input tensor. Softmax is a generalization of the logistic function for the multidimensional case, determined by the formula:

\begin{equation}\label{eq:softmax} \sigma(z)_i = \frac{e^{z_i}}{\sum \limits_{k=1}^{K} e^{z_k}} \end{equation}

The softmax function is used along the tensor depth (along the maps) and scales their values so that the elements lie in the range [0, 1] and add up to 1, i.e. if there is a tensor of shape (N, C, H, W), where N - batch size, C - number of maps (channels), H - map height, W - map width, provided that each element of the maps is x_{nchw}\in{R}, then when passing through the softmax function x_{nchw}\in[0, 1], whereby \displaystyle\sum_{c=1}^C x_{nchw} = 1.

Softmax is often used for the last layer of deep neural networks for classification tasks. To train the neural network, cross entropy is used as the loss function.


def __init__(self, name=None):


Parameter Allowed types Description Default
name str Layer name None




Necessary imports.

import numpy as np
from PuzzleLib.Backend import gpuarray
from PuzzleLib.Modules import SoftMax


gpuarray is required to properly place the tensor in the GPU

For simplicity, let us take maps of unit height:

batchsize, maps, h, w = 1, 3, 1, 3
data = gpuarray.to_gpu(np.random.randn(batchsize, maps, 1, 3).astype(np.float32))
[[[[-1.0856307   0.99734545  0.2829785 ]]

  [[-1.5062947  -0.5786002   1.6514366 ]]

  [[-2.4266791  -0.42891264  1.2659363 ]]]]
softmax = SoftMax()
outdata = softmax(data)
[[[[0.521327   0.69107646 0.13155064]]

  [[0.34230885 0.14292283 0.51690024]]

  [[0.13636416 0.16600075 0.35154915]]]]
print(np.sum(outdata.get(), axis=1))
[[[1. 1. 1.]]]