Conv3D¶

Description¶

This module performs a three-dimensional convolution operation. Please see ConvND for more information..

For an input tensor of shape $(N, C_{in}, D_{in}, H_{in}, W_{in})$ ) and an the output of shape $(N, C_{out}, D_{out}, H_{out}, W_{out})$ the operation is performed as follows (we consider the i-th element of the batch and the j-th map of the output tensor):

$out(N_{i},C_{out_j}) = bias(C_{out_j}) + \sum_{k=0}^{C_{in} - 1}weight(C_{out_j}, k) \star input_i(N_{i},k)$

where

$N$ - batch size;
$C$ - number of maps in the tensor;
$D$ - tensor map depth;
$H$ - tensor map height;
$W$ - tensor map width;
$bias$ - bias tensor of the convolution layer, of shape $(1, C_{out}, 1, 1)$ ;
$weight$ - weights tensor of the convolution layer, of shape $(C_{out}, C_{in}, size_h, size_w)$ ;
$\star$ - cross correlation operator.

Initializing¶

def __init__(self, inmaps, outmaps, size, stride=1, pad=0, dilation=1, wscale=1.0, useBias=True, name=None,
                 initscheme=None, empty=False, groups=1):

Parameters

Parameter	Allowed types	Description	Default
inmaps	int	Number of maps in the input tensor	-
outmaps	int	Number of maps in the output tensor	-
size	int	Convolution kernel size (the kernel is always equilateral)	-
stride	Union[int, tuple]	Convolution stride	1
pad	Union[int, tuple]	Map padding	0
dilation	Union[int, tuple]	Convolution window dilation	1
wscale	float	Random layer weights variance	1.0
useBias	bool	Specifies the layer weights initialization scheme	True
initscheme	Union[tuple, str]	Specifies the layer weights initialization scheme (see createTensorWithScheme)	None -> ("xavier_uniform", "in")
name	str	Layer name	None
empty	bool	Whether to initialize the matrix of weights and biases	False
groups	int	Number of groups the maps are split into for separate processing	1

Explanations

Info

For the above input $(N, C_{in}, D_{in}, H_{in}, W_{in})$ and output $(N, C_{out}, D_{out}, H_{out}, W_{out})$ tensors there is a relation between their shapes: \begin{equation} D_{out} = \frac{D_{in} + 2pad_d - dil_d(size_d - 1) - 1}{stride_d} + 1 \end{equation} \begin{equation} H_{out} = \frac{H_{in} + 2pad_h - dil_h(size_h - 1) - 1}{stride_h} + 1 \end{equation} \begin{equation} W_{out} = \frac{W_{in} + 2pad_w - dil_w(size_w - 1) - 1}{stride_w} + 1 \end{equation}

size - the filters are always equilateral, i.e. (size_d, size_h, size_w), where size_d == size_h == size_w;

stride - possible to specify either a single value of the convolution stride along all the map axes, or a tuple (stride_d, stride_h, stride_w), where stride_h - value of the convolution stride along the depth of the map, stride_h - value of the convolution stride along the height of the map, and stride_w - along the width;

pad - possible to specify either a single padding value for all sides of the maps, or a tuple (pad_d, pad_h, pad_w), where pad_d - padding value on each side along the depth of the map, pad_h - along the height of the map, and pad_w - along the width. The possibility of creating an asymmetric padding (filling with additional elements on only one side of the tensor) is not provided for this module;

dilation - possible to specify either a single dilation value for all sides of the convolution kernel or a tuple (dil_d, dil_h, dil_w), where dil_d - filter dilation along the depth of the map, dil_h - filter dilation along the height of the map, dil_w - along the width;

groups - number of groups into which the set of maps is split in order to be convoluted separately.

The general rule is (one should bear in mind that the values of the inmaps and outmaps parameters must be divided by the value of the groups parameter): for every $\frac{inmaps}{groups}$ input maps, $\frac{outmaps}{groups}$ output maps are formes. That is, we can say that we perform groups independent convolutions. Special cases:

if groups=1, then each output map interacts with all input maps, that is, a regular convolution occurs;
if inmaps == outmaps == groups, then a depthwise convolution occurs: one output map is formed from each input map (please see details in ConvND theory).

Thus, to obtain a full Depthwise Separable Convolution block, it is necessary to place two library layers of convolution in sequence:

one depthwise convolution with parameters inmaps == outmaps == groups;
one pointwise convolution with a kernel of size 1.

Examples¶

Basic convolution example¶

Necessary imports.

import numpy as np
from PuzzleLib.Backend import gpuarray
from PuzzleLib.Modules import Conv3D
from PuzzleLib.Variable import Variable

Info

gpuarray is required to properly place the tensor in the GPU.

For this module, examples are given in a simplified version. For a more visual presentation, please see the Conv2D examples.

batchsize, inmaps, d, h, w = 1, 1, 6, 6, 6
outsize = 2
data = gpuarray.to_gpu(np.random.randn(batchsize, inmaps, d, h, w).astype(np.float32))

Let us initialize the module with standard parameters (stride=1, pad=0, dilation=1, groups=1):

conv = Conv3D(inmaps=inmaps, outmaps=outsize, size=2)
print(conv(data))

Let us leave all parameters the same except size:

conv = Conv3D(inmaps=inmaps, outmaps=outsize, size=4)
print(conv(data))

The size parameter can be set different for each axis of the map:

conv = Conv3D(inmaps=inmaps, outmaps=outsize, size=(2, 4, 2))
print(conv(data))

The stride and pad parameters can also be set different for each axis of the map:

conv = Conv3D(inmaps=inmaps, outmaps=outsize, size=4, stride=(1, 4, 4), pad=(0, 1, 1))
print(conv(data))

As mentioned earlier, if the parameter has different values for different axes, then all these values must be passed explicitly. The following example will throw an error:

conv = Conv3D(inmaps=inmaps, outmaps=outsize, size=2, stride=(1, 3))
print(conv(data))