Conv3D¶
Description¶
This module performs a three-dimensional convolution operation. Please see ConvND for more information..
For an input tensor of shape (N, C_{in}, D_{in}, H_{in}, W_{in}) ) and an the output of shape (N, C_{out}, D_{out}, H_{out}, W_{out}) the operation is performed as follows (we consider the i-th element of the batch and the j-th map of the output tensor):
where
N - batch size;
C - number of maps in the tensor;
D - tensor map depth;
H - tensor map height;
W - tensor map width;
bias - bias tensor of the convolution layer, of shape (1, C_{out}, 1, 1);
weight - weights tensor of the convolution layer, of shape (C_{out}, C_{in}, size_h, size_w);
\star - cross correlation operator.
Initializing¶
def __init__(self, inmaps, outmaps, size, stride=1, pad=0, dilation=1, wscale=1.0, useBias=True, name=None,
initscheme=None, empty=False, groups=1):
Parameters
Parameter | Allowed types | Description | Default |
---|---|---|---|
inmaps | int | Number of maps in the input tensor | - |
outmaps | int | Number of maps in the output tensor | - |
size | int | Convolution kernel size (the kernel is always equilateral) | - |
stride | Union[int, tuple] | Convolution stride | 1 |
pad | Union[int, tuple] | Map padding | 0 |
dilation | Union[int, tuple] | Convolution window dilation | 1 |
wscale | float | Random layer weights variance | 1.0 |
useBias | bool | Specifies the layer weights initialization scheme | True |
initscheme | Union[tuple, str] | Specifies the layer weights initialization scheme (see createTensorWithScheme) | None -> ("xavier_uniform", "in") |
name | str | Layer name | None |
empty | bool | Whether to initialize the matrix of weights and biases | False |
groups | int | Number of groups the maps are split into for separate processing | 1 |
Explanations
Info
For the above input (N, C_{in}, D_{in}, H_{in}, W_{in}) and output (N, C_{out}, D_{out}, H_{out}, W_{out}) tensors there is a relation between their shapes: \begin{equation} D_{out} = \frac{D_{in} + 2pad_d - dil_d(size_d - 1) - 1}{stride_d} + 1 \end{equation} \begin{equation} H_{out} = \frac{H_{in} + 2pad_h - dil_h(size_h - 1) - 1}{stride_h} + 1 \end{equation} \begin{equation} W_{out} = \frac{W_{in} + 2pad_w - dil_w(size_w - 1) - 1}{stride_w} + 1 \end{equation}
size
- the filters are always equilateral, i.e. (size_d, size_h, size_w)
, where size_d == size_h == size_w
;
stride
- possible to specify either a single value of the convolution stride along all the map axes, or a tuple (stride_d, stride_h, stride_w)
, where stride_h
- value of the convolution stride along the depth of the map, stride_h
- value of the convolution stride along the height of the map, and stride_w
- along the width;
pad
- possible to specify either a single padding value for all sides of the maps, or a tuple (pad_d, pad_h, pad_w)
, where pad_d
- padding value on each side along the depth of the map, pad_h
- along the height of the map, and pad_w
- along the width. The possibility of creating an asymmetric padding (filling with additional elements on only one side of the tensor) is not provided for this module;
dilation
- possible to specify either a single dilation value for all sides of the convolution kernel or a tuple (dil_d, dil_h, dil_w)
, where dil_d
- filter dilation along the depth of the map, dil_h
- filter dilation along the height of the map, dil_w
- along the width;
groups
- number of groups into which the set of maps is split in order to be convoluted separately.
The general rule is (one should bear in mind that the values of the inmaps
and outmaps
parameters must be divided by the value of the groups
parameter): for every \frac{inmaps}{groups} input maps, \frac{outmaps}{groups} output maps are formes. That is, we can say that we perform groups
independent convolutions. Special cases:
- if
groups=1
, then each output map interacts with all input maps, that is, a regular convolution occurs; - if
inmaps
==outmaps
==groups
, then a depthwise convolution occurs: one output map is formed from each input map (please see details in ConvND theory).
Thus, to obtain a full Depthwise Separable Convolution block, it is necessary to place two library layers of convolution in sequence:
- one depthwise convolution with parameters
inmaps
==outmaps
==groups
; - one pointwise convolution with a kernel of size 1.
Examples¶
Basic convolution example¶
Necessary imports.
import numpy as np
from PuzzleLib.Backend import gpuarray
from PuzzleLib.Modules import Conv3D
from PuzzleLib.Variable import Variable
Info
gpuarray
is required to properly place the tensor in the GPU.
For this module, examples are given in a simplified version. For a more visual presentation, please see the Conv2D examples.
batchsize, inmaps, d, h, w = 1, 1, 6, 6, 6
outsize = 2
data = gpuarray.to_gpu(np.random.randn(batchsize, inmaps, d, h, w).astype(np.float32))
Let us initialize the module with standard parameters (stride=1
, pad=0
, dilation=1
, groups=1
):
conv = Conv3D(inmaps=inmaps, outmaps=outsize, size=2)
print(conv(data))
Let us leave all parameters the same except size
:
conv = Conv3D(inmaps=inmaps, outmaps=outsize, size=4)
print(conv(data))
The size
parameter can be set different for each axis of the map:
conv = Conv3D(inmaps=inmaps, outmaps=outsize, size=(2, 4, 2))
print(conv(data))
The stride
and pad
parameters can also be set different for each axis of the map:
conv = Conv3D(inmaps=inmaps, outmaps=outsize, size=4, stride=(1, 4, 4), pad=(0, 1, 1))
print(conv(data))
As mentioned earlier, if the parameter has different values for different axes, then all these values must be passed explicitly. The following example will throw an error:
conv = Conv3D(inmaps=inmaps, outmaps=outsize, size=2, stride=(1, 3))
print(conv(data))