GroupLinear

Description

Info

Parent class: Module

Derived classes: -

This module is a group modification of a fully connected Linear layer: while a regular fully connected layer takes a vector and returns a vector, a group linear layer can return several vectors obtained using independent weights.

Dimension

While for an ordinary fully connected layer the input shape is (N, L_{in}), and the outputs are shaped (N, L_{out}), for a group modification the input shape is (N, G, L_{in}) and the output shape is (N, G, L_{out}) (or (G, N, L_{in}) and (G, N, L_{out}) - respectively, depending on the batchDim parameter), where N - batch size, G - number of groups, L_{in} - size of the input feature vector, L_{out} - size of the output feature vector.

Initializing

def __init__(self, groups, insize, outsize, wscale=1.0, useW=True, useBias=True, initscheme=None,
                 inmode="full", wmode="full", batchDim=0, name=None, empty=False, transpW=False):

Parameters

Parameter Allowed types Description Default
groups int Number of groups -
insize int Input vector size -
outsize int Output vector size -
wscale float Variance of random layer weights 1.0
useW bool Variance of random layer weights True
useBias bool Whether to use biases True
initscheme Union[tuple, str] Specifies the initialization scheme of the layer weights (see createTensorWithScheme). None -> ("xavier_uniform", "in")
inmode str Input mode. Possible values: full and one full
wmode str Weights mode. Possible values: full and one full
batchDim int Batch axis position 0
name str Layer name None
empty bool Whether to initialize the matrix of weights and biases False
transpW bool Whether to use a transposed matrix of weights False

Explanations

groups - parameter that controls the connections between inputs and outputs; when groups = 1, we get a special case of a regular fully connected layer;


inmode - if one, one input vector is multiplied to groups outputs using independent weights, i.e. (N, 1, L_{in}) \to (N, G, L_{out}); if full, the module works in the normal mode: (N, G, L_{in}) \to (N, G, L_{out});


wmode - if one, groups input vectors will form groups outputs using the same weights; if full, the module works in the normal mode.


batchDim - by default, the batch size axis comes first: (N, G, L_{in}), however it is possible to swap it with the group axis, by setting batchDim=1: (G, N, L_{in}).

Examples


Basic example


Necessary imports.

>>> import numpy as np
>>> from PuzzleLib.Backend import gpuarray
>>> from PuzzleLib.Modules import GroupLinear

Info

gpuarray is required to properly place the tensor in the GPU.

>>> batchsize, groups, insize = 1, 2, 3
>>> data = gpuarray.to_gpu(np.random.randint(0, 9, (batchsize, groups, insize)).astype(np.float32))
>>> print(data)
[[[5. 4. 4.]
  [4. 4. 3.]]]
>>> print(data.shape)
(1, 2, 3)

Let us initialize the module with default parameters (useW=True, useBias=True, inmode="full", wmode="full", batchDim=0) and fill the weights tensor with custom values to make the demonstration of the module operation more convenient:

>>> outsize = 4
>>> grpLinear = GroupLinear(groups, insize, outsize)
>>> print(grpLinear.W.shape)
(2, 3, 4)
>>> grpLinear.W[0].fill(1)
>>> grpLinear.W[1].fill(-1)
>>> grpLinear(data)
[[[ 13.  13.  13.  13.]
  [-11. -11. -11. -11.]]]
>>> print(grpLinear.data.shape)
(1, 2, 4)

wmode parameter


Let us change the wmode parameter:

>>> grpLinear = GroupLinear(groups, insize, outsize, wmode="one")
>>> print(grpLinear.W.shape)
(1, 3, 4)
>>> grpLinear.W.fill(1)
>>> grpLinear(data)
[[[13. 13. 13. 13.]
  [11. 11. 11. 11.]]]
>>> print(grpLinear.data.shape)
(1, 2, 4)


inmode parameter


Let us change the inmode parameter and initialize the other data (with corresponding shapes) for this example:

>>> data = gpuarray.to_gpu(np.random.randint(0, 9, (batchsize, 1, insize)).astype(np.float32))
>>> print(data)
[[[5. 1. 5.]]]
>>> print(data.shape)
(1, 1, 3)

Let us again fill the weights tensor with adjustable values:

>>> grpLinear = GroupLinear(groups, insize, outsize, inmode="one")
>>> print(grpLinear.W.shape)
(2, 3, 4)
>>> grpLinear.W[0].fill(1)
>>> grpLinear.W[1].fill(-1)
>>> grpLinear(data)
[[[ 11.  11.  11.  11.]
  [-11. -11. -11. -11.]]]
>>> print(grpLinear.data.shape)
(1, 2, 4)

BatchDim parameter


>>> data = gpuarray.to_gpu(np.random.randint(0, 9, (groups, batchsize, insize)).astype(np.float32))
>>> print(data.shape)
(2, 1, 3)
>>> grpLinear = GroupLinear(groups, insize, outsize, batchDim=1)
>>> grpLinear(data)
>>> print(grpLinear.data.shape)
(2, 1, 4)