SpatialTf¶

Description¶

Info

Parent class: Module

Derived classes: -

This module performs the spatial transformation of the tensor by the specified operator. The coordinates of the original tensor elements are multiplied by a transforming operator, which gives the coordinates of the output tensor element:

$\begin{bmatrix} \grave{x} \\ \grave{y} \end{bmatrix} = \begin{bmatrix} \theta_{11} & \theta_{12} & \theta_{13} \\ \theta_{21} & \theta_{22} & \theta_{23} \end{bmatrix} \begin{bmatrix} x \\ y \\ 1 \end{bmatrix}$

where

$x, y$ - coordinates of the element in the original tensor;
$\theta_{ij}$ - coefficients of the operator;
$\grave{x}, \grave{y}$ - coordinates of the element in the output tensor.

The value of the transferred element is found using bilinear interpolation.

To obtain the coefficients of the operator, a mini-neural network is used, the output layer of which has 6 elements (according to the number of coefficients). Such a network is trained along with the main one.

Additional sources¶

Arxiv
Link

Initializing¶

def __init__(self, shape=None, name=None):

Parameters

Parameter	Allowed types	Description	Default
shape	tuple	The dimension of the output tensor. If `None`, then the dimension of the tensor will remain unchanged	None
name	str	Layer name	None

Explanations

-

Examples¶

Necessary imports.

import numpy as np
from PuzzleLib.Backend import gpuarray
from PuzzleLib.Modules import SpatialTf

Let us generate the data tensor. Please, pay attention to the format.

Info

gpuarray is required to properly place the tensor in the GPU

batchsize, maps, h, w = 1, 1, 4, 4
data = gpuarray.to_gpu(np.random.randn(batchsize, maps, inh, inw).astype(np.float32))
transform = gpuarray.to_gpu(np.tile(np.array([[1.0, 0.0, 0.001], [0, 1.0, 0.001]], dtype=np.float32), reps=(batchsize, 1, 1)))
spatialtf = SpatialTf()
spatialtf([data, transform])