Skip to content

SpatialTf

Description

Info

Parent class: Module

Derived classes: -

This module performs the spatial transformation of the tensor by the specified operator. The coordinates of the original tensor elements are multiplied by a transforming operator, which gives the coordinates of the output tensor element:

\begin{bmatrix} \grave{x} \\ \grave{y} \end{bmatrix} = \begin{bmatrix} \theta_{11} & \theta_{12} & \theta_{13} \\ \theta_{21} & \theta_{22} & \theta_{23} \end{bmatrix} \begin{bmatrix} x \\ y \\ 1 \end{bmatrix}

where

x, y - coordinates of the element in the original tensor;
\theta_{ij} - coefficients of the operator;
\grave{x}, \grave{y} - coordinates of the element in the output tensor.

The value of the transferred element is found using bilinear interpolation.

To obtain the coefficients of the operator, a mini-neural network is used, the output layer of which has 6 elements (according to the number of coefficients). Such a network is trained along with the main one.

Additional sources

Initializing

def __init__(self, shape=None, name=None):

Parameters

Parameter Allowed types Description Default
shape tuple The dimension of the output tensor. If None, then the dimension of the tensor will remain unchanged None
name str Layer name None

Explanations

-

Examples

Necessary imports.

import numpy as np
from PuzzleLib.Backend import gpuarray
from PuzzleLib.Modules import SpatialTf
Let us generate the data tensor. Please, pay attention to the format.

Info

gpuarray is required to properly place the tensor in the GPU

batchsize, maps, h, w = 1, 1, 4, 4
data = gpuarray.to_gpu(np.random.randn(batchsize, maps, inh, inw).astype(np.float32))
transform = gpuarray.to_gpu(np.tile(np.array([[1.0, 0.0, 0.001], [0, 1.0, 0.001]], dtype=np.float32), reps=(batchsize, 1, 1)))
spatialtf = SpatialTf()
spatialtf([data, transform])