SMORMS3

Description

Info

Parent class: Optimizer

Derived classes: -

SMORMS3 (squared mean over root mean squared cubed) - is one of the optimization algorithms first mentioned in this source. It is a hybrid of RMSProp and LeCun's method..

Let us introduce:

r = \frac{1}{mem + 1}

Whereafter:

\begin{equation} m_t = r{m_{t-1}} + (1 - r)g_t \end{equation}

\begin{equation} \upsilon_t = r{\upsilon_{t-1}} + (1 - r)g_t^2 \end{equation}

Then the parameter update process would be:

\theta_{t+1} = \theta_t - \frac{min(\alpha, \frac{m_t^2}{\upsilon + \epsilon})}{\sqrt{\upsilon_t^2 + \epsilon}} g_t
mem_{t+1} = 1 + mem_t(1 - \frac{m_t^2}{\upsilon + \epsilon})

Initializing

def __init__(self, learnRate=1e-3, epsilon=1e-16, nodeinfo=None):

Parameters

Parameter Allowed types Description Default
learnRate float Learning rate 1e-3
epsilon float Smoothing parameter 1e-5
nodeinfo NodeInfo Object containing information about the computational node None

Explanations

-

Examples


Necessary imports:

>>> import numpy as np
>>> from PuzzleLib.Optimizers import RMSProp
>>> from PuzzleLib.Backend import gpuarray

Info

gpuarray is required to properly place the tensor in the GPU.

Let us set up a synthetic training dataset:

>>> data = gpuarray.to_gpu(np.random.randn(16, 128).astype(np.float32))
>>> target = gpuarray.to_gpu(np.random.randn(16, 1).astype(np.float32))

Declaring the optimizer:

>>> optimizer = RMSProp(learnRate=0.001, factor=0.9)

Suppose that there is already some net network defined, for example, through Graph, then in order to install the optimizer on the network, the following is required:

>>> optimizer.setupOn(net, useGlobalState=True)

Info

You can read more about optimizer methods and their parameters in the description of the Optimizer parent class

Moreover, let there be some loss error function, inherited from Cost, calculating its gradient as well. Then we get the implementation of the optimization process:

>>> for i in range(100):
...   predictions = net(data)
...   error, grad = loss(predictions, target)

...   optimizer.zeroGradParams()
...   net.backward(grad)
...   optimizer.update()

...   if (i + 1) % 5 == 0:
...     print("Iteration #%d error: %s" % (i + 1, error))