AdaDelta (Java Statistical Analysis Tool 0.0.8 API)

java.lang.Object
- jsat.math.optimization.stochastic.AdaDelta

All Implemented Interfaces:

Serializable, GradientUpdater
```
public class AdaDelta
extends Object
implements GradientUpdater
```
AdaDelta is inspired by AdaGrad and was developed for use primarily in neural networks. It still maintains a per feature learning rate, however unlike AdaGrad the learning rates may increase over time and are highly robust to any individual learning rate.

See: Zeiler, M. D. (2012). ADADELTA: An Adaptive Learning Rate Method. CoRR, abs/1212.5.

Author:

Edward Raff

See Also:

Serialized Form

Constructor Summary

Constructors
Constructor and Description
`AdaDelta()` Creates a new AdaDelta updater using a decay rate of 0.95
`AdaDelta(AdaDelta toCopy)` Copy constructor
`AdaDelta(double rho)` Creates a new AdaDelta updater

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`AdaDelta`	`clone()`
`double`	`getRho()`
`void`	`setRho(double rho)` Sets the decay rate used by AdaDelta.
`void`	`setup(int d)` Sets up this updater to update a weight vector of dimension `d` by a gradient of the same dimension
`void`	`update(Vec x, Vec grad, double eta)` Updates the weight vector `x` such that x = x-ηf(grad), where f(grad) is some function on the gradient that effectively returns a new vector.
`double`	`update(Vec x, Vec grad, double eta, double bias, double biasGrad)` Updates the weight vector `x` such that x = x-ηf(grad), where f(grad) is some function on the gradient that effectively returns a new vector.

Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Constructor Detail
  - AdaDelta
```
public AdaDelta()
```
    Creates a new AdaDelta updater using a decay rate of 0.95
  - AdaDelta
```
public AdaDelta(double rho)
```
    Creates a new AdaDelta updater
    
    Parameters:
    
    rho - the decay rate to use
  - AdaDelta
```
public AdaDelta(AdaDelta toCopy)
```
    Copy constructor
    
    Parameters:
    
    toCopy - the object to copy
- Method Detail
  - setRho
```
public void setRho(double rho)
```
    Sets the decay rate used by AdaDelta. Lower values focus more on the current gradient, where higher values incorporate a longer history.
    
    Parameters:
    
    rho - the decay rate in (0, 1) to use
  - getRho
```
public double getRho()
```
    Returns:
    
    the decay rate that will be used
  - update
```
public void update(Vec x,
                   Vec grad,
                   double eta)
```
    Description copied from interface: GradientUpdater
    
    Updates the weight vector x such that x = x-ηf(grad), where f(grad) is some function on the gradient that effectively returns a new vector. It is not necessary for the internal implementation to ever explicitly form any of these objects, so long as x is mutated to have the correct result.
    
    Specified by:
    
    update in interface GradientUpdater
    
    Parameters:
    
    x - the vector to mutate such that is has been updated by the gradient
    
    grad - the gradient to update the weight vector x from
    
    eta - the learning rate to apply
  - update
```
public double update(Vec x,
                     Vec grad,
                     double eta,
                     double bias,
                     double biasGrad)
```
    Description copied from interface: GradientUpdater
    
    Updates the weight vector x such that x = x-ηf(grad), where f(grad) is some function on the gradient that effectively returns a new vector. It is not necessary for the internal implementation to ever explicitly form any of these objects, so long as x is mutated to have the correct result.
    
    This version of the update method includes two extra parameters to make it easer to use when a scalar bias term is also used
    
    Specified by:
    
    update in interface GradientUpdater
    
    Parameters:
    
    x - the vector to mutate such that is has been updated by the gradient
    
    grad - the gradient to update the weight vector x from
    
    eta - the learning rate to apply
    
    bias - the bias term of the vector
    
    biasGrad - the gradient for the bias term
    
    Returns:
    
    the value to change the bias by, the update being bias = bias - returnValue
  - clone
```
public AdaDelta clone()
```
    Specified by:
    
    clone in interface GradientUpdater
    
    Overrides:
    
    clone in class Object
  - setup
```
public void setup(int d)
```
    Description copied from interface: GradientUpdater
    
    Sets up this updater to update a weight vector of dimension d by a gradient of the same dimension
    
    Specified by:
    
    setup in interface GradientUpdater
    
    Parameters:
    
    d - the dimension of the weight vector that will be updated

Class AdaDelta

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Detail

AdaDelta

AdaDelta

AdaDelta

Method Detail

setRho

getRho

update

update

clone

setup