public class AdaDelta extends Object implements GradientUpdater
AdaGrad
and was developed for use primarily
in neural networks. It still maintains a per feature learning rate, however
unlike AdaGrad the learning rates may increase over time and are highly
robust to any individual learning rate. Constructor and Description |
---|
AdaDelta()
Creates a new AdaDelta updater using a decay rate of 0.95
|
AdaDelta(AdaDelta toCopy)
Copy constructor
|
AdaDelta(double rho)
Creates a new AdaDelta updater
|
Modifier and Type | Method and Description |
---|---|
AdaDelta |
clone() |
double |
getRho() |
void |
setRho(double rho)
Sets the decay rate used by AdaDelta.
|
void |
setup(int d)
Sets up this updater to update a weight vector of dimension
d
by a gradient of the same dimension |
void |
update(Vec x,
Vec grad,
double eta)
Updates the weight vector
x such that x = x-ηf(grad),
where f(grad) is some function on the gradient that effectively returns a
new vector. |
double |
update(Vec x,
Vec grad,
double eta,
double bias,
double biasGrad)
Updates the weight vector
x such that x = x-ηf(grad),
where f(grad) is some function on the gradient that effectively returns a
new vector. |
public AdaDelta()
public AdaDelta(double rho)
rho
- the decay rate to usepublic AdaDelta(AdaDelta toCopy)
toCopy
- the object to copypublic void setRho(double rho)
rho
- the decay rate in (0, 1) to usepublic double getRho()
public void update(Vec x, Vec grad, double eta)
GradientUpdater
x
such that x = x-ηf(grad),
where f(grad) is some function on the gradient that effectively returns a
new vector. It is not necessary for the internal implementation to ever
explicitly form any of these objects, so long as x
is mutated to
have the correct result.update
in interface GradientUpdater
x
- the vector to mutate such that is has been updated by the
gradientgrad
- the gradient to update the weight vector x
frometa
- the learning rate to applypublic double update(Vec x, Vec grad, double eta, double bias, double biasGrad)
GradientUpdater
x
such that x = x-ηf(grad),
where f(grad) is some function on the gradient that effectively returns a
new vector. It is not necessary for the internal implementation to ever
explicitly form any of these objects, so long as x
is mutated to
have the correct result. update
in interface GradientUpdater
x
- the vector to mutate such that is has been updated by the
gradientgrad
- the gradient to update the weight vector x
frometa
- the learning rate to applybias
- the bias term of the vectorbiasGrad
- the gradient for the bias termbias = bias - returnValue
public AdaDelta clone()
clone
in interface GradientUpdater
clone
in class Object
public void setup(int d)
GradientUpdater
d
by a gradient of the same dimensionsetup
in interface GradientUpdater
d
- the dimension of the weight vector that will be updatedCopyright © 2017. All rights reserved.