public class MahalanobisDistance extends TrainableDistanceMetric
Constructor and Description |
---|
MahalanobisDistance() |
Modifier and Type | Method and Description |
---|---|
MahalanobisDistance |
clone() |
double |
dist(int a,
int b,
List<? extends Vec> vecs,
List<Double> cache)
Computes the distance between 2 vectors in the original list of vectors.
|
double |
dist(int a,
Vec b,
List<? extends Vec> vecs,
List<Double> cache)
Computes the distance between one vector in the original list of vectors
with that of another vector not from the original list.
|
double |
dist(int a,
Vec b,
List<Double> qi,
List<? extends Vec> vecs,
List<Double> cache)
Computes the distance between one vector in the original list of vectors
with that of another vector not from the original list, but had
information generated by
DistanceMetric.getQueryInfo(jsat.linear.Vec) . |
double |
dist(Vec a,
Vec b)
Computes the distance between 2 vectors.
|
List<Double> |
getAccelerationCache(List<? extends Vec> vecs)
Returns a cache of double values associated with the given list of
vectors in the given order.
|
List<Double> |
getAccelerationCache(List<? extends Vec> vecs,
ExecutorService threadpool)
Returns a cache of double values associated with the given list of
vectors in the given order.
|
List<Double> |
getQueryInfo(Vec q)
Pre computes query information that would have be generated if the query
was a member of the original list of vectors when calling
DistanceMetric.getAccelerationCache(java.util.List) . |
boolean |
isIndiscemible()
Returns true if this distance metric obeys the rule that, for any x and y ∈ S
d(x, y) = 0 if and only if x = y |
boolean |
isReTrain()
Returns true if this metric will indicate a need to be retrained
once it has been trained once.
|
boolean |
isSubadditive()
Returns true if this distance metric obeys the rule that, for any x, y, and z ∈ S
d(x, z) ≤ d(x, y) + d(y, z) |
boolean |
isSymmetric()
Returns true if this distance metric obeys the rule that, for any x, y, and z ∈ S
d(x, y) = d(y, x) |
double |
metricBound()
All metrics must return values greater than or equal to 0.
|
boolean |
needsTraining()
Returns true if the metric needs to be trained.
|
void |
setReTrain(boolean reTrain)
It may be desirable to have the metric trained only once, and use the same parameters
for all other training sessions of the learning algorithm using the metric.
|
boolean |
supportsAcceleration()
Indicates if this distance metric supports building an acceleration cache
using the
DistanceMetric.getAccelerationCache(java.util.List) and associated
distance methods. |
boolean |
supportsClassificationTraining()
Some metrics might be special purpose, and not trainable for all types of data sets or tasks.
|
boolean |
supportsRegressionTraining()
Some metrics might be special purpose, and not trainable for all types of data sets tasks.
|
String |
toString()
Returns a descriptive name of the Distance Metric in use
|
void |
train(ClassificationDataSet dataSet)
Trains this metric on the given classification problem data set
|
void |
train(ClassificationDataSet dataSet,
ExecutorService threadpool)
Trains this metric on the given classification problem data set
|
void |
train(DataSet dataSet)
Trains this metric on the given data set
|
void |
train(DataSet dataSet,
ExecutorService threadpool)
Trains this metric on the given data set
|
<V extends Vec> |
train(List<V> dataSet)
Trains this metric on the given data set
|
<V extends Vec> |
train(List<V> dataSet,
ExecutorService threadpool)
Trains this metric on the given data set
|
void |
train(RegressionDataSet dataSet)
Trains this metric on the given regression problem data set
|
void |
train(RegressionDataSet dataSet,
ExecutorService threadpool)
Trains this metric on the given regression problem data set
|
trainIfNeeded, trainIfNeeded, trainIfNeeded, trainIfNeeded
public boolean isReTrain()
needsTraining()
will always return true. false means the metric will not indicate
a need to be retrained once it has been trained once.public void setReTrain(boolean reTrain)
needsTraining()
will always return true. false means the metric will not indicate
a need to be retrained once it has been trained once.reTrain
- true to make the metric always request retraining, false so it will not.public <V extends Vec> void train(List<V> dataSet)
TrainableDistanceMetric
train
in class TrainableDistanceMetric
V
- the type of vectors in the listdataSet
- the data set to train onpublic <V extends Vec> void train(List<V> dataSet, ExecutorService threadpool)
TrainableDistanceMetric
train
in class TrainableDistanceMetric
V
- the type of vectors in the listdataSet
- the data set to train onthreadpool
- the source of threads for parallel trainingpublic void train(DataSet dataSet)
TrainableDistanceMetric
train
in class TrainableDistanceMetric
dataSet
- the data set to train onpublic void train(DataSet dataSet, ExecutorService threadpool)
TrainableDistanceMetric
train
in class TrainableDistanceMetric
dataSet
- the data set to train onthreadpool
- the source of threads for parallel trainingpublic void train(ClassificationDataSet dataSet)
TrainableDistanceMetric
train
in class TrainableDistanceMetric
dataSet
- the data set to train onpublic void train(ClassificationDataSet dataSet, ExecutorService threadpool)
TrainableDistanceMetric
train
in class TrainableDistanceMetric
dataSet
- the data set to train onthreadpool
- the source of threads for parallel trainingpublic boolean supportsClassificationTraining()
TrainableDistanceMetric
supportsClassificationTraining
in class TrainableDistanceMetric
public void train(RegressionDataSet dataSet)
TrainableDistanceMetric
train
in class TrainableDistanceMetric
dataSet
- the data set to train onpublic void train(RegressionDataSet dataSet, ExecutorService threadpool)
TrainableDistanceMetric
train
in class TrainableDistanceMetric
dataSet
- the data set to train onthreadpool
- the source of threads for parallel trainingpublic boolean supportsRegressionTraining()
TrainableDistanceMetric
supportsRegressionTraining
in class TrainableDistanceMetric
public boolean needsTraining()
TrainableDistanceMetric
needsTraining
in class TrainableDistanceMetric
public double dist(Vec a, Vec b)
DistanceMetric
a
- the first vectorb
- the second vectorpublic boolean isSymmetric()
DistanceMetric
public boolean isSubadditive()
DistanceMetric
public boolean isIndiscemible()
DistanceMetric
public double metricBound()
DistanceMetric
Double.POSITIVE_INFINITY
is a valid return value.public String toString()
DistanceMetric
toString
in interface DistanceMetric
toString
in class Object
public MahalanobisDistance clone()
clone
in interface DistanceMetric
clone
in class TrainableDistanceMetric
public boolean supportsAcceleration()
DistanceMetric
DistanceMetric.getAccelerationCache(java.util.List)
and associated
distance methods. By default this method will return false
. If
true
, then a cache can be obtained from this distance metric and
used in conjunction with DistanceMetric.dist(int, jsat.linear.Vec,
java.util.List, java.util.List)
and DistanceMetric.dist(int, int,
java.util.List, java.util.List)
to perform distance computations.true
if cache acceleration is supported for this metric,
false
otherwise.public List<Double> getAccelerationCache(List<? extends Vec> vecs)
DistanceMetric
null
will be
returned.vecs
- the list of vectors to build an acceleration cache forpublic double dist(int a, int b, List<? extends Vec> vecs, List<Double> cache)
DistanceMetric
null
, then
DistanceMetric.dist(jsat.linear.Vec, jsat.linear.Vec)
will be called directly.a
- the index of the first vectorb
- the index of the second vectorvecs
- the list of vectors used to build the cachecache
- the cache associated with the given list of vectorspublic double dist(int a, Vec b, List<? extends Vec> vecs, List<Double> cache)
DistanceMetric
null
, then
DistanceMetric.dist(jsat.linear.Vec, jsat.linear.Vec)
will be called directly.a
- the index of the vector in the cacheb
- the other vectorvecs
- the list of vectors used to build the cachecache
- the cache associated with the given list of vectorspublic List<Double> getQueryInfo(Vec q)
DistanceMetric
DistanceMetric.getAccelerationCache(java.util.List)
. This can then be used if
a large number of distance computations are going to be done against
points in the original set for a point that is outside the original space.
null
will be
returned.q
- the query point to generate cache information forpublic List<Double> getAccelerationCache(List<? extends Vec> vecs, ExecutorService threadpool)
DistanceMetric
null
will be
returned.vecs
- the list of vectors to build an acceleration cache forthreadpool
- source of threads for parallel computation of result.
This may be null
, which means the
singled threaded
version
may be used.public double dist(int a, Vec b, List<Double> qi, List<? extends Vec> vecs, List<Double> cache)
DistanceMetric
DistanceMetric.getQueryInfo(jsat.linear.Vec)
.
null
, then
DistanceMetric.dist(jsat.linear.Vec, jsat.linear.Vec)
will be called directly.a
- the index of the vector in the cacheb
- the other vectorqi
- the query information about bvecs
- the list of vectors used to build the cachecache
- the cache associated with the given list of vectorsCopyright © 2017. All rights reserved.