public class PearsonDistance extends Object implements DistanceMetric
Constructor and Description |
---|
PearsonDistance()
Creates a new standard Pearson Distance that does not ignore zero values
and anti-correlated values are considered far away.
|
PearsonDistance(boolean bothNonZero,
boolean absoluteDistance)
Creates a new Pearson Distance object
|
Modifier and Type | Method and Description |
---|---|
PearsonDistance |
clone() |
static double |
correlation(Vec a,
Vec b,
boolean bothNonZero)
Computes the Pearson correlation between two vectors.
|
double |
dist(int a,
int b,
List<? extends Vec> vecs,
List<Double> cache)
Computes the distance between 2 vectors in the original list of vectors.
|
double |
dist(int a,
Vec b,
List<? extends Vec> vecs,
List<Double> cache)
Computes the distance between one vector in the original list of vectors
with that of another vector not from the original list.
|
double |
dist(int a,
Vec b,
List<Double> qi,
List<? extends Vec> vecs,
List<Double> cache)
Computes the distance between one vector in the original list of vectors
with that of another vector not from the original list, but had
information generated by
DistanceMetric.getQueryInfo(jsat.linear.Vec) . |
double |
dist(Vec a,
Vec b)
Computes the distance between 2 vectors.
|
List<Double> |
getAccelerationCache(List<? extends Vec> vecs)
Returns a cache of double values associated with the given list of
vectors in the given order.
|
List<Double> |
getAccelerationCache(List<? extends Vec> vecs,
ExecutorService threadpool)
Returns a cache of double values associated with the given list of
vectors in the given order.
|
List<Double> |
getQueryInfo(Vec q)
Pre computes query information that would have be generated if the query
was a member of the original list of vectors when calling
DistanceMetric.getAccelerationCache(java.util.List) . |
boolean |
isIndiscemible()
Returns true if this distance metric obeys the rule that, for any x and y ∈ S
d(x, y) = 0 if and only if x = y |
boolean |
isSubadditive()
Returns true if this distance metric obeys the rule that, for any x, y, and z ∈ S
d(x, z) ≤ d(x, y) + d(y, z) |
boolean |
isSymmetric()
Returns true if this distance metric obeys the rule that, for any x, y, and z ∈ S
d(x, y) = d(y, x) |
double |
metricBound()
All metrics must return values greater than or equal to 0.
|
boolean |
supportsAcceleration()
Indicates if this distance metric supports building an acceleration cache
using the
DistanceMetric.getAccelerationCache(java.util.List) and associated
distance methods. |
equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
toString
public PearsonDistance()
public PearsonDistance(boolean bothNonZero, boolean absoluteDistance)
bothNonZero
- true
if non zero values should be treated as
"missing" or "no vote", and will not contribute. But this will not
change the mean value used. false
produces the standard Pearson value.absoluteDistance
- true
to use the absolute correlation, meaning
correlated and anti-correlated values will have the same distance.public double dist(Vec a, Vec b)
DistanceMetric
dist
in interface DistanceMetric
a
- the first vectorb
- the second vectorpublic boolean isSymmetric()
DistanceMetric
isSymmetric
in interface DistanceMetric
public boolean isSubadditive()
DistanceMetric
isSubadditive
in interface DistanceMetric
public boolean isIndiscemible()
DistanceMetric
isIndiscemible
in interface DistanceMetric
public double metricBound()
DistanceMetric
Double.POSITIVE_INFINITY
is a valid return value.metricBound
in interface DistanceMetric
public PearsonDistance clone()
clone
in interface DistanceMetric
clone
in class Object
public static double correlation(Vec a, Vec b, boolean bothNonZero)
bothNonZero
is true
, and the vectors have no overlapping non zero values, 0 will
be returned.a
- the first vectorb
- the second vectorbothNonZero
- false
is the normal Pearson correlation. true
will make the computation ignore
all indexes where one of the values is zero, the mean will be from all non zero values in each vector.public boolean supportsAcceleration()
DistanceMetric
DistanceMetric.getAccelerationCache(java.util.List)
and associated
distance methods. By default this method will return false
. If
true
, then a cache can be obtained from this distance metric and
used in conjunction with DistanceMetric.dist(int, jsat.linear.Vec,
java.util.List, java.util.List)
and DistanceMetric.dist(int, int,
java.util.List, java.util.List)
to perform distance computations.supportsAcceleration
in interface DistanceMetric
true
if cache acceleration is supported for this metric,
false
otherwise.public List<Double> getAccelerationCache(List<? extends Vec> vecs)
DistanceMetric
null
will be
returned.getAccelerationCache
in interface DistanceMetric
vecs
- the list of vectors to build an acceleration cache forpublic double dist(int a, int b, List<? extends Vec> vecs, List<Double> cache)
DistanceMetric
null
, then
DistanceMetric.dist(jsat.linear.Vec, jsat.linear.Vec)
will be called directly.dist
in interface DistanceMetric
a
- the index of the first vectorb
- the index of the second vectorvecs
- the list of vectors used to build the cachecache
- the cache associated with the given list of vectorspublic double dist(int a, Vec b, List<? extends Vec> vecs, List<Double> cache)
DistanceMetric
null
, then
DistanceMetric.dist(jsat.linear.Vec, jsat.linear.Vec)
will be called directly.dist
in interface DistanceMetric
a
- the index of the vector in the cacheb
- the other vectorvecs
- the list of vectors used to build the cachecache
- the cache associated with the given list of vectorspublic List<Double> getQueryInfo(Vec q)
DistanceMetric
DistanceMetric.getAccelerationCache(java.util.List)
. This can then be used if
a large number of distance computations are going to be done against
points in the original set for a point that is outside the original space.
null
will be
returned.getQueryInfo
in interface DistanceMetric
q
- the query point to generate cache information forpublic List<Double> getAccelerationCache(List<? extends Vec> vecs, ExecutorService threadpool)
DistanceMetric
null
will be
returned.getAccelerationCache
in interface DistanceMetric
vecs
- the list of vectors to build an acceleration cache forthreadpool
- source of threads for parallel computation of result.
This may be null
, which means the
singled threaded
version
may be used.public double dist(int a, Vec b, List<Double> qi, List<? extends Vec> vecs, List<Double> cache)
DistanceMetric
DistanceMetric.getQueryInfo(jsat.linear.Vec)
.
null
, then
DistanceMetric.dist(jsat.linear.Vec, jsat.linear.Vec)
will be called directly.dist
in interface DistanceMetric
a
- the index of the vector in the cacheb
- the other vectorqi
- the query information about bvecs
- the list of vectors used to build the cachecache
- the cache associated with the given list of vectorsCopyright © 2017. All rights reserved.