public class E2LSH<V extends Vec> extends Object
L1
and
L2
distance metrics. This is
essentially a vector collection that can only perform a radius search for a
pre-defined radius. In addition, the results are only approximate - not all
of the correct points may be returned, and it is possible no points will be
returned when the truth is that some data points do exist.
searchR(jsat.linear.Vec, boolean)
methods. While the set of points returned is approximate, the distance values
are exact. This is because no approximate distance is available, so the
distances must be computed to remove violators.
Constructor and Description |
---|
E2LSH(List<V> vecs,
double radius,
double eps,
int w,
int k,
double delta,
DistanceMetric dm)
Creates a new LSH scheme for a given distance metric
|
E2LSH(List<V> vecs,
double radius,
double eps,
int w,
int k,
double delta,
DistanceMetric dm,
List<Double> distCache)
Creates a new LSH scheme for a given distance metric
|
Modifier and Type | Method and Description |
---|---|
double |
getC()
Returns the multiplier used on the radius that controls the degree
of approximation.
|
int |
getL()
Returns how many separate hash tables have been created for this distance
metric.
|
double |
getRadius()
Returns the desired approximate radius for which to return results
|
List<? extends VecPaired<Vec,Double>> |
searchR(Vec q)
Performs a search for points within the set
radius
of the query point. |
List<? extends VecPaired<Vec,Double>> |
searchR(Vec q,
boolean approx)
Performs a search for points within the set
radius
of the query point. |
public E2LSH(List<V> vecs, double radius, double eps, int w, int k, double delta, DistanceMetric dm, List<Double> distCache)
vecs
- the set of vector to place into the LSHradius
- the searchR radius for vectorseps
- the approximation error, where vectors as fast as R(1+eps) are
likely to be returned. Must be positive.w
- the projection radius. If given a value <= 0, a default value
of 4 will be used.k
- the number of hash functions to conjoin into the final hash per
vector. If a value <= 0 is given, a default value will be computed.delta
- (1-delta) will be the desired minimum probability of
correctly selecting the correct nearest neighbor if there is only 1-NN
within a distance of radius
. It will be used to determine some
number getL()
hash tables to reach the desired probability.
0.10 is a good value.dm
- the distance metric to use, must be EuclideanDistance
or ManhattanDistance
.distCache
- the distance acceleration cache to use, if null
,
and it is supported, one will not be built. This is provided to a void
redundant calculation when initializing multiple LSH tables using the
same data set.public E2LSH(List<V> vecs, double radius, double eps, int w, int k, double delta, DistanceMetric dm)
vecs
- the set of vector to place into the LSHradius
- the searchR radius for vectorseps
- the approximation error, where vectors as fast as R(1+eps) are
likely to be returned. Must be positive.w
- the projection radius. If given a value <= 0, a default value
of 4 will be used.k
- the number of hash functions to conjoin into the final hash per
vector. If a value <= 0 is given, a default value will be computed.delta
- (1-delta) will be the desired minimum probability of
correctly selecting the correct nearest neighbor if there is only 1-NN
within a distance of radius
. It will be used to determine some
number getL()
hash tables to reach the desired probability.
0.10 is a good value.dm
- the distance metric to use, must be EuclideanDistance
or ManhattanDistance
.public List<? extends VecPaired<Vec,Double>> searchR(Vec q)
radius
of the query point.q
- the query point to search nearpublic List<? extends VecPaired<Vec,Double>> searchR(Vec q, boolean approx)
radius
of the query point.q
- the query point to search nearapprox
- whether or not to return results in the approximate query
rangepublic double getC()
public double getRadius()
public int getL()
Copyright © 2017. All rights reserved.