public class ElkanKMeans extends KMeans
DistanceMetric
used support
DistanceMetric.isSubadditive()
.
DEFAULT_SEED_SELECTION, dm, MaxIterLimit, means, nearestCentroidDist, rand, saveCentroidDistance, seedSelection, storeMeans
Constructor and Description |
---|
ElkanKMeans()
Creates a new KMeans instance.
|
ElkanKMeans(DistanceMetric dm)
Creates a new KMeans instance
|
ElkanKMeans(DistanceMetric dm,
Random rand)
Creates a new KMeans instance
|
ElkanKMeans(DistanceMetric dm,
Random rand,
SeedSelectionMethods.SeedSelection seedSelection)
Creates a new KMeans instance.
|
ElkanKMeans(ElkanKMeans toCopy) |
Modifier and Type | Method and Description |
---|---|
ElkanKMeans |
clone() |
protected double |
cluster(DataSet dataSet,
List<Double> accelCache,
int k,
List<Vec> means,
int[] assignment,
boolean exactTotal,
ExecutorService threadpool,
boolean returnError,
Vec dataPointWeights)
This is a helper method where the actual cluster is performed.
|
boolean |
isUseDenseSparse()
Returns if Dense Sparse acceleration will be used if available
|
void |
setUseDenseSparse(boolean useDenseSparse)
Sets whether or not to use
DenseSparseMetric when computing. |
cluster, cluster, cluster, cluster, cluster, cluster, getDistanceMetric, getIterationLimit, getListOfLists, getMeans, getParameter, getParameters, getSeedSelection, setIterationLimit, setSeedSelection, setStoreMeans, supportsWeightedData
cluster, cluster, cluster, cluster
cluster, cluster, createClusterListFromAssignmentArray, getDatapointsFromCluster
public ElkanKMeans(DistanceMetric dm, Random rand, SeedSelectionMethods.SeedSelection seedSelection)
dm
- the distance metric to use, must support DistanceMetric.isSubadditive()
.rand
- the random number generator to use during seed selectionseedSelection
- the method of seed selection to usepublic ElkanKMeans(DistanceMetric dm, Random rand)
dm
- the distance metric to use, must support DistanceMetric.isSubadditive()
.rand
- the random number generator to use during seed selectionpublic ElkanKMeans(DistanceMetric dm)
dm
- the distance metric to use, must support DistanceMetric.isSubadditive()
.public ElkanKMeans()
EuclideanDistance
will be used by default.public ElkanKMeans(ElkanKMeans toCopy)
public void setUseDenseSparse(boolean useDenseSparse)
DenseSparseMetric
when computing.
This may or may not provide a speed increase.useDenseSparse
- whether or not to compute the distance from dense
mean vectors to sparse ones using accelerationpublic boolean isUseDenseSparse()
protected double cluster(DataSet dataSet, List<Double> accelCache, int k, List<Vec> means, int[] assignment, boolean exactTotal, ExecutorService threadpool, boolean returnError, Vec dataPointWeights)
KMeans
cluster
in class KMeans
dataSet
- The set of data points to perform clustering onaccelCache
- acceleration cache to use, or null
. If
null
, the kmeans code will attempt to create onek
- the number of clustersmeans
- the initial points to use as the means. Its length is the
number of means that will be searched for. These means will be altered,
and should contain deep copies of the points they were drawn from. May be
empty, in which case the list will be filled with some selected meansassignment
- an empty temp space to store the clustering
classifications. Should be the same length as the number of data pointsexactTotal
- determines how the objective function (return value)
will be computed. If true, extra work will be done to compute the exact
distance from each data point to its cluster. If false, an upper bound
approximation will be used. This also impacts the value stored in
KMeans.nearestCentroidDist
threadpool
- the source of threads for parallel computation. If
null, single threaded execution will occurreturnError
- true
is the sum of squared distances should be
returned. false
means any value can be returned.
KMeans.saveCentroidDistance
only applies if this is true
dataPointWeights
- the weight value to use for each data point. If
null, assume each point has equal weight.public ElkanKMeans clone()
Copyright © 2017. All rights reserved.