public class MiniBatchKMeans extends KClustererBase
Constructor and Description |
---|
MiniBatchKMeans(DistanceMetric dm,
int batchSize,
int iterations)
Creates a new Mini-Batch k-Means object that uses
k-means++ for seed selection. |
MiniBatchKMeans(DistanceMetric dm,
int batchSize,
int iterations,
SeedSelectionMethods.SeedSelection seedSelection)
Creates a new Mini-Batch k-Means object
|
MiniBatchKMeans(int batchSize,
int iterations)
Creates a new Mini-Batch k-Means object that uses
k-means++ for seed selection
and uses the EuclideanDistance . |
MiniBatchKMeans(MiniBatchKMeans toCopy)
Copy constructor
|
Modifier and Type | Method and Description |
---|---|
MiniBatchKMeans |
clone() |
int[] |
cluster(DataSet dataSet,
ExecutorService threadpool,
int[] designations)
Performs clustering on the given data set.
|
int[] |
cluster(DataSet dataSet,
int[] designations)
Performs clustering on the given data set.
|
int[] |
cluster(DataSet dataSet,
int clusters,
ExecutorService threadpool,
int[] designations) |
int[] |
cluster(DataSet dataSet,
int clusters,
int[] designations) |
int[] |
cluster(DataSet dataSet,
int lowK,
int highK,
ExecutorService threadpool,
int[] designations) |
int[] |
cluster(DataSet dataSet,
int lowK,
int highK,
int[] designations) |
int |
getBatchSize()
Returns the batch size used at each iteration
|
DistanceMetric |
getDistanceMetric()
Returns the distance metric used for determining the nearest cluster center
|
int |
getIterations()
Returns the number of mini-batch iterations used
|
List<Vec> |
getMeans()
Returns the raw list of means that were used for each class.
|
SeedSelectionMethods.SeedSelection |
getSeedSelection()
Returns the method of seed selection to use
|
void |
setBatchSize(int batchSize)
Sets the batch size to use at each iteration.
|
void |
setDistanceMetric(DistanceMetric dm)
Sets the distance metric used for determining the nearest cluster center
|
void |
setIterations(int iterations)
Sets the number of mini-batch iterations to perform
|
void |
setSeedSelection(SeedSelectionMethods.SeedSelection seedSelection)
Sets the method of selecting the initial data points to
seed the clustering algorithm.
|
void |
setStoreMeans(boolean storeMeans)
If set to
true the computed means will be stored after clustering
is completed, and can then be retrieved using getMeans() . |
cluster, cluster, cluster, cluster
cluster, cluster, createClusterListFromAssignmentArray, getDatapointsFromCluster, supportsWeightedData
equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
cluster, cluster, supportsWeightedData
public MiniBatchKMeans(int batchSize, int iterations)
k-means++
for seed selection
and uses the EuclideanDistance
.batchSize
- the mini-batch sizeiterations
- the number of mini batches to performpublic MiniBatchKMeans(DistanceMetric dm, int batchSize, int iterations)
k-means++
for seed selection.dm
- the distance metric to usebatchSize
- the mini-batch sizeiterations
- the number of mini batches to performpublic MiniBatchKMeans(DistanceMetric dm, int batchSize, int iterations, SeedSelectionMethods.SeedSelection seedSelection)
dm
- the distance metric to usebatchSize
- the mini-batch sizeiterations
- the number of mini batches to performseedSelection
- the seed selection algorithm to initiate clusteringpublic MiniBatchKMeans(MiniBatchKMeans toCopy)
toCopy
- the object to copypublic void setStoreMeans(boolean storeMeans)
true
the computed means will be stored after clustering
is completed, and can then be retrieved using getMeans()
.storeMeans
- true
if the means should be stored for later,
false
to discard them once clustering is complete.public List<Vec> getMeans()
public void setDistanceMetric(DistanceMetric dm)
dm
- the distance metric to usepublic DistanceMetric getDistanceMetric()
public void setBatchSize(int batchSize)
naive k-means
algorithm.batchSize
- the number of points to use at each iterationpublic int getBatchSize()
public void setIterations(int iterations)
iterations
- the number of algorithm iterations to performpublic int getIterations()
public void setSeedSelection(SeedSelectionMethods.SeedSelection seedSelection)
seedSelection
- the seed selection algorithm to usepublic SeedSelectionMethods.SeedSelection getSeedSelection()
public int[] cluster(DataSet dataSet, int[] designations)
Clusterer
dataSet
- the data set to perform clustering ondesignations
- the array which will contain the designated values. The array will be altered and returned by
the function. If null is given, a new array will be created and returned.public int[] cluster(DataSet dataSet, ExecutorService threadpool, int[] designations)
Clusterer
dataSet
- the data set to perform clustering onthreadpool
- a source of threads to run tasksdesignations
- the array which will contain the designated values. The array will be altered and returned by
the function. If null is given, a new array will be created and returned.public int[] cluster(DataSet dataSet, int clusters, ExecutorService threadpool, int[] designations)
public int[] cluster(DataSet dataSet, int clusters, int[] designations)
public int[] cluster(DataSet dataSet, int lowK, int highK, ExecutorService threadpool, int[] designations)
public int[] cluster(DataSet dataSet, int lowK, int highK, int[] designations)
public MiniBatchKMeans clone()
clone
in interface Clusterer
clone
in interface KClusterer
clone
in class KClustererBase
Copyright © 2017. All rights reserved.