public class Bagging extends Object implements Classifier, Regressor, Parameterized
Decision Trees
, because they meet these strengths and weaknesses. NearestNeighbour
is an example of a particularly bad method to bag.
Modifier and Type | Field and Description |
---|---|
static int |
DEFAULT_EXTRA_SAMPLES
The number of extra samples to take when bagging in each round used by default in the constructor: 0
|
static int |
DEFAULT_ROUNDS
The number of rounds of bagging that will be used by default in the constructor: 20
|
static boolean |
DEFAULT_SIMULTANIOUS_TRAINING
The default behavior for parallel training, as specified by
setSimultaniousTraining(boolean) is true |
Constructor and Description |
---|
Bagging(Classifier baseClassifier)
Creates a new Bagger for classification.
|
Bagging(Classifier baseClassifier,
int extraSamples,
boolean simultaniousTraining)
Creates a new Bagger for classification.
|
Bagging(Classifier baseClassifier,
int extraSamples,
boolean simultaniousTraining,
int rounds,
Random random)
Creates a new Bagger for classification.
|
Bagging(Regressor baseRegressor)
Creates a new Bagger for regression.
|
Bagging(Regressor baseRegressor,
int extraSamples,
boolean simultaniousTraining)
Creates a new Bagger for regression.
|
Bagging(Regressor baseRegressor,
int extraSamples,
boolean simultaniousTraining,
int rounds,
Random random)
Creates a new Bagger for regression.
|
Modifier and Type | Method and Description |
---|---|
CategoricalResults |
classify(DataPoint data)
Performs classification on the given data point.
|
Bagging |
clone() |
int |
getExtraSamples() |
Parameter |
getParameter(String paramName)
Returns the parameter with the given name.
|
List<Parameter> |
getParameters()
Returns the list of parameters that can be altered for this learner.
|
int |
getRounds()
Returns the number of rounds of boosting that will be done, which is also the number of base learners that will be trained
|
static ClassificationDataSet |
getSampledDataSet(ClassificationDataSet dataSet,
int[] sampledCounts)
Creates a new data set from the given sample counts.
|
static RegressionDataSet |
getSampledDataSet(RegressionDataSet dataSet,
int[] sampledCounts)
Creates a new data set from the given sample counts.
|
static ClassificationDataSet |
getWeightSampledDataSet(ClassificationDataSet dataSet,
int[] sampledCounts)
Creates a new data set from the given sample counts.
|
static RegressionDataSet |
getWeightSampledDataSet(RegressionDataSet dataSet,
int[] sampledCounts)
Creates a new data set from the given sample counts.
|
double |
regress(DataPoint data) |
static void |
sampleWithReplacement(int[] sampleCounts,
int samples,
Random rand)
Performs the sampling based on the number of data points, storing the
counts in an array to be constructed from XXXX
|
void |
setExtraSamples(int i)
Bagging samples from the training set with replacement, and draws a sampleWithReplacement at least as large
as the training set.
|
void |
setRounds(int rounds)
Sets the number of rounds that bagging is done, meaning how many base learners are trained
|
void |
setSimultaniousTraining(boolean simultaniousTraining)
Bagging produces multiple base learners.
|
boolean |
supportsWeightedData()
Indicates whether the model knows how to train using weighted data points.
|
void |
train(RegressionDataSet dataSet) |
void |
train(RegressionDataSet dataSet,
ExecutorService threadPool) |
void |
trainC(ClassificationDataSet dataSet)
Trains the classifier and constructs a model for classification using the
given data set.
|
void |
trainC(ClassificationDataSet dataSet,
ExecutorService threadPool)
Trains the classifier and constructs a model for classification using the
given data set.
|
public static final int DEFAULT_ROUNDS
public static final int DEFAULT_EXTRA_SAMPLES
public static final boolean DEFAULT_SIMULTANIOUS_TRAINING
setSimultaniousTraining(boolean)
is truepublic Bagging(Classifier baseClassifier)
baseClassifier
- the base learner to use.public Bagging(Classifier baseClassifier, int extraSamples, boolean simultaniousTraining)
baseClassifier
- the base learner to use.extraSamples
- how many extra samples past the training size to takesimultaniousTraining
- controls whether base learners are trained sequentially or simultaneouslypublic Bagging(Classifier baseClassifier, int extraSamples, boolean simultaniousTraining, int rounds, Random random)
baseClassifier
- the base learner to use.extraSamples
- how many extra samples past the training size to takesimultaniousTraining
- controls whether base learners are trained sequentially or simultaneouslyrounds
- how many rounds of bagging to perform.random
- the source of randomness for samplingpublic Bagging(Regressor baseRegressor)
baseRegressor
- the base learner to use.public Bagging(Regressor baseRegressor, int extraSamples, boolean simultaniousTraining)
baseRegressor
- the base learner to use.extraSamples
- how many extra samples past the training size to takesimultaniousTraining
- controls whether base learners are trained sequentially or simultaneouslypublic Bagging(Regressor baseRegressor, int extraSamples, boolean simultaniousTraining, int rounds, Random random)
baseRegressor
- the base learner to use.extraSamples
- how many extra samples past the training size to takesimultaniousTraining
- controls whether base learners are trained sequentially or simultaneouslyrounds
- how many rounds of bagging to perform.random
- the source of randomness for samplingpublic void setExtraSamples(int i)
i
- how many extra samples to takepublic int getExtraSamples()
public void setRounds(int rounds)
rounds
- the number of base learners to trainArithmeticException
- if the number specified is not a positive valuepublic int getRounds()
public void setSimultaniousTraining(boolean simultaniousTraining)
simultaniousTraining
- true to train all learners at the same time, false to train them sequentiallypublic CategoricalResults classify(DataPoint data)
Classifier
classify
in interface Classifier
data
- the data point to classifypublic void trainC(ClassificationDataSet dataSet, ExecutorService threadPool)
Classifier
trainC
in interface Classifier
dataSet
- the data set to train onthreadPool
- the source of threads to use.public void trainC(ClassificationDataSet dataSet)
Classifier
trainC
in interface Classifier
dataSet
- the data set to train onpublic static ClassificationDataSet getSampledDataSet(ClassificationDataSet dataSet, int[] sampledCounts)
dataSet
- the data set that was sampled fromsampledCounts
- the sampling values obtained from
sampleWithReplacement(int[], int, java.util.Random)
public static ClassificationDataSet getWeightSampledDataSet(ClassificationDataSet dataSet, int[] sampledCounts)
dataSet
- the data set that was sampled fromsampledCounts
- the sampling values obtained from
sampleWithReplacement(int[], int, java.util.Random)
public static RegressionDataSet getSampledDataSet(RegressionDataSet dataSet, int[] sampledCounts)
dataSet
- the data set that was sampled fromsampledCounts
- the sampling values obtained from
sampleWithReplacement(int[], int, java.util.Random)
public static RegressionDataSet getWeightSampledDataSet(RegressionDataSet dataSet, int[] sampledCounts)
dataSet
- the data set that was sampled fromsampledCounts
- the sampling values obtained from
sampleWithReplacement(int[], int, java.util.Random)
public static void sampleWithReplacement(int[] sampleCounts, int samples, Random rand)
sampleCounts
- an array to keep count of how many times each data
point was sampled. The array will be filled with zeros before sampling
startssamples
- the number of samples to take from the data setrand
- the source of randomnesspublic boolean supportsWeightedData()
Classifier
supportsWeightedData
in interface Classifier
supportsWeightedData
in interface Regressor
public void train(RegressionDataSet dataSet, ExecutorService threadPool)
public void train(RegressionDataSet dataSet)
public Bagging clone()
public List<Parameter> getParameters()
Parameterized
getParameters
in interface Parameterized
public Parameter getParameter(String paramName)
Parameterized
getParameter
in interface Parameterized
paramName
- the name of the parameter to obtainCopyright © 2017. All rights reserved.