public class TfIdf extends WordWeighting
Modifier and Type | Class and Description |
---|---|
static class |
TfIdf.TermFrequencyWeight |
Constructor and Description |
---|
TfIdf()
Creates a new TF-IDF document weighting scheme that uses
LOG weighting for term frequency. |
TfIdf(TfIdf.TermFrequencyWeight tfWeighting)
Creates a new TF-IDF document weighting scheme that uses the specified
term frequency weighting
|
Modifier and Type | Method and Description |
---|---|
void |
applyTo(Vec vec)
The implementation may want to pre compute come values based on the
vector it is about to be applied to.
|
double |
indexFunc(double value,
int index)
An index function, meant to be applied to vectors where the
value to be computed may vary based on the position in the
vector of the value.
|
void |
setWeight(List<? extends Vec> allDocuments,
List<Integer> df)
Prepares the word weighting to be performed on a data set.
|
f, f
public TfIdf()
LOG
weighting for term frequency.public TfIdf(TfIdf.TermFrequencyWeight tfWeighting)
tfWeighting
- the weighting method to use for the term frequency
(tf) componentpublic void setWeight(List<? extends Vec> allDocuments, List<Integer> df)
WordWeighting
setWeight
in class WordWeighting
allDocuments
- the list of all vectors that make up the set of
documents. The word vectors should be unmodified, containing the value of
how many times a word appeared in the document for each index.df
- a list mapping each integer index of a word to how many times
that word occurred in totalpublic double indexFunc(double value, int index)
IndexFunction
indexFunc
in class IndexFunction
value
- the value at the specified indexindex
- the index the value is frompublic void applyTo(Vec vec)
WordWeighting
Vec.applyIndexFunction(jsat.math.IndexFunction)
. The vector
should be in a bag-of-words form where each index value indicates how
many times the word for that index occurred in the document represented
by the vector.applyTo
in class WordWeighting
vec
- the vector to set up for and then alter by invoking
Vec.applyIndexFunction(jsat.math.IndexFunction)
onCopyright © 2017. All rights reserved.