Package | Description |
---|---|
jsat.text | |
jsat.text.wordweighting |
Constructor and Description |
---|
BasicTextVectorCreator(Tokenizer tokenizer,
Map<String,Integer> wordIndex,
WordWeighting weighting)
Creates a new basic text vector creator
|
ClassificationHashedTextDataLoader(int dimensionSize,
Tokenizer tokenizer,
WordWeighting weighting)
Creates an new hashed text data loader for classification problems.
|
ClassificationHashedTextDataLoader(Tokenizer tokenizer,
WordWeighting weighting)
Creates an new hashed text data loader for classification problems, it
uses a relatively large default size of 222 for the dimension
of the space.
|
ClassificationTextDataLoader(Tokenizer tokenizer,
WordWeighting weighting)
Creates a new text data loader
|
HashedTextDataLoader(int dimensionSize,
Tokenizer tokenizer,
WordWeighting weighting) |
HashedTextDataLoader(Tokenizer tokenizer,
WordWeighting weighting) |
HashedTextVectorCreator(int dimensionSize,
Tokenizer tokenizer,
WordWeighting weighting)
Creates a new text vector creator that works with hash-trick features
|
TextDataLoader(Tokenizer tokenizer,
WordWeighting weighting)
Creates a new loader for text datasets
|
Modifier and Type | Class and Description |
---|---|
class |
BinaryWordPresent
Provides a simple binary representation of bag-of-word vectors by simply
marking a value 1.0 if the token is present, and 0.0 if the token is not
present.
|
class |
OkapiBM25
Implements the Okapi BM25
word weighting scheme.
|
class |
TfIdf
Applies Term Frequency Inverse Document Frequency (TF IDF) weighting to the
word vectors.
|
class |
WordCount
Provides a simple representation of bag-of-word vectors by simply using the
number of occurrences for a word in a document as the weight for said word.
|
Copyright © 2017. All rights reserved.