public class BasicTextVectorCreator extends Object implements TextVectorCreator
TextDataLoader
, though can be used if you know all the words you will
need (and can initialize the WordWeighting
) before creating this
object.Constructor and Description |
---|
BasicTextVectorCreator(Tokenizer tokenizer,
Map<String,Integer> wordIndex,
WordWeighting weighting)
Creates a new basic text vector creator
|
Modifier and Type | Method and Description |
---|---|
Vec |
newText(String text)
Converts the given input text into a vector representation.
|
Vec |
newText(String input,
StringBuilder workSpace,
List<String> storageSpace)
Converts the given input text into a vector representation
|
public BasicTextVectorCreator(Tokenizer tokenizer, Map<String,Integer> wordIndex, WordWeighting weighting)
tokenizer
- the tokenizer to apply to incoming stringswordIndex
- the map of each known word to its index, the size of the
map indicating the maximum (exclusive) indexweighting
- the weighting process to apply to each loaded document.
This should have already been initialized, or be stateless.public Vec newText(String text)
TextVectorCreator
newText
in interface TextVectorCreator
text
- the input stringpublic Vec newText(String input, StringBuilder workSpace, List<String> storageSpace)
TextVectorCreator
newText
in interface TextVectorCreator
input
- the input stringworkSpace
- an already allocated (but empty) string builder than can
be used as a temporary work space.storageSpace
- an already allocated (but empty) list to place the
tokens intoCopyright © 2017. All rights reserved.