public class TokenFeatureExtractor extends Object implements FeatureExtractor<CharSequence>, Serializable
TokenFeatureExtractorproduces feature vectors from character sequences representing token counts.
The token feature extractors implement the
interface. A token feature extractor will actually be serializable
if the underlying tokenizer factory is serializable, either by
Serializable interface or the
Compilable interface. If it is not, attempting to serialize the
feature extractor will throw an exception.
|Constructor and Description|
Construct a token-based feature extractor from the specified tokenizer factory.
|Modifier and Type||Method and Description|
Return the feature vector for the specified character sequence.
Return the tokenizer factory underlying this token feature extractor.
Returns a description of this token feature extractor including its contained tokenizer factory.
public TokenFeatureExtractor(TokenizerFactory factory)
factory- Tokenizer factory to use for tokenization.
public TokenizerFactory tokenizerFactory()
Warning: This is the actual tokenizer factory, not a copy, so changes to it will affect this class.
public Map<String,Counter> features(CharSequence in)
public String toString()
toString()method of the contained tokenizer factory.