public interface TagChunkCodec
TagChunkCodecprovides a means of coding chunkings as taggings and decoding (string) taggings back to chunkings.
Each codec contains a method
tagSet(Set) to return the
complete set of tags used in the coding given a set of chunk types.
Codecs also use a variable argument method
legalTags(String) to determine if a sequence of tags is legal.
For a known set of chunk types, the followers of a tag can be
constructed by iterating over the set of tags returned by
tagSet() and check if they're legal using
To validate whether a chunking may be successfully encoded as a
tagging and then decoded to the original chunking, use the method
isEncodable(Chunking). To validate whether a string
tagging may be successfully decoded to a chunking and then
reencoded to the original string tagging, use
|Modifier and Type||Method and Description|
Returns an iterator over chunks extracted in order of highest probability up to the specified maximum number of results.
Returns the complete set of tags used by this codec for the specified set of chunk types.
Return the result of decoding the specified tagging into a chunking.
Return the string tagging that fully encodes the specified chunking.
Return the tagging that partially encodes the specified chunking.
This method will typically be more efficient than
toStringTagging(), but implementations may just return the
same value, because
This method may be implemented by delegating to
toStringTagging(Chunking), but a direct
implementation is often more efficient.
chunking- Chunking to encode.
StringTagging toStringTagging(Chunking chunking)
chunking- Chunking to encode.
Chunking toChunking(StringTagging tagging)
tagging- Tagging to decode.
IllegalArgumentException- If the tag sequence is illegal.
Modifying the returned set will not affect the codec.
chunkTypes- Set of types for chunks.
boolean legalTags(String... tags)
trueif the specified sequence of tags is a complete legal tag sequence. The companion method
legalTagSubSequence(String)tests if a substring of tags is legal.
tags- Variable length array of tags.
trueif the specified sequence of tags is a complete legal tag sequence.
boolean legalTagSubSequence(String... tags)
trueif the specified sequence of tags is a legal subsequence of tags. See the companion method
legalTags(String)to test if a complete sequence is legal.
A sequence of tags is a legal subsequence if a legal sequence may be created by adding more tags to the front and/or end of the specified sequence.
Providing an empty sequence of tags always returns
true. The result for a single input tag determines if the tag
itself is legal. For longer sequences, the tags must all be
legal and their order must be legal.
tags- Sequence of tags to test.
trueif the sequence of tags is legal as a subsequence of some larger sequence.
boolean isEncodable(Chunking chunking)
trueif the specified chunking may be encoded as a tagging then decoded back to the original chunking accurately.
chunking- Chunking to test.
trueif encoding then decoding produces the specified chunking.
boolean isDecodable(StringTagging tagging)
trueif the specified tagging may be decoded as a chunking then encoded back to the original tagging accurately.
tagging- Tagging to test.
trueif decoding then encoding produces the specified tagging.
Iterator<Chunk> nBestChunks(TagLattice<String> lattice, int tokenStarts, int tokenEnds, int maxResults)
lattice- Lattice from which chunks are extracted.
maxResults- Maximum number of chunks to return.