|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||
java.lang.Objectcom.aliasi.corpus.Corpus<ClassificationHandler<E,Classification>>
com.aliasi.classify.XValidatingClassificationCorpus<E>
public class XValidatingClassificationCorpus<E>
A XValidatingClassificationCorpus holds a set of
inputs and classification results to be used as a corpus with
built-in cross-validation support. Instances may be added in the
constructors, or through the implementation of a classification
handler.
Initially, the fold will be set to 0, which takes the initial
prefix of the data for testing and the rest for training. The
fold may be reset using setFold(int). This will reset the
fold to be the specified value. In this way, by iterating from
0 to numFolds()-1, a full cross-validation may be
performed.
The randomization method permuteCorpus(Random) takes
a corpus and permutes its instances. This may be used to make
each fold random.
Corpus.visitCorpus(Handler)
will run the specified handler over all of the data collected in
this corpus.
This class must be used with external read/write synchronization. The write operations include the constructor, set-fold, permute corpus and handle methods. The read operations include the visit num instances and fold reporting methods.
| Constructor Summary | |
|---|---|
XValidatingClassificationCorpus(int numFolds)
Construct a cross-validating corpus with the specified number of folds that initially contains no examples. |
|
XValidatingClassificationCorpus(List<E> inputList,
List<Classification> classificationList,
int numFolds)
Construct a cross-validating corpus containing the instances specified on the parallel arrays of inputs and classifications, and the specified number of folds. |
|
XValidatingClassificationCorpus(Parser<ClassificationHandler<E,Classification>> parser,
File[] dataFiles,
int numFolds)
Construct a cross-validating corpus containing the instances parsed out of the specified data files using the specified parser using the specified number of folds. |
|
XValidatingClassificationCorpus(XValidatingClassificationCorpus<E> corpus)
Construct a deep copy of the specified corpus. |
|
| Method Summary | |
|---|---|
String[] |
categories()
Returns the categories found in the cases for this corpus sorted into ascending order. |
int |
fold()
Returns the current fold. |
void |
handle(E e,
Classification c)
Adds the specified object and corresponding classification to the corpus. |
int |
numFolds()
Returns the number of folds for this corpus. |
int |
numInstances()
Returns the number of instances for this corpus. |
void |
permuteCorpus(Random random)
Randomly permutes the corpus using the specified randomizer. |
void |
setFold(int fold)
Set the current fold to the specified value. |
String |
toString()
Returns a string representation of the size of this corpus. |
void |
visitTest(ClassificationHandler<E,Classification> handler)
Sends all of the test cases in this corpus for the current fold to the specified handler. |
void |
visitTrain(ClassificationHandler<E,Classification> handler)
Send all of the training cases in this corpus for the current fold to the specified handler. |
| Methods inherited from class com.aliasi.corpus.Corpus |
|---|
visitCorpus, visitCorpus |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
| Constructor Detail |
|---|
public XValidatingClassificationCorpus(XValidatingClassificationCorpus<E> corpus)
The main use for this method is for cross-validation, where several copies of the same corpus may be used in parallel. Typically, a single corpus is permuted once and then copied with the copies being set to handle different folds concurrently.
The cost of the copy is the pair of parallel lists to hold the inputs and classifications. The inputs and classifications are not themselves deep-copied.
corpus - Corpus to deep copy.
public XValidatingClassificationCorpus(List<E> inputList,
List<Classification> classificationList,
int numFolds)
The lists are copied and not used after construction.
inputList - List of inputs to classify.classificationList - List of classification results for inputs.numFolds - Number of folds for cross-validation.
IllegalArgumentException - If the number of folds is not
greater than zero or if the parallel lists are not of the same
length.
public XValidatingClassificationCorpus(Parser<ClassificationHandler<E,Classification>> parser,
File[] dataFiles,
int numFolds)
throws IOException
parser - Classification parser for data files.dataFiles - List of data files to parse.numFolds - Number of folds.
IllegalArgumentException - If the number of folds is less
than one.
IOException - If there is an underlying I/O error reading
the file or parsing.public XValidatingClassificationCorpus(int numFolds)
numFolds - Number of folds for cross-validation.
IllegalArgumentException - If the number of folds is
less than one.| Method Detail |
|---|
public String[] categories()
public void handle(E e,
Classification c)
handle in interface ClassificationHandler<E,Classification>e - Object that is classified.c - Classification for the object.public void permuteCorpus(Random random)
random - Randomizer to use for permutation.public int numInstances()
public int numFolds()
public int fold()
public void setFold(int fold)
fold - New fold value.
IllegalArgumentException - If the fold is less than zero or
greater than or equal to the number of folds.public void visitTest(ClassificationHandler<E,Classification> handler)
visitTest in class Corpus<ClassificationHandler<E,Classification>>handler - Handler to receive the test cases.public void visitTrain(ClassificationHandler<E,Classification> handler)
visitTrain in class Corpus<ClassificationHandler<E,Classification>>handler - Handler to receive the training cases.public String toString()
toString in class Object
|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||