com.aliasi.classify
Class ClassifierEvaluator<E,C extends Classification>

java.lang.Object
  extended by com.aliasi.classify.ClassifierEvaluator<E,C>
All Implemented Interfaces:
ClassificationHandler<E,Classification>, Handler

public class ClassifierEvaluator<E,C extends Classification>
extends Object
implements ClassificationHandler<E,Classification>

A ClassifierEvaluator provides an evaluation harness for classifiers. An evaluator is constructed from a classifier and a complete list of the categories returned by the classifier. Test cases are then added using the addCase(String,Object) which accepts a string-based category and object to classify. The evaluator will run the classifier over the input object and collect results over multiple cases. Depending on the classification types returned by the classifier, various report statistics are available.

An exhaustive set of evaluation metrics for first-best classification results is accessbile as a confusion matrix through the confusionMatrix() method. Confusion matrices provide dozens of statistics on classification which can be computed from first-best results; see ConfusionMatrix for more information.

Depending on the class of return results for the classifier being evaluated, the following methods are supported:

Classifier Return Class Supported Methods
Classification
confusionMatrix()
RankedClassification
rankCount(String,int)
averageRankReference()
meanReciprocalRank()
averageRank(String,String)
ScoredClassification
scoredOneVersusAll(String)
averageScore(String,String)
averageScoreReference()
ConditionalClassification
averageConditionalProbability(String,String)
averageConditionalProbabilityReference()
JointClassification
averageLog2JointProbability(String,String)
averageLog2JointProbabilityReference()

If the input is a ranked classification and the reference category does not appear at some rank in the classification, results will be returned as if the reference category appeared in the last possible rank in the ranked classification. This heuristic for scoring applies to all four methods listed for ranked classifications in the table above. As a consequence, the results of averageRank(String,String) might not be such as they could be derived by a set of ranked classifications, because we are assuming that all unlisted categories have the worst possible rank.

This class requires concurrent read and synchronous write synchronization. Reads are any of the statistics gathering methods and write is just adding new test cases.

Incomplete Rankings, Scorings and Conditionals

Some classifiers might not return a rank, score or conditional probability estimate for every input. In this case, the counts for existing categories are still updated, but flags are set indicating that values are missing. If any ranked, scored or conditional classification missed a rank, score or conditonal probability estimate for a category, the corresponding method will return true, missingRankings(), missingScorings(), or missingConditionals()

Since:
LingPipe2.0
Version:
3.5
Author:
Bob Carpenter

Constructor Summary
ClassifierEvaluator(Classifier<E,C> classifier, String[] categories)
          Construct a classifier evaluator for the specified classifier that records results for the specified set of categories.
 
Method Summary
 void addCase(String referenceCategory, E input)
          Adds a test case for the specified input with the specified reference category.
 void addClassification(String referenceCategory, Classification classification)
          Adds the specified classification as a response for the specified reference category.
 double averageConditionalProbability(String refCategory, String responseCategory)
          Returns the average conditional probability of the specified response category for test cases with the specified reference category.
 double averageConditionalProbabilityReference()
          Returns the average over all test cases of the conditional probability of the response that matches the reference category.
 double averageLog2JointProbability(String refCategory, String responseCategory)
          Returns the average log (base 2) joint probability of the response category for cases of the specified reference category.
 double averageLog2JointProbabilityReference()
          Returns the average over all test cases of the joint log (base 2) probability of the response that matches the reference category.
 double averageRank(String refCategory, String responseCategory)
          Returns the average rank of the specified response category for test cases with the specified reference category.
 double averageRankReference()
          Returns the average over all test samples of the rank of the the response that matches the reference category.
 double averageScore(String refCategory, String responseCategory)
          Returns the average score of the specified response category for test cases with the specified reference category.
 double averageScoreReference()
          Returns the average over all test cases of the score of the response that matches the reference category.
 String[] categories()
          Returns the categories for which this evaluator stores results.
 Classifier<E,C> classifier()
          Returns the classifier for this evaluator.
 ScoredPrecisionRecallEvaluation conditionalOneVersusAll(String refCategory)
          Returns a scored precision-recall evaluation of the classifcation of the specified reference category versus all other categories using the conditional probability scores.
 ConfusionMatrix confusionMatrix()
          Returns the confusion matrix of first-best classification result statistics for this evaluator.
 void handle(E input, Classification classification)
          This is a convenience implementation for the classification handler interface.
 double meanReciprocalRank()
          Returns the average over all test samples of the reciprocal of one plus the rank of the reference category in the response.
 boolean missingConditionals()
          Returns true if this evaluation involved conditional classifications that did not score every category.
 boolean missingRankings()
          Returns true if this evaluation involved ranked classifications that did not rank every category.
 boolean missingScorings()
          Returns true if this evaluation involved ranked classifications that did not score every category.
 int numCases()
          Returns the number of test cases which have been provided to this evaluator.
 PrecisionRecallEvaluation oneVersusAll(String refCategory)
          Returns the first-best one-versus-all precision-recall evaluation of the classification of the specified reference category versus all other categories.
 int rankCount(String referenceCategory, int rank)
          Returns the number of times that the reference category's rank was the specified rank.
 ScoredPrecisionRecallEvaluation scoredOneVersusAll(String refCategory)
          Returns a scored precision-recall evaluation of the classification of the specified reference category versus all other categories using the classification scores.
 String toString()
          Returns a string-based representation of the classification results.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

ClassifierEvaluator

public ClassifierEvaluator(Classifier<E,C> classifier,
                           String[] categories)
Construct a classifier evaluator for the specified classifier that records results for the specified set of categories.

If the classifier evaluator is only going to be populated using the addClassification(String,Classification) method, then the classifier may be null.

Parameters:
classifier - Classifier to evaluate.
categories - Categories of the classifier.
Method Detail

classifier

public Classifier<E,C> classifier()
Returns the classifier for this evaluator.

Returns:
The classifier for this evaluator.

categories

public String[] categories()
Returns the categories for which this evaluator stores results.

Returns:
The categories for which this evaluator stores results.

addCase

public void addCase(String referenceCategory,
                    E input)
Adds a test case for the specified input with the specified reference category. This method runs the classifer over the specified input. It then stores the resulting classification and reference category for collective reporting.

This method simply applies the classifier specified at construction time to the specified input to produce a classification which is forwarded to addClassification(String,Classification).

Parameters:
referenceCategory - Correct category for object.
input - Object being classified.
Throws:
IllegalArgumentException - If the reference category is not a category for this evaluator.

handle

public void handle(E input,
                   Classification classification)
This is a convenience implementation for the classification handler interface. It merely delegates to addCase(String,Object) by extracting the best category from the specified classification.

Specified by:
handle in interface ClassificationHandler<E,Classification>
Parameters:
input - Object being evaluated.
classification - Reference classification of object.

numCases

public int numCases()
Returns the number of test cases which have been provided to this evaluator.

Returns:
The number of test cases which have been provided to this evaluator.

confusionMatrix

public ConfusionMatrix confusionMatrix()
Returns the confusion matrix of first-best classification result statistics for this evaluator. See ConfusionMatrix for details of the numerous available evaluation metrics provided by confusion matrices.

Returns:
The confusion matrix for the test cases evaluated so far.

missingRankings

public boolean missingRankings()
Returns true if this evaluation involved ranked classifications that did not rank every category.

Returns:
true if categories were unranked in some ranked classification.

missingScorings

public boolean missingScorings()
Returns true if this evaluation involved ranked classifications that did not score every category.

Returns:
true if categories were unscored in some scored classification.

missingConditionals

public boolean missingConditionals()
Returns true if this evaluation involved conditional classifications that did not score every category.

Returns:
true if categories were missing conditional probability estimates in some conditional classification.

rankCount

public int rankCount(String referenceCategory,
                     int rank)
Returns the number of times that the reference category's rank was the specified rank.

For example, in the set of training samples and results described in the method documentation for averageRank(String,String), sample rank counts are as follows:

rankCount("a",0) = 3
rankCount("a",1) = 1
rankCount("a",2) = 0
 
rankCount("b",0) = 1
rankCount("b",1) = 0
rankCount("b",2) = 1
 
rankCount("c",0) = 1
rankCount("c",1) = 0
rankCount("c",2) = 0
These results are typically presented as a bar graph histogram per category.

Parameters:
referenceCategory - Reference category.
rank - Rank of count.
Returns:
Number of times the reference category's ranking was the specified rank.
Throws:
IllegalArgumentException - If the category is unknown.

averageRankReference

public double averageRankReference()
Returns the average over all test samples of the rank of the the response that matches the reference category.

Using the example classifications shown in the method documentation of averageRank(String,String):

averageRankReference()
= (0 + 0 + 0 + 1 + 0 + 2 + 0)/7 ~ 0.43

Returns:
The average rank of the reference category in all classification results.

meanReciprocalRank

public double meanReciprocalRank()
Returns the average over all test samples of the reciprocal of one plus the rank of the reference category in the response. This represents counting from one, so if the first-best answer is correct, the reciprocal rank is 1/1; if the second is correct, 1/2; if the third, 1/3; and so on. These individual recirpocals are then averaged over cases.

Using the example classifications shown in the method documentation of averageRank(String,String):

averageRankReference()
= (1/1 + 1/1 + 1/1 + 1/2 + 1/1 + 1/3 + 1/1)/7 ~ 0.83

Returns:
The mean reciprocal rank of the reference category in the result ranking.

averageConditionalProbability

public double averageConditionalProbability(String refCategory,
                                            String responseCategory)
Returns the average conditional probability of the specified response category for test cases with the specified reference category. If there are no cases matching the reference category, the result is Double.NaN. If the conditional classifiers' results are properly normalized, the sum of the averages over all categories will be 1.0.

Better classifiers return high values when the reference and response categories are the same and lower values when they are different. The log value would be extremely volatile given the extremely low and high conditional estimates of the language model classifiers.

Parameters:
refCategory - Reference category.
responseCategory - Response category.
Returns:
Average conditional probability of response category in cases for specified reference category.
Throws:
IllegalArgumentException - If the either category is unknown.

averageLog2JointProbability

public double averageLog2JointProbability(String refCategory,
                                          String responseCategory)
Returns the average log (base 2) joint probability of the response category for cases of the specified reference category. If there are no cases matching the reference category, the result is Double.NaN.

Better classifiers return high values when the reference and response categories are the same and lower values when they are different. Unlike the conditional probability values, joint probability averages are not particularly useful because they are not normalized by input length. For the language model classifiers, the scores are normalized by length, and provide a better cross-case view.

Parameters:
refCategory - Reference category.
responseCategory - Response category.
Returns:
Average log (base 2) conditional probability of response category in cases for specified reference category.
Throws:
IllegalArgumentException - If the either category is unknown.

averageScoreReference

public double averageScoreReference()
Returns the average over all test cases of the score of the response that matches the reference category. Better classifiers return higher values for this average.

Whether average scores make sense across training instances depends on the classifier.

Returns:
The average score of the reference category in the response.

averageConditionalProbabilityReference

public double averageConditionalProbabilityReference()
Returns the average over all test cases of the conditional probability of the response that matches the reference category. Better classifiers return higher values for this average.

As a normalized value, the average conditional probability always has a sensible interpretation across training instances.

Returns:
The average conditional probability of the reference category in the response.

averageLog2JointProbabilityReference

public double averageLog2JointProbabilityReference()
Returns the average over all test cases of the joint log (base 2) probability of the response that matches the reference category. Better classifiers return higher values for this average.

Whether average scores make sense across training instances depends on the classifier. For the language-model based classifiers, the normalized score values are more reasonable averages.

Returns:
The average joint log probability of the reference category in the response.

averageScore

public double averageScore(String refCategory,
                           String responseCategory)
Returns the average score of the specified response category for test cases with the specified reference category. If there are no cases matching the reference category, the result is Double.NaN.

Better classifiers return high values when the reference and response categories are the same and lower values when they are different. Depending on the classifier, the scores may or may not be meaningful as an average.

Parameters:
refCategory - Reference category.
responseCategory - Response category.
Returns:
Average score of response category in test cases for specified reference category.
Throws:
IllegalArgumentException - If the either category is unknown.

averageRank

public double averageRank(String refCategory,
                          String responseCategory)
Returns the average rank of the specified response category for test cases with the specified reference category. If there are no cases matching the reference category, the result is Double.NaN.

Better classifiers return lower values when the reference and response categories are the same and higher values when they are different.

For example, suppose there are three categories, a, b and c. Consider the following seven test cases, with the specified ranked results:

Test Case Reference Rank 0 Rank 1 Rank 2
0aabc
1aacb
2aabc
3abac
4bbac
5bacb
6ccba
for which:
averageRank("a","a") = (0 + 0 + 0 + 1)/4 = 0.25
averageRank("a","b") = (1 + 2 + 1 + 0)/4 = 1.00
averageRank("a","c") = (2 + 1 + 2 + 2)/4 = 1.75
 
averageRank("b","a") = (1 + 0)/2 = 0.50
averageRank("b","b") = (0 + 2)/2 = 1.0
averageRank("b","c") = (2 + 1)/2 = 1.5
 
averageRank("c","a") = (2)/1 = 2.0
averageRank("c","b") = (1)/1 = 1.0
averageRank("c","c") = (0)/1 = 0.0

If every ranked result is complete in assigning every category to a rank, the sum of the average ranks will be one less than the number of cases with the specified reference value. If categories are missing from ranked results, the sums may possible be larger than one minus the number of test cases.

Note that the confusion matrix is computed using only the reference and first column of this matrix of results.

Parameters:
refCategory - Reference category.
responseCategory - Response category.
Returns:
Average rank of response category in test cases for specified reference category.
Throws:
IllegalArgumentException - If either category is unknown.

scoredOneVersusAll

public ScoredPrecisionRecallEvaluation scoredOneVersusAll(String refCategory)
Returns a scored precision-recall evaluation of the classification of the specified reference category versus all other categories using the classification scores.

Parameters:
refCategory - Reference category.
Returns:
The scored one-versus-all precision-recall evaluatuion.
Throws:
IllegalArgumentException - If the specified category is unknown.

conditionalOneVersusAll

public ScoredPrecisionRecallEvaluation conditionalOneVersusAll(String refCategory)
Returns a scored precision-recall evaluation of the classifcation of the specified reference category versus all other categories using the conditional probability scores. This method may only be called for evaluations that have scores.

Parameters:
refCategory - Reference category.
Returns:
The conditional one-versus-all precision-recall evaluatuion.
Throws:
IllegalArgumentException - If the specified category is unknown.

oneVersusAll

public PrecisionRecallEvaluation oneVersusAll(String refCategory)
Returns the first-best one-versus-all precision-recall evaluation of the classification of the specified reference category versus all other categories. This method may be called for any evaluation.

Parameters:
refCategory - Reference category.
Returns:
The first-best one-versus-all precision-recall evaluatuion.
Throws:
IllegalArgumentException - If the specified category is unknown.

toString

public String toString()
Returns a string-based representation of the classification results.

Overrides:
toString in class Object
Returns:
A string-based representation of the classification results.

addClassification

public void addClassification(String referenceCategory,
                              Classification classification)
Adds the specified classification as a response for the specified reference category.

Parameters:
referenceCategory - Reference category for case.
classification - Response classification for case.