E
 the type of objects being clusteredpublic class ClusterScore<E> extends Object
ClusterScore
provides a range of cluster scoring
metrics for reference partitions versus response partitions.
Traditional evaluation measures for pairs of parititions involve
comparing the equivalence relations generated by the partitions
pointwise. A partition defines an equivalence relation in the
usual way: a pair (A,B)
is in the equivalence if there
is a cluster that contains both A
and B
.
Each element is assumed to be equal to itself. A pair
(A,B)
is a true positive if it is an equivalence in
the reference and response clustes, a false positive if it is in
the response but not the reference, and so on. The resulting
precisionrecall statistics over the relations is reported through
equivalenceEvaluation()
.
The scoring used for the Message Understanding Conferences is:
mucRecall(referencePartition,responsePartition)
= Σ_{c in referencePartition}
(size(c)  overlap(c,responsePartition))
/ Σ_{c in referencePartition}
( size(c)  1 )
where size(c)
is the number of elements in the
cluster c
, and overlap(c,responsePartition)
is the number of clusters in the response partition that intersect
the cluster c
. Precision is defined dually by
reversing the roles of reference and response, and fmeasure is defined
as usual. Further details and examples can be found in:
Marc Vilain, John Burger, John Aberdeen, Dennis Connolly, and Lynette Hirschman. 1995. A modeltheoretic coreference scoring scheme. In Proceedings fo the 6th Message Understanding Conference (MUC6). 4552. Morgan Kaufmann.
Bcubed cluster scoring was defined as an alternative to the MUC scoring metric. There are two variants Bcubed cluster precision, both of which are weighted averages of a perelement precision score:
b3Precision(A,referencePartition,responsePartition)
= cluster(responsePartition,A) INTERSECT cluster(referencePartition,A)
/ cluster(responsePartition,A)
where cluster(partition,a)
is the cluster in the
partition partition
containing the element a
;
in other words, this is A
's equivalence class and contains
the set of all elements equivalent to A
in the partition.
For the uniform cluster method, each cluster in the reference partition is weighted equally, and each element is weighted equally within a cluster:
b3ClusterPrecision(referencePartition,responsePartition)
= Σ_{a}
b3Precision(a,referencePartition,responsePartition)
/ (referencePartition * cluster(referencePartition,a))
For the uniform element method, each element a
is weighted uniformly:
b3ElementPrecision(ReferencePartition,ResponsePartition)
= Σ_{a}
b3Precision(a,referencePartition,responsePartition) / numElements
where numElements
is the total number of elements in
the partitions. For both Bcubed approaches, recall is defined
dually by switching the roles of reference and response, and the
F_{1}measure is defined in the usual way.
The Bcubed method is described in detail in:
Bagga, Amit and Breck Baldwin. 1998. Algorithms for scoring coreference chains. In Proceedings of the First International Conference on Language Resources and Evaluation Workshop on Linguistic Coreference.
As an example, consider the following two partitions:
reference = { {1, 2, 3, 4, 5}, {6, 7}, {8, 9, A, B, C} }
response = { { 1, 2, 3, 4, 5, 8, 9, A, B, C }, { 6, 7} }
which produce the following values for method calls:
Method Result equivalenceEvaluation()
PrecisionRecallEvaluation(54,0,50,40)
TP=54; FN=0; FP=50; TN=40mucPrecision()
0.9 mucRecall()
1.0 mucF()
0.947 b3ElementPrecision()
0.583 b3ElementRecall()
1.0 b3ElementF()
0.737 b3ClusterPrecision()
0.75 b3ClusterRecall()
1.0 b3ClusterF()
0.857
Note that there are additional scoring metrics within the Dendrogram
class for cophenetic correlation and dendrogramspecific
withincluster scatter.
Constructor and Description 

ClusterScore(Set<? extends Set<? extends E>> referencePartition,
Set<? extends Set<? extends E>> responsePartition)
Construct a cluster score object from the specified reference and
response partitions.

Modifier and Type  Method and Description 

double 
b3ClusterF()
Returns the F_{1} measure of the
B^{3} precision and recall metrics with equal cluster
weighting.

double 
b3ClusterPrecision()
Returns the precision as defined by B^{3} metric with
equal cluster weighting.

double 
b3ClusterRecall()
Returns the recall as defined by B^{3} metric with
equal cluster weighting.

double 
b3ElementF()
Returns the F_{1} measure of the
B^{3} precision and recall metrics with equal element
weighting.

double 
b3ElementPrecision()
Returns the precision as defined by B^{3} metric with
equal element weighting.

double 
b3ElementRecall()
Returns the recall as defined by B^{3} metric with
equal element weighting.

PrecisionRecallEvaluation 
equivalenceEvaluation()
Returns the precisionrecall evaluation corresponding to
equalities in the reference and response clusterings.

Set<Tuple<E>> 
falseNegatives()
Returns the set of false negative relations for this scoring.

Set<Tuple<E>> 
falsePositives()
Returns the set of false positive relations for this scoring.

double 
mucF()
Returns the F_{1} measure of the MUC recall
and precision.

double 
mucPrecision()
Returns the precision as defined by MUC.

double 
mucRecall()
Returns the recall as defined by MUC.

static <E> double 
scatter(Set<? extends E> cluster,
Distance<? super E> distance)
Returns the scatter for the specified cluster based on the
specified distance.

String 
toString()
Returns a string representation of the statistics for this
score.

Set<Tuple<E>> 
truePositives()
Returns the set of true positive relations for this scoring.

static <E> double 
withinClusterScatter(Set<? extends Set<? extends E>> clustering,
Distance<? super E> distance)
Returns the withincluster scatter measure for the specified
clustering with respect to the specified distance.

public ClusterScore(Set<? extends Set<? extends E>> referencePartition, Set<? extends Set<? extends E>> responsePartition)
The reference partition is taken to represent the "truth" or the "correct" answer, also known as the "gold standard". The response partition is the partition to evaluate against the reference partition.
If the specified partitions are not over the same sets or if the equivalence classes are not disjoint, an illegal argument exception is raised.
referencePartition
 Partition of reference elements.responsePartition
 Partition of response elements.IllegalArgumentException
 If the partitions are not
valid partitions over the same set of elements.public PrecisionRecallEvaluation equivalenceEvaluation()
public double mucPrecision()
public double mucRecall()
public double mucF()
public double b3ClusterPrecision()
public double b3ClusterRecall()
public double b3ClusterF()
public double b3ElementPrecision()
public double b3ElementRecall()
public double b3ElementF()
public Set<Tuple<E>> truePositives()
Tuple
. These true
positives will include both (x,y)
and
(y,x)
for a true positive relation between
x
and y
.public Set<Tuple<E>> falsePositives()
Tuple
. The false
positives will include both (x,y)
and
(y,x)
for a false positive relation between
x
and y
.public Set<Tuple<E>> falseNegatives()
Tuple
. The false
negative set will include both (x,y)
and
(y,x)
for a false negative relation between
x
and y
.public String toString()
public static <E> double withinClusterScatter(Set<? extends Set<? extends E>> clustering, Distance<? super E> distance)
scatter(Set,Distance)
for a definition of scatter.
withinClusterScatter(clusters,distance) = Σ_{cluster in clusters} scatter(cluster,distance)
As the number of clusters increases, the withincluster scatter decreases monotonically. Typically, this is used to determine how many clusters to return, by inspecting a plot of withincluster scatter against number of clusters and looking for a "knee" in the graph.
E
 the type of objects being clusteredclustering
 Clustering to evaluate.distance
 Distance against which to evaluate.public static <E> double scatter(Set<? extends E> cluster, Distance<? super E> distance)
xs[i]
for
the i
th element returned by the set's iterator,
scatter is defined by:
Note that elements are not compared to themselves. This presupposes a distance for which the distance of an element to itself is zero and which is symmetric.scatter(xs,distance) = Σ_{i} Σ_{j < i} distance(xs[i],xs[j])
E
 the type of objects being clusteredcluster
 Cluster to evaluate.distance
 Distance against which to evaluate.