public class DiscreteChooser extends Object implements Serializable
DiscreteChooser
class implements multinomial
conditional logit discrete choice analysis with variables varying
over alternatives.
The model is characterized by a coefficient vector
β
, which is supplied at construction time.
Choices are represented as vectors, and there may be any number of
them.
Suppose the model is presented with
N
choices represented as vectors
α[0],...,α[N1]
. The probability
of choice n
for 0 <= n < N
is
where the asterisk (p(nα,β) = exp(α[n] * β) / Z
*
) represents vector dot products, and
the normalizing constant is defined by summation in the usual way, by
Z = Σ_{0 <= n < N} exp(α[n] * β)
This model is related to logistic regression in that it is a loglinear model.
Thus intercepts should not be added to choice vectors and a regression prior with an uninformative intercept should not be used for estimation.
n
and choice
m
,
The value does depends on onlyp(nα,β)/p(mα, β) = (exp(α[n] * β) / Z) / (exp(α[m] * β) / Z) = exp(α[n] * β) / exp(α[m] * β).
α[n]
and
α[m]
(and the model's coefficient vector
β
).
This is fine when the choices are independent, but problematic when there are dependencies between the choices. A standard example of is given a choice between items A and B may be modeled properly. But consider a B' that is very much like B and added to the mix. For instance, consider choosing between a California cabernet and a Bordeaux. Suppose you have a 2/3 probability of choosing the Bordeaux and a 1/3 probability of choosing the California cabernet. Now consider adding a second Califoronia cabernet that's very similar to the first one (as measured by the model, of course). Then the probabilities will be roughly 1/2 for choosing the Bordeaux, and 1/4 for each of the California cabernets. With similar choices, the probability of each should go down. If they were identical (perfectly correlated), the right answer would seem to be a 2/3 probability of choosing the Bordeaux and 1/6 probability for choosing each of the Califoronia cabernets.
If some of the coefficients are features of a chooser, the model may be used to represent the decision function of multiple choosers. In this case, all fetaure tying is up to the implementing class. Typically, there will be interaction features included between the chooser and the choice. Returning to the wine example, different choosers might put different weights on the degree of new oak used, acid levels, or complexity. In these cases, overall preferences may be represented by chooserindependent variables, and then chooserdependent preferences would be interpreted relatively.
Constructor and Description 

DiscreteChooser(Vector coefficients)
Construct a discrete chooser with the specified
coefficient vector.

Modifier and Type  Method and Description 

double[] 
choiceLogProbs(Vector[] choices)
Return an array of (natural) log choice probabilities
corresponding to the input array of choices.

double[] 
choiceProbs(Vector[] choices)
Return an array of choice probabilities corresponding to the
input array of choices.

int 
choose(Vector[] choices)
Returns the most likely choice among the choices in the
specified array of vectors.

Vector 
coefficients()
Return an unmodifiable view of the coefficients
underlying this discrete chooser.

static DiscreteChooser 
estimate(Vector[][] alternativess,
int[] choices,
RegressionPrior prior,
int priorBlockSize,
AnnealingSchedule annealingSchedule,
double minImprovement,
int minEpochs,
int maxEpochs,
Reporter reporter)
Returns a discrete choice model estimated from the specified
training data, prior, and learning parameters.

String 
toString()
Return a stringbased representation of the coefficient
vector underlying this discrete chooser.

public DiscreteChooser(Vector coefficients)
coefficients
 Coefficient vector.public int choose(Vector[] choices)
choices
 Array of alternative choices represented
as vectors.IllegalArgumentException
 If there is not at least one
choice.public double[] choiceProbs(Vector[] choices)
choices
 Array of alternative choices represented
as vectors.IllegalArgumentException
 If there is not at least one
choice.public double[] choiceLogProbs(Vector[] choices)
choices
 Array of alternative choices represented
as vectors.IllegalArgumentException
 If there is not at least one
choice.public Vector coefficients()
public String toString()
public static DiscreteChooser estimate(Vector[][] alternativess, int[] choices, RegressionPrior prior, int priorBlockSize, AnnealingSchedule annealingSchedule, double minImprovement, int minEpochs, int maxEpochs, Reporter reporter)
Training is carried out using stochastic gradient descent. Priors are only applied every block size number of examples, and at the end of each epoch to catch up.
The reporter receives LogLevel.INFO
reports on
parameters and LogLevel.DEBUG
reports on a perepoch
basis of learning rate, log likelihood, log prior, and totals.
alternativess
 An array of vectors for each training instance.choices
 The index of the vector chosen for each training instance.prior
 The prior to apply to coefficients.priorBlockSize
 Period with which the prior is applied.annealingSchedule
 Learning rates per epoch.minImprovement
 Minimum improvement in the rolling average
of log likelihood plus prior to compute another epoch.minEpochs
 Minimum number of epochs to compute.maxEpochs
 Maximum number of epochs.reporter
 Reporter to which progress reports are sent.