leaderbot.evaluate.model_selection#
- leaderbot.evaluate.model_selection(models, train=False, tie=False, report=True)#
Evaluate model selection.
- Parameters:
- modelslist[leaderbot.models.BaseModel]
A single or a list of models to be evaluated.
Note
All models should be created using the same dataset to make proper comparison.
- trainbool, default=False
If True, the models will be trained. If False, it is assumed that the models are pre-trained.
- tiebool, default=False
If False, ties in the data are not counted toward model evaluation. This option is only effective on
leaderbot.models.BradleyTerry
model, and has no effect on the other models.- reportbool, default=False
If True, a table of the analysis is printed.
- Returns:
- metricsdict
A dictionary containing the following keys and values:
'name'
: list of names of the models.'n_param'
: list of number of parameters of the models.'nll'
: list of negative log-likelihood values of the models.'aic'
: list of Akaike information criterion of the models.'bic'
: list of Bayesian information criterion of the models.'cel_win'
: list of cross entropies for win outcomes.'cel_loss'
: list of cross entropies for loss outcomes.'cel_tie'
: list of cross entropies for tie outcomes.'cel_all'
: list of cross entropies for all outcomes.
- Raises:
- RuntimeError
if
train
is False but at least one of the models are not pre-trained.
Examples
>>> import leaderbot as lb >>> from leaderbot.models import BradleyTerry as BT >>> from leaderbot.models import RaoKupper as RK >>> from leaderbot.models import Davidson as DV >>> # Obtain data >>> data = lb.data.load() >>> # Create a list of models to compare >>> models = [ ... BT(data, k_cov=None), ... BT(data, k_cov=0), ... BT(data, k_cov=1), ... RK(data, k_cov=None, k_tie=0), ... RK(data, k_cov=0, k_tie=0), ... RK(data, k_cov=1, k_tie=1), ... DV(data, k_cov=None, k_tie=0), ... DV(data, k_cov=0, k_tie=0), ... DV(data, k_cov=0, k_tie=1) ... ] >>> # Evaluate models >>> metrics = lb.evaluate.model_selection(models, train=True, ... report=True)
The above code outputs the following table
+----+--------------+---------+--------+--------------------------------+---------+---------+ | | | | | CEL | | | | id | model | # param | NLL | all win loss tie | AIC | BIC | +----+--------------+---------+--------+--------------------------------+---------+---------+ | 1 | BradleyTerry | 129 | 0.6554 | 0.6553 0.3177 0.3376 inf | 256.7 | 1049.7 | | 2 | BradleyTerry | 258 | 0.6552 | 0.6551 0.3180 0.3371 inf | 514.7 | 2100.8 | | 3 | BradleyTerry | 387 | 0.6551 | 0.6550 0.3178 0.3372 inf | 772.7 | 3151.8 | | 4 | RaoKupper | 130 | 1.0095 | 1.0095 0.3405 0.3462 0.3227 | 258.0 | 1057.2 | | 5 | RaoKupper | 259 | 1.0092 | 1.0092 0.3408 0.3457 0.3228 | 516.0 | 2108.2 | | 6 | RaoKupper | 516 | 1.0102 | 1.0102 0.3403 0.3453 0.3245 | 1030.0 | 4202.1 | | 7 | Davidson | 130 | 1.0100 | 1.0100 0.3409 0.3461 0.3231 | 258.0 | 1057.2 | | 8 | Davidson | 259 | 1.0098 | 1.0098 0.3411 0.3455 0.3231 | 516.0 | 2108.2 | | 9 | Davidson | 387 | 1.0075 | 1.0075 0.3416 0.3461 0.3197 | 772.0 | 3151.1 | +----+--------------+---------+--------+--------------------------------+---------+---------+