leaderbot.models.RaoKupper#
- class leaderbot.models.RaoKupper(data, k_cov=0, k_tie=1)#
Generalized Rao-Kupper model.
- Parameters:
- datadict
A dictionary of data that is provided by
leaderbot.data.load()
.- k_covint, default=0
Determines the structure of covariance in the model based on the following values:
None
: this means no covariance is used in the model. Together with settingk_tie=0
, the original Rao-Kupper model is retrieved.0
: this assumes covariance is a diagonal matrix.positive integer: this assumes covariance is a diagonal plus low-rank matrix where the rank of low-rank approximation is
k_cov
.
See Notes below for further details.
- n_tie_factorint, default=1
Determines the rank of low-rank factor structure for modeling tie outcomes based on the following values:
0
: this assumes no low-rank factor model. Together with settingk_tie=0
, the original Rao-Kupper model is retrieved.positive integer: this employs a low-rank structure for modeling tie outcomes with rank
k_tie
.
See Notes below for further details.
See also
Notes
This class implements a generalization of the Rao-Kupper model based on [1], incorporating covariance and tie factor models.
Covariance Model:
This model utilizes a covariance matrix with diagonal plus low-rank structure of the form
\[\mathbf{\Sigma} = \mathbf{D} + \mathbf{\Lambda} \mathbf{\Lambda}^{\intercal},\]where
\(\mathbf{\Sigma}\) is an \(m \times m\) symmetric positive semi-definite covariance matrix where \(m\) is the number of agents (competitors).
\(\mathbf{D}\): is an \(m \times m\) diagonal matrix with non-negative diagonals.
\(\mathbf{\Lambda}\): is a full-rank \(m \times k_{\mathrm{cov}}\) matrix where \(k_{\mathrm{cov}}\) is given by the input parameter
k_cov
.
If
k_cov=None
, the covariance matrix is not used in the model, retrieving the original Bradley-Terry model [2] (along with settingk_tie=0
). Ifk_cov=0
, the covariance model reduces to a diagonal matrix \(\mathbf{D}\).Tie Model:
Modeling tie in Rao-Kupper model introduces a threshold parameter \(\eta\). In generalized Rao-Kupper model, threshold parameter is instead modeled by the additive low-rank structure of the form
\[\begin{split}\mathbf{H} = \begin{cases} \mathbf{G} \boldsymbol{\Phi}^{\intercal} + \boldsymbol{\Phi} \mathbf{G}^{\intercal}, & 0< k_{\mathrm{tie}} \leq m \\ \eta \mathbf{J} & k_{\mathrm{tie}} = 0, \end{cases}\end{split}\]where
\(\mathbf{H}\) is an \(m \times m\) symmetric matrix where its elements represent pair-specific thresholds and \(m\) is the number of agents (competitors).
\(\mathbf{G}\) is an \(m \times k_{\mathrm{tie}}\) matrix of parameters of the full rank \(k_{\mathrm{tie}}\) given by the input argument
k_tie
.\(\boldsymbol{\Phi}\) is an \(m \times k_{\mathrm{tie}}\) orthonormal matrix of basis functions.
\(\mathbf{J}\) is an \(m \times m\) matrix of all ones.
Setting
k_tie = 0
leads to a model with single tie threshold, retrieving the original Rao-Kupper model (along with settingk_cov=None
).A similar approach that also models tie outcomes is
leaderbot.models.Davidson
model.Best Practices for Setting Parameters:
The number of model parameters and training time scale with \(k_{\mathrm{cov}}\) and \(k_{\mathrm{tie}}\). Depending on the dataset size, choosing too small or too large a value for these parameters can lead to under- or over-parameterization. In practice, moderate values of \(1 \sim 10\) often balance model fit, test accuracy, and training runtime efficiency.
References
[1]Siavash Ameli, Siyuan Zhuang, Ion Stoica, and Michael W. Mahoney. A Statistical Framework for Ranking LLM-Based Chatbots. The Thirteenth International Conference on Learning Representations, 2025.
[2]P. V. Rao and L. L. Kupper. Ties in Paired-Comparison Experiments: A Generalization of the Bradley-Terry Model.. Journal of the American Statistical Association, 62(317), 194–204, 1967.
Examples
>>> from leaderbot.data import load >>> from leaderbot.models import RaoKupper >>> # Create a model >>> data = load() >>> model = RaoKupper(data) >>> # Train the model >>> model.train() >>> # Make inference >>> prob = model.infer()
- Attributes:
- xnp.ndarray
A 2D array of integers with the shape
(n_pairs, 2)
where each row consists of indices[i, j]
representing a match between a pair of agents with the indicesi
andj
.- ynp.ndarray
A 2D array of integers with the shape
(n_pairs, 3)
where each row consists of three counts[n_win, n_loss, n_ties]
representing the frequencies of win, loss, and ties between agentsi
andj
given by the corresponding row of the input arrayx
.- agentslist
A list of the length
n_agents
representing the name of agents.- n_agentsint
Number of agents.
- paramnp.array, default=None
The model parameters. This array is set once the model is trained.
- n_paramint
Number of parameters
- k_covint
Number of factors for matrix factorization.
- n_tie_factorint
Number of factors used in tie parameters.
Methods
loss
([w, return_jac, constraint])Total loss for all data instances.
train
([init_param, method, max_iter, tol])Tune model parameters with maximum likelihood estimation method.
infer
([x])Infer the probabilities of win, loss, and tie outcomes.
predict
([x])Predict outcome between competitors.
fisher
([w, epsilon, order])Observed Fisher information matrix.
rank
()Rank competitors based on their scores.
leaderboard
([max_rank])Print leaderboard of the agent matches.
marginal_outcomes
([max_rank, bg_color, ...])Plot marginal probabilities and frequencies of win, loss, and tie.
map_distance
([ax, cmap, max_rank, method, ...])Visualize distance between agents using manifold learning projection.
cluster
([ax, max_rank, tier_label, method, ...])Cluster competitors to performance tiers.
scores
()Get scores.
plot_scores
([max_rank, horizontal, ...])Plots competitors' scores by rank.
match_matrix
([max_rank, density, source, ...])Plot match matrices of win and tie counts of mutual matches.