leaderbot.models.RaoKupper#

class leaderbot.models.RaoKupper(data, k_cov=0, k_tie=1)#

Generalized Rao-Kupper model.

Parameters:
datadict

A dictionary of data that is provided by leaderbot.data.load().

k_covint, default=0

Determines the structure of covariance in the model based on the following values:

  • None: this means no covariance is used in the model. Together with setting k_tie=0, the original Rao-Kupper model is retrieved.

  • 0: this assumes covariance is a diagonal matrix.

  • positive integer: this assumes covariance is a diagonal plus low-rank matrix where the rank of low-rank approximation is k_cov.

See Notes below for further details.

n_tie_factorint, default=1

Determines the rank of low-rank factor structure for modeling tie outcomes based on the following values:

  • 0: this assumes no low-rank factor model. Together with setting k_tie=0, the original Rao-Kupper model is retrieved.

  • positive integer: this employs a low-rank structure for modeling tie outcomes with rank k_tie.

See Notes below for further details.

Notes

This class implements a generalization of the Rao-Kupper model based on [1], incorporating covariance and tie factor models.

Covariance Model:

This model utilizes a covariance matrix with diagonal plus low-rank structure of the form

\[\mathbf{\Sigma} = \mathbf{D} + \mathbf{\Lambda} \mathbf{\Lambda}^{\intercal},\]

where

  • \(\mathbf{\Sigma}\) is an \(m \times m\) symmetric positive semi-definite covariance matrix where \(m\) is the number of agents (competitors).

  • \(\mathbf{D}\): is an \(m \times m\) diagonal matrix with non-negative diagonals.

  • \(\mathbf{\Lambda}\): is a full-rank \(m \times k_{\mathrm{cov}}\) matrix where \(k_{\mathrm{cov}}\) is given by the input parameter k_cov.

If k_cov=None, the covariance matrix is not used in the model, retrieving the original Bradley-Terry model [2] (along with setting k_tie=0). If k_cov=0, the covariance model reduces to a diagonal matrix \(\mathbf{D}\).

Tie Model:

Modeling tie in Rao-Kupper model introduces a threshold parameter \(\eta\). In generalized Rao-Kupper model, threshold parameter is instead modeled by the additive low-rank structure of the form

\[\begin{split}\mathbf{H} = \begin{cases} \mathbf{G} \boldsymbol{\Phi}^{\intercal} + \boldsymbol{\Phi} \mathbf{G}^{\intercal}, & 0< k_{\mathrm{tie}} \leq m \\ \eta \mathbf{J} & k_{\mathrm{tie}} = 0, \end{cases}\end{split}\]

where

  • \(\mathbf{H}\) is an \(m \times m\) symmetric matrix where its elements represent pair-specific thresholds and \(m\) is the number of agents (competitors).

  • \(\mathbf{G}\) is an \(m \times k_{\mathrm{tie}}\) matrix of parameters of the full rank \(k_{\mathrm{tie}}\) given by the input argument k_tie.

  • \(\boldsymbol{\Phi}\) is an \(m \times k_{\mathrm{tie}}\) orthonormal matrix of basis functions.

  • \(\mathbf{J}\) is an \(m \times m\) matrix of all ones.

Setting k_tie = 0 leads to a model with single tie threshold, retrieving the original Rao-Kupper model (along with setting k_cov=None).

A similar approach that also models tie outcomes is leaderbot.models.Davidson model.

Best Practices for Setting Parameters:

The number of model parameters and training time scale with \(k_{\mathrm{cov}}\) and \(k_{\mathrm{tie}}\). Depending on the dataset size, choosing too small or too large a value for these parameters can lead to under- or over-parameterization. In practice, moderate values of \(1 \sim 10\) often balance model fit, test accuracy, and training runtime efficiency.

References

[1]

Siavash Ameli, Siyuan Zhuang, Ion Stoica, and Michael W. Mahoney. A Statistical Framework for Ranking LLM-Based Chatbots. The Thirteenth International Conference on Learning Representations, 2025.

[2]

P. V. Rao and L. L. Kupper. Ties in Paired-Comparison Experiments: A Generalization of the Bradley-Terry Model.. Journal of the American Statistical Association, 62(317), 194–204, 1967.

Examples

>>> from leaderbot.data import load
>>> from leaderbot.models import RaoKupper

>>> # Create a model
>>> data = load()
>>> model = RaoKupper(data)

>>> # Train the model
>>> model.train()

>>> # Make inference
>>> prob = model.infer()
Attributes:
xnp.ndarray

A 2D array of integers with the shape (n_pairs, 2) where each row consists of indices [i, j] representing a match between a pair of agents with the indices i and j.

ynp.ndarray

A 2D array of integers with the shape (n_pairs, 3) where each row consists of three counts [n_win, n_loss, n_ties] representing the frequencies of win, loss, and ties between agents i and j given by the corresponding row of the input array x.

agentslist

A list of the length n_agents representing the name of agents.

n_agentsint

Number of agents.

paramnp.array, default=None

The model parameters. This array is set once the model is trained.

n_paramint

Number of parameters

k_covint

Number of factors for matrix factorization.

n_tie_factorint

Number of factors used in tie parameters.

Methods

loss([w, return_jac, constraint])

Total loss for all data instances.

train([init_param, method, max_iter, tol])

Tune model parameters with maximum likelihood estimation method.

infer([x])

Infer the probabilities of win, loss, and tie outcomes.

predict([x])

Predict outcome between competitors.

fisher([w, epsilon, order])

Observed Fisher information matrix.

rank()

Rank competitors based on their scores.

leaderboard([max_rank])

Print leaderboard of the agent matches.

marginal_outcomes([max_rank, bg_color, ...])

Plot marginal probabilities and frequencies of win, loss, and tie.

map_distance([ax, cmap, max_rank, method, ...])

Visualize distance between agents using manifold learning projection.

cluster([ax, max_rank, tier_label, method, ...])

Cluster competitors to performance tiers.

scores()

Get scores.

plot_scores([max_rank, horizontal, ...])

Plots competitors' scores by rank.

match_matrix([max_rank, density, source, ...])

Plot match matrices of win and tie counts of mutual matches.