leaderbot.data.load#

leaderbot.data.load(filename=None, tie='tie', whitelist=None, clean=True, check_duplicacy=False)#

Load data from JSON file or URL.

Parameters:
filenamestr, default=None

A .json filename of the data. The filename can be the location on the local machine or a URL of a file on a remote server accessible via the HTTP or HTTPS protocol. If None, a default file that is shipped with the package will be used.

tie{'none', 'tie', 'both'}, default=``’tie’``

A string that determines how the third column of the output array Y is filled:

  • 'none': Y[:, :2] is filled with zeros, meaning no tie is counted.

  • 'tie': Y[:, :2] is filled with only the counts of ties, excluding the case of tie as both bad.

  • 'both': Y[:, :2] is filled with the sum of both counts of tie and tie as both bad.

whitelistlist or str, default=None

A list of agent names to be selected from the full set of agent names in the data. Alternatively, a .json filename can be provided, which should contain a list of names to be used.

cleanbool, default=True

If True, the pairs with zero win, loss, and tie counts are deleted from the list of data.

check_duplicacybool, default=False

If True, all pairs in the data are checked for duplicacy.

Note

Performing this check is time consuming.

Returns:
dataDataType

A dictionary containing the following key/values:

  • 'X':

    A list of tuple of two indices (i, j) representing a match between a pair of agents with the indices i and j.

  • 'Y':

    A list of tuples of three integers (n_win, n_loss, n_ties) representing the frequencies of win, loss, and ties between agents i and j given by the corresponding tuple in X.

  • 'models': a list of the name of agents in the match.

Raises:
If check_duplicacy is ` True`:
Warning

If duplicacy were found in the data.

Examples

>>> from leaderbot.data import load

>>> # Load default data provided by the package
>>> data = load()

>>> # Load from a file
>>> filename = '/scratch/user/my-data.json'
>>> data = load(filename)

>>> # Load default data, but only select a custom whitelist of names
>>> whitelist = [
...     "chatgpt-4o-latest",
...     "gemini-1.5-pro-exp-0801",
...     "gpt-4o-2024-05-13",
...     "gpt-4o-mini-2024-07-18",
... ]
>>> data = load(whitelist=whitelist)

>>> # Use a sample whitelist provided by the package
>>> from leaderbot.data import sample_whitelist
>>> data = load(whitelist=sample_whitelist)