probcalc package

Module contents

This is the top-level probcalc package, which contains all the subpackages and submodules of the project.

Here’s a table of user-friendly aliases and the backend classes they refer to:

Alias	Class name
P	`probcalc.distribution_classes.ProbabilityCalculator`
B	`probcalc.distributions.BinomialDistribution`
Po	`probcalc.distributions.PoissonDistribution`
N	`probcalc.distributions.NormalDistribution`
Geo	`probcalc.distributions.GeometricDistribution`

Submodules

probcalc.distribution_classes module

A simple module to contain superclasses to be used by distributions.

exception probcalc.distribution_classes.NonsenseError

Bases: Exception

A simple error representing mathematical nonsense.

This could be a probability that doesn’t make sense, or getting more successes than trials, etc.

class probcalc.distribution_classes._Bounds

Bases: object

This is a simple little class to hold bounds for a Distribution object.

__init__()

Create a _Bounds object with default bounds.

These default bounds are (None, False), meaning everything up to but not including the natural bounds of the distribution. We don’t include it, because evaluating probability at something like infinity might not make sense all the time.

lower: tuple[int | None, bool]

The lower of the two bounds.

The first element of the tuple is the value of the bound itself. None means the natural bound of the distribution. This can be 0, negative infinity, or something else depending on the distribution.

The second element of the tuple is whether the value bound of the should be included in probability calculations or not.

upper: tuple[int | None, bool]

The upper of the two bounds.

The first element of the tuple is the value of the bound itself. None means the natural bound of the distribution. This can be the maximum number of trials, infinity, or something else depending on the distribution.

The second element of the tuple is whether the value bound of the should be included in probability calculations or not.

__repr__() → str: Return a simple repr of the object, containing the value of the lower and upper bounds.

__eq__(other)

Check equality.

This dunder method has been implemented purely to allow distributions to throw errors when users attempt to combine inequality and equality logic operators. To check against that, though, we need to be able to check _Bounds equality.

class probcalc.distribution_classes.Distribution

Bases: ABC

This is an abstract superclass representing an arbitrary probability distribution.

It implements logical comparison dunder methods and calculate(), which allow it to be used easily in calculate_probability().

__init__(*, accepts_floats: bool)

Create a Distribution object with natural bounds and one flag.

Parameters: accepts_floats (bool) – Whether this distribution should accept floats

_accepts_floats: bool

This attribute is a flag for whether this distribution accepts floats, or only accepts ints.

If it accepts floats, then it is continuous, if it doesn’t, then it’s discrete.

Note

All logical comparison dunder methods implemented here check against this flag and return NotImplemented if the user tries to compare a discrete distribution with a float.

_negate_probability: bool: This attribute is a flag set by __ne__() and used by calculate() for the != operator.

reset() → None: Reset the bounds of the distribution to be the default, and reset negate_probability flag.

abstract __repr__() → str: Return a simple repr of the distribution, normally the syntax used to construct it.

__eq__(other)

Set the upper and lower bounds to other, if possible.

This method checks the bounds against the defaults to see if the user has previously compared this distribution with an inequality operator. If they have, then we raise an error.

Raises: NonsenseError – If the user has tried to mix inequality and equality comparison

__ne__(other)

Set the upper and lower bounds to other, if possible, and set negate_probability.

See __eq__().

Raises: NonsenseError – If the user has tried to mix inequality and equality comparison

__lt__(other): Set the upper bound and don’t include this value.

__le__(other): Set the upper bound and include this value.

__gt__(other): Set the lower bound and don’t include this value.

__ge__(other): Set the lower bound and include this value.

calculate(*, strict: bool = True) → float

Return the probability of a random variable from this distribution taking on a value within its bounds.

Warning

If strict is False, then we get undefined behaviour. Beware.

Warning

This method should only really be used in scripts and things, because it can easily result in undefined behaviour when the Distribution object is mutated between calls, which is often done with logical comparison operators.

If you want a good way to calculate probability interactively, see calculate_probability().

Parameters: strict (bool) – Whether to raise errors or just ignore them
Returns float: The calculated probability

abstract pmf(value: int, *, strict: bool = True) → float

Evaluate the PMF (probability mass function) of this distribution.

This is the probability that a random variable distributed by this distribution takes on the given value.

Parameters

value (int) – The value to find the probability of
strict (bool) – Whether to throw errors for invalid input, or return 0

Returns float

The calculated probability

Raises

NonsenseError – If the value doesn’t make sense in the context of the distribution

abstract cdf(value: int, *, strict: bool = True) → float

Evaluate the CDF (cumulative distribution function) of this distribution.

This is the probability that a random variable distributed by this distribution takes on a value less than or equal to the given value.

Parameters

value (int) – The value to find the probability for
strict (bool) – Whether to throw errors for invalid input, or return 0

Returns float

The calculated probability

Raises

NonsenseError – If the value doesn’t make sense in the context of the distribution

class probcalc.distribution_classes.ProbabilityCalculator

Bases: object

This class only exists to give the probability calculator a nice repr.

__init__() → None: Create the object with a non-public _sig_figs attribute.

set_sig_figs(x: int) → None

Set the number of significant figures used in the result of calculations.

Raises: ValueError – If x is not a positive integer

__repr__() → str: Return a very simple repr of the calculator.

__call__(distribution: Distribution, /) → float

Return the probability of a random variable from this distribution taking on a value within its bounds.

This function is just a convenient wrapper around Distribution.calculate().

Note

This function calls Distribution.reset(), but Distribution.calculate() doesn’t on its own. Using the class method multiple times with different inputs can result in undefined behaviour. Use this wrapper for all interactive use.

This function gets exported as P by __init__.py, which lets the user do things like:

Example

>>> from probcalc import P, B
>>> X = B(20, 0.5)
>>> P(X > 6)
0.9423408508
>>> P(4 < X <= 12)
0.8625030518

Parameters: distribution (Distribution) – The probability distribution that we’re using to calculate the value
Returns float: The calculated probability
Raises: NonsenseError – If the bounds of the distribution are invalid

probcalc.distributions module

This module contains classes for various probability distributions, and a convenience function.

class probcalc.distributions.BinomialDistribution

Bases: Distribution

This is a binomial distribution, used to model multiple independent, binary trials.

__init__(number_of_trials: int, probability: float): Construct a binomial distribution from a given number of trials and probability of success for each trial.

__repr__() → str: Return a nice repr of the distribution.

_check_nonsense(successes: int, *, strict: bool) → Literal[None, -1]

Check if the given number of successes is nonsense.

Parameters

successes (int) – The number of successes to check
strict (bool) – Whether to throw errors or just return -1

Returns

None on success, -1 on fail

Return type

Literal[None, -1]

Raises

NonsenseError – If the number of successes is outside the valid range
NonsenseError – If the number of successes is not an integer

pmf(successes: int, *, strict: bool = True) → float

Return the probability that we get a given number of successes.

This method uses the formula \(\binom{n}{r} p^r q^{n - r}\) where \(n\) is the number of trials, \(r\) is the number of successes, \(p\) is the probability of each success, and \(q = 1 - p\).

Parameters

successes (int) – The number of successes to find the probability of
strict (bool) – Whether to throw errors for invalid input, or return 0

Returns float

The probability of getting exactly this many successes

Raises

NonsenseError – If the number of successes is outside the valid range
NonsenseError – If the number of successes is not an integer

cdf(successes: int, *, strict: bool = True) → float

Return the probability that we get less than or equal to the given number of successes.

This method just sums pmf() from 0 to the given number of successes.

Parameters

successes (int) – The number of successes to find the probability for
strict (bool) – Whether to throw errors for invalid input, or return 0

Returns float

The probability of getting less than or equal to this many successes

Raises

NonsenseError – If the number of successes is outside the valid range
NonsenseError – If the number of successes is not an integer

calculate(*, strict: bool = True) → float

Check for nonsense in an edge case.

This method overrides Distribution.calculate(). See that method for documentation.

class probcalc.distributions.PoissonDistribution

Bases: Distribution

This is a Poisson distribution, used to model independent events that happen at a constant average rate.

__init__(rate: float): Construct a Poisson distribution with the given average rate of event occurrence.

__repr__() → str: Return a nice repr of the distribution.

static _check_nonsense(number: int, *, strict: bool = True) → Literal[None, -1]

Check if the given number of event occurrences is nonsense.

Parameters

number (int) – The number of occurrences to check
strict (bool) – Whether to throw errors or just return -1

Returns

None on success, -1 on fail

Return type

Literal[None, -1]

Raises

NonsenseError – If the number is negative
NonsenseError – If the number is not an integer

pmf(number: int, *, strict: bool = True) → float

Return the probability that we get a given number of occurrences.

This method uses the formula \(\frac{e^{-\lambda} \lambda^x}{x!}\), where \(x\) is the number of occurrences and \(\lambda\) is the rate of the distribution.

Parameters

number (int) – The number of occurrences to find the probability of
strict (bool) – Whether to throw errors for invalid input, or return 0

Returns float

The probability of getting exactly this many occurrences

Raises

NonsenseError – If the number of occurrences is negative
NonsenseError – If the number of occurrences is not an integer

cdf(number: int, *, strict: bool = True) → float

Return the probability that we get less than or equal to the given number of occurrences.

This method just sums pmf() from 0 to the given number of occurrences.

Parameters

number (int) – The number of occurrences to find the probability for
strict (bool) – Whether to throw errors for invalid input, or return 0

Returns float

The probability of getting less than or equal to this many occurrences

Raises

NonsenseError – If the number of occurrences is negative
NonsenseError – If the number of occurrences is not an integer

class probcalc.distributions.NormalDistribution

Bases: Distribution

A normal distribution with mean and standard deviation.

__init__(mean: float, std_dev: float): Create a normal distribution with given mean and standard deviation.

Note

We use standard deviation, not variance.

__repr__() → str: Return a nice repr of the distribution.

__lt__(other)

Call probcalc.distribution_classes.Distribution.__le__().

This is because normal distributions don’t distinguish strong/weak inequality.

__ge__(other)

Call probcalc.distribution_classes.Distribution.__gt__().

This is because normal distributions don’t distinguish strong/weak inequality.

pmf(value: float, *, strict: bool = True) → float

Return the probability of getting the given value from this normal distribution.

Parameters

value (float) – The value to find the probability of
strict (bool) – Whether to throw errors for invalid input, or return 0

Returns float

The probability of getting exactly this many occurrences

cdf(value: float, *, strict: bool = True) → float

Return the probability that we get less than or equal to the given number of occurrences.

This method uses the formula \(\frac{1}{2}\left[1+\text{erf}\left(\frac{x}{\sqrt{2}}\right)\right]\)

Parameters

value (int) – The value to find the probability for
strict (bool) – Whether to throw errors for invalid input, or return 0

Returns float

The probability of getting less than or equal to this value

class probcalc.distributions.GeometricDistribution

Bases: Distribution

This is a geometric distribution, used to model situations where you want to know about the first success.

__init__(probability: float) → None: Construct a geometric distribution with the given probability of success.

__repr__() → str: Return a nice repr of the distribution.

_check_nonsense(trials: int, *, strict: bool) → Literal[None, -1]

Check if the given number of trials is nonsense.

Parameters

trials (int) – The number of trials to check
strict (bool) – Whether to throw errors or just return -1

Returns

None on success, -1 on fail

Return type

Literal[None, -1]

Raises

NonsenseError – If the number of trials is outside the valid range
NonsenseError – If the number of trials is not an integer

pmf(trial: int, *, strict: bool = True) → float

Return the probability that the first success happens on the given trial.

This method uses the formula \(p(1 - p)^{x - 1}\) where \(x\) is the number of the trial, and \(p\) is the probability of success.

Parameters

trial (int) – The number of trials to find the probability of
strict (bool) – Whether to throw errors for invalid input, or return 0

Returns float

The probability of getting the first success on this trial

Raises

NonsenseError – If the number of successes is outside the valid range
NonsenseError – If the number of successes is not an integer

cdf(trials: int, *, strict: bool = True) → float

Return the probability that the first success occurs at or sooner than the given number of trials.

Parameters

trial (int) – The number of trials to find the probability of
strict (bool) – Whether to throw errors for invalid input, or return 0

Returns float

The probability of getting the first success on this trial

Raises

NonsenseError – If the number of successes is outside the valid range
NonsenseError – If the number of successes is not an integer

probcalc.utility module

A simple utility module to just provide helper functions for the maths.

probcalc.utility.factorial(n: int) → int: Return the factorial of n.

probcalc.utility.factorial_fraction(n: int) → float: Return 1 / factorial(n), but without overflowing.

probcalc.utility.choose(n: int, r: int) → int

Return the number of ways to choose r items from n elements.

This is often written as \(\binom{n}{r}\) or \(^nC_r\).

Parameters

n (int) – The number of items to choose from
r (int) – The number of items to be chosen

Returns int

The number of ways to choose r from n

Raises

ValueError – If r > n

probcalc.utility.round_sig_fig(n: float, sig_fig: int) → float

Round n to a given number of significant figures.

Example

>>> round_sig_fig(0.123456789, 3)
0.123
>>> round_sig_fig(0.123456789, 6)
0.123457
>>> round_sig_fig(0.123456789, 9)
0.123456789
>>> round_sig_fig(0.0000123456789, 3)
1.23e-05

Parameters

n (float) – The number to round
sig_fig (int) – The number of significant figures to round to

Returns float

The rounded number