distributions

2022-11-07

Random numbers and distributions

Upstream URL

github.com/Lisp-Stat/distributions

Author

Steven Nunez <steve@symbolics.tech>

License

msPl, Same as DISTRIBUTIONS, this is part of the DISTRIBUTIONS library.
README

Contributors Forks Stargazers Issues MS-PL License LinkedIn


Logo

Distributions

The Distributions package provides a collection of probabilistic distributions and related functions
Explore the docs »

Report Bug · Request Feature · Reference Manual

Table of Contents

  1. About The Project
  2. Getting Started
  3. Usage
  4. Roadmap
  5. Resources
  6. Contributing
  7. License
  8. Contact

About the Project

DISTRIBUTIONS is a library for (1) generating random draws from various commonly used distributions, and (2) calculating statistical functions, such as density, distribution and quantiles for these distributions.

In the implementation and the interface, our primary considerations are:

  1. Correctness. Above everything, all calculations should be correct. Correctness shall not be sacrificed for speed or implementational simplicity. Consequently, everything should be unit-tested all the time.

  2. Simple and unified interface. Random variables are instances which can be used for calculations and random draws. The naming convention for building blocks is (draw|cdf|pdf|quantile|...)-(standard-)?distribution-name(possible-suffix)?, eg pdf-standard-normal or draw-standard-gamma1, for example.

  3. Speed and exposed building blocks on demand. You can obtain the generator function for random draws as a closure using the accessor "generator" from an rv. In addition, the package exports independent building blocks such as draw-standard-normal, which can be inlined into your code if necessary.

Implementation note: Subclasses are allowed to calculate intermediate values (eg to speed up computation) any time, eg right after the initialization of the instance, or on demand. The consequences or changing the slots of RV classes are UNDEFINED, but probably quite nasty. Don't do it. Note: lazy slots are currently not used, will be reintroduced in the future after profiling/benchmarking.

Built With

Getting Started

To get a local copy up and running follow these steps:

Prerequisites

An ANSI Common Lisp implementation. Developed and tested with SBCL.

Installation

Lisp-Stat is composed of several system that are designed to be independently useful. So you can, for example, use distributions for any project needing to manipulate statistical distributions.

Getting the source

To make the system accessible to ASDF (a build facility, similar to make in the C world), clone the repository in a directory ASDF knows about. By default the common-lisp directory in your home directory is known. Create this if it doesn't already exist and then:

  1. Clone the repositories
cd ~/common-lisp && \
git clone https://github.com/Lisp-Stat/distributions.git && \
  1. Reset the ASDF source-registry to find the new system (from the REPL)
    (asdf:clear-source-registry)
  2. Load the system
    (ql:quickload :distributions)

This will download all of the dependencies for you.

Getting dependencies

To get the third party systems that Lisp-Stat depends on you can use a dependency manager, such as Quicklisp or CLPM Once installed, get the dependencies with either of:

(clpm-client:sync :sources "clpi") ;sources may vary
(ql:quickload :distributions)

You need do this only once. After obtaining the dependencies, you can load the system with ASDF: (asdf:load-system :distributions). If you have installed the slime ASDF extensions, you can invoke this with a comma (',') from the slime REPL in emacs.

Usage

Create a standard normal distribution

(defparameter *rv-normal* (distributions:r-normal))

and take a few draws from it:

LS-USER> (distributions:draw *rv-normal*)
1.037208743704438d0
LS-USER> (distributions:draw *rv-normal*)
-0.2847287516046668d0
LS-USER> (distributions:draw *rv-normal*)
-0.6793466378900889d0
LS-USER> (distributions:draw *rv-normal*)
1.5040711441992598d0
LS-USER>

For more examples, please refer to the Documentation.

Roadmap

  1. Sketch the interface.
  2. Extend basic functionality (see Coverage below)
  3. Keep extending the library based on user demand.
  4. Optimize things on demand, see where the bottlenecks are.

Specific planned improvements, roughly in order of priority

  • more serious testing. I like the approach in Cook (2006): we should transform empirical quantiles to z-statistics and calculate the p-value using chi-square tests

  • (mm rv x) and similar methods for multivariate normal (and maybe T)

See the open issues for a list of proposed features (and known issues).

Coverage

DistributionPDFCDFQuantileDrawFit
BernoulliN/AN/AN/AYesNo
BetaYesYesYesYesYes
BinomialNoNoNoYesNo
Chi-SquareNoNoNoNoNo
DiscreteYesYesNoYesNo
ExponentialYesYesYesYesNo
GammaYesYesYesYesNo
GeometricNoNoNoYesNo
Inverse-GammaYesNoNoYesNo
Log-NormalYesYesYesYesNo
NormalYesYesYesYesNo
PoissonNoNoNoYesNo
RayleighNoYesNoYesNo
Student tNoNoNoYesNo
UniformYesYesYesYesNo

Resources

This system is part of the Lisp-Stat project; that should be your first stop for information. Also see the resources and community page for more information.

Contributing

Always try to implement state-of-the-art generation and calculation methods. If you need something, read up on the literature, the field has developed a lot in the last decades, and most older books present obsolete methods. Good starting points are Gentle (2005) and Press et al (2007), though you should use the latter one with care and don't copy algorithms without reading a few recent articles, they are not always the best ones (the authors admit this, but they claim that some algorithms are there for pedagogical purposes).

Always document the references in the docstring, and include the full citation in doc/references.bib (BibTeX format).

Do at least basic optimization with declarations (eg until SBCL doesn't give a notes any more, notes about return values are OK). Benchmarks are always welcome, and should be documented.

Document doubts and suggestions for improvements, use !! and ??, more marks mean higher priority.

Please see CONTRIBUTING.md for details on the code of conduct, and the process for submitting pull requests.

License

Distributed under the MS-PL License. See LICENSE for more information.

Contact

Project Link: https://github.com/lisp-stat/distributions

Dependencies (9)

  • alexandria
  • anaphora
  • array-operations
  • cephes.cl
  • fiveam
  • float-features
  • let-plus
  • numerical-utilities
  • special-functions

Dependents (2)

  • GitHub
  • Quicklisp