# Cl-Online-Learning

A collection of online learning algorithms for Linear classification written in pure Common Lisp.

## Implemented algorithms

### Binary classifier

- Perceptron
- AROW (Crammer, Koby, Alex Kulesza, and Mark Dredze. "Adaptive regularization of weight vectors." Advances in neural information processing systems. 2009.)
- SCW-I (Soft Confidence Weighted) (Wang, Jialei, Peilin Zhao, and Steven C. Hoi. "Exact Soft Confidence-Weighted Learning." Proceedings of the 29th International Conference on Machine Learning (ICML-12). 2012.)
- Logistic Regression with SGD or ADAM optimizer (Kingma, Diederik, and Jimmy Ba. "Adam: A method for stochastic optimization." ICLR 2015)

### Multiclass classifier

- one-vs-rest ( K binary classifier required )
- one-vs-one ( K*(K-1)/2 binary classifier required )

## Installation

cl-online-learning is available from Quicklisp.

```
(ql:quickload :cl-online-learning)
```

When install from github repository,

```
cd ~/quicklisp/local-projects/
git clone https://github.com/masatoi/cl-online-learning.git
```

When using Roswell,

```
ros install masatoi/cl-online-learning
```

## Usage

### Prepare dataset

A data point is a pair of a class label (+1 or -1) and a input vector.

And dataset is represented as a sequence of data points. READ-DATA function is available to make a dataset from a sparse format used in LIBSVM (http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/). This function requires the number of features of that dataset.

```
(defparameter a1a-dim 123)
(defparameter a1a
(clol.utils:read-data
(merge-pathnames #P"t/dataset/a1a" (asdf:system-source-directory :cl-online-learning))
a1a-dim))
```

### Define learner

A learner object is just a struct, therefore their constructor is available to make it.

`(defparameter arow-learner (clol:make-arow a1a-dim 10d0))`

### Update and Train

To update the model destructively with one data point, use an update function corresponding to the model type.

```
(clol:arow-update arow-learner
(cdar a1a) ; input
(caar a1a)) ; label
```

TRAIN function can be used to learn the dataset collectively.

`(clol:train arow-learner a1a)`

It may be necessary to call this function several times until learning converges. For now, the convergence test has not been implemented yet.

### Predict and Test

```
(clol:arow-predict arow-learner (cdar a1a))
; => -1.0d0
(clol:test arow-learner a1a)
; Accuracy: 84.85981%, Correct: 1362, Total: 1605
```

### Multiclass classification

For multiclass data, the label of the data point is an integer representing the index of the class. READ-DATA function with MULTICLASS-P keyword option is available for make such a dataset.

```
(defparameter iris-dim 4)
; A dataset in which a same label appears consecutively need to be shuffled
(defparameter iris
(clol.utils:shuffle-vector
(coerce (clol.utils:read-data
(merge-pathnames #P"t/dataset/iris.scale"
(asdf:system-source-directory :cl-online-learning))
iris-dim :multiclass-p t)
'simple-vector)))
(defparameter iris-train (subseq iris 0 100))
(defparameter iris-test (subseq iris 100))
```

ONE-VS-REST and ONE-VS-ONE are available for multiclass classification by using multiple binary classifiers. In many cases, ONE-VS-ONE is more accurate, but it requires more computational resource as the number of classes increases.

```
;; Define model
(defparameter arow-1vs1
(clol:make-one-vs-one iris-dim ; Input data dimension
3 ; Number of class
'arow 0.1d0)) ; Binary classifier type and its parameters
;; Train and test model
(clol:train arow-1vs1 iris-train)
(clol:test arow-1vs1 iris-test)
; Accuracy: 98.0%, Correct: 49, Total: 50
```

### Sparse data

For sparse data (most elements are 0), the data point is a pair of a class label and a instance of SPARSE-VECTOR struct, and a learner with SPARSE- prefix is used. READ-DATA function with SPARSE-P keyword option is available for make such a dataset.

For example, news20.binary data has too high dimensional features to handle with normal learners. However, by using the sparse version, the learner can be trained with practical computational resources.

```
(defparameter news20.binary-dim 1355191)
(defparameter news20.binary (clol.utils:read-data "/path/to/news20.binary" news20.binary-dim :sparse-p t))
(defparameter news20.binary.arow (clol:make-sparse-arow news20.binary-dim 10d0))
(time (loop repeat 20 do (clol:train news20.binary.arow news20.binary)))
;; Evaluation took:
;; 1.527 seconds of real time
;; 1.526852 seconds of total run time (1.526852 user, 0.000000 system)
;; 100.00% CPU
;; 5,176,917,149 processor cycles
;; 11,436,032 bytes consed
(clol:test news20.binary.arow news20.binary)
;; Accuracy: 99.74495%, Correct: 19945, Total: 19996
```

In a similar way, the sparse version learners are also available in multiclass classification.

```
(defparameter news20-dim 62060)
(defparameter news20-train (clol.utils:read-data "/path/to/news20.scale" news20-dim :sparse-p t :multiclass-p t))
(defparameter news20-test (clol.utils:read-data "/path/to/news20.t.scale" news20-dim :sparse-p t :multiclass-p t))
(defparameter news20-arow (clol:make-one-vs-rest news20-dim 20 'sparse-arow 10d0))
(loop repeat 12 do (clol:train news20-arow news20-train))
(clol:test news20-arow news20-test)
;; Accuracy: 86.90208%, Correct: 3470, Total: 3993
```

## Author

Satoshi Imai (satoshi.imai@gmail.com)

## Licence

This software is released under the MIT License, see LICENSE.txt.

- Author
- Satoshi Imai
- License
- MIT Licence