cl-sam

2015-06-08

Toolkit for manipulation of DNA sequence alignment data stored in the Sequence Alignment/Map (SAM) format.

Upstream URL

github.com/keithj/cl-sam

Author

Keith James

License

GPL v3
README
Introduction cl-sam is a Common Lisp toolkit for manipulation of DNA sequence alignment data stored in the Sequence Alignment/Map (SAM) format (http://samtools.sourceforge.net). cl-sam is de novo implementation of the SAM spec in Common Lisp, using zlib via a C foreign function interface. While cl-sam is slower than the C and Java implementations for some (but not all) operations, performance is good enough for real work. Moreover, it offers the advantages of rapid development at all levels in the SAM/BAM stack (zlib, bgzf, BAM and SAM) not afforded by scripting language bindings of samtools. cl-sam is able to create BAM records de novo and may be used to create a BAM file from scratch or edit a BAM stream. SAM/BAM header manipulation functions are included so that new headers may be created and headers from different BAM files may be systematically merged without redundancy or avoidable conflicts. Where conflicts are unavoidable, error conditions are raised to alert the user. The operations supported in this version are: SAM BAM Read No Yes Write Yes Yes Sorting operations are available using an external merge sort that is extensible by user-supplied sorting predicates. Installation cl-sam uses ASDF for system definition. Copy or symlink cl-sam.asd (and optionally cl-sam-test.asd) to your asdf:*central-registry* and load cl-sam with the asdf:load-system function: (asdf:load-system :cl-sam) Tests To run the unit and regression tests you need to have LIFT installed. Run the tests with the asdf:operate function: (asdf:operate 'asdf:test-op :cl-sam) or with the equivalent deoxybyte-systems:test-system function: (asdf:test-system :cl-sam) Documentation See the Lisp docstrings, particularly the package docstrings for an overview. HTML documentation may be generated with the command: (dxs:document-system :cl-sam) at the REPL, provided that CLDOC is installed. The components of cl-sam are divided by file as follows: bam.lisp High-level BAM record data reading functions. bam-reader.lisp High-level BAM file reading functions. bam-writer.lisp High-level BAM file writing functions. bam-index.lisp Structures representing a samtools BAM index. sam.lisp High-level SAM record data reading functions. sam-reader.lisp High-level SAM file reading functions. sam-writer.lisp High-level SAM file writing functions. bgzf-reader.lisp Low-level BGZF seek and read functions. bgzf-stream.lisp Low-level BGZF Gray stream implementation. An alternative to the bgzf-reader functions, but roughly 2x slower. The test suite contains examples of use. Dependencies deoxybyte-systems git://github.com/keithj/deoxybyte-systems.git deoxybyte-utilities git://github.com/keithj/deoxybyte-utilities.git deoxybyte-io git://github.com/keithj/deoxybyte-io.git deoxybyte-gzip git://github.com/keithj/deoxybyte-gzip.git CFFI http://common-lisp.net/project/cffi/ Optional dependencies LIFT http://common-lisp.net/project/lift/ CLDOC http://common-lisp.net/project/cldoc/

Dependencies (5)

  • deoxybyte-gzip
  • deoxybyte-io
  • deoxybyte-systems
  • deoxybyte-unix
  • lift

Dependents (0)

    • GitHub
    • Quicklisp