cl-gene-searcher
2011-10-01
A simple interface to a SQLite database for querying information for genes, and DGV Tracks.
Upstream URL
Author
David Thole
Maintainer
David Thole
License
LGPL
cl-gene-searcher is a very simple library to allow searching for information available through UCSC, in a local form. The current database utilized is sqlite, with eventual support for utilizing the API interface for UCSC.
Installation
1) git clone the repository
2) Symlink the asd file to your central asdf systems directory
3) Download hg18.sqlite (hg19 coming soon) - Location: (still working on this)
Sample Usage:
(require :cl-gene-searcher)
You can create your own package that uses cl-gene-searcher, or just (in-package :cl-gene-searcher)
You will need to point the library to your hg18 database:
(setDatabasePath "/Users/dthole/programming/cl-gene-searcher/data/hg18.sqlite")
To search by a gene name (or multiple gene names):
CL-GENE-SEARCHER> (loop for x in (query-gene-by-name "RPS6KA2" "FAM20C") do (format t "name: ~a~%" (name x)))
name: RPS6KA2
name: RPS6KA2
name: FAM20C
NIL
To look up genes based off some genomic location:
CL-GENE-SEARCHER> (query-gene-by-range :chr "chr1" :start 1 :stop 5000)
(#<GENES {100381DFE1}> #<GENES {100381E451}> #<GENES {100381E8C1}>
#<GENES {100381ED31}> #<GENES {100381F1A1}> #<GENES {100381F611}>
#<GENES {100381FA81}> #<GENES {100381FEF1}> #<GENES {1003820361}>
#<GENES {10038207D1}> #<GENES {1003820C41}> #<GENES {10038210B1}>
#<GENES {1003821521}> #<GENES {1003821991}> #<GENES {1003821E01}>)
And to get their useful information:
CL-GENE-SEARCHER> (loop for x in (query-gene-by-range :chr "chr1" :start 1 :stop 5000) do (format t "chr: ~a | start: ~a | end: ~a | name: ~a~%" (chr x) (start_region x) (stop_region x) (name x)))
chr: chr1 | start: 1115 | end: 4121 | name: uc001aaa.2
chr: chr1 | start: 1115 | end: 4272 | name: uc009vip.1
chr: chr1 | start: 4268 | end: 6628 | name: uc009vis.1
chr: chr1 | start: 4268 | end: 9622 | name: uc009vit.1
chr: chr1 | start: 4268 | end: 9622 | name: uc009viu.1
chr: chr1 | start: 4268 | end: 9622 | name: uc001aae.2
chr: chr1 | start: 4268 | end: 14764 | name: uc001aab.2
chr: chr1 | start: 4268 | end: 19221 | name: uc009viq.1
chr: chr1 | start: 4268 | end: 19221 | name: uc009vir.1
chr: chr1 | start: 4268 | end: 19221 | name: uc001aac.2
chr: chr1 | start: 4269 | end: 19221 | name: uc009viv.1
chr: chr1 | start: 4269 | end: 19221 | name: uc009viw.1
chr: chr1 | start: 4832 | end: 19672 | name: uc001aaf.1
chr: chr1 | start: 4224 | end: 7502 | name: LOC100288778
chr: chr1 | start: 4224 | end: 19233 | name: WASH7P
NIL
If you wanted just the refFlat version of this information:
CL-GENE-SEARCHER> (loop for x in (query-refflat-gene-by-range :chr "chr1" :start 1 :stop 5000) do (format t "chr: ~a | start: ~a | end: ~a | name: ~a~%" (chr x) (start_region x) (stop_region x) (name x)))
chr: chr1 | start: 4224 | end: 7502 | name: LOC100288778
chr: chr1 | start: 4224 | end: 19233 | name: WASH7P
NIL