Our laboratory develops free, open-source software for genomics research. We strive to develop intuitive and well-documented tools and are always open to feedback and user requests. If you find these tools useful in your research, we ask that you cite them in your research and report any issues that you uncover.

bedtools

Bedtools is a flexible toolset for genome arithmetic. It supports a wide range of common data manipulation tasks for genomic intervals (e.g. genes, ChIP-seq peaks) in many commonly used genomics file formats ( BED, BAM, GFF, VCF).

pybedtools

Pybedtools is a Python wrapper (and much more) for Bedtools and extends these "genome algebra" programs by offering feature-level manipulations from with Python. Pybedtools is maintained by Ryan Dale.

gemini

Gemini is a powerful framework for exploring genetic variation in the context of the wealth of existing genome annotations that are available for the human genome. By integrating diverse annotations with genetic variation in the now standard VCF format, researchers have an single system for prioritizing variants in studies of human disease.

hydra

Hydra detects all classes of structural variation using paired-end sequence alignments from modern DNA sequencing technologies. Unlike other existing tools, hydra detects SVs arising from repetitive DNA and can make improved SV predictions by incorporating sequence alignments from many (100s or more) samples.

lumpy

Lumpy is a new probabilistic framework that we have developed to integrate multiple structural variation signals such as discordant paired-end alignments and split-read alignments. While it is clear that integrating all SV signals is important for sensitive discovery, most existing (including our own Hydra) tools only exploit one signal. Lumpy integrates multiple signals in order to improve sensitivity and breakpoint resolution. This is especially important for cancer genome analysis where tumor heterogeneity causes potentially important rearrangements occur with less supporting alignments in the sampled DNA.

scurgen

Scurgen is a new visualization tool we are developing for exploring genomic data with space-filling curves. Nothing is better at pattern detection than the human eye. Scurgen attempts to present genomic data is a way such that patterns are easily detected among multiple datasets and it works nicely with BAM, BED, VCF, BEDGRAPH, and GFF formats.