We combine computational and genomic techniques to explore genome biology and the genetic basis of traits.

News

We are grateful to have been awarded funding from the Chan-Zuckerberg Initiative to further develop our bedtools and Go Get Data (GGD) projects as part of their Essential Open Source Software for Science program. Details about our efforts can be found here.
Tom Sasani's manuscript on germline mutation in large human pedigrees is out in eLife. Using sequencing data from 33 large, three-generation CEPH families, Tom found significant variability in parental age effects on DNM counts across families. He also discovered that nearly 10% of DNMs that would typically be attributed to the germline, are, in fact, post-zygotic. Read the manuscript and check out our open science repository of the code and data used in this study. Lastly, Tom gives an insightful interview in the eLife Podcast (min 7:11) if you are interested in learning more.
Our manuscript identifying constrained coding regions in the human genome was just published in Nature Genetics, and it made the cover! We encourage you to read our blog post describing the motivation and key results. In addition, there is a nice writeup of our work and a short video below providing a high level overview.
Our research into the genetic basis of rare human diseases is featured in a recent Scope Radio interview: "Backed by Computer Power, Scientists Are Finding the Causes of Mysterious Diseases"
We are always looking to add motivated, talented graduate students and postdoctoral scientist to our team. If interested, please email Aaron Quinlan articulating your research experience and career goals.

Research

The research in our laboratory is focused on the application of computational methods to develop a deeper understanding of genetic variation in diverse contexts. Modern experimental methods allow us to examine entire genomes with exquisite detail. Perhaps not surprisingly, staggering complexity is revealed as we look more closely at how genetic variation (both inherited and somatic) contributes to phenotypes. Modern genomic technologies necessitate efficient approaches for exploring, manipulating and comparing large genomic datasets. We develop such methods so that we and others may apply them to experiments investigating the impact of genetic variation on human disease, evolution, and somatic differentiation. Genome research is difficult - we strive to develop computational means that make it easier.

  • Rare disease genetics

    We develop and apply new software for identifying causal genetic variants in studies of rare familial disease. The University of Utah has a long history of expertise in this area and we work closely with many clinical collaborators to solve rare disease. Our GEMINI software is central to these efforts, and our laboratory collaborates with other members of the Utah Center for Genetic Discovery to study familial disease among the large pedigrees in the Utah Genome Project.

  • Structural variation

    Human chromosomes harbor hundreds of structural differences including deletions, insertions, duplications, inversions, and translocations. Collectively, these differences are known as "structural variation" (or, "SV"). Any two humans differ by thousands of structural variants which vary greatly in size and phenotypic consequence. However, we are just beginning to understand the contribution of SV to evolution, development, and complex disease. Our laboratory continues to develop new methods such as LUMPY for detecting and understanding structural variation using modern DNA sequencing techniques.

  • Cancer genomics and genome evolution

    Massively parallel DNA sequencing has yielded detailed maps of clonal variation in human cancer, through an inference of clonal substructure by analysis of variant allele frequencies in bulk tumor cell populations and direct sequencing of single cells. Dynamic changes in clonal structure over time and under the selective pressure of treatment have been extensively studied in hematologic malignancies, but are less well characterized in solid cancers. Our understanding of the dynamics of clonal change and its role in therapeutic response and the emergence of resistance is in its infancy. However, deeper insight is accessible via significant advances in sequencing and new algorithms. We are developing new methods to identify genomic changes that are responsible for clonal evolution, chemoresistance, and relapse.

  • Algorithm and Software Development

    Broadly speaking, the research in my laboratory marries genetics with genomics technologies, computer science, and machine learning techniques to develop new strategies for gaining insight into genome biology. We try to tackle challenging problems with practical importance to understanding genome variation in the context of human disease. We actively maintain a broad range of widely used tools for genome research including: BEDTOOLS, GEMINI, LUMPY, VCFANNO, PEDDY, and GQT.

Software

We strive to develop innovative, well-tested, and well-documented tools for genome research.

somalier

Extract informative sites, evaluate relatedness, and perform quality-control on BAM, CRAM, BCF, VCF, and GVCF. somalier makes checking any number of samples for identity easy directly from the alignments.


slivar

slivar is a set of command-line tools that enables rapid querying and filtering of VCF files. It facilitates operations on trios and groups and allows arbitrary expressions using simple javascript.


ggd

Search, and install genomic data packages. Build and check new ggd data packages. ggd provides easy access to processed genomic data. It removes the difficulties and complexities with finding and processing the data sets and annotations germane to your experiments and/or analyses. You can quickly and easily search and install data package using ggd. ggd also offers tools to easily create and contribute data packages to ggd.


d4-format

The D4 Quantatative Data Format. We sought to improve on existing formats such as BigWig and compressed BED files by creating the Dense Depth Data Dump (D4) format and tool suite. The D4 format is adaptive in that it profiles a random sample of aligned sequence depth from the input BAM or CRAM file to determine an optimal encoding that minimizes file size, while also enabling fast data access. We show that D4 uses less disk space for both RNA-Seq and whole-genome sequencing and offers 3 to 440 fold speed improvements over existing formats for random access, aggregation and summarization for scalable downstream analyses that would be otherwise intractable.


smoove-nf

Nextflow implementation of the smoove workflow, integrating several other tools meant to facilate variant calling and quality control of discovered variants.


seqcover

seqcover is a tool for viewing and evaluating depth-of-coverage with the following aims...

  • show a global view where it's easy to see problematic samples and genes
  • offer an interactive gene-wise view to explore coverage characteristics of individual samples within each gene
  • not require a server (single html page)
  • be responsive for up to 20 samples * 200 genes and be useful for a single-sample see how we do this
  • highlight outlier samples based on any number of (summarized) background samples


samplot

samplot is a command line tool for rapid, multi-sample structural variant visualization. samplot takes SV coordinates and bam files and produces high-quality images that highlight any alignment and depth signals that substantiate the SV.


oncogemini

OncoGEMINI is an adaptation of GEMINI intended for the improved identification of biologically and clincally relevant tumor variants from multi-sample and longitudinal tumor sequencing data. Using a GEMINI-compatible database (generated from an annotated VCF file), OncoGEMINI is able to filter tumor variants based on included genomic annotations and various allele frequency signatures.


mosdepth

Fast BAM/CRAM depth calculation for WGS, exome, or targeted sequencing. mosdepth can output...

  • per-base depth about 2x as fast samtools depth--about 25 minutes of CPU time for a 30X genome.
  • mean per-window depth given a window size--as would be used for CNV calling.
  • the mean per-region given a BED file of regions.
  • the mean or median per-region cumulative coverage histogram given a window size
  • a distribution of proportion of bases covered at or above a given threshold for each chromosome and genome-wide.
  • quantized output that merges adjacent bases as long as they fall in the same coverage bins e.g. (10-20)
  • threshold output to indicate how many bases in each region are covered at the given thresholds.
  • A summary of mean depths per chromosome and within specified regions per chromosome.
  • a d4 file (better than bigwig).


jigv

igv.js server and automatic configuration to view bam/cram/vcf/bed. igv.js requires that the files are hosted on a server, like apache or nginx and it requires writing html and javascript. In a single binary, jigv provides a server and some default configuration, javascript, and HTML enabling a simple entrypoint...

jigv --open-browser --region chr1:34566-34999 *.bam /path/to/some.cram my.vcf.gz


idplot

Designed to accelerate SARS-CoV-2 research, idplot allows one to quickly compare similar sequences (*.fasta) to a reference (.fasta) with options to inspect recombination and similarity within an interactive report.


freebayes-nf

A simplified version of freebayes-parallel written in Nextflow to handle job distribution on HPC resources. Intervals can be supplied by the user or created automatically to optimize compute utilization.


covviz

A many-sample coverage browser. The aim of covviz is to highlight regions of significant and sustained deviation of coverage depth from the majority of samples.


duphold

Uphold your DUP and DEL calls. SV callers like lumpy look at split-reads and pair distances to find structural variants. This tool is a fast way to add depth information to those calls. This can be used as additional information for filtering variants; for example we will be skeptical of deletion calls that do not have lower than average coverage compared to regions with similar gc-content.


smoove

smoove simplifies and speeds calling and genotyping SVs for short reads. It also improves specificity by removing many spurious alignment signals that are indicative of low-level noise and often contribute to spurious calls.


STRling

STRling (pronounced like "sterling") is a method to detect large STR expansions from short-read sequencing data. It is capable of detecting novel STR expansions, that is expansions where there is no STR in the reference genome at that position (or a different repeat unit from what is in the reference). It can also detect STR expansions that are annotated in the reference genome. STRling uses kmer counting to recover mis-mapped STR reads. It then uses soft-clipped reads to precisely discover the position of the STR expansion in the reference genome.


indexcov.png

indexcov

Crazy fast genome coverage estimates! The BAM and CRAM formats provide a supplementary linear index that facilitates rapid access to sequence alignments in arbitrary genomic regions. Comparing consecutive entries in a BAM or CRAM index allows one to infer the number of alignment records per genomic region for use as an effective proxy of sequence depth in each genomic region. Based on these properties, we have developed indexcov, an efficient estimator of whole-genome sequencing coverage to rapidly identify samples with aberrant coverage profiles, reveal large scale chromosomal anomalies, recognize potential batch effects, and infer the sex of a sample.


bedtools.swiss.png

bedtools

Collectively, the bedtools utilities are a swiss-army knife of tools for a wide-range of genomics analysis tasks. The most widely-used tools enable genome arithmetic. That is, set theory on the genome. For example, bedtools allows one to intersect, merge, count, complement, and shuffle genomic intervals from multiple files in widely-used genomic file formats such as BAM, BED, GFF, VCF. While each individual tool is designed to do a relatively simple task (e.g., intersect two interval files), quite sophisticated analyses can be conducted by combining multiple bedtools operations on the UNIX command line.


lumpy.png

lumpy

LUMPY is a novel and general probabilistic SV discovery framework that naturally integrates multiple SV detection signals, including those generated from read alignments or prior evidence, and that can readily adapt to any additional source of evidence that may become available with future technological advances.


gemini.png

gemini

GEMINI (GEnome MINIng) is a flexible framework for exploring genetic variation in the context of the wealth of genome annotations available for the human genome. By placing genetic variants, sample phenotypes and genotypes, as well as genome annotations into an integrated database framework, GEMINI provides a simple, flexible, and powerful system for exploring genetic variation for rare disease and population genetics.


gqt.png

gqt

Genotype Query Tools (GQT) is command line software and a C API for indexing and querying large-scale genotype data sets like those produced by 1000 Genomes, the UK100K, and forthcoming datasets involving millions of genomes. GQT represents genotypes as compressed bitmap indices, which reduce computational burden of variant queries based on sample genotypes, phenotypes, and relationships by orders of magnitude over standard "variant-centric" indexing strategies. This index can significantly expand the capabilities of population-scale analyses by providing interactive-speed queries to data sets with millions of individuals.


poretools.png

poretools

Poretools is a flexible toolkit for exploring datasets generated by nanopore sequencing devices from MinION for the purposes of quality control and downstream analysis. Poretools operates directly on the native FAST5 (an application of the HDF5 standard) file format produced by ONT and provides a wealth of format conversion utilities and data exploration and visualization tools.


speedseq.png

speedseq

SpeedSeq is an open-source genome analysis platform that accomplishes alignment, variant detection and functional annotation of a 50× human genome in 13 h on a low-cost server and alleviates a bioinformatics bottleneck that typically demands weeks of computation with extensive hands-on expert involvement. SpeedSeq offers performance competitive with or superior to current methods for detecting germline and somatic single-nucleotide variants, structural variants, insertions and deletions, and it includes novel functionality for streamlined interpretation.


vcfanno.png

vcfanno

vcfanno annotates a VCF with any number of sorted and tabixed input BED, BAM, and VCF files in parallel. It does this by finding overlaps as it streams over the data and applying user-defined operations on the overlapping annotations.


peddy.png

peddy

pedagree is a python library for querying, QC'ing, and manipulating pedigree files.


giggle.png

Giggle

Giggle is Google for genomic features and intervals. That is, scalable, multi-file index for fast queries of genomic intervals.


Publications

Preprints

Extensive recombination-driven coronavirus diversification expands the pool of potential pandemic pathogens

Stephen A. Goldstein, Joe Brown, Brent S Pedersen, Aaron R. Quinlan, Nels C. Elde

https://doi.org/10.1101/2021.02.03.429646

Unfazed: parent-of-origin detection for large and small de novo variants

Jonathan R Belyeu, Thomas A Sasani, Brent S Pedersen, Aaron R Quinlan

https://doi.org/10.1101/2021.02.03.429658

Poxviruses capture host genes by LINE-1 retrotransposition

Sarah M. Fixsen, Kelsey R. Cone, Stephen A. Goldstein, Thomas A. Sasani, Aaron R. Quinlan, Stefan Rothenburg, Nels C. Elde.

https://doi.org/10.1101/2020.10.26.356402

Efficient storage and analysis of quantitative genomics data with the Dense Depth Data Dump (D4) format and d4tools

Hao Hou, Brent Pedersen, Aaron Quinlan.

https://doi.org/10.1101/2020.10.23.352567

CaBagE: a Cas9-based Background Elimination strategy for targeted, long-read DNA sequencing

Amelia Wallace, Thomas A. Sasani, Jordan Swanier, Brooke L. Gates, Jeff Greenland, Brent S. Pedersen, K-T Varley, Aaron R. Quinlan.

https://doi.org/10.1101/2020.10.13.337253

Effective variant filtering and expected candidate variant yield in studies of rare human disease

Brent S. Pedersen, Joseph Brown, Harriet Dashnow, Amelia D. Wallace, Matt Velinder, Tatiana Tvrdik, Rong Mao, D. Hunter Best, Pinar Bayrak-Toydemir, Aaron R. Quinlan.

https://doi.org/10.1101/2020.08.13.249532

OncoGEMINI: Software for Investigating Tumor Variants From Multiple Biopsies With Integrated Cancer Annotations

Thomas J. Nicholas, Michael J. Cormier, Xiaomeng Huang, Yi Qiao, Gabor T. Marth, Aaron R. Quinlan.

https://doi.org/10.1101/2020.03.10.979591

Go Get Data (GGD): simple, reproducible access to scientific data

Michael J. Cormier, Jonathan R. Belyeu, Brent S. Pedersen, Joseph Brown, Johannes Koster, Aaron R. Quinlan.

https://doi.org/10.1101/2020.09.10.291377

De novo structural mutation rates and gamete-of-origin biases revealed through genome sequencing of 2,396 families

Jonathan R. Belyeu, Harrison Brand, Harold Wang, Xuefang Zhao, Brent S. Pedersen, Julie Feusier, Meenal Gupta, Thomas J. Nicholas, Lisa Baird, Bernie Devlin, Stephan J. Sanders, Lynn B. Jorde, Michael E. Talkowski, Aaron R. Quinlan.

https://doi.org/10.1101/2020.10.06.329011

Samplot: A Platform for Structural Variant Visual Validation and Automated Filtering

Jonathan R. Belyeu, Murad Chowdhury, Joseph Brown, Brent S. Pedersen, Michael J. Cormier, Aaron R. Quinlan, Ryan M. Layer.

https://doi.org/10.1101/2020.09.23.310110

2020

Somalier: rapid relatedness estimation for cancer and germline studies using efficient genome sketches

Brent S. Pedersen, Preetida J. Bhetariya, Joe Brown, Stephanie N. Kravitz, Gabor Marth, Randy L. Jensen, Mary P. Bronner, Hunter R. Underhill, Aaron R. Quinlan.

Genome Medicine, https://doi.org/10.1186/s13073-020-00761-2

Germline mutation rates in young adults predict longevity and reproductive lifespan

Richard M. Cawthon, Huong D. Meeks, Thomas A. Sasani, Ken R. Smith, Richard A. Kerber, Elizabeth O’Brien, Lisa Baird, Melissa M. Dixon, Andreas P. Peiffer, Mark F. Leppert, Aaron R. Quinlan, Lynn B. Jorde.

Scientific Reports, https://doi.org/10.1038/s41598-020-66867-0

XPRESSyourself: Enhancing, standardizing, and automating ribosome profiling computational analyses yields improved insight into data

Jordan A. Berg, Jonathan R. Belyeu, Jeffrey T. Morgan, Yeyun Ouyang, Alex J. Bott, Aaron R. Quinlan, Jason Gertz, Jared Rutter.

PLoS computational biology, https://doi.org/10.1371/journal.pcbi.1007625

2019

Large, three-generation CEPH families reveal post-zygotic mosaicism and variability in germline mutation accumulation

Thomas. A Sasani, Brent S. Pedersen, Ziyue Gao, Lisa Baird, Molly Przeworski, Lynn B. Jorde, Aaron R. Quinlan.

eLife, https://elifesciences.org/articles/46922

Duphold: scalable, depth-based annotation and curation of high-confidence structural variant calls.

Brent S. Pedersen, Aaron R. Quinlan.

GigaScience, https://doi.org/10.1093/gigascience/giz040

Overlooked roles of DNA damage and maternal age in generating human germline mutations.

Ziyue Gao, Priya Moorjani, Thomas A. Sasani, Brent S. Pedersen, Aaron R. Quinlan, Lynn B. Jorde, Guy Amster, Molly Przeworski.

PNAS, https://doi.org/10.1073/pnas.1901259116

Coexpression patterns define epigenetic regulators associated with neurological dysfunction.

Leandros Boukas, James M. Havrilla, Peter F. Hickey, Aaron R. Quinlan, Hans T. Bjornsson, Kasper D. Hansen.

Genome Research, https://doi.org/10.1101/gr.239442.118

2018

A map of constrained coding regions in the human genome.

James M. Havrilla, Brent S. Pedersen, Ryan M. Layer, Aaron R. Quinlan

Nature Genetics, https://doi.org/10.1038/s41588-018-0294-6

Genome-wide de novo risk score implicates promoter variation in autism spectrum disorder.

An JY, Lin K, Zhu L, Werling DM, Dong S, Brand H, Wang HZ, Zhao X, Schwartz GB, Collins RL, Currall BB, Dastmalchi C, Dea J, Duhn C, Gilson MC, Klei L, Liang L, Markenscoff-Papadimitriou E, Pochareddy S, Ahituv N, Buxbaum JD, Coon H, Daly MJ, Kim YS, Marth GT, Neale BM, Quinlan AR, Rubenstein JL, Sestan N, State MW, Willsey AJ, Talkowski ME, Devlin B, Roeder K, Sanders SJ.

Science, doi: 10.1126/science.aat6576

Long read sequencing reveals poxvirus evolution through rapid homogenization of gene arrays.

Sasani TA, Cone KR, Quinlan AR, Elde NC.

eLife, doi: 10.7554/eLife.35453

Whole-genome analysis for effective clinical diagnosis and gene discovery in early infantile epileptic encephalopathy.

Ostrander BEP, Butterfield RJ, Pedersen BS, Farrell AJ, Layer RM, Ward A, Miller C, DiSera T, Filloux FM, Candee MS, Newcomb T, Bonkowsky JL, Marth GT, Quinlan AR

Nature Genomic Medicine, doi: 10.1038/s41525-018-0061-8

Coloc-stats: a unified web interface to perform colocalization analysis of genomic features.

Simovski B, Kanduri C, Gundersen S, Titov D, Domanska D, Bock C, Bossini-Castillo L, Chikina M, Favorov A, Layer RM, Mironov AA, Quinlan AR, Sheffield NC, Trynka G, Sandve GK.

Nucleic Acids Research, doi: 10.1093/nar/gky474

SV-plaudit: A cloud-based framework for manually curating thousands of structural variants.

Belyeu JR, Nicholas TJ, Pedersen BS, Sasani TA, Havrilla JM, Kravitz SN, Conway ME, Lohman BK, Quinlan AR, Layer RM.

Gigascience, doi: 10.1093/gigascience/giy064

hts-nim: scripting high-performance genomic analyses.

Pedersen BS, Quinlan AR

Bioinformatics, doi: 10.1093/bioinformatics/bty358

An analytical framework for whole-genome sequence association studies and its implications for autism spectrum disorder.

Donna M Werling, Harrison Brand, Joon-Yong An, Matthew R Stone, Joseph T Glessner, Lingxue Zhu, Ryan L Collins, Shan Dong, Ryan M Layer, Eiriene-Chloe Markenscoff-Papadimitriou, Andrew Farrell, Grace B Schwartz, Benjamin B Currall, Jeanselle Dea, Clif Duhn, Carolyn Erdman, Michael Gilson, Robert E Handsaker, Seva Kashin, Lambertus Klei, Jeffrey D Mandell, Tomasz J Nowakowski, Yuwen Liu, Sirisha Pochareddy, Louw Smith, Michael F Walker, Harold Z Wang, Mathew J Waterman, Xin He, Arnold R Kriegstein, John L Rubenstein, Nenad Sestan, Steven A McCarroll, Ben M Neale, Hilary Coon, A. Jeremy Willsey, Joseph D Buxbaum, Mark J Daly, Matthew W State, Aaron Quinlan, Gabor T Marth, Kathryn Roeder, Bernie Devlin, Michael E Talkowski, Stephan J Sanders

Nature Genetics, DOI: 10.1038/s41588-018-0107-y

Nanopore sequencing and assembly of a human genome with ultra-long reads

Miten Jain, Sergey Koren, Josh Quick, Arthur C Rand, Thomas A Sasani, John R Tyson, Andrew D Beggs, Alexander T Dilthey, Ian T Fiddes, Sunir Malla, Hannah Marriott, Karen H Miga, Tom Nieto, Justin O'Grady, Hugh E Olsen, Brent S Pedersen, Arang Rhie, Hollian Richardson, Aaron Quinlan, Terrance P Snutch, Louise Tee, Benedict Paten, Adam M. Phillippy, Jared T Simpson, Nicholas James Loman, Matthew Loose

Nature Biotechnology, DOI: 10.1038/nbt.4060

GIGGLE: a search engine for large-scale integrated genome analysis

Ryan M. Layer, Brent S. Pedersen, Tonya DiSera, Gabor T. Marth, Jason Gertz, Aaron R. Quinlan

Nature Methods, doi: 10.1038/nmeth.4556

mosdepth: quick coverage calculation for genomes and exomes

Brent S. Pedersen and Aaron Quinlan

Bioinformatics doi.org/10.1093/bioinformatics/btx699

2017

Indexcov: fast coverage quality control for whole-genome sequencing.

Brent S. Pedersen, Ryan L Collins, Michael E Talkowski, Aaron Quinlan

GigaScience doi.org/10.1093/gigascience/gix090

Settling the score: variant prioritization and Mendelian disease.

Karen Eilbeck*, Aaron Quinlan*, Mark Yandell

Nature Reviews Genetics doi:10.1038/nrg.2017.52

Combating subclonal evolution of resistant cancer phenotypes.

Andrea Bild, Samuel Brady, Jasmine McQuerry, Yi Qiao, Stephen Piccolo, Gajendra Shrestha, Ryan Layer, Brent Pedersen, David Jenkins, Ryan Miller, Amanda Esch, Sara Selitsky, Joel Parker, Layla Anderson, Chakravarthy Reddy, Jonathan Boltax, Dean Li, Philip Moos, Joe Gray, Laura Heiser, W. Evan Johnson, Saundra Buys, Adam Cohen, Quinlan AR, Gabor Marth, Theresa Werner, Brian Dalley, and Rachel Factor

Nature Communications, doi:10.1038/s41467-017-01174-3

Identification of ATIC as a novel target for chemoradiosensitization.

Xiangfei Liu, Uma Devi Paila, Sharon N. Teraoka, Jocyndra A. Wright, Xin Huang, Quinlan AR, Richard A. Gatti and Patrick Concannon

International Journal of Radiation Oncology, doi:10.1016/j.ijrobp.2017.08.033

Who’s Who? Detecting and Resolving Sample Anomalies in Human DNA Sequencing Studies with Peddy.

Pedersen BS, Quinlan AR†

AJHG doi: 10.1016/j.ajhg.2017.01.017

cyvcf2: fast, flexible variant analysis with Python.

Pedersen BS, Quinlan AR†

Bioinformatics doi: 10.1093/bioinformatics/btx057

2016

Vcfanno: fast, flexible annotation of genetic variants.

Pedersen BS, Layer RM, Quinlan AR†

Genome Biol. doi: 10.1186/s13059-016-0973-5

Targeted Deep Sequencing in Multiple-Affected Sibships of European Ancestry Identifies Rare Deleterious Variants in PTPN22 that Confer Risk for Type 1 Diabetes.

Ge Y, Onengut-Gumuscu S, Quinlan AR, Mackey AJ, Wright JA, Buckner JH, Habib T, Rich SS, Concannon P.

Diabetes. pii: db150322

2015

Efficient genotype compression and analysis of large genetic-variation data sets.

Layer RM, Kindlon N, Karczewski K, Exome Aggregation Consortium, Quinlan AR†

Nature Methods. doi:10.1038/nmeth.3654

A parallel algorithm for N-way interval set intersection.

Layer RM, Quinlan AR†

IEEE Proceedings.

Speedseq: Ultra-fast personal genome analysis and interpretation.

Chiang C, Layer RM, Faust GG, Lindberg MR, Rose DB, Garrison EP, Marth GT, Quinlan AR, Hall IM.

Nature Methods. doi:10.1038/nmeth.3505

Fine mapping of type 1 diabetes susceptibility loci and evidence for colocalization of causal variants with lymphoid gene enhancers.

Onengut-Gumuscu S, Chen WM, Burren O, Cooper NJ, Quinlan AR, et al.

Nature Genetics. doi:10.1038/ng.3245

Population-based structural variation discovery with Hydra-Multi.

Lindberg MR, Hall IM, Quinlan AR†, et al.

Bioinformatics. doi:10.1093/bioinformatics/btu771

Extending reference assembly models.

Church DM, Schneider VA, Steinberg KM, Schatz MC, Quinlan AR, Chin CS, Kitts PA, Aken B, Marth GT, Hoffman MM, Herrero J, Mendoza ML, Durbin R, Flicek P.

Genome Biology. doi:10.1186/s13059-015-0587-3.

2014

Genetics of Systemic Lupus Erythematosus: Immune Responses and End Organ Resistance to Damage.

Dai C, Deng Y, Quinlan AR, Gaskin F, Tsao B, Fu SM.

Current Opinion in Immunology. doi:10.1016/j.coi.2014.10.004

A reference bacterial genome dataset generated on the MinIONTM portable single-molecule nanopore sequencer.

Quick J, Quinlan AR, Loman N.

GigaScience. doi: 10.1186/2047-217X-3-22

PORETOOLS: a toolkit for working with nanopore sequencing data from Oxford Nanopore

Loman N, Quinlan AR†, Loman N.

Bioinformatics. doi:10.1093/bioinformatics/btu555

SubcloneSeeker: a computational framework for reconstructing tumor clone structure for cancer variant interpretation and prioritization.

Qiao Y, Quinlan AR, Jazaeri A, Verhaak R, Wheeler D, Marth G.

Genome Biology. doi:10.1186/s13059-014-0443-x

BEDTools: the Swiss-army tool for genome interval arithmetic.

Quinlan AR†

Current Protocols in Bioinformatics. doi: 10.1002/0471250953.bi1112s47

LUMPY: A probabilistic framework for sensitive detec- tion of chromosomal rearrangements.

Layer RM, Quinlan AR†, Hall IM.

Genome Biology. doi:10.1186/gb-2014-15-6-r84

Homozygous mutation of MTPAP causes cellular radiosensitivity and persistent DNA double strand breaks.

Martin N, Nakamura K, Paila U, Woo J, Brown C, Wright J, Teraoka S, Haghayegh S, Mc- Curdy D, Schneider M, Hu H, Quinlan AR, Gatti R, and Concannon P.

Cell Death Dis. doi: 10.1038/cddis.2014.99

A Novel IFITM5 Mutation in Severe Osteogenesis Imperfecta Decreases PEDF Secretion by Osteoblasts.

Farber CR, Reich A, Barnes AM, Becerra P, Rauch F, Cabral WA, Bae A, Quinlan AR, Glorieux FH, Clemens TL, and Marini JC.

J Bone Miner Res. doi: 10.1002/jbmr.2173

Pathogenic variants for Mendelian and complex traits in exomes of 6,517 European and African Ameri- cans: implications for the return of incidental results.

Tabor HK, Auer PL, Jamal SM, Chong JX, Yu JH, Gordon AS, Graubert TA, O’Donnell CJ, Rich SS, Nickerson DA; NHLBI Exome Sequencing Project, Bamshad MJ.

Am J Hum Genet. doi: 10.1016/j.ajhg.2014.07.006

Whole-exome sequencing identifies rare and low-frequency coding variants associated with LDL cholesterol.

Lange LA, Hu Y, Zhang H, NHLBI Grand Opportunity Exome Sequencing Project, et al.

Am J Hum Genet. doi: 10.1016/j.ajhg.2014.01.010

Quantifying rare, deleterious variation in 12 human cytochrome P450 drug-metabolism genes in a large-scale exome dataset.

Gordon AS, Tabor HK, Johnson AD, Snively BM, NHLBI GO Exome Sequencing Project, et al.

Hum Mol Genet, doi: 10.1093/hmg/ddt588

2013

GEMINI: Integrative Exploration of Genetic Variation and Genome Annotations.

Paila U, Chapman BA, Kirchner R, Quinlan AR†.

PLoS Comput Biol. doi:10.1371/journal.pcbi.1003153

Joint linkage and association analysis with exome sequence data implicates SLC25A40 in hypertriglyceridemia.

Rosenthal EA, Ranchalis J, Crosslin DR, Burt A, Brunzell JD, Motulsky AG, Nickerson DA; NHLBI GO Exome Sequencing Project, Wijsman EM, Jarvik GP.

Am J Hum Genet., doi: 10.1016/j.ajhg.2013.10.019

Recurrent gain-of-function mutation in PRKG1 causes thoracic aortic aneurysms and acute aortic dissections.

Guo DC, Regalado E, NHLBI Grand Opportunity Exome Sequencing Project, et al.

Am J Hum Genet., doi: 10.1016/j.ajhg.2013.06.019

Fine-scale patterns of population stratification confound rare variant association tests.

O’Connor TD, Kiezun A, Bamshad M, Rich SS, Smith JD, Turner E; NHLBIGO Exome Sequencing Project; ESP Population Genetics, Statistical Analysis Working Group, Leal SM, Akey JM.

PLoS One. doi:10.1371/journal.pone.0065834

Common and rare von Willebrand factor (VWF) coding variants, VWF levels, and factor VIII levels in African Americans: the NHLBI Exome Sequencing Project.

Johnsen JM, Auer PL, Morrison AC, Jiao S, Wei P, Haessler J, Fox K, McGee SR, Smith JD, Carlson CS, Smith N, Boerwinkle E, Kooperberg C, Nickerson DA, Rich SS, Green D, Peters U, Cushman M, Reiner AP; NHLBI Exome Sequencing Project.

Blood. doi:10.1182/blood-2013-02-485094

Exome sequencing and genome-wide linkage analysis in 17 families illustrate the complex contribution of TTN truncating variants to dilated cardiomy- opathy.

Norton N, Li D, Rampersaud E, Morales A, Martin ER, Zuchner S, Guo S, Gonzalez M, Hedges DJ, Robertson PD, Krumm N, Nickerson DA, Hershberger RE; National Heart, Lung, and Blood Institute GO Exome Sequencing Project and the Exome Sequencing Project Family Studies Project Team.

Circ Cardiovasc Genet. doi:10.1161/CIRCGENETICS.111.000062

Breakpoint profiling of 64 cancer genomes reveals numerous complex rearrangements spawned by homology-independent mechanisms.

Malhotra A, Lindberg M, Leibowitz M, Clark R, Faust G, Layer R, Quinlan AR†, and Hall IM†.

Genome Research, doi:10.1101/gr.143677.112

Analysis of 6,515 exomes reveals the recent origin of most human protein-coding variants

Fu W, O’Connor TD, Jun G, Kang HM, Abecasis G, Leal SM, Gabriel S, Rieder MJ, Altshuler D, Shendure J, Nickerson DA, Bamshad MJ; NHLBI Exome Sequencing Project, Akey JM.

Nature. doi:10.1038/nature11690

Binary Interval Search (BITS): A Scalable Algorithm for Counting Interval Intersections.

Layer R, Robins G, Skadron K, Quinlan AR†

Bioinformatics. doi: 10.1093/bioinformatics/bts652

TGFB2 mutations cause familial thoracic aortic aneurysms and dissections associated with mild systemic features of Marfan syndrome.

Boileau C, Guo DC, Hanna N, Regalado ES, D, NHLBI Go Exome Sequencing Project, et al.

Nature Genetics. doi:10.1038/ng.2348

Exome sequencing of extreme phenotypes identifies DCTN4 as a modifier of chronic Pseudomonas aeruginosa infection in cystic fibrosis.

Emond MJ, Louie T, Emerson J, Zhao W, NHLBI Exome Sequencing Project; Lung GO, Gibson RL, Bamshad MJ.

Nature Genetics. doi:10.1038/ng.2344

2007-2012

Copy number variation detection and genotyping from exome sequence data.

Krumm N, Sudmant PH, Ko A, O‘Roak BJ, NHLBI Exome Sequencing Project, Quinlan AR, Nickerson DA, Eichler EE.

Genome Research. doi: 10.1101/gr.138115.112

Characterizing complex structural variation in germline and somatic genomes.

Quinlan AR and Hall IM.

Trends in Genetics. doi: http://dx.doi.org/10.1016/j.tig.2011.10.002

Detection and interpretation of genomic structural variation in mammals.

Quinlan AR and Hall IM.

Methods in Molecular Biology

Genome sequencing of mouse induced pluripotent stem cells reveals retroelement stability and infrequent DNA rearrangement during reprogramming.

Quinlan AR, Boland MJ, Leibowitz ML, Shumilina S, Pehrson SM, Baldwin KK, Hall IM.

Cell Stem Cell. doi: 10.1016/j.stem.2011.07.018

Evidence for two independent associations with type 1 diabetes at the 12q13 locus.

Keene KL, Quinlan AR, Hou X, Hall IM, Mychaleckyj, Onengut-Gumuscu S, Concannon P.

Genes and Immunity. doi: 10.1038/gene.2011.56

Pybedtools: a flexible Python library for manipulating genomic datasets and annotations.

Dale R, Pedersen B, Quinlan AR†.

Bioinformatics. doi: 10.1093/bioinformatics/btr539

BamTools: a C++ API and toolkit for analyzing and managing BAM files.

Barnett D, Garrison E, Quinlan AR, Stromberg M, Marth G.

Bioinformatics. doi: 10.1093/bioinformatics/btr174

A map of human genome variation from population-scale sequencing.

1000 Genomes Project Consortium.

Nature. doi: 10.1038/nature09534

BEDTools: A flexible framework for comparing genomic features.

Quinlan AR and Hall IM.

Bioinformatics. doi: 10.1093/bioinformatics/btq033

Genome-wide mapping and assembly of structural variant breakpoints in the mouse genome.

Quinlan AR, Clark RA, Sokolova, S, Leibowitx ML, Zhang Y, Hurles ME, Mell JC, Hall IM.

Genome Research. doi: 10.1101/gr.102970.109

Population Genomic Inferences from Sparse High-Throughput Sequencing of Two Populations of Drosophila melanogaster.

Sackton, TB, Kulathinal RJ, Bergman CM, Quinlan AR, Dopman E, Marth GT, Hartl DL, Clark AG.

Genome Biol Evol. doi: 10.1093/gbe/evp048

Rapid whole-genome mutational profiling using next-generation sequencing technologies.

Smith D, Quinlan AR, Peckham HR, et al.

Genome Research

Whole Genome Sequencing and SNP Discovery for C. elegans using massively parallel sequencing-by-synthesis.

Hillier LW, Marth GT, Quinlan AR, et al.

Nature Methods. doi: 10.1101/gr.077776.108

PyroBayes: Accurate quality scores for 454 Life Science pyrosequences.

Quinlan AR, Stewart D, Stromberg M, Marth GT

Nature Methods. doi:10.1038/nmeth.1172

Primer-site SNPs mask mutations.

Quinlan AR and Marth GT.

Nature Methods. doi:10.1038/nmeth0307-192

The Lab

We are a hard working group of geneticists and computational biologists at the University of Utah in the Departments of Human Genetics and Biomedical Informatics. We're committed to developing and applying cutting-edge methods to the understanding of genome biology and the genetic basis of disease. Working in our lab is a unique opportunity to apply and learn large-scale genomics methodologies and to make an impact on our understanding of human diseases. Please contact us if you are passionate about future work in this area. To learn more about Salt Lake City and the incredible research and quality of life in Utah, please visit the "Why Utah" resource.

Aaron Quinlan

Principal Investigator

Brent Pedersen

Senior Programmer

Tom Nicholas

Sr. Research Scientist

Meenal Gupta

Sr. Research Scientist

Hao Hou

Programmer / Staff Scientist

Peter McHale

Sr. Analyst/Programmer

Joe Brown

Data Engineer / Staff Scientist

Amelia Wallace

Postdoctoral Research Associate

Harriet Dashnow

Postdoctoral Scientist

Stephanie Kravitz

Graduate Student

Jonathan Belyeu

Graduate Student

Michael Cormier

Graduate Student

Simone Longo

Graduate Student

John Chamberlin

Graduate Student

Jason Kunisaki

Graduate Student

Alumni

Tom Sasani

Graduate Student

Brian Lohman

Postdoctoral Research Associate

Ryan Layer

Assistant Professor at CU-Boulder

Jim Havrilla

Graduate Student

Uma Paila

Postdoctoral Scientist

Neil Kindlon

Programmer

John Kubinski

Undergraduate Researcher

Phanwadee Sinthong

Undergraduate Researcher

Nathan Wilkinson

Undergraduate Student

Teaching

Salt Lake Learners of Biostats (SLLOBS)

SLLOBS - Lecture 03 - Data Frames and crude RNA-seq analysis

SLLOBS - Lecture 05 - Data Visualization with ggplot2

SLLOBS - Lecture 08 - Intro to Probability with Coin Tosses

SLLOBS - Lecture 11 - Bayes's Rule, variance, prob. distributions, expectation, covariance

SLLOBS - Lecture 12 - Poisson distributions in biology

SLLOBS - Lecture 13 - Gaussian Processes and QQ Plots

SLLOBS - Lecture 14 - t-statistics, t-distribution, t-tests, and p-values

SLLOBS - Lecture 16 - Intro to regression and model interpretation.

Applied Computational Genomics

Tutorials

Lectures

Lab.blog

Funding

We are very grateful to receive generous funding for our research from the National Human Genome Research Institute, the National Cancer Institute, USTAR, the Simons Foundation, and the Margolis Foundation.

Contact Us

Aaron Quinlan is an Associate Professor in the Department of Human Genetics and the Department of Biomedical Informatics at the University of Utah. Our lab is located on the 7th floor of The Eccles Institute for Human Genetics at The University of Utah. We are a part of the Utah Center for Genetic Discovery.

Eccles Institute for Human Genetics
15 North 2030 East, Room 7160B

aquinlan at genetics dot utah dot edu