1000 Genomes Project LRS Sequencing Consortium

The 1000 Genomes Project LRS Sequencing Consortium (1KGP-LRS) is building on the landmark work done by the 1000 Genomes Project (1KGP), which began in 2008 as a collaborative initiative to establish a database of normal human genetic variation by sequencing the genomes of over a thousand healthy individuals from diverse ancestries. In the end, the 1KGP study sequenced over 3,000 genomes using short-read sequencing, and it continues to provide invaluable insights into human genetic diversity.

The 1KGP-LRS Consortium kicked off on Thursday, June 30, 2022. Funding-permitting, we hope to perform long-read sequencing of all 1KGP samples, which are available as DNA or cell lines from the NHGRI's Sample Repository for Human Genetic Research housed at the Coriell Institute for Medical Research. To obtain high-coverage, high-quality long-read assemblies, we are isolating high molecular weight DNA directly from cell culture of 1KGP cell lines obtained from Coriell.

The goal of the 1KGP-LRS Consortium is to identify a broader spectrum of genomic variation than is possible using short-read sequencing so we may further improve our understanding of human genetic disease. This dataset is already enabling us to better understand normal patterns of human structural variation, identify variation in difficult-to-map regions of the genome, and study repeat expansions and methylation patterns.

The People

We are a collaborative group of researchers from around the world interested in leveraging long-read sequencing to better understand the normal patterns of structural variation, methylation, and repeat expansion in the population so we can more effectively identify missing disease-causing variation in individuals. The project is led by Danny Miller and Evan Eichler at the University of Washington and cell culture and DNA extraction are performed in the Miller and Eichler labs. Sequencing is performed at the University of Washington, the New York Genome Center and Stanford University. Individuals and institutions contributing to this work are listed below. If you would like to be included here, please let us know.

Zachary Anderson
University of Washington

Anna O Basile
New York Genome Center

Wayne E Clarke
New York Genome Center

André Corvelo
New York Genome Center

Nikhita Damaraju
University of Washington

Harriet Dashnow
University of Utah / University of Colorado School of Medicine

Wouter De Coster
University of Antwerp

Evan E Eichler
University of Washington

Erik Garrison
University of Tennessee Health Science Center

Sophia B Gibson
University of Washington

Joy Goffena
University of Washington

Claudia Gonzaga-Jauregui
Universidad Nacional Autónoma De México

Sara Goodwin
Cold Spring Harbor Laboratory

Andrea Guarracino
University of Tennessee Health Science Center

Jonas A Gustafson
University of Washington

Adrienne Helland
New York Genome Center

Kendra Hoekzema
University of Washington

Miten Jain
Northeastern University

Tanner D Jensen
Stanford University

Mikhail Kolmogorov
National Cancer Institute, NIH

Qiuhui Li
Johns Hopkins University

Matthew Loose
University of Nottingham

W Richard McCombie
Cold Spring Harbor Laboratory

Richard N McLaughlin Jr
Pacific Northwest Research Institute / University Of Washington

Angela L Miller
University of Washington

Danny E Miller
University of Washington

Stephen B Montgomery
Stanford University

Rajeeva Lochan Musunuri
New York Genome Center

Nathan D Olson
National Institute of Standards and Technology

Cate R Paschal
Seattle Children's Hospital

Karynne E Patterson
University of Washington

Catherine E Reeves
New York Genome Center

Mahler Revsine
Johns Hopkins University

Phillip A Richmond
Alamya Health

Esther Robb
Stanford University

Michael C Schatz
Johns Hopkins University

Fritz J Sedlazeck
Baylor College of Medicine, Rice University

Maisha Sinha
University of Washington

Anthony A Snead
New York University

Sophie HR Storz
University of Washington

David Twesigomwe
University of the Witwatersrand

Rachel A Ungar
Stanford University

Sydney A Ward
University of Washington

Lei Yang
Pacific Northwest Research Institute

Christina Zakarian
University of Washington

Miranda PG Zalusky
University of Washington

Michael C Zody
New York Genome Center

Justin M Zook
National Institute of Standards And Technology

University Of Washington Center for Rare Disease Research (UW-CRDR)

Genomics Research to Elucidate the Genetics of Rare Diseases (GREgoR) Consortium

The Technology

Illustration of DNALong-read sequencing is being performed using both Oxford Nanopore and PacBio. Nanopore technology detects changes in current as single-stranded DNA or RNA molecules pass through a protein pore. The first 100 samples were sequenced on the R9.4.1 pore, and subsequent samples were sequenced with the higher accuracy R10 chemistry. PacBio sequencing identifies DNA bases through real-time detection of fluorescenly labeled nucleotides. Both platforms support direct detection of epigenetic modifications, including DNA methylation.

Data from both platforms will be integrated through a joint analysis pipeline that harmonizes variant calls, resolves discrepancies and consolidates epigenetic signatures to produce a comprehensive genomic profile.

The Data

The 1KGP-LRS Consortium is committed to publicly releasing data as they are generated, after basecalling and standard QC. Raw Nanopore sequencing data, processed data and summary data can be found here. We are now beginning to generate PacBio data as well and will release that in batches as it is available.

Analysis of the first 100 genomes sequenced for this project is published in Genome Research (PMID 39358015). The initial 100 samples:

  • Represent all 5 superpopulations and 19 subpopulations
  • Have yielded an average sequence read N50 of 54 kbp and 37x depth of coverage
  • Have identified ~24,500 high-confidence structural variants per genome

Join Us

The 1KGP-LRS Consortium is open to all. Please contact Danny Miller if you are interested in joining the consortium and to be added to the Slack group.

Contact Us

Danny Erwin Miller, MD, PhD

For questions or inquiries,
email: [email protected]

Physical Address

Center for Developmental Biology and Regenerative Medicine
1900 Ninth Ave.
Seattle, WA 98101