While I was at CHI’s Molecular Medicine Tri-Con in San Francisco last week (Feb 23rd), I had a chance to sit in at a discussion table at the end of the day. The topic at Table 6 was about diagnostic applications that used next generation sequencing (NGS). About 16 people discussed the pros and cons of targeted resequencing versus whole genome sequencing. Karl Voelkerding M.D.,(Assoc. Professor, Pathology, Univ. of Utah; Medical Director, Advanced Technology and Bioinformatics, ARUP Laboratories), moderated the discussion. Karl said that NGS is being applied to multi-gene panels, exomes and whole genomes in clinical research and diagnostics. Each approach has different costs and complexity of data analysis and interpretation.
NGS for Multi-gene Panels v. Whole Genome
Karl started off by talking about multi-gene panels and NGS. Karl briefly talked about using multi-gene panels and Marfan Syndrome. He said that the challenge involves sample preparation and noted that Fluidigm has a workable solution for this.
He asked the group “What’s being seen in Europe?” A person from Europe said that he has seen targeted NGS vs. whole genome NGS used by a fee-for-service company in Europe. A person from Genomic Health said that, “if cost is not an issue, it’s OK to use whole genome. But otherwise it’s better to use targeted resequencing.” Karl said that at his lab, it takes over a year to do a CE- based multi-gene sequence [ vs. NGS].
Others at the table asked about costs. The person from RainDance said that they have an in-solution capture method that could reduce costs. Karl said that even there, there are non-trivial labor costs. He said that “Some commercial companies do use robatic liquid handlers to reduce cost.”
Scenarios, Approaches, Costs
He said that this area is a moving target. Amplified appproaches in multi-gene panels increase specificity for up to ten genes. Otherwise if over ten genes, it takes many months of CE sequencing work. Researchers need to develop a special workflow for this type of CE- sequencing. Karl said “An elusive goal is to make sequencing work like PCR.” They are not there yet.
One person asked about simplifying the data content in a database by choosing some data as benign. Karl said that academics are randomly updating their data by using a grad student or even an undergrad student. But this approach gives inconsistant data quality. He said that some commercial-based databases use more regularly scheduled updating.
He said that you need to ask the question “Are the genes associated with pathology? Some genes are benign, some others are linked to disease. We need to know, over time, what data items get classified as a changed data set.”
Some companies do targeted resequencing as a business and make IP from the database content. The database tells what is benign or what is something else.
A consultant asked “It would be interesting to see what in the database is predictive.” Karl said “Extract the DNA, do PCR, do CE-seq, and analyze.”
The consultant also asked “What if you do NGS, then find genes, then pass data on to CE-seq to verify for Dx accuracy?” Karl said “Some research corelabs do exome sequencing for genome sequencing. NHGRI is good with that approach. He does 30x coverage at his lab.
Another person asked “What is the control level for false positives?
Karl said that, downstream, it depends on technologies used such as mass spec, v. NGS v. CE sequencing v. PCR. Karl mentioned that the American College of Cardiology considered testing for hypertrophic cardiomyopathy (HCM) and asked “Should we do multi-gene testing” They test by using using echocardiograms. Karl give the statitics for WW incidence.
So with the exome v. whole genome question. Karl asked, “When can you use gDNA for Illumina. The workflow is to do DNA sonograph, do Agilent Bioanalyzer 2100 to get total DNA, do qPCR to get fragment library which can go to the SOLiD or to the Illumina cluster [for HiSeq2000].
The sequencing workflow is:
- Day 1 do gDNA
- Day 2 do qPCR, then transfer to Cbot
- Day 3 run the HiSeq2000 at 2×100 for 8 days
- Then run SeqTest, run QSeqTest, then output in Qfile format
Karl said it takes 105 days from start to end.
He said that, if you do exome sequencing, you need to do a purification step at the beginning, which adds 3-4 days to the workflow, but the exome sequencing is at a lower cost. Karl said the his lab is hooked up to the Univ. of Utah’s cluster computer and can do a data alignment in 1-day. The cluster computer at the Univ. of Utah is also HIPPA compliant for privacy.
So cost drives exome sequencing. Karl said that “When doing exome sequencing you are doing a lot less sequencing, but you do more sample preparation. You sequence on 2 lanes v. on 8 lanes [on Illumina].
Some List Prices
Karl gave some cost numbers.
- For whole genome sequencing it costs $10K with all reagents, including for library preparation.
- For exome sequencing, it costs $1,200-$1,300 at 200X to 900X coverage.
So an answer for supporting multi-gene sequencing is to use exome sequencing of all genes in a panel. e.g. Broad can sequence 2000 exomes per week. They streamlined a special workflow for this. Anyway, at the end of the day, you need to do down stream validation.
Consent Approaches that Should be Considered
A woman asked, “But in the clinical environment, what if you find other genetic information?, Some other genetic information?, Do you not tell the clinician?”
Karl said that “the key is informed consent.” He said “ARUP is developing a tiered consent process — its mostly used for pediatrics now. So if they set out looking for one genetic area, but what if they find something else? They age-level at age-14 for consent.”
Karl gave an example about the rare disease area at the NIH.. The NIH does exome sequencing. Their success rate is 20% to identify a suspicious gene. “So why just 20% with de novo mutations?” He said that they are using exome sequencing and they just use a small population. He mentioned a paper in Nature Genetics involving a group in the Netherlands that saw a lot of power in NGS of a child that is an alternative to use laborious CE sequencing.
Karl said that the items not covered in the consented area are marked off. He said that this is usually done in laboratory medicine. When it comes to a recessive gene, the answer is often guided by family history. Therefore “consent with tiering” is the way to be able to manage what diagnostic information is delivered to clinicians. Karl wrapped up the discussion by saying that “NGS is pushing the envelope!”