by Sunder Parameswaran, VP Products at Falcon Computing
Posted on August 8, 2018
Genome Sequencing is one of the top 3 initiatives facing Bioinformaticians in the last few years. Whole-genome sequencing presents the most comprehensive method for analyzing the human genome. The genomic information has been instrumental in identifying inherited disorders, characterizing the mutations that drive cancer progression and tracking disease outbreaks. Increasingly genome sequencing has become part of drug discovery, precision medicine and large population studies.
However, the industry is faced with significant challenges. The top 3 challenges that we hear the most from clinicians, researchers & bio-informaticians include:
The cost of sequencing has been rapidly dropping over the last decade. That has been accelerated by several private genome sequencing providers competing in the market place to offer the cheapest sequencing services. However, this has led to:
A. Inconsistencies in what is included in the “sequencing cost of the whole genome” reported by each of them. For example, an institute such as National Human Genome Research Institute information about a wide range of activities and components, such as: reagents, consumables, DNA-sequencing instruments, certain computer equipment, other equipment, laboratory pipeline development, laboratory information management systems, initial data processing, submission of data to public databases, project management, utilities, other indirect costs, labor, and administration. But other service providers or research institutions include a subset of these.
B. More importantly, several sequencing analysis providers have been offering proprietary pipelines which optimize specifically for cost or performance but lead to accuracy issues. The Broad Institute of MIT & Harvard has been working to create Industry standard Sequencing Analysis through its Genome Analysis Toolkit (GATK). The toolkit offers a wide variety of tools with primary focus on variant discovery & genotyping.
When you choose your Secondary analysis, ensure that your vendor provides Industry standard pipelines rather than proprietary versions of the pipelines.
Even as the cost of sequencing has dropped in the last decade the cost of computing to do the data analysis has not kept pace. But, the recent advances in adopting hardware such as FPGAs along with CPU are enabling to reduce the time to complete a WGS or WES analysis. Public Clouds such as Amazon Web Services, Huawei Cloud, AliCloud today offer FPGA based services while vendors such as Intel & Xilinx offer FPGA boards to be used on-premise.
This has enabled vendors to offer accelerated pipelines that reduce the analysis time, optimize the costs of analysis both on public clouds & private clouds.
Ensure that your vendor offers a single platform for your analysis across clouds and different hardware in order to future-proof the consistencies of your results.
The latest genomic sequencers generate more than 2TB data daily. As the sequencing data goes through various stages of analyses, it generates even larger amounts of intermediate data. This presents challenges such as data compression, data storage & archival to bioinformaticians. In addition, they face the question of how to efficiently migrate the genomics data across on-premise & public clouds.
Ensure that your vendor helps you manage your data across public clouds & on-premise private clouds.
Falcon Computing provides a preconfigured solution for bioinformaticians, scientists and researchers to lower cost and accelerate their secondary analysis of DNA & RNA sequencing. The Falcon Accelerated Genomics Pipelines (FAGP) are the accelerated version the GATK Best Practices pipelines & include accelerated BWA alignment tools, samtools, picard, and GATK.
Falcon Computing enables data scientists and software developers working in Genomics, Machine Learning, Finance and Computer Vision to simplify their path to heterogeneous accelerated computing. With a simple push-button C/C++ to a highly-optimized FPGA implementation for both public and private clouds, users can achieve performance, energy-efficiency and productivity gains. Falcon Computing was founded in 2014 by Dr. Jason Cong, serial entrepreneur and Director of the Center for Domain Specific Computing (CDSC) at UCLA.