Date of Award
2016
Document Type
Open Access Dissertation
Department
Computer Science and Engineering
Sub-Department
College of Engineering and Computing
First Advisor
Jason D. Bakos
Abstract
Genomic databases are exhibiting a growth rate that is outpacing Moore's Law, which has made database search algorithms a popular application for use on emerging processor technologies. NCBI BLAST is the standard tool for performing searches against these databases, which operates by transforming each database query into a filter that is subsequently applied to the database. This requires a database scan for every query, fundamentally limiting its performance by I/O bandwidth. In this dissertation we present a functionally-equivalent variation on the NCBI BLAST algorithm that maps more suitably to an FPGA implementation. This variation of the algorithm attempts to reduce the I/O requirement by leveraging FPGA-specific capabilities, such as high pattern matching throughput and explicit on-chip memory structure and allocation. Our algorithm transforms the database—not the query—into a filter that is stored as a hierarchical arrangement of three tables, the first two of which are stored on-chip and the third off-chip. Our results show that it is possible to achieve speedups of up to 8x based on the relative reduction in I/O of our approach versus that of NCBI BLAST, with a minimal impact on sensitivity. More importantly, the performance relative to NCBI BLAST improves with larger databases and query workload sizes.
Rights
© 2016, Jordan Bradshaw
Recommended Citation
Bradshaw, J.(2016). Regular Expression Synthesis for BLAST Two-Hit Filtering. (Doctoral dissertation). Retrieved from https://scholarcommons.sc.edu/etd/3815