FusorSV: an algorithm for optimally combining data from multiple structural variation detection methods

oleh: Timothy Becker, Wan-Ping Lee, Joseph Leone, Qihui Zhu, Chengsheng Zhang, Silvia Liu, Jack Sargent, Kritika Shanker, Adam Mil-homens, Eliza Cerveira, Mallory Ryan, Jane Cha, Fabio C. P. Navarro, Timur Galeev, Mark Gerstein, Ryan E. Mills, Dong-Guk Shin, Charles Lee, Ankit Malhotra

Format: Article
Diterbitkan: BMC 2018-03-01

Deskripsi

Abstract Comprehensive and accurate identification of structural variations (SVs) from next generation sequencing data remains a major challenge. We develop FusorSV, which uses a data mining approach to assess performance and merge callsets from an ensemble of SV-calling algorithms. It includes a fusion model built using analysis of 27 deep-coverage human genomes from the 1000 Genomes Project. We identify 843 novel SV calls that were not reported by the 1000 Genomes Project for these 27 samples. Experimental validation of a subset of these calls yields a validation rate of 86.7%. FusorSV is available at https://github.com/TheJacksonLaboratory/SVE.