Recognition of protein/gene names from text using an ensemble of classifiers

oleh: Zhou GuoDong, Shen Dan, Zhang Jie, Su Jian, Tan SoonHeng

Format: Article
Diterbitkan: BMC 2005-05-01

Deskripsi

<p>Abstract</p> <p>This paper proposes an ensemble of classifiers for biomedical name recognition in which three classifiers, one Support Vector Machine and two discriminative Hidden Markov Models, are combined effectively using a simple majority voting strategy. In addition, we incorporate three post-processing modules, including an abbreviation resolution module, a protein/gene name refinement module and a simple dictionary matching module, into the system to further improve the performance. Evaluation shows that our system achieves the best performance from among 10 systems with a balanced F-measure of 82.58 on the closed evaluation of the BioCreative protein/gene name recognitiontask (Task 1A).</p>