Prediction of High-Risk Types of Human Papillomaviruses Using Statistical Model of Protein “Sequence Space”

Date

2015-03-21

ORCID

Journal Title

Journal ISSN

Volume Title

Publisher

Hindawi Publishing Corporation

item.page.doi

Abstract

Discrimination of high-risk types of human papillomaviruses plays an important role in the diagnosis and remedy of cervical cancer. Recently, several computational methods have been proposed based on protein sequence-based and structure-based information, but the information of their related proteins has not been used until now. In this paper, we proposed using protein "sequence space" to explore this information and used it to predict high-risk types of HPVs. The proposed method was tested on 68 samples with known HPV types and 4 samples without HPV types and further compared with the available approaches. The results show that the proposed method achieved the best performance among all the evaluated methods with accuracy 95.59% and F1-score 90.91%, which indicates that protein "sequence space" could potentially be used to improve prediction of high-risk types of HPVs.

Description

Keywords

Papillomaviruses, Proteins, Statistics--Models, Sequence spaces

item.page.sponsorship

"This work is supported by National Natural Science Foundation of China (61370015, 61170316, and 61272312), research grants from Zhejiang Provincial Natural Science Foundation of China (LY14F020046), Medicine and Health Foundation of Zhejiang Province (2011-2011RCA012), and 521 Talent Cultivation Plan of Zhejiang Sci-Tech University."

Rights

CC BY 3.0 (Attribution), ©2015 The Authors.

Citation

Collections