ScholarMate
客服热线:400-1616-289

Predicting Corynebacterium glutamicum promoters based on novel feature descriptor and feature selection technique

Li, HongFei; Zhang, Jingyu; Zhao, Yuming*; Yang, Wen*
Science Citation Index Expanded
东北林业大学; 哈尔滨医科大学

摘要

The promoter is an important noncoding DNA regulatory element, which combines with RNA polymerase to activate the expression of downstream genes. In industry, artificial arginine is mainly synthesized by Corynebacterium glutamicum. Replication of specific promoter regions can increase arginine production. Therefore, it is necessary to accurately locate the promoter in C. glutamicum. In the wet experiment, promoter identification depends on sigma factors and DNA splicing technology, this is a laborious job. To quickly and conveniently identify the promoters in C. glutamicum, we have developed a method based on novel feature representation and feature selection to complete this task, describing the DNA sequences through statistical parameters of multiple physicochemical properties, filtering redundant features by combining analysis of variance and hierarchical clustering, the prediction accuracy of the which is as high as 91.6%, the sensitivity of 91.9% can effectively identify promoters, and the specificity of 91.2% can accurately identify non-promoters. In addition, our model can correctly identify 181 promoters and 174 non-promoters among 400 independent samples, which proves that the developed prediction model has excellent robustness.

关键词

promoter Corynebacterium glutamicum physicochemical properties analysis of variance hierarchical clustering feature selection random forest