摘要

N6-methyladenine (m(6)A) is one of the crucial epigenetic modifications and is related to the control of various DNA processes. Carrying out a genome-wide m(6)A analysis via wet experiments is fundamental but takes a long time. As complementary methods, computing tools, especially those based on machine learning, are urgently needed. A new protocol, iRicem6A-CNN, for identifying m(6)A sites in the rice gen-ome was developed. This protocol was designed to use dinucleotide one-hot encoding to generate input tensors for predictions by convolutional neutral networks, and achieved five-fold cross-validation and independent testing accuracy values of 93.82% and 96.19%, respectively, performing better than those of other available predictors. The experiment results demonstrates that only the ability of iRicem6A-CNN based on 2-mer one-hot encoding is to display high performance but also to perform more stably and robustly than models using 1-mer one-hot encoding.

  • 单位
    University of Electronic Science and Technology of China; Changsha University; Hainan Normal University

全文