Advertisement

Experimental Investigation of Frequency Chaos Game Representation for in Silico and Accurate Classification of Viral Pathogens from Genomic Sequences

  • Emmanuel Adetiba
  • Joke A. Badejo
  • Surendra Thakur
  • Victor O. Matthews
  • Marion O. Adebiyi
  • Ezekiel F. Adebiyi
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10208)

Abstract

This paper presents an experimental investigation to determine the efficacy and the appropriate order of Frequency Chaos Game Representation (FCGR) for accurate and in silico classification of pathogenic viruses. For this study, we curated genomic sequences of selected viral pathogens from the virus pathogen database and analysis resource corpus. The viral genomes were encoded using the first to seventh order FCGRs so as to produce training and testing genomic data features. Thereafter, four different kernels of naïve Bayes classifier were experimentally trained and tested with the generated FCGR genomic features. The performance result with the highest average classification accuracy of 98% was returned by the third and fourth order FCGRs. However, due to consideration for memory utilization, computational efficiency vis-à-vis classification accuracy, the third order FCGR is deemed suitable for accurate classification of viral pathogens from genome sequences. This provides a promising foundation for developing genomic based diagnostic toolkit that could be used to promptly address the global incidence of epidemics from pathogenic viruses.

Keywords

Classification FCGR Genome GSP Naïve Bayes Pathogens Sequences Virus 

Notes

Acknowledgement

The publication of this study is supported and funded by the Covenant University Centre for Research, Innovation and Development (CUCRID), Covenant University, Canaanland, Ota, Ogun State, Nigeria.

References

  1. 1.
    Adetiba, E., Olugbara, O.O., Taiwo, T.B.: Identification of pathogenic viruses using genomic cepstral coefficients with radial basis function neural network. In: Pillay, N., Engelbrecht, A.P., Abraham, A., du Plessis, M.C., Snášel, V., Muda, A.K. (eds.) Advances in Nature and Biologically Inspired Computing. AISC, vol. 419, pp. 281–291. Springer, Cham (2016). doi: 10.1007/978-3-319-27400-3_25 CrossRefGoogle Scholar
  2. 2.
    Hoang, T., Yin, C., Yau, S.S.T.: Numerical encoding of DNA sequences by chaos game representation with application in similarity comparison. Genomics 108(3), 134–142 (2016)CrossRefGoogle Scholar
  3. 3.
    Huang, G., Zhou, H., Li, Y., Xu, L.: Alignment-free comparison of genome sequences by a new numerical characterization. J. Theor. Biol. 281(1), 107–112 (2011)CrossRefGoogle Scholar
  4. 4.
    Qi, Z.H., Du, M.H., Qi, X.Q., Zheng, L.J.: Gene comparison based on the repetition of single-nucleotide structure patterns. Comput. Biol. Med. 42(10), 975–981 (2012)CrossRefGoogle Scholar
  5. 5.
    Karamichalis, R., Kari, L., Konstantinidis, S., Kopecki, S.: An investigation into inter-and intragenomic variations of graphic genomic signatures. BMC Bioinform. 16(1), 1 (2015)CrossRefGoogle Scholar
  6. 6.
    Swain, M.T.: Fast comparison of microbial genomes using the Chaos games representation for metagenomic applications. Procedia Comput. Sci. 18, 1372–1381 (2013)CrossRefGoogle Scholar
  7. 7.
    Deschavanne, P.J., Giron, A., Vilain, J., Fagot, G., Fertil, B.: Genomic signature: characterization and classification of species assessed by chaos game representation of sequences. Mol. Biol. Evol. 16(10), 1391–1399 (1999)CrossRefGoogle Scholar
  8. 8.
    Almeida, J.S., Carrico, J.A., Maretzek, A., Noble, P.A., Fletcher, M.: Analysis of genomic sequences by chaos game representation. Bioinformatics 17(5), 429–437 (2001)CrossRefGoogle Scholar
  9. 9.
    Jeffrey, H.J.: Chaos game representation of gene structure. Nucleic Acids Res. 18, 2163–2170 (1990)CrossRefGoogle Scholar
  10. 10.
    Wang, Y., Hill, K., Singh, S., Kari, L.: The spectrum of genomic signatures: from dinucleotides to chaos game representation. Gene 14(346), 173–178 (2005)CrossRefGoogle Scholar
  11. 11.
    Messaoudi, I., Oueslati, A.E., Lachiri, Z.: Wavelet analysis of frequency chaos game signal: a time-frequency signature of the C. elegans DNA. EURASIP J. Bioinform. Syst. Biol. 2014(1), 1 (2014)CrossRefGoogle Scholar
  12. 12.
    Kari, L., Hill, K.A., Sayem, A.S., Karamichalis, R., Bryans, N., Davis, K., Dattani, N.S.: Mapping the space of genomic signatures. PLoS one 10(5), e0119815 (2015)CrossRefGoogle Scholar
  13. 13.
    Tanchotsrinon, W., Lursinsap, C., Poovorawan, Y.: A high performance prediction of HPV genotypes by chaos game representation and singular value decomposition. BMC Bioinform. 16(1), 1 (2015)CrossRefGoogle Scholar
  14. 14.
    Stan, C., Cristescu, C.P., Scarlat, E.I.: Similarity analysis for DNA sequences based on chaos game representation. Case study: the albumin. J. Theoret. Biol. 267(4), 513–518 (2010)CrossRefMathSciNetGoogle Scholar
  15. 15.
    Sandberg, R., Winberg, G., Bränden, C.I., Kaske, A., Ernberg, I., Cöster, J.: Capturing whole-genome characteristics in short sequences using a naive Bayesian classifier. Genome Res. 11(8), 1404–1409 (2001)CrossRefGoogle Scholar
  16. 16.
    Wang, Q., Garrity, G.M., Tiedje, J.M., Cole, J.R.: Naive Bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy. Appl. Environ. Microbiol. 73(16), 5261–5267 (2007)CrossRefGoogle Scholar
  17. 17.
    Janecek, A., Gansterer, W.N., Demel, M., Ecker, G.: On the relationship between feature selection and classification accuracy. In: FSDM, pp. 90–105, 15 September 2008Google Scholar
  18. 18.
    Vijayan, K., Nair, V.V., Gopinath, D.P.: Classification of organisms using frequency-chaos game representation of genomic sequences and ANN. In: 10th National Conference on Technological Trends (NCTT 2009), pp. 6–7, November 2009Google Scholar
  19. 19.
    Nair, V.V., Nair, A.S.: Combined classifier for unknown genome classification using chaos game representation features. In: Proceedings of the International Symposium on Biocomputing, p. 35. ACM (2010)Google Scholar
  20. 20.
    Yang, L., Tan, Z., Wang, D., Xue, L., Guan, M.X., Huang, T., Li, R.: Species identification through mitochondrial rRNA genetic analysis. Sci. Rep. 4(4089), 1–11 (2014)Google Scholar
  21. 21.
    Adetiba, E., Olugbara, O.O.: Classification of eukaryotic organisms through cepstral analysis of mitochondrial DNA. In: Mansouri, A., Nouboud, F., Chalifour, A., Mammass, D., Meunier, J., ElMoataz, A. (eds.) ICISP 2016. LNCS, vol. 9680, pp. 243–252. Springer, Cham (2016). doi: 10.1007/978-3-319-33618-3_25 Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Emmanuel Adetiba
    • 1
    • 4
  • Joke A. Badejo
    • 1
  • Surendra Thakur
    • 3
  • Victor O. Matthews
    • 1
  • Marion O. Adebiyi
    • 2
    • 4
  • Ezekiel F. Adebiyi
    • 2
    • 4
  1. 1.Department of Electrical and Information Engineering, College of EngineeringCovenant UniversityOtaNigeria
  2. 2.Department of Computer and Information Science, College of Science and TechnologyCovenant UniversityOtaNigeria
  3. 3.KZN e-Skills CoLabDurban University of TechnologyDurbanSouth Africa
  4. 4.Covenant University Bioinformatics Research (CUBRe)OtaNigeria

Personalised recommendations