Название: Data Analytics in Bioinformatics
Автор: Группа авторов
Издательство: John Wiley & Sons Limited
Жанр: Программы
isbn: 9781119785606
isbn:
23. Grun, B., Model Based Clustering, arXiv:1807.01987v1 [stat.ME], 5 Jul 2018.
24. Kohonen, T., The self-organizing map. Proc. IEEE, 78, 9, 1464–80, 1990.
25. Grid-Based Clustering Algorithms, Data Clustering: Theory, Algorithms, and Applications, 209–217, https://doi.org/10.1137/1.9780898718348.ch12.
26. Sander, J., Density-Based Clustering, in: Encyclopedia of Machine Learning, C. Sammut and G.I. Webb (Eds.), Springer, Boston, MA, 2011, https://doi.org/10.1007/978-0-387-30164-8.
27. Ester, M., Kriegel, H.-P., Sander, J., Xu, X., A density-based algorithm for discovering clusters in large spatial databases with noise, in: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD’96), AAAI Press, pp. 226–231, 1996.
28. Sharan, R., Elkon, R., Shamir, R., Cluster Analysis and Its Applications to Gene Expression Data, in: Bioinformatics and Genome Analysis, Ernst Schering Research Foundation Workshop, vol. 38, Springer, Berlin, Heidelberg, 2002, https://doi.org/10.1007/978-3-662-04747-7_5.
29. Colaprico, A. et al., TCGAbiolinks: an R/Bioconductor package for integrative analysis of TCGA data. Nucleic Acids Res., 44.8, e71–e71, 2015.
30. Silva, T.C. et al., TCGA Workflow: Analyze cancer genomics and epigenomics data using Bioconductor packages. F1000Research, 5, 2016, (https://f1000research.com/articles/5-1542/v2).
31. Mounir, M. et al., New functionalities in the TCGAbiolinks package for the study and integration of cancer data from GDC and GTEx. PLoS Comput. Biol., 15, 3, e1006701, 2019, https://doi.org/10.1371/journal.pcbi.1006701.
32. Brazma, A. and Vilo, J., Gene expression data analysis. FEBS Lett., 480, 1, 17–24, 2000.
3
A Critical Review on the Application of Artificial Neural Network in Bioinformatics
Vrs Jhalia 1 * and Tripti Swarnkar 2
1 Department of Computer Science and Engineering, Siksha ‘O’ Anusandhan Deemed to be University, Bhubaneswar, India
2 Department of Computer Application, Siksha ‘O’ Anusandhan Deemed to be University, Bhubaneswar, India
Abstract
Proper diagnosis of disease requires deep analysis and classification of disease. In the last two decades the exponential increase in the biological data emerged as an opportunity for many researchers. “Bioinformatics” is the integration of biology and computer science to develop methods and software tools for understanding this large biological dataset. Analyzing the hunks of data by manual means is difficult. Here computer science plays a pivotal role for extracting the hidden patterns. Among all Artificial Intelligence (AI) techniques, Artificial Neural Network (ANN) is considered as the most efficient model in pattern recognition by an automated process. ANN is a type of data structure inspired by networks of biological neurons organized in layers, in which data set is fed to the model, model learns the features from the input and predicts the output. The performance is compared based on classification accuracy. In this chapter we will make a review by comparing the classification accuracy of the ANN model with other existing classification model to get more insights to draw hypothesis.
Keywords: Machine learning (ML), artificial neural network (ANN), prediction model, disease classification
3.1 Introduction
We wear smart watch to measure heart beat rate, hold smart phones that scan fingerprint and retina. All these are possible because of study and research on bioinformatics. Bioinformatics brings biological themes into computers. It is basically the integration of biology with computer science, mathematics and statistics. It analyzes the biological data such as genes, proteins, genomes, cells and deals with the computational management of ecological system, genetic engineering, medical information, drug development. This can be used for prediction, modeling, visualizing and designing.
Bioinformatics is a technique that uses computers to build mathematical model using different bioinformatics tools to acquire, manage and visualize biological data [1]. To infer relationship between components of complex biological system it develops software tools and methods to understand and analyze clinical data and biological data. This requires collecting and interpreting large chunks of biological data so as to gain information about some biological processes. In such situation computational biology is the actual program used to manipulate the data and bioinformatics puts those data to use.
Bioinformatics has a great significance in many areas of biology. In the sphere of genetics, it allows gene prediction, understanding the gene flow, gene sequencing, interpreting genomes and their observed mutations. For image and signal processing, bioinformatics techniques are applied in experimental molecular biology for mining useful information from large amount of raw data. In the text mining bioinformatics tools help to study and organize biological data of many biological literatures, which lead to the expansion of many biological and gene ontologies. It has a significant impact in the study of protein and gene expression and regulation. In system biology bioinformatics tools are used to analyse and catalogue the biological process and networks. It also assists to gain knowledge about the evolutionary characteristics of molecular biology for analysing, comparing and interpreting genetic and genomic data [2]. Bioinformatics helps to predict the 3D structure of different macromolecules such as RNA, DNA and proteins. It also assists in the analysis and simulation of different biomolecular interactions that are an important part of structural biology [3].
3.1.1 Different Areas of Application of Bioinformatics
Figure 3.1 represents different fields in which bioinformatics has a great range of application. It also describes various functions of bioinformatics in each of the field.
Figure 3.1 Areas of research of bioinformatics [4].
3.1.2 Bioinformatics in Real World
СКАЧАТЬ