Recognizable similarities of several images generated by DNA-Walk for four sample genomes of the four groups of coronavirus. https://doi.org/10.1371/journal.pone.0267106.g002

Recognizable similarities of several images generated by DNA-Walk for four sample genomes of the four groups of coronavirus. https://doi.org/10.1371/journal.pone.0267106.g002

Source publication
Article
Full-text available
The classification of biological sequences is an open issue for a variety of data sets, such as viral and metagenomics sequences. Therefore, many studies utilize neural network tools, as the well-known methods in this field, and focus on designing customized network structures. However, a few works focus on more effective factors, such as input enc...

Context in source publication

Context 1
... this manner, taking advantages of input signature, it avoids complex CNN structure to perform classification. This property is illustrated in Fig 2, which represent several sample images from several corona categories. ...

Citations

... The problem of how to transform 2D data into 1D data or whether to do so at all is of interest in many different areas, such as in the prediction of molecular properties on the bases of molecular structures [5] or in the case of automated methods for detecting viral subtypes using genomic data [6]. The choice often depends on the selected transformers and classifiers. ...
Article
Full-text available
For robust classification, selecting a proper classifier is of primary importance. However, selecting the best classifiers depends on the problem, as some classifiers work better at some tasks than on others. Despite the many results collected in the literature, the support vector machine (SVM) remains the leading adopted solution in many domains, thanks to its ease of use. In this paper, we propose a new method based on convolutional neural networks (CNNs) as an alternative to SVM. CNNs are specialized in processing data in a grid-like topology that usually represents images. To enable CNNs to work on different data types, we investigate reshaping one-dimensional vector representations into two-dimensional matrices and compared different approaches for feeding standard CNNs using two-dimensional feature vector representations. We evaluate the different techniques proposing a heterogeneous ensemble based on three classifiers: an SVM, a model based on random subspace of rotation boosting (RB), and a CNN. The robustness of our approach is tested across a set of benchmark datasets that represent a wide range of medical classification tasks. The proposed ensembles provide promising performance on all datasets.
... To emphasize the importance of optical implementation of the forward inference in CNNs, optical processing of the large biological data sequences is explored as follows. As discussed in [45], classification of virus sequences (e.g., Coronaviruses, Dengue, HIV, Hepatitis B and C, and Influenza A), metagenomics data, and metabarcoding data can be performed by CNNs taking advantages of an appropriate image-based encoding method. It should be noted that single training procedure is carried out for each biological dataset while many test procedures are required to classify the input [44]. ...
Article
Full-text available
Convolutional neural networks (CNNs) are at the heart of several machine learning applications, while they suffer from computational complexity due to their large number of parameters and operations. Recently, all-optical implementation of the CNNs has achieved many attentions, however, the recently proposed optical architectures for CNNs cannot fully utilize the tremendous capabilities of optical processing, due to the required electro-optical conversions in-between successive layers. To implement an all-optical multi-layer CNN, it is essential to optically implement all required operations, namely convolution, summation of channels’ output for each convolutional kernel feeding the nonlinear unit, nonlinear activation function, and finally, pooling operations. Considering the lack of multi-layer photonic CNN implementation, in this paper, we explore a fully-optical design for implementing successive convolutional layers in an optical CNN. As a proof of concept, and without loss of generality, we considered two successive optical layers in the proposed network, named as 2L-OPCNN, for comparative studies against electrical counterpart and single optical layer CNN. Our simulation results confirm nearly the same accuracies for classifying images of Kaggle Cats and Dogs challenge, CIFAR-10, and MNIST datasets, compared to the electrical counterpart, as well as improved accuracies compared to single optical layer CNN.
Article
The classification of different organisms into subtypes is one of the most important tools of organism studies, and among them, the classification of viruses itself has been the focus of many studies due to their use in virology and epidemiology. Many methods have been proposed to classify viruses, some of which are designed for a specific family of organisms and some of which are more general. But still, especially for certain categories such as Influenza and HIV, classification is facing performance challenges as well as processing and memory bottlenecks. In this way, we designed an automated classifier, called PC-mer, that is based on k-mer and physicochemical characteristics of nucleotides, which reduces the number of features about 2k times compared to the alternative methods based on k-mer, and compared to integer and one-hot encoding methods, it is possible to keep the number of features constant despite the growth of the sequence length. In this way, it also increases the training speed by an average of 17.93 times. This improvement in processing complexity is provided while PC-mer can also improve the classifying performance for a variety of virus families.