Left: an example shows the interaction between features and attention masks. Right: example images illustrating that different features have different corresponding attention masks in our network. The sky mask diminishes low-level background blue color features. The balloon instance mask highlights high-level balloon bottom part features.

Source publication

Residual Attention Network for Image Classification

Article

Full-text available

Apr 2017

In this work, we propose "Residual Attention Network", a convolutional neural network using attention mechanism which can incorporate with state-of-art feed forward network architecture in an end-to-end training fashion. Our Residual Attention Network is built by stacking Attention Modules which generate attention-aware features. The attention-awar...

Context 1

... from more discriminative feature representation brought by the attention mechanism, our model also ex- hibits following appealing properties: (1) Increasing Attention Modules lead to consistent perfor- mance improvement, as different types of attention are cap- tured extensively. Fig.1 shows an example of different types of attentions for a hot air balloon image. ...

View in full-text

Context 2

... Attention Module, each trunk branch has its own mask branch to learn attention that is specialized for its fea- tures. As shown in Fig.1, in hot air balloon images, blue color features from bottom layer have corresponding sky mask to eliminate background, while part features from top layer are refined by balloon instance mask. Besides, the in- cremental nature of stacked network structure can gradually refine attention for complex images. ...

View in full-text

Context 3

... Attention Modules can gradually refine the feature maps. As show in Fig.1, fea- tures become much clearer as depth going deeper. ...

View in full-text

Figure 1. Multiple filters applied to the image.

Figure 2. Max-pooling. Another common way of pooling is mean -pooling,...

An Improved Remote Sensing Image Classification Method Based on DCNN

Article

Full-text available

Sep 2020

The popular deep learning method of the year established a recognition model to identify remote sensing images, which is a new method. Based on Google Inc.’s Inception-v3 convolutional neural network recognition model, this paper adopted a conception of migration learning to achieve the accurate automatic classification of remote sensing image scen...

Figure 1. General framework of MS and PAN image classification based on...

Figure 3. Datasets: (a) Level 1B dataset (from left to right: False...

Figure 6. Classification maps with different methods on Xi'an Suburban...

Figure 8. The classification accuracies with different number of...

Parameter setting of the proposed AGCNet.

Attention-Guided Multispectral and Panchromatic Image Classification

Article

Full-text available

Nov 2021

Multi-sensor image can provide supplementary information, usually leading to better performance in classification tasks. However, the general deep neural network-based multi-sensor classification method learns each sensor image separately, followed by a stacked concentrate for feature fusion. This way requires a large time cost for network training...

Fig. 6. Architecture of VGG16 H-CNN model.

Fig. 7. Architecture of VGG19 H-CNN model.

Fig. 8. Details of CCP-3 block architecture.

Fig. 13. Loss per epoch in H-CNN VGG19 model.

Hierarchical Convolutional Neural Networks using CCP-3 Block Architecture for Apparel Image Classification

Article

Full-text available

Jan 2023

AcneNet - A Deep CNN Based Classification Approach for Acne Classes

Conference Paper

Full-text available

Jul 2019

Skin diseases are very common and nowadays easy to get remedy from. But, sometimes properly diagnosing these diseases can be very troublesome due to the very hard-to-discriminate nature of the symptoms they exhibit. Deep Neural Networks, since its recent advent, has started outperforming different algorithms in almost every sector. One of the probl...

Perception Improvement for Free: Exploring Imperceptible Black-box Adversarial Attacks on Image Classification

Preprint

Full-text available

Oct 2020

GraphKAN: Enhancing Feature Extraction with Graph Kolmogorov Arnold Networks

Preprint

Jun 2024

Massive number of applications involve data with underlying relationships embedded in non-Euclidean space. Graph neural networks (GNNs) are utilized to extract features by capturing the dependencies within graphs. Despite groundbreaking performances, we argue that Multi-layer perceptrons (MLPs) and fixed activation functions impede the feature extraction due to information loss. Inspired by Kolmogorov Arnold Networks (KANs), we make the first attempt to GNNs with KANs. We discard MLPs and activation functions, and instead used KANs for feature extraction. Experiments demonstrate the effectiveness of GraphKAN, emphasizing the potential of KANs as a powerful tool. Code is available at https://github.com/Ryanfzhang/GraphKan.

Ultra-High Resolution Image Segmentation via Locality-Aware Context Fusion and Alternating Local Enhancement

Article

Full-text available

May 2024
INT J COMPUT VISION

Ultra-high resolution image segmentation has raised increasing interests in recent years due to its realistic applications. In this paper, we innovate the widely used high-resolution image segmentation pipeline, in which an ultra-high resolution image is partitioned into regular patches for local segmentation and then the local results are merged into a high-resolution semantic mask. In particular, we introduce a novel locality-aware context fusion based segmentation model to process local patches, where the relevance between local patch and its various contexts are jointly and complementarily utilized to handle the semantic regions with large variations. Additionally, we present the alternating local enhancement module that restricts the negative impact of redundant information introduced from the contexts, and thus is endowed with the ability of fixing the locality-aware features to produce refined results. Furthermore, in comprehensive experiments, we demonstrate that our model outperforms other state-of-the-art methods in public benchmarks and verify the effectiveness of the proposed modules. Our released codes will be available at: https://github.com/liqiokkk/FCtL.

Enhanced fingerprint pattern classification: Integrating attention modules with lightweight deep learning models

Article

Full-text available

Apr 2024

Large fingerprint databases can make the automated search process tedious and time‐consuming. Fingerprint pattern classification is a significant step in the identification system's complexity in terms of time and speed. Although several fingerprint algorithms have been developed for classification tasks, further improvements in performance and efficiency are still required. Most of the fingerprint algorithms use deep learning techniques. However, some deep learning techniques can be resource‐intensive and computationally expensive, while others can disregard the spatial relationships between the features used in classifying fingerprint patterns. This study proposes using lightweight deep learning models (i.e., MobileNet and EfficientNet‐B0) integrated with attention modules to classify fingerprint patterns. The two lightweight models are modified, yielding MobileNet+ and EfficientNet‐B0+ models. The lightweight deep learning models can help achieve optimal performance and reduce computational complexity. The attention modules focus on distinctive features for classification. Our proposed approach integrates four attention modules for fingerprint pattern classification into two lightweight deep learning models, that is, MobileNet+ and EfficientNet‐B0+. To evaluate our approach, we use two publicly available fingerprint datasets, that is, the NIST special database 301 dataset and the LivDet dataset. The evaluation results show that the EfficientNet‐B0+ model achieves the highest classification accuracy of 97% with only 854,086 training parameters. As a conclusion, we consider the training parameters small enough for the EfficientNet‐B0+ model to be deployed on low‐resource devices.

CNN-Transformer Bridge Mode for Detecting Arcing Horn Defects in Railway Sectional Insulator

Article

Full-text available

Mar 2024
IEEE T INSTRUM MEAS

The sectional insulator with arcing horns is indispensable for electrified railways’ overhead contact system (OCS). When these arcing horns are damaged, broken, or detached, arcing can occur due to unstable contact between the pantograph and catenary, potentially causing burns and damage to OCS components or even complete system failure. Unfortunately, no specialized technique or method for detecting arcing horn defects exists. To address this critical issue, this paper proposes a novel CTBM-DAHD network that utilizes CNN-Transformer bridge mode to recognize arcing horn defects in realistic application scenarios, such as rainy, foggy, sunny, and night-time conditions. Notably, the network can accurately detect obscured and long-range minor arcing horn defects, significantly improving recall and precision of defect recognition. The experimental findings demonstrate that, compared to the state-of-the-art networks, the CTBM-DAHD network achieves outstanding performance while maintaining lower computational costs and a reduced number of weight parameters. It surpasses the best CNN-Transformer bridge fusion network by 3.5% and outperforms the dual-branch vision transformer by 1.5%. Moreover, the CTBM-DAHD network has been successfully deployed on over 500 high-speed trains in China, detecting more than 1,000 arcing horn defects. These results affirm its effectiveness in recognizing arcing horn defects in complex and natural environments.

A novel model for fall detection and action recognition combined lightweight 3D-CNN and convolutional LSTM networks

Article

Full-text available

Feb 2024
PATTERN ANAL APPL

Three-dimensional convolutional neural networks (3D-CNNs) and full connection long short-term memory networks (FC-LSTMs) have been demonstrated as a kind of powerful non-intrusive approaches in fall detection. However, the feature extration of 3D-CNN-based requires a large-scale dataset. Meanwhile, the deployment of FC-LSTM to expand the input into one-dimension leads to the loss of spatial information. To this end, a novel model combined lightweight 3D-CNN and convolutional long short-term memory (ConvLSTM) networks is proposed in this paper. In this model, a lightweight 3D convolutional neural network with five layers is presented to avoid the phenomenon of over-fitting. To further explore the discrimnative features, the channel- and spatial-wise attention modules are adopted in each layer to improve the detection performance. In addition, the ConvLSTM is presented to extract the long-term spatial–temporal features of 3D tensors. Finally, we verify our model through extensive experiments by comprehensive comparisons with HMDB5, UCF11, URFD, and MCFD. Experimental results on the public benchmarks demonstrate that our method outperforms current state-of-the-art single-stream networks with 62.55 ± 7.99% on HMDB5, 97.28 ± 0.36% on UCF11, 98.06 ± 0.32% on URFD, and 94.84 ± 4.64% on MCFD.

Glaucoma detection model by exploiting multi-region and multi-scan-pattern OCT images with dynamical region score

Article

Full-text available

Feb 2024

Currently, deep learning-based methods have achieved success in glaucoma detection. However, most models focus on OCT images captured by a single scan pattern within a given region, holding the high risk of the omission of valuable features in the remaining regions or scan patterns. Therefore, we proposed a multi-region and multi-scan-pattern fusion model to address this issue. Our proposed model exploits comprehensive OCT images from three fundus anatomical regions (macular, middle, and optic nerve head regions) being captured by four scan patterns (radial, volume, single-line, and circular scan patterns). Moreover, to enhance the efficacy of integrating features across various scan patterns within a region and multiple regional features, we employed an attention multi-scan fusion module and an attention multi-region fusion module that auto-assign contribution to distinct scan-pattern features and region features adapting to characters of different samples, respectively. To alleviate the absence of available datasets, we have collected a specific dataset (MRMSG-OCT) comprising OCT images captured by four scan patterns from three regions. The experimental results and visualized feature maps both demonstrate that our proposed model achieves superior performance against the single scan-pattern models and single region-based models. Moreover, compared with the average fusion strategy, our proposed fusion modules yield superior performance, particularly reversing the performance degradation observed in some models relying on fixed weights, validating the efficacy of the proposed dynamic region scores adapted to different samples. Moreover, the derived region contribution scores enhance the interpretability of the model and offer an overview of the model’s decision-making process, assisting ophthalmologists in prioritizing regions with heightened scores and increasing efficiency in clinical practice.

An effective framework of human abnormal behaviour recognition and tracking using multiscale dilated assisted residual attention network

Article

Jan 2024
EXPERT SYST APPL

Selvakumar Subramanian

The prevention of certain unwanted crime events and eliminating them even before their execution can be done by automatic identification of abnormal behavior in humans. Hence automatic prediction of abnormal human behavior is a difficult task to perform. Some of the automated model has been implemented and provided the most promising results. The manual intervention is being the greatest approach in earlier time, yet it brings with numerous errors, consumes more time and more cost effective. Henceforth, the automated model is suggested for identifying the activities. As the scholar focus on machine and deep learning, this classifier may extract the hand-crafted features. But it fails to yield the appropriate solution for finding the activities. Since it belongs to the video frames, the object detection is highly Ineffective feature vector and inadequate scale measures of the learning model paves the way for performance degradation. This issue can be resolved by including an attention mechanism in the deep learning model for both monitoring and classification purposes. The recommended Human Abnormal Behavior Recognition and Tracking (HABRT) model performs the following operations, such as the collection of video, categorizing the behavior in the video as normal or abnormal, monitoring, extraction of the object, and classification of the abnormality. The input video with such frames is initially gathered from publically available databases. By using these frames, the abnormal behavior classification is done by Multiscale Dilated assisted Residual Attention Network (MD-RAN), For further enhancement, the hyper-parameters in the MD-RAN are optimally selected by novel Modified Random Parameter-based Chimp Optimization Algorithm (MRP-ChOA). Once the abnormal frames are obtained, the activity tracking is achieved by Adaptively Modified You Only Look Once (YOLO) V3 (AM-YOLO V3). This model encompasses with multiple layers, so that utilized number of layers are determined optimally using MRP-ChOA. Consequently, the objects are extracted from the abnormal frames with the help of AM-YOLO V3. Finally, the abnormalities are classified by using the same MD-RAN. At last, the performance is analyzed and validated with diverse parameters, which are then compared with other algorithms. While implementing the dataset 1, the accuracy value attains maximum in contrast with 3.14% of DTCN, 2.308% of CNN-RNN and 13.7% of ResAttenConvLSTM, correspondingly. Thus, the findings reveal that it has the potential to deliver extensive results for abnormal recognition and tracking.

A primer on the use of machine learning to distil knowledge from data in biological psychiatry

Article

Jan 2024
MOL PSYCHIATR

Applications of machine learning in the biomedical sciences are growing rapidly. This growth has been spurred by diverse cross-institutional and interdisciplinary collaborations, public availability of large datasets, an increase in the accessibility of analytic routines, and the availability of powerful computing resources. With this increased access and exposure to machine learning comes a responsibility for education and a deeper understanding of its bases and bounds, borne equally by data scientists seeking to ply their analytic wares in medical research and by biomedical scientists seeking to harness such methods to glean knowledge from data. This article provides an accessible and critical review of machine learning for a biomedically informed audience, as well as its applications in psychiatry. The review covers definitions and expositions of commonly used machine learning methods, and historical trends of their use in psychiatry. We also provide a set of standards, namely Guidelines for REporting Machine Learning Investigations in Neuropsychiatry (GREMLIN), for designing and reporting studies that use machine learning as a primary data-analysis approach. Lastly, we propose the establishment of the Machine Learning in Psychiatry (MLPsych) Consortium, enumerate its objectives, and identify areas of opportunity for future applications of machine learning in biological psychiatry. This review serves as a cautiously optimistic primer on machine learning for those on the precipice as they prepare to dive into the field, either as methodological practitioners or well-informed consumers.

Artificial Intelligence-Enabled 5G Network Performance Evaluation With Fine Granularity and High Accuracy

Article

Full-text available

Jan 2024

Network performance evaluation is crucial in ensuring the effective operation of 5G wireless networks, offering valuable insights into evaluating network status and user experience. However, the complexity of network conditions, characterized by high dynamics and diverse user requirements across various vertical applications, presents a significant challenge for generating accurate and detailed evaluation results using existing algorithms. To provide a feasible solution for this issue, an artificial intelligence-enabled 5G network performance evaluation scheme for private 5G networks is proposed. First, the network performance evaluation at different granularities is modeled with the deployment of network performance evaluation introduced. Furthermore, an intelligent network performance evaluation architecture based on residual networks with the attention mechanism is introduced, which can generate evaluation scores based on key performance indicators of reliability, accessibility, utilization, integrity, mobility and retainability. Additionally, the corresponding training strategies for the intelligent model, catering to different evaluation granularity, are thoroughly designed. Finally, to validate the effectiveness of the proposed scheme, comprehensive experiments are conducted using practical 5G network operation system data. The experimental results demonstrate the scheme’s ability to achieve highly accurate evaluations with fine spatial granularity. These findings establish the feasibility and efficacy of the proposed artificial intelligence-enabled scheme in enhancing 5G network performance evaluation.

Dynamic Neural Network Structure: A Review for Its Theories and Applications

Article

Full-text available

Jan 2024

The dynamic neural network (DNN), in contrast to the static counterpart, offers numerous advantages, such as improved accuracy, efficiency, and interpretability. These benefits stem from the network’s flexible structures and parameters, making it highly attractive and applicable across various domains. As the broad learning system (BLS) continues to evolve, DNNs have expanded beyond deep learning (DL), orienting a more comprehensive range of domains. Therefore, this comprehensive review article focuses on two prominent areas where DNN structures have rapidly developed: 1) DL and 2) broad learning. This article provides an in-depth exploration of the techniques related to dynamic construction and inference. Furthermore, it discusses the applications of DNNs in diverse domains while also addressing open issues and highlighting promising research directions. By offering a comprehensive understanding of DNNs, this article serves as a valuable resource for researchers, guiding them toward future investigations.

Contexts in source publication

Similar publications

Citations