Figure 6 - available via license: Creative Commons Attribution 3.0 Unported
Content may be subject to copyright.
The Comparison of Different Step Size Therefore, the general gradient descent algorithm is following: Initialise in aim function, recursion terminal distance ε, step size α. Calculate the gradient, the partial derivative of the loss function of aim function, . 

The Comparison of Different Step Size Therefore, the general gradient descent algorithm is following: Initialise in aim function, recursion terminal distance ε, step size α. Calculate the gradient, the partial derivative of the loss function of aim function, . 

Source publication
Article
Full-text available
With the development of the integrated circuit and computer science, people become caring more about solving practical issues via information technologies. Along with that, a new subject called Artificial Intelligent (AI) comes up. One popular research interest of AI is about recognition algorithm. In this paper, one of the most common algorithms,...

Context in source publication

Context 1
... it cannot detect a optimal point in this convergence space. Figure 6 shows the comparison among different step size's loss functions convergence situations. If , stop the algorithm, otherwise, continue the process. ...

Similar publications

Article
Full-text available
3D convolution neural networks (CNNs) have shown excellent predictive performance on tasks such as action recognition from videos. Since 3D CNNs have unique characteristics and extremely high compute/memory-overheads, executing them on accelerators designed for 2D CNNs provides sub-optimal performance. To overcome these challenges, researchers have...

Citations

... The output was subjected to the activation function, and the correctness of the generated value was calculated. The error (loss) values between yp and the target value yt were calculated by the loss function (Cui, 2018). The mean square error (MSE) loss function (Muthukumar et al., 2021) was utilized to measure the loss values in this study. ...
... It is important to note that the loss Accordingly, the optimizer updates the weight and the bias values (learning stage) using the loss function. In this research, the gradient descent (GD) optimization algorithm was employed, as represented in Eq (7) (Cui, 2018). Other optimization algorithms have been developed based on the GD algorithm, including Adagrad (Duchi et al., 2011), Adam (Kingma and Ba, 2014), and AdaBelief (Zhuang et al., 2020). ...
... Backpropagation adjusts network weights based on output errors by applying the chain rule to compute the gradient of the loss function with respect to each weight, effectively propagating errors backward through the network [9]. Gradient descent, an iterative optimization algorithm, then minimizes the loss function by adjusting these weights based on the computed gradients [10]. Together, these mechanisms enable continuous improvement of the model's predictions by minimizing the error between the predicted and actual outcomes. ...
... Backpropagation calculates the gradients, providing the direction in which the weights need to be modified to minimize the loss. Subsequently, gradient descent utilizes these gradients to update the weights throughout iterations or epochs until the desired level of accuracy is reached [10]. continuous improvement of the model's predictions by minimizing the error between the predicted and actual outcomes. ...
... Backpropagation calculates the gradients, providing the direction in which the weights need to be modified to minimize the loss. Subsequently, gradient descent utilizes these gradients to update the weights throughout iterations or epochs until the desired level of accuracy is reached [10]. ...
Article
Full-text available
Background: This research goes into in deep learning technologies within the realm of medical imaging, with a specific focus on the detection of anomalies in medical pathology, emphasizing breast cancer. It underscores the critical importance of segmentation techniques in identifying diseases and addresses the challenges of scarce labelled data in Whole Slide Images. Additionally, the paper provides a review, cataloguing 61 deep learning architectures identified during the study. Objectives: The aim of this study is to present and assess a novel quantitative approach utilizing specific deep learning architectures, namely the Feature Pyramid Net-work and the Linknet model, both of which integrate a ResNet34 layer encoder to enhance performance. The paper also seeks to examine the efficiency of a semi-supervised training regimen using a dual model architecture, consisting of ‘Teacher’ and ‘Student’ models, in addressing the issue of limited labelled datasets. Methods: Employing a semi-supervised training methodology, this research enables the ‘Student’ model to learn from the ‘Teacher’ model’s outputs. The study methodically evaluates the models’ stability, accuracy, and segmentation capabilities, employing metrics such as the Dice Coefficient and the Jaccard Index for comprehensive assessment. Results: The investigation reveals that the Linknet model exhibits good performance, achieving an accuracy rate of 94% in the detection of breast cancer tissues utilizing a 21-seed parameter for the initialization of model weights. It further excels in generating annotations for the ‘Student’ model, which then achieves a 91% accuracy with minimal computational demands. Conversely, the Feature Pyramid Network model demonstrates a slightly lower accuracy of 93% in the Teacher model but exhibits improved and more consistent results in the ‘Student’ model, reaching 95% accuracy with a 42-seed parameter. Conclusions: This study underscores the efficacy and potential of the Feature Pyra-mid Network and Linknet models in the domain of medical image analysis, particularly in the detection of breast cancer, and suggests their broader applicability in various medical segmentation tasks related to other pathology disorders. Furthermore, the research enhances the understanding of the pivotal role that deep learning technologies play in advancing diagnostic methods within the field of medical imaging.
... 3. Weight sharing: The convolutional layer in CNNs uses a weight sharing mechanism [36], which means that the same set of weights are used when the convolutional kernel slides across the entire image [37]. This kind of weight sharing can greatly reduce the number of model parameters and improve the training efficiency of the model. ...
Article
Full-text available
Gesture recognition is an important and inevitable technology in modern times, its appearance and improvement greatly improve the convenience of people's lives, but also enrich people's lives. It has a wide range of applications in various fields. In daily life, it can carry out human-computer interaction and the use of smart home. In terms of medical treatment, it can help patients to recover and assist doctors to carry out experiments. In terms of entertainment, it allows users to interact with the game in an immersive manner. This paper chooses three technologies that deep learning plays a more prominent role in gesture recognition, namely CNNs, LSTM and transfer learning based on deep learning. They each have their own advantages and disadvantages. Because of the different principles of use, different techniques have different roles, such as CNNs can carry out feature extraction, LSTM can deal with long time series, transfer learning can transfer what is learned from another task to this task. Select different practical technologies according to different application scenarios, and make improvements in real time in practical applications. Gesture recognition based on deep learning has the advantages of good accuracy, robustness and real-time implementation, but it also bears the disadvantages of huge economic and time costs and high hardware requirements. Despite some challenges, researchers continue to optimize and improve the technology, and believe that in the future, gesture recognition technology will be more mature and valuable.
... The training processes for all receivers show convergence, with the final loss being less than 0.1. The stochastic gradient descent method (SGDM) was used to search for optimal learnable parameters and biases that minimize the loss function [47]. The initial learning rate of SGDM was set to 1 × 10 −5 , and after every 400 samples, the learning rate will drop by a factor of 0.5 as summarized in the following Table 3. ...
Article
Full-text available
Global navigation satellite systems (GNSSs) applied to intelligent transport systems in urban areas suffer from multipath and non-line-of-sight (NLOS) effects due to the signal reflections from high-rise buildings, which seriously degrade the accuracy and reliability of vehicles in real-time applications. Accordingly, the integration between GNSS and inertial navigation systems (INSs) could be utilized to improve positioning performance. However, the fixed GNSS solution uncertainty of the conventional integration method cannot determine the fluctuating GNSS reliability in fast-changing urban environments. This weakness becomes solvable using a deep learning model for sensing the ambient environment intelligently, and it can be further mitigated using factor graph optimization (FGO), which is capable of generating robust solutions based on historical data. This paper mainly develops the adaptive GNSS/INS loosely coupled system on FGO, along with the fixed-gain Kalman filter (KF) and adaptive KF (AKF) being taken as comparisons. The adaptation is aided by a convolutional neural network (CNN), and the feasibility is verified using data from different grades of receivers. Compared with the integration using fixed-gain KF, the proposed adaptive FGO (AFGO) maintains the 100% positioning availability and reduces the overall 2D positioning error by up to 70% in the aspects of both root mean square error (RMSE) and standard deviation (STD).
... However, performing with deep networks is complex since it leads to the vanishing gradient issue [Alzubaidi, 2021]. Convolutional neural networks (CNN) have been applied successfully in image classification and pattern recognition techniques [Cui, 2018, Zhou et al., 2015, Hien et al., 2021, Hieu et al., 2020a. Deep neural networks, on the other hand, require a large amount of data to avoid overfitting. ...
Article
Full-text available
The natural ecosystem incorporates thousands of plant species and distinguishing them is normally manual, complicated, and time-consuming. Since the task requires a large amount of expertise, identifying forest plant species relies on the work of a team of botanical experts. The emergence of Machine Learning, especially Deep Learning, has opened up a new approach to plant classification. However, the application of plant classification based on deep learning models remains limited. This paper proposed a model, named PlantKViT, combining Vision Transformer architecture and the KNN algorithm to identify forest plants. The proposed model provides high efficiency and convenience for adding new plant species. The study was experimented with using Resnet-152, ConvNeXt networks, and the PlantKViT model to classify forest plants. The training and evaluation were implemented on the dataset of DanangForestPlant, containing 10,527 images and 489 species of forest plants. The accuracy of the proposed PlantKViT model reached 93%, significantly improved compared to the ConvNeXt model at 89% and the Resnet-152 model at only 76%. The authors also successfully developed a website and 2 applications called 'plant id' and 'Danangplant' on the iOS and Android platforms respectively. The PlantKViT model shows the potential in forest plant identification not only in the conducted dataset but also worldwide. Future work should gear toward extending the dataset and enhance the accuracy and performance of forest plant identification.
... This is because the training of DNNs becomes a nonsmooth convex optimization problem, which can have local minima that are also global minima. As a result, traditional optimization algorithms such as stochastic gradient descent (SGD) and its variants, including Nesterov accelerated gradient (NAG) and Adam [2,3], may struggle to find the optimal parameters in such cases. ...
... Choosing an appropriate learning rate is a crucial aspect of any optimization algorithm, as it determines the step size at which the model parameters are updated during training. Fine-tuning the learning rate for deep CNNs can be tedious and may not always result in optimal convergence [3,16,18]. To address this issue, we propose a novel approach for controlling the learning rate during training. ...
Article
Full-text available
Autism spectrum disorder (ASD) is a complex neurodevelopmental disorder that affects social interaction and communication. Early detection of ASD can significantly improve outcomes for individuals with the disorder, and there has been increasing interest in using machine learning techniques to aid in the diagnosis of ASD. One promising approach is the use of deep learning techniques, particularly convolutional neural networks (CNNs), to classify facial images as indicative of ASD or not. However, choosing a learning rate for optimizing the performance of these deep CNNs can be tedious and may not always result in optimal convergence. In this paper, we propose a novel approach called the control subgradient algorithm (CSA) for tackling ASD diagnosis based on facial images using deep CNNs. CSA is a variation of the subgradient method in which the learning rate is updated by a control step in each iteration of each epoch. We apply CSA to the popular DensNet-121 CNN model and evaluate its performance on a publicly available facial ASD dataset. Our results show that CSA is faster than the baseline method and improves the classification accuracy and loss compared to the baseline. We also demonstrate the effectiveness of using CSA with L1\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$L_1$$\end{document}-regularization to further improve the performance of our deep CNN model.
... Deep neural networks, as an advanced machine learning method, have gradually been applied in object detection, text detection, and image classification (Banan et al., 2020;Est et al., 2015;Jafari et al., 2017;Zhu et al., 2016). Many researchers have begun to use deep neural network for LCZ classification, and have greatly improved the classification accuracy (Liu and Shi, 2020;Qiu et al., 2020;Rosentreter et al., 2020).As a representative algorithm of deep neural networks, CNN combines the back propagation mechanism and the gradient descent optimization (Cui, 2018;Mizutani, 1994). Back-propagation provides opportunities for feedback to enhance reliability, and gradient descent is used in the self-training process. ...
Article
Local Climate Zone (LCZ) is a significant classification system of urban form and function, which can reflect the 3-dimensional urban information specifically. However, previous studies of LCZ lack class expansion, comparison of classification accuracy of different CNNs, and results post-processing methods. Therefore, using very-high-resolution (VHR) images (2.2 m resolution) to expand LCZ classes, we compare three different biclassified convolutional neurel networks (CNNs),namely MobileNet-Segnet (MS), MobileNet-Unet (MU) and MobileNet-Pspnet (MP), and select optimal CNN to classify Fujian Delta images. Then, we combine “Vote-Filter-Overlay” methods to remove misidentified patches and smooth boundaries for biclassified LCZ maps. The study results show that: (1) The 2.2 m resolution VHR image can expand the LCZ class from 17 to 20 classes. (2) Different CNNs have diverse sensitivity to each LCZ, the more distinctive texture characteristics of LCZs, the higher their identification rate. Among the three CNNs, MP is the best model for LCZ (2,4,8,9,10, B,C,D,G) and MU is the optimal models for LCZ (1,3,5,6,11,A,F,H,I). (3) “Vote-Filter-Overlay” method can remove misidentified patches and noise and make the LCZ map more in line with actual urban form and functions. (4) Fujian Delta urban areas form a continuous urban belt along the southeastern coast, while the villages and forests distribute in the northwestern. Many small patches of vegetation and water, which can serve as potential urban ecological corridors, were found in the urban core area. Fujian Delta urban are dominated by compact LCZ (1–3), and Xiamen has the highest proportion of LCZ (1), especially in Xiamen island. The results of this study will provide a reference for LCZ classification and basic data for urban morphology.
... The most commonly used algorithm is gradient descent or variants of gradient descent method such as stochastic gradient descent or mini-batch gradient descent [CJRR22,Cui18,LL18,SHL21]. On each iteration, the gradient descent method updates the parameters in the negative direction of the gradient of the objective function with respect to the parameters where the gradient gives the direction of the steepest ascent. ...
Thesis
Full-text available
Currently, artificial neural networks are the most popular approach to machine learning problems such as high-dimensional multivariate regression. Methods using sums of separable functions, which grew out of tensor decompositions, are designed to represent functions in high dimensions and can be applied to high-dimensional multivariate regression. Here we compare the ability of these two methods to approximate function spaces in order to assess their relative expressive power. We show that a general neural network result can be translated into sums of separable functions if the activation function satisfies certain smoothness conditions. Comparatively, we show that it is possible to approximate any sums of separable function result with neural networks using the approximation of products of functions by deep neural networks. We identify general approximation schemes in both the single-layer and deep-layer settings that apply to both methods for approximating certain function classes. In particular, we show that sums of separable functions give the same error rates as neural networks for function classes such as Barron’s functions and band-limited functions. Inspired by deep neural networks, we also introduce deep layer sums of separable functions that shows similar results as deep neural networks for functions with compositional structure.
... All of the aforementioned techniques have limitations ranging from poor interpretation and recognition of the object of interest to structural design issues. To solve these issues and to enhance the convergence of the algorithms, gradient descent training algorithms were proposed [18,19]. The gradient descent algorithms have the tendency to overcome most of the shortcomings of their predecessors by quickly converging into local minima but in an efficient manner [20]. ...
Article
Full-text available
Tracking moving objects is one of the most promising yet the most challenging research areas pertaining to computer vision, pattern recognition and image processing. The challenges associated with object tracking range from problems pertaining to camera axis orientations to object occlusion. In addition, variations in remote scene environments add to the difficulties related to object tracking. All the mentioned challenges and problems pertaining to object tracking make the procedure computationally complex and time-consuming. In this paper, a stochastic gradient based optimization technique has been used in conjunction with particle filters for object tracking. First, the object that needs to be tracked is detected using the Maximum Average Correlation Height (MACH) filter. The object of interest is detected based on the presence of a correlation peak and average similarity measure. The results of object detection are fed to the tracking routine. The gradient descent technique is employed for object tracking and is used to optimize the particle filters. The gradient descent technique allows particles to converge quickly, allowing less time for the object to be tracked. The results of the proposed algorithm are compared with similar state-of-the-art tracking algorithms on five datasets that include both artificial moving objects and humans to show that the gradient-based tracking algorithm provides better results, both in terms of accuracy and speed.
... The testing data sets have been represented as "Set 5" (i.e. 5 images) [19] and "Set 14" (14images) [21] . This choice of training and testing data has been made for allowing a fair comparison to where they have been utilized [14] . To find the efficiency of this implemented model, we use different metrics to measure the degree of similarity between original and input noisy/blur images. ...
... Overall framework of create low resolution 3) Training a network: Network Training can be defined as a procedure that includes finding the kernels in convolution layers and weights that minimize the differences between output predictions constructive images and considering the ground truth labels on the training data-set. The algorithm of backpropagation can be defined as the approach that is typically utilized for the training of the NNs, in which the algorithm of the gradient descent optimization and loss function play essential roles[14] Fig. 4shown the main step of train process in CNN model. For the strategy of the training in the training phase, initially, the original colour image is converted into the grey-scale image through the extraction of the component of the luminance in the color space of the YCbCr. ...
Article
Image restoration is a branch of image processing that involves a mathematical deterioration and restoration model to restore an original image from a degraded image. This research aims to restore blurred images that have been corrupted by a known or unknown degradation function. Image restoration approaches can be classified into 2 groups based on degradation feature knowledge: blind and non-blind techniques. In our research, we adopt the type of blind algorithm. A deep learning method (SR) has been proposed for single image super-resolution. This approach can directly learn an end-to-end mapping between low-resolution images and high-resolution images. The mapping is expressed by a deep convolutional neural network (CNN). The proposed restoration system must overcome and deal with the challenges that the degraded images have unknown kernel blur, to deblur degraded images as an estimation from original images with a minimum rate of error.