Figure - uploaded by Yusuf Baydilli
Content may be subject to copyright.
Weight initialization methods.

Weight initialization methods.

Source publication
Conference Paper
Full-text available
In this study, we analyzed the hyper-parameters which are frequently used in deep learning methods on a generated DNN. On the Fashion-MNIST dataset, we had chance to interpret the evolution of the model to the end as a result of tests performed on a low epoch number. At the end of the study, we reached a success rate of about 90 percent on the test...

Context in source publication

Context 1
... initialization is a significant agent on reducing the time spent on learning [8], [9]. When we examined the accuracy and loss values obtained from the tests performed, although no significant difference occurred, we decided to continue benchmark process with He Normal [10], in which the most improvement was observed (Table 2). ...

Citations

... Therefore, for each data set, it is important to investigate the sensitivity of the parameters in the CNN model to make predictions with high accuracy. To develop any deep learning model, the optimal values of a set of hyperparameters must be decided, such as activation functions, batch size, and learning rate, among others, in order to fine-tune each of these layers [13][14][15][16]. ...
... In the studies, the mini-batch size should be determined as a value that can fit into the GPU memory and should be less than the training dataset. It is usually taken as the power of two (such as 8,16,32). ...
... Optimization of the network can therefore proceed, and we can begin to identify domains that improve the model's generalizability, accuracy, learning rate, and the ability for the network to learn the crucial features of the dataset [32]- [36]. ...
Preprint
Full-text available
In this paper, we present AnalogVNN, a simulation framework built on PyTorch which can simulate the effects of optoelectronic noise, limited precision, and signal normalization present in photonic neural network accelerators. We use this framework to train and optimize linear and convolutional neural networks with up to 9 layers and ~1.7 million parameters, while gaining insights into how normalization, activation function, reduced precision, and noise influence accuracy in analog photonic neural networks. By following the same layer structure design present in PyTorch, the AnalogVNN framework allows users to convert most digital neural network models to their analog counterparts with just a few lines of code, taking full advantage of the open-source optimization, deep learning, and GPU acceleration libraries available through PyTorch.
... ey are the most mysterious hyperparameters that increase the number of parameters to be trained with. Since NHL and NNPHL are the quantities that affect the number of parameters to be trained (weight, bias), the most optimal model, i.e., "the most complex model with least parameters," must be established in both time and learning [10,11]. Increasing or decreasing the number of hidden layers or the number of hidden nodes might improve the accuracy or might not. is depends on the complexity of the problem, where inappropriate selection of these hyperparameters causes underfitting of overfitting problems [12]. ...
... where f(x i ) is the objective function and its range supposed to be positive in this situation. e updated speed formula for each member can be created via equation (10) by computing X cg , the best global solution at the whole swarm level P gs , and the best local solution detected by every individual P il as follows: ...
... e idea behind that is to make such a CoG-particle (X cg ) the convergence of swarm individuals towards the optimal global solution and help them move away from the local solution. It may be possibly clarified by explaining the function of the second and third parts in (10). ...
Article
Full-text available
Since the declaration of COVID-19 as a pandemic, the world stock markets have suffered huge losses prompting investors to limit or avoid these losses. The stock market was one of the businesses that were affected the most. At the same time, artificial neural networks (ANNs) have already been used for the prediction of the closing prices in stock markets. However, standalone ANN has several limitations, resulting in the lower accuracy of the prediction results. Such limitation is resolved using hybrid models. Therefore, a combination of artificial intelligence networks and particle swarm optimization for efficient stock market prediction was reported in the literature. This method predicted the closing prices of the shares traded on the stock market, allowing for the largest profit with the minimum risk. Nevertheless, the results were not that satisfactory. In order to achieve prediction with a high degree of accuracy in a short time, a new improved method called PSOCoG has been proposed in this paper. To design the neural network to minimize processing time and search time and maximize the accuracy of prediction, it is necessary to identify hyperparameter values with precision. PSOCoG has been employed to select the best hyperparameters in order to construct the best neural network. The created network was able to predict the closing price with high accuracy, and the proposed model ANN-PSOCoG showed that it could predict closing price values with an infinitesimal error, outperforming existing models in terms of error ratio and processing time. Using S&P 500 dataset, ANN-PSOCoG outperformed ANN-SPSO in terms of prediction accuracy by approximately 13%, SPSOCOG by approximately 17%, SPSO by approximately 20%, and ANN by approximately 25%. While using DJIA dataset, ANN-PSOCoG outperformed ANN-SPSO in terms of prediction accuracy by approximately 18%, SPSOCOG by approximately 24%, SPSO by approximately 33%, and ANN by approximately 42%. Besides, the proposed model is evaluated under the effect of COVID-19. The results proved the ability of the proposed model to predict the closing price with high accuracy where the values of MAPE, MAE, and RE were very small for S&P 500, GOLD, NASDAQ-100, and CANUSD datasets.
Conference Paper
Convolutional neural networks are composed of many hidden layers and each layer has its properties and may require some parameters. Such training models train on the weights derived from the training dataset. These weights determine how the input will have an impact on the output. The efficiency of the training model can be increased by fine-tuning the parameters of each layer. Manually tuned parameters that are known as hyperparameters require a heuristic approach for defining their values. In this paper, the hyperparameters, namely learning rate, filter size, number of filters, number of layers, and validation frequency, have been studied. This paper also examines the effect of the size of the dataset and the number of layers. It is observed that learning rate and number of layers have more impact on the accuracy as compared to validation frequency.