MULTI- STAGE MOMENT-BASED OPTIMIZATION: ANALYSIS AND APPLICATION OF THE ADAM ALGORITHM

E.N. Muminov; A. A. Tillaboev; S.Sh. Qobilov

PDF (English)

Выпуск: Том 4 № 16 (2025): Наука и технология в современном мире

Раздел: Статьи

Опубликован: июн 13, 2025

E.N. Muminov

Teacher of the Department of "Artificial Intelligence" at the Tashkent University of Information Technologies named after Muhammad Al-Khwarizmi

Gaspower@mail.ru

A. A. Tillaboev

Teacher of the Department of "Artificial Intelligence" at the Tashkent University of Information Technologies named after Muhammad Al-Khwarizmi

emuminov864@gmail.com

S.Sh. Qobilov

Teacher of the Department of "Artificial Intelligence" at the Tashkent University of Information Technologies named after Muhammad Al-Khwarizmi

qobilov.sirojiddin92@gmail.com

Аннотация:

In the era of deep learning and large-scale artificial intelligence systems, the importance of efficient optimization algorithms has significantly increased. Neural networks, particularly those with deep and complex architectures, rely heavily on gradient-based iterative methods to update model parameters by minimizing a loss function. Among these methods, the Adam (Adaptive Moment Estimation) algorithm has emerged as a widely adopted solution due to its adaptive learning capability and robust convergence behavior. Originally introduced by D. Kingma and J. Ba in 2015, Adam integrates the advantages of both Stochastic Gradient Descent (SGD) and RMSprop algorithms, addressing several limitations of traditional approaches, such as fixed learning rates, slow convergence, oscillatory updates, and sensitivity to noisy gradients [1][2].

Как цитировать:

Muminov , E., Tillaboev, A. A., & Qobilov , S. (2025). MULTI- STAGE MOMENT-BASED OPTIMIZATION: ANALYSIS AND APPLICATION OF THE ADAM ALGORITHM. Наука и технология в современном мире, 4(16), 22–26. извлечено от https://in-academy.uz/index.php/zdift/article/view/54979

Библиографические ссылки:

Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning. MIT Press.

Bengio, Y. (2012). Practical recommendations for gradient-based training of deep architectures. In Neural Networks: Tricks of the Trade (pp. 437–478). Springer.

Zeiler, M. D. (2012). ADADELTA: An adaptive learning rate method. 2017.

Tieleman, T., & Hinton, G. (2012). Lecture 6.5 – RMSProp: Divide the gradient by a running average of its recent magnitude. COURSERA: Neural Networks for Machine Learning.

Duchi, J., Hazan, E., & Singer, Y. (2011). Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research, 12, 2121–2159.

LeCun, Y., Bottou, L., Orr, G. B., & Müller, K. R. (2012). Efficient backprop. In Neural Networks: Tricks of the Trade (pp. 9–48). Springer.

Zhang, J., & Mitliagkas, I. (2020). Lookahead optimizer: k steps forward, 1 step back. In Advances in Neural Information Processing Systems, 32.

Bock, C., & Gumbsch, P. (2020). Optimization of deep neural networks: Recent advances and applications. Journal of Computational Science, 45, 101182.

Li J., Li X., & Hoi, S.C. (2018). Learning to optimize: A primer and a benchmark.

Reddi, S.J., Kale, S., & Kumar, S. (2019). On the convergence of Adam and beyond. In International Conference on Learning Representations (ICLR).

Article Sidebar

Main Article Content

Аннотация:

Article Details

Как цитировать:

Библиографические ссылки: