Standard

Harvard

APA

Vancouver

Author

BibTeX

@article{8ec5fe4ae43e42989cb09e8f08efd5b5,
title = "Analysis of a Two-Step Gradient Method with Two Momentum Parameters for Strongly Convex Unconstrained Optimization",
abstract = "The paper is devoted to the theoretical and numerical analysis of the two-step method, constructed as a modification of Polyak{\textquoteright}s heavy ball method with the inclusion of an additional momentum parameter. For the quadratic case, the convergence conditions are obtained with the use of the first Lyapunov method. For the non-quadratic case, sufficiently smooth strongly convex functions are obtained, and these conditions guarantee local convergence.An approach to finding optimal parameter values based on the solution of a constrained optimization problem is proposed. The effect of an additional parameter on the convergence rate is analyzed. With the use of an ordinary differential equation, equivalent to the method, the damping effect of this parameter on the oscillations, which is typical for the non-monotonic convergence of the heavy ball method, is demonstrated. In different numerical examples for non-quadratic convex and non-convex test functions and machine learning problems (regularized smoothed elastic net regression, logistic regression, and recurrent neural network training), the positive influence of an additional parameter value on the convergence process is demonstrated.",
keywords = "convex optimization, gradient descent, heavy ball method",
author = "Кривовичев, {Герасим Владимирович} and Сергеева, {Валентина Юрьевна}",
year = "2024",
month = mar,
day = "18",
doi = "10.3390/a17030126",
language = "English",
volume = "17",
journal = "Algorithms",
issn = "1999-4893",
publisher = "MDPI AG",
number = "3",

}

RIS

TY - JOUR

T1 - Analysis of a Two-Step Gradient Method with Two Momentum Parameters for Strongly Convex Unconstrained Optimization

AU - Кривовичев, Герасим Владимирович

AU - Сергеева, Валентина Юрьевна

PY - 2024/3/18

Y1 - 2024/3/18

N2 - The paper is devoted to the theoretical and numerical analysis of the two-step method, constructed as a modification of Polyak’s heavy ball method with the inclusion of an additional momentum parameter. For the quadratic case, the convergence conditions are obtained with the use of the first Lyapunov method. For the non-quadratic case, sufficiently smooth strongly convex functions are obtained, and these conditions guarantee local convergence.An approach to finding optimal parameter values based on the solution of a constrained optimization problem is proposed. The effect of an additional parameter on the convergence rate is analyzed. With the use of an ordinary differential equation, equivalent to the method, the damping effect of this parameter on the oscillations, which is typical for the non-monotonic convergence of the heavy ball method, is demonstrated. In different numerical examples for non-quadratic convex and non-convex test functions and machine learning problems (regularized smoothed elastic net regression, logistic regression, and recurrent neural network training), the positive influence of an additional parameter value on the convergence process is demonstrated.

AB - The paper is devoted to the theoretical and numerical analysis of the two-step method, constructed as a modification of Polyak’s heavy ball method with the inclusion of an additional momentum parameter. For the quadratic case, the convergence conditions are obtained with the use of the first Lyapunov method. For the non-quadratic case, sufficiently smooth strongly convex functions are obtained, and these conditions guarantee local convergence.An approach to finding optimal parameter values based on the solution of a constrained optimization problem is proposed. The effect of an additional parameter on the convergence rate is analyzed. With the use of an ordinary differential equation, equivalent to the method, the damping effect of this parameter on the oscillations, which is typical for the non-monotonic convergence of the heavy ball method, is demonstrated. In different numerical examples for non-quadratic convex and non-convex test functions and machine learning problems (regularized smoothed elastic net regression, logistic regression, and recurrent neural network training), the positive influence of an additional parameter value on the convergence process is demonstrated.

KW - convex optimization

KW - gradient descent

KW - heavy ball method

UR - https://www.mendeley.com/catalogue/e2c44057-5a9e-357a-a111-aade3a86dc1a/

U2 - 10.3390/a17030126

DO - 10.3390/a17030126

M3 - Article

VL - 17

JO - Algorithms

JF - Algorithms

SN - 1999-4893

IS - 3

M1 - 126

ER -

ID: 118253583