Optimal Step Length in Steepest Descent and Newton's Method for Strongly Convex Quadratic Functions

Math Problem Statement

To find the optimal step length for an iteration of steepest descent or Newton’s method, we need to find the minimizer of the univariate function φ(α) = f (xk + αpk). Show that the optimal step length of a strongly convex quadratic function is given by α = − ∇f T k pk pT k Qpk (3.55) f is strongly convex and quadratic, which means that f (x) = 1 2 xT Qx − bT x, where Q is symmetric and positive definite.

Solution

To find the optimal step length $\alpha$ for an iteration of the steepest descent or Newton's method, we need to minimize the univariate function $\varphi(\alpha)$ , which is defined as:

$\varphi(\alpha) = f(x_k + \alpha p_k)$

where $f(x)$ is a strongly convex quadratic function. The general form of a strongly convex quadratic function $f(x)$ is given by:

$f(x) = \frac{1}{2} x^T Q x - b^T x$

where $Q$ is a symmetric positive definite matrix, and $b$ is a vector.

Step 1: Define $\varphi(\alpha)$

We start by expressing $\varphi(\alpha)$ :

$\varphi(\alpha) = f(x_k + \alpha p_k)$

Substitute the quadratic form of $f(x)$ into this expression:

$\varphi(\alpha) = \frac{1}{2} (x_k + \alpha p_k)^T Q (x_k + \alpha p_k) - b^T (x_k + \alpha p_k)$

Step 2: Expand $\varphi(\alpha)$

Expand the quadratic and linear terms:

[ \varphi(\alpha) = \frac{1}{2} \left[ x_k^T Q x_k + 2\alpha x_k^T Q p_k + \alpha^2 p_k^T Q p_k \right] - \left[ b^T x_k + \alpha b^T p_k \right] ]

Rearrange the terms:

$\varphi(\alpha) = \frac{1}{2} x_k^T Q x_k - b^T x_k + \alpha x_k^T Q p_k - \alpha b^T p_k + \frac{1}{2} \alpha^2 p_k^T Q p_k$

Since $\nabla f(x_k) = Qx_k - b$ , we recognize that $x_k^T Q p_k - b^T p_k = p_k^T \nabla f(x_k)$ . Therefore, we can rewrite the expression:

$\varphi(\alpha) = f(x_k) + \alpha p_k^T \nabla f(x_k) + \frac{1}{2} \alpha^2 p_k^T Q p_k$

Step 3: Minimize $\varphi(\alpha)$

To find the optimal step length, we need to minimize $\varphi(\alpha)$ with respect to $\alpha$ . We take the derivative of $\varphi(\alpha)$ with respect to $\alpha$ :

$\frac{d}{d\alpha} \varphi(\alpha) = p_k^T \nabla f(x_k) + \alpha p_k^T Q p_k$

Set this derivative equal to zero to find the minimizer:

$p_k^T \nabla f(x_k) + \alpha p_k^T Q p_k = 0$

Solve for $\alpha$ :

$\alpha = - \frac{p_k^T \nabla f(x_k)}{p_k^T Q p_k}$

Conclusion

Thus, the optimal step length for a strongly convex quadratic function is:

$\alpha = - \frac{p_k^T \nabla f(x_k)}{p_k^T Q p_k}$

This expression ensures that the step length minimizes the function $\varphi(\alpha) = f(x_k + \alpha p_k)$ .

Would you like further details on any part of this explanation?

Here are some related questions to expand your understanding:

How does the steepest descent method differ from Newton's method in terms of convergence rate?
What role does the matrix $Q$ play in the steepest descent algorithm for quadratic functions?
Why is it important that $Q$ is positive definite in this context?
How does the optimal step length formula change if $f(x)$ is not quadratic but still strongly convex?
Can the method be generalized to non-convex functions? If so, how would the step length be determined?

Tip: Always check if the matrix $Q$ is positive definite when applying methods that assume strong convexity, as this guarantees the function has a unique minimum.

Ask a new question for Free

By Image

Drop file here or Click Here to upload

Math Problem Analysis

Mathematical Concepts

Optimization
Steepest Descent Method
Newton's Method
Strong Convexity
Quadratic Functions

Formulas

φ(α) = f(x_k + αp_k)
f(x) = 1/2 x^T Q x - b^T x
α = - (p_k^T ∇f(x_k)) / (p_k^T Q p_k)

Theorems

First-order optimality condition
Strong convexity property

Suitable Grade Level

Undergraduate-level Mathematics or Optimization

Related Recommendation

Optimal Step Size in Gradient Descent for Quadratic Function

Derive the Optimal Step Size for Gradient Descent at Iteration k

Optimal Step Size in Gradient Descent for Quadratic Function Minimization

Gradient Descent for Quadratic Function Minimization