Math Problem Statement

To find the optimal step length for an iteration of steepest descent or Newton’s method, we need to find the minimizer of the univariate function φ(α) = f (xk + αpk). Show that the optimal step length of a strongly convex quadratic function is given by α = − ∇f T k pk pT k Qpk (3.55) f is strongly convex and quadratic, which means that f (x) = 1 2 xT Qx − bT x, where Q is symmetric and positive definite.

Solution

To find the optimal step length α\alpha for an iteration of the steepest descent or Newton's method, we need to minimize the univariate function φ(α)\varphi(\alpha), which is defined as:

φ(α)=f(xk+αpk)\varphi(\alpha) = f(x_k + \alpha p_k)

where f(x)f(x) is a strongly convex quadratic function. The general form of a strongly convex quadratic function f(x)f(x) is given by:

f(x)=12xTQxbTxf(x) = \frac{1}{2} x^T Q x - b^T x

where QQ is a symmetric positive definite matrix, and bb is a vector.

Step 1: Define φ(α)\varphi(\alpha)

We start by expressing φ(α)\varphi(\alpha):

φ(α)=f(xk+αpk)\varphi(\alpha) = f(x_k + \alpha p_k)

Substitute the quadratic form of f(x)f(x) into this expression:

φ(α)=12(xk+αpk)TQ(xk+αpk)bT(xk+αpk)\varphi(\alpha) = \frac{1}{2} (x_k + \alpha p_k)^T Q (x_k + \alpha p_k) - b^T (x_k + \alpha p_k)

Step 2: Expand φ(α)\varphi(\alpha)

Expand the quadratic and linear terms:

[ \varphi(\alpha) = \frac{1}{2} \left[ x_k^T Q x_k + 2\alpha x_k^T Q p_k + \alpha^2 p_k^T Q p_k \right] - \left[ b^T x_k + \alpha b^T p_k \right] ]

Rearrange the terms:

φ(α)=12xkTQxkbTxk+αxkTQpkαbTpk+12α2pkTQpk\varphi(\alpha) = \frac{1}{2} x_k^T Q x_k - b^T x_k + \alpha x_k^T Q p_k - \alpha b^T p_k + \frac{1}{2} \alpha^2 p_k^T Q p_k

Since f(xk)=Qxkb\nabla f(x_k) = Qx_k - b, we recognize that xkTQpkbTpk=pkTf(xk)x_k^T Q p_k - b^T p_k = p_k^T \nabla f(x_k). Therefore, we can rewrite the expression:

φ(α)=f(xk)+αpkTf(xk)+12α2pkTQpk\varphi(\alpha) = f(x_k) + \alpha p_k^T \nabla f(x_k) + \frac{1}{2} \alpha^2 p_k^T Q p_k

Step 3: Minimize φ(α)\varphi(\alpha)

To find the optimal step length, we need to minimize φ(α)\varphi(\alpha) with respect to α\alpha. We take the derivative of φ(α)\varphi(\alpha) with respect to α\alpha:

ddαφ(α)=pkTf(xk)+αpkTQpk\frac{d}{d\alpha} \varphi(\alpha) = p_k^T \nabla f(x_k) + \alpha p_k^T Q p_k

Set this derivative equal to zero to find the minimizer:

pkTf(xk)+αpkTQpk=0p_k^T \nabla f(x_k) + \alpha p_k^T Q p_k = 0

Solve for α\alpha:

α=pkTf(xk)pkTQpk\alpha = - \frac{p_k^T \nabla f(x_k)}{p_k^T Q p_k}

Conclusion

Thus, the optimal step length for a strongly convex quadratic function is:

α=pkTf(xk)pkTQpk\alpha = - \frac{p_k^T \nabla f(x_k)}{p_k^T Q p_k}

This expression ensures that the step length minimizes the function φ(α)=f(xk+αpk)\varphi(\alpha) = f(x_k + \alpha p_k).

Would you like further details on any part of this explanation?


Here are some related questions to expand your understanding:

  1. How does the steepest descent method differ from Newton's method in terms of convergence rate?
  2. What role does the matrix QQ play in the steepest descent algorithm for quadratic functions?
  3. Why is it important that QQ is positive definite in this context?
  4. How does the optimal step length formula change if f(x)f(x) is not quadratic but still strongly convex?
  5. Can the method be generalized to non-convex functions? If so, how would the step length be determined?

Tip: Always check if the matrix QQ is positive definite when applying methods that assume strong convexity, as this guarantees the function has a unique minimum.

Ask a new question for Free

By Image

Drop file here or Click Here to upload

Math Problem Analysis

Mathematical Concepts

Optimization
Steepest Descent Method
Newton's Method
Strong Convexity
Quadratic Functions

Formulas

φ(α) = f(x_k + αp_k)
f(x) = 1/2 x^T Q x - b^T x
α = - (p_k^T ∇f(x_k)) / (p_k^T Q p_k)

Theorems

First-order optimality condition
Strong convexity property

Suitable Grade Level

Undergraduate-level Mathematics or Optimization