A Modified Fletcher-Reeves-Type Method for Nonsmooth Convex Minimization

Conjugate gradient methods are efficient for smooth optimization problems, while there are rare conjugate gradient based methods for solving a possibly nondifferentiable convex minimization problem. In this paper by making full use of inherent properties of Moreau-Yosida regularization and descent property of modified conjugate gradient method we propose a modified Fletcher-Reeves-type method for nonsmooth convex minimization. It can be applied to solve large-scale nonsmooth convex minimization problem due to lower storage requirement. The algorithm is globally convergent under mild conditions.


Introduction
Let f : R n → R be a possibly nondifferentiable convex function and consider unconstrained optimization problem of the form Associated with problem (1) is the problem where λ is a positive parameter and ∥ • ∥ denotes the Euclidean norm.
The function F has some good properties: problems (1) and ( 2) are equivalent in the sense that the solution sets of the two problems coincide with each other [9].F is a differentiable convex function, even though the function f is nondifferentiable.Moreover F has a Lipschitz continuous gradient [9].These features motivate us to solve problem (1) through the Moreau-Yosida regularization, particularly when f is nondifferentiable.
The conjugate gradient methods are welcome methods for smooth unconstrained optimization problems.They are particularly efficient for large-scale problems due to their simplicity and low storage [8].To the best of our knowledge, there are rare conjugate gradient based methods for nonsmooth convex minimization, which motivates this paper.We propose a conjugate gradient based method for minimizing Moreau-Yosida regularization F , with a line search on approximate value of the function F instead of its exact value.The line search rule is different from that in [7,16].In this paper, we will focus on the MFR method which is a descent conjugate gradient method, proposed by Zhang, Zhou and Li [20] for solving unconstrained optimization.Under mild conditions, we prove the global convergence of the method.Note that we do not require the strong convexity assumption on the objective function f as [16] and [17] required.Recently, Yuan, Wei and Li [18] propose a modified Polak-Ribière-Polyak conjugate gradient algorithm for nonsmooth convex programs, [18] and this paper have common feature that they both propose algorithms for problem (1) by means of Moreau-Yosida regularization and the search directions satisfy the sufficient descent property, but they have different line search technique.
The paper is organized as follows.In section 2, we derive the algorithm.Section 3 is devoted to proving its global convergence.The last section contains some concluding remarks.

202
QIONG LI Throughout this paper, ⟨•, •⟩ denotes inner product of two vectors, and g(x) denotes the gradient of F (x).

Derivation of MFR Type Algorithm
In this section, we first recall some basic results in convex analysis which are useful in the subsequent discussions.Let θ : R n → R be a function such that is strongly convex and f (z) is convex, by the definition of strongly convex function we know that θ(z) is strongly convex.Hence p(x) = arg min z∈R n θ(z) is well defined and unique.Then F (x) can be expressed by Some features about F (x) can be seen in [9].Properties 1.The function F is finite-valued, convex and everywhere differentiable with gradient Moreover, the gradient mapping g : R n → R n is globally Lipschitz continuous with modulus 1 λ , i.e., 2.
x is an optimal solution to (1) if and only if g(x) = 0, namely p(x) = x.
It is obvious that F (x) and g(x) can be obtained by p(x).However p(x) is difficult or even impossible to obtain.Fortunately, for each x ∈ R n and any ε > 0, there exists a vector p a (x, ε) ∈ R n , where the superscript character a means the approximation, such that Hence, we can use p a (x, ε) to define approximations of F (x) and g(x) by Stat., Optim.Inf.Comput.Vol. 2, September 2014.
A MODIFIED FLETCHER-REEVES-TYPE METHOD 203 and Implementable algorithms that are designed to find p a (x, ε) are introduced in [1,5,6].The following Proposition deriving from Fukushima and Qi [7] shows that with p a (x, ε) we can compute approximations F a (x, ε) and g a (x, ε) to F (x) and g(x), respectively with any desired accuracy.
Step 1 Compute p a (x k , ε k ).Compute the search direction where Step 2 Choose a scalar ε k+1 such that 0 < ε k+1 < ε k .Let i k be the smallest nonnegative integer i such that

Proposition 2
The search direction (8) in Algorithm 1 satisfies Proof.We omit the detail.The following proposition ensures that, at each iteration k of the algorithm, α k is well defined and can be determined finitely in Step 2.

Global Convergence of MFR Type Algorithm
In this section, we prove the global convergence under the following assumptions: We note that this assumption is a weaker condition than the strong convexity of f as required in [16,17], which can be verified by the fact that the property of strong convexity of f is transmitted to the Moreau-Yosida regularization F : If f is strongly convex, then F is strongly convex [11] (Theorem 2.2), so we can deduce that the strong convexity of f implies the boundedness of Ω.It is clear that the sequence {x k } generated by Algorithm 1 are contained in Ω. Combining this assumption with the Lipschitz continuous property of the gradient g, we have that there exists a constant γ > 0 such that Combining (1) with Proposition 1, we obtain the conclusion that there exists a constant γ 1 > 0 such that Lemma 1 Let {x k } and {d k } be generated by Algorithm 1.If the sequence {ε k } of strictly decreasing positive numbers satisfies the condition Then the whole sequence {F a (x k , ε k )} is convergent, and QIONG LI Proof.By the line search rule, it holds that and hence which together with the assumption On the other hand, f is bounded from below by assumption, and hence F is also bounded from below.Since Therefore the sequence {F a (x k , ε k )} has at least one accumulation point.In fact, it can be shown in a way similar to the first part of the proof of Theorem 4.1 in [7] that the whole sequence {F a (x k , ε k )} is convergent.Applying the inequality (5) recursively, we have That is, Since the whole sequence by taking the limit in (7) we have Proof.Now we prove (8) by considering the following two cases.Case 1. α k = 1.We get from Proposition 2 that ∥g Case 2. α k < 1.By the line search step, i.e., Step 2 of Algorithm 1, ρ −1 α k does not satisfy inequality (9).This means By the mean-value theorem and inequality (7), there is a t k ∈ (0, 1) such that Combining this with (5), we get Substituting this inequality into (9), we get Due to (7), we get which implies the second inequality is due to (2).Hence (8) holds.

Theorem 1
Assume the conditions of Lemma 3.1 hold.We have Proof.For the sake of contradiction, we suppose that the conclusion is not true.Then there exists a constant ϵ > 0 such that From ( 7), we obtain there exists a constant ϵ * > 0 such that We get from (8) that Dividing both sides of this equality by ⟨g (x k , ε k ), d k ⟩ 2 , we get from Proposition 2 and (11) that The last inequality implies that which contradicts (8).The proof is then complete.
In this paper, via Moreau-Yosida regularization, by introducing a new line search on the approximation to the Moreau-Yosida regularization, we extend MFR method for smooth unconstrained optimization to nonsmooth convex minimization.The global convergence is established under mild conditions.