The Study of Coordinate-Wise Decomposition Descent Method for Optimization Problems

. The aim of this paper is to consider a general non-stationary optimization problem whose objective function need not be smooth in general and only approximation sequences are known instead of exact values of the functions. We apply a two-step technique where approximate solutions of a sequence of a generalized mixed variational inequality problem (GMVIP) are inserted in the iterative method of a selective coordinate-wise decomposition descent method. Its convergence is achieved under coercivity-type assumptions.


Introduction
This paper presents a systematic approach to non-stationary optimization problems for a class of non-stationary variational inequalities.The theoretical results are delivered in the general framework of abstract inequalities in an abstract space.The main feature of such general non-stationary optimizations lies in the fact that they are governed by a target function that cannot be smooth in general and only approximation sequences are known instead of exact values of the functions.
The variational inequalities appear in a variety of mathematical, physical and mechanical problems, for example, the unilateral contact problems in nonlinear elasticity, the problems describing the adhesive and friction effects, the nonconvex semi permeability problems, the nonlinear optimization problems, the masonry structures, and the delamination problems in multi-layered compositions.Variational inequalities have been introduced by G. Stampacchia [1] in 1968 as the variational formulation of important classes of unilateral, boundary-valued problems and inequality problems in mathematical mechanics.The notion of optimization problem is a generalization of variational inequality for a case where the function involved is nonconvex, nonsmoothed and nonstationary.They cover optimization problems for partial differential equations with nonmonotone, possibly multivalued and nonconvex nonlinearities.In the last few years, many kinds of optimization problems and variational inequalities have been investigated see, [2] and the study of optimization and variational inequalities has emerged today as a new and interesting branch of mathematics.
Various models in applied sciences can be conveniently formulated as optimization problems involving certain constraints/parameters.These constraints/parameters are either known are unknown but they often characterize some physical properties of the underlying model.In this context, the direct problem consists in solving the optimization problem.In recent years, the field of optimization and variational inequalities emerged as one of the most vibrant and developing branches of applied and industrial mathematics because of their wide applications, see [3,4].Rather recently, the decomposition approach was suggested for variational inequalities with binding constraints in [5].Within this approach, the initial problem is treated as a two-level one using a share allocation procedure, leading to a set-valued variational inequality as a master problem.Further, a decomposable dual regularization (penalty) method that deals with a single-valued approximation of the master problem for each fixed share allocation was suggested.However, the optimization problem of identifying constraints/parameters in optimization and variational inequalities is still an untreated topic in the literature, which is the motivation of the present work.
The general optimization problem consists of finding the minimal value of some goal function p over the corresponding feasible set X .For brevity, we write this problem as its solution set is denoted by X and the optimal value of the function by p , defined by We denote R s by the real s-dimensional Euclidean space, all elements of such spaces being column vectors represented by boldface, e.g.x.For any vectors x and y of R s , we denote by x, y their scalar product, i.e.
x, y = x y = s i=1 x i y i , and by x the Euclidean norm of x, i.e.
We define for brevity M = {1, • • • , n}, | A | will denote the cardinality of a finite set A. As usual, R will denote the set of real numbers, R = R {+∞}.
Let us consider a partition of the N-dimensional space i.e. where This means that any point where The simplest case, where n i = 1 for all i ∈ M and n = N corresponds to the scalar coordinate partition.
Assume that f 1 , f 2 : R N → R are two functions, which are continuous on X and is a continuous function on X .The partially decomposable optimization problems played a significant role in various data type applications; see, e.g.[6][7][8][9].In these problems, the cost function and feasible set are specialized as follows: where h i : R N i → R is a convex function, and X i is a convex set in R N i for i = 1, • • • , n.Note that the function f : R N → R is not supposed to be convex in general.That is, we have to solve a non-convex non-differentiable optimization problem, which appears too difficult to solve with usual subgradient-type methods.One can develop efficient coordinate-wise decent decomposition methods for finding stationary points problem (1.1), (1.3)-(1.5)for a smooth f ; see, e.g.[7,[10][11][12].Then the stationary points can be defined as a solution to the following generalized mixed variational inequality problems (GMVIP): Find a point x ∈ X such that where In this paper, we intend to suggest a coordinate-wise decomposition descent method for the following problem: Find a point x ∈ X such that ) to be the usual subdifferential of f at x, then each solution of (1.7) exactly solves (1.1), (1.3)-(1.5).Next, we suppose that only sequences of approximations are known instead of the exact values of G, Q and h.

Preliminaries
Let us consider a partially partitionable optimization problem of the form where the function µ : R N → R is smooth on X , but not necessary convex.This problem will serve as an approximation of the basic problem (1.1), (1.3)-(1.5).We use the same partition (1.2) of the space R N and fix the assumption on the feasible set.
(A1) It holds that (1.5) where X i are non-empty, convex and closed sets in We suppose that where η i : R N i → R is convex and has the non-empty subdifferential ∂η i (x i ) at each point x i ∈ X i , for i ∈ M. Then each function η i is lower semi-continuous, hence the function η is also lower semicontinuous, and Therefore, problems (2.1) and (2.2) are rewritten as where Given a point x ∈ X , we say that a vector d is feasible for x if x + αd ∈ X for some α > 0. From the above assumptions, it follows that the function ϕ is directionally differentiable at each point x ∈ X , that is, its directional derivative concerning any feasible vector d is defined by the formula: see, e.g.[14].
Recall a function f : R s → R is said to be coercive on a set (a) Each solution of problem (2.3) is a solution of the following GMVIP: Find a point x ∈ X = Now, we denote by X 0 the solution set of GMVIP (2.4) and call it the set of stationary points of problem (2.3); cf(1.6).
Fix α > 0. For each point x ∈ X we can define y(x) = (y This GMVIP gives a necessary and sufficient optimality condition for the optimization problem: From the above assumptions each Φ i (x, •) is strongly convex then problem (2.6) and (2.7) (or (2.5)) has the unique solution y(x), thus defining the single valued mapping x → y(x).Observe that all the components of y(x) can be found independently, i.e. (2.6) and (2.7) is equivalent to n independent optimization problems of form min for i = 1, • • • , n and y i (x) just solves (2.8).

Set
where i (x) = x i − y i (x) .
From Lemma 2.2, we see that the value (x) can serve as an accuracy measure for GMVIP (2.4).
[15] Take any point x ∈ X and index i ∈ M. If Denote by Z + the set of non-negative integers.From [15], we describe the basic algorithm for GMVIP (2.4) as follows Algorithm 2.1.( Decomposition Descent Schemes (DDS.)) Output: A point z.
Step 1: Choose an index i ∈ M such that i (x κ ) ≥ δ, set i κ = i , and go to Step 3.
Step 2: Set z = x κ and stop.
Lemma 2.4.The line search procedure in Step 3 is always finite.
Proof.We assume that, if the line search procedure is infinite, then hence, by taking the limit, we get Proposition 2.1.The number of iterations in Algorithm DDS is finite.
Proof.By construction, we have hence the sequence {x κ } is bounded and has limits points, besides, Assume that the sequence {x κ } is infinite.Since the set M is finite, there is an index i κ = i , which is repeated infinitely.Take the corresponding subsequence {κ s }, then, without loss of generality, we can assume that the subsequence {x κs } converges to a point x, besides, Using the mean value theorem (see e.g.[14]), we obtain for some g κs + q κs = µ (x κs + λκ s θ ξ κs d κs ), t κs ∈ ∂η(x κs + λκ s θ ξ κs d κs ), ξ κs ∈ (0, 1).By taking the limit s → ∞, we have for some t ∈ ∂η(x), where On the other hand, using Lemma 2.3 gives besides, by construction, we have which is a contradiction.

Main Results
We now intend to describe a general iterative method for the non-stationary (or limit) generalized mixed variational inequality problems (GMVIP) (1.7).First, we introduce the approximation assumptions.
(A2) There exists a sequence of continuous mappings G , Q : X → R N , which are the gradients of (A3) For each i = 1, • • • , n, there exists a sequence of convex functions h ,i : R N i → R, such that each of them is subdifferentiable on X i and that the relations {u } → ū and u ∈ X i imply Under condition (A2), the limit set valued mappings G and Q at any point is approximated by a sequence of gradients {G } and {Q }.In fact, if G, Q are the Clarke subdifferential of a locally Lipschitz functions f 1 + f 2 = f , it can be always approximated by a sequence of gradients within condition (A2); see [16,17].Also, observe that if there is a subsequence y s ∈ X with {y s } → ȳ, then (A2) implies {G s (y s )} → ḡ ∈ G(ȳ), {Q s (y s )} → q ∈ Q(ȳ) and the same is true for (A3).At the same time, the non-differentiability of the functions f 1 + f 2 = f or h is not obligatory, the main property is the existence of the approximation sequences indicated in (A2) and (A3).
We replace GMVIP (1.7) with a sequence of GMVIPs: Find a point z where we use the partition of G and Q which corresponds to that of the space R N , i.e. where Similarly, we set Since the feasible set X may be unbounded, we introduce coercivity conditions.
(C2) There exists a number σ > 0 and a point v ∈ X such that for any sequences {u } and {d } satisfying the conditions: Clearly, (C1) gives a custom coercivity condition for each function f (x) + h (x), which provides existence of solutions of each particular problem (3.1).Obvious, (C1) holds if X is bounded.At the same time, (C2) gives a similar coercivity condition for the whole sequence of these problems approximation the limit GMVIP (1.7).It also holds if X is bounded.In the unbounded case (C2) is weaker than the following coercivity condition: Therefore, we conclude that the conditions C1 and C2 are not restrictive.The whole decomposition descent method for the non-stationary GMVIP (1.7) has a two-level iteration scheme where each stage of the upper level invokes Algorithm (DDS) with different parameters.
At the -th stage, = 1, 2, • • • , we have a point z −1 ∈ X and a number δ . Set apply Algorithm (DDS) with x 0 = z −1 , δ = δ and obtain a z = z as its output.Now, we establish the main convergence result.(ii) the number of iterations at each stage of Method (DNS) is finite; (iii) the sequence {z } generated by Method (DNS) has limit points and all these limit points are solutions of GMVIP (1.7); (iv) if f is convex, then all the limit points of {z } belong to X .
Proof.We see that (C1) implies that each problem (3.1) has a solution.Since the cost function is coercive, hence the set is bounded.It follows that the optimization problem has a solution and so is GMVIP (3.1).Hence, assertion (i) is true.Next, from Proposition 2.1, the assertion (ii) is also true.
By (ii), the sequence {z } is well-defined and (2.5) implies Besides, the stopping rule in Algorithm (DDS) gives Now, we proceed to show that {z } is bounded.Conversely, assume that { z } → +∞.Applying (3.2) with y = v, we have Here and below, for brevity we set g = G (z ), q = Q (z ), z = y (z ), and d = α(y(z ) − z ).
Take a subsequence { s } such that then, from (C2), we have a contradiction.Therefore, the sequence {z } is bounded and has limit points.Let z be an arbitrary limit point for {z }, i.e.
Since z ∈ X , we have z ∈ X .It follows from (A2) that and lim s→∞ q s = q ∈ Q(z).
Next, if f is convex, then so is p (a goal function defined on a feasible set X )and each limit point of {z } belongs to X , which gives assertion (iv), and GMVIP (1.7) has a solution.

Examples
We can take the exact one-dimensional minimization rule instead of the current Armijo line-search (2.9) in Algorithm (DDS), then the assertions of Theorem 3.1 remain true.Next, if the function µ each function f ) is convex, we can replace (2.9) with the following: Moreover, if the gradient of the function µ is Lipschitz continuous, we have an explicit lower bound for the step size and utilize the fixed step-size version of the Algorithm (DDS), which leads to further reduction of computational expenses.Now, we give only two instances to illustrate possible applications.
The first instance is the linear inverse problem that arises very often in signal and image processing, see [18].The problem consists of solving a linear system of equations where A is a m×n matrix, b is a vector in R m , whose exact values may be unknown or admit some noise perturbations.If A A is ill-conditioned, the custom approach based on the least square minimization where L is a loss function and C > 0 is a penalty parameter.The usual choice is L(z; y ) = max{0; 1 − y z} and p is either 1 or 2. We observe that the data of the observation points x i can be inexact or even non-stationary.
Next, taking p = 2, we can rewrite this problem as min w,ξ 0.5 w 2 + C i=1 ξ i , subject to Its dual has the quadratic programming format: (α s y s )(α t y t ) x s , x t .(4.3) We see that all these problems fall into format (1.1), (1.3)-(1.5)has a solution.

Conclusions
We discussed a new class of selective coordinate-wise descent splitting methods for non-stationary decomposable composite optimization problems and proved the convergence of the problems involving the non-smooth set-valued functions where all coordinate variations together change the tolerance parameters corresponding to the sequence of GMVIP.

problem min x Ax − b 2 may
give very inexact approximations.To enhance its properties, one can utilize a family of regularized problems of the formmin x Ax − b 2 + εh(x),(4.1)whereh(x)= x 2 or h(x) = x 1 n i=1 | x i |,ε > 0 is a parameter.Note that the non-smooth regularization term yields additionally sparse solutions with rather small number of non-zero components; see, e.g.[6,19].The second instance is the basic machine learning problem, which is called the linear support vector machine.It consists in finding the optimal partition of the feature space R n by using some given training sequencex i , i = 1, • • • ,where each point x i has a binary label y i ∈ {−1, +1} indicating the class.We have to find a separating hyperplane.Usually, its parameters are found in the solution of the optimization problem min w∈R n 1 p w p p + C i=1 L( w, x i ; y i ), (4.2)