Least Change Secant Update Methods for Nonlinear Complementarity Problem

In this work, we introduce a family of Least Change Secant Update Me-thods for solving Nonlinear Complementarity Problems based on its reformulation as a nonsmooth system using the one-parametric class of non-linear complementarity functions introduced by Kanzow and Kleinmichel. We prove local and superlinear convergence for the algorithms. Some numerical experiments show a good performance of this algorithm.


Introduction
Let F : R n → R n , F (x) = (F 1 (x), . . ., F n (x)) be a continuously differentiable mapping.The Nonlinear Complementarity Problem, NCP for short, consists of finding a vector x ∈ R n such that, x ≥ 0, F (x) ≥ 0, x T F (x) = 0. (1) Here, y ≥ 0 for y ∈ R n means y i ≥ 0 for all i = 1, . . ., n.The third condition in (1) requires that the vectors x and F (x) are orthogonal; for this reason, it is called complementarity condition.
The NCP arises in many applications such as Friction Mechanical Contact problems [1], Structural Mechanics Design problems, Lubrication Elastohydrodynamic problems [2], Traffic Equilibrium problems [3], as well as problems related to Economic Equilibrium Models [4].The importance of NCP in the areas of Physics, Engineering and Economics is due to the fact that the concept of complementarity is synonymous with the notion of system in equilibrium.In recent years, various techniques have been studied to solve the NCP, one of which is to reformulate it as a nonsmooth system of nonlinear equations by using special functions called complementarity functions [5].A function ϕ : R 2 → R such that is called a complementarity function.
Geometrically, the equivalence (2) means that the trace of the function ϕ obtained by the intersection with the xy plane is the curve formed by the positive semiaxes x and y, which is not differentiable at (0, 0).This lack of smoothness on the curve imply the nondifferentiability of the function ϕ.
In order to reformulate the NCP as a system of nonlinear equations, it is necessary to consider a complementarity function ϕ and to define Φ :R n → R n by . . .
then it follows from lack of smoothness of ϕ that the nonlinear system of equations is nonsmooth.From the definition of a complementarity function (2) it follows that a vector x * solves the system (4), if and only if, x * solves the NCP.Different algorithms have been proposed for solving the reformulation of the NCP by a nonsmooth system of nonlinear equations (4) like nonsmooth Newton methods [6], nonsmooth quasi-Newton methods [7], [8], [9], among others [10], [11], [12], [13].
There are many complementarity functions, but the most used has been the minimum function [14] and the Fischer-Burmeister function [15], defined respectively by The minimum function is nonsmooth at the points of the form (a, a), while the Fischer-Burmeister function is not nonsmooth at (0, 0).In 1998, Kanzow and Kleinmichel [15] introduced an one-parametric class of complementarity functions ϕ λ defined by ing.cienc., vol.11, no.21, pp.11-36, enero-junio.2015.
where λ ∈ (0, 4) and which we will refer to throughout this work as Kanzow function.This function is nonsmooth at (0, 0).For any other vector in R 2 , the gradient vector of ϕ λ is defined by In [16], the author makes a carefully analysis of this function and deduces some important bounds that we will use later.Moreover, In the special case λ = 2, the function ϕ λ reduces to theFischer-Burmeister function, whereas in the limiting case λ → 0, the function ϕ λ becomes a multiple of the minimum function.In what follows, we denote by Φ λ the function defined in (3) and obtained by the complementarity function ϕ λ .
In this work, we propose a nonsmooth quasi Newton method for solving the NCP using the system Φ λ (x) = 0; for this method, we prove local convergence.Moreover, we introduce a family of least change secant update for solving the NCP based on the nonsmooth system of equations Φ λ (x) = 0 and, for these family, we prove local and superlinear convergence under suitable assumptions.
We organize this paper as follows.In Section 2, we reformulate the NCP as a nonsmooth system of equations using the Φ λ function and we characterize a subset of the generalized Jacobian of Φ λ in x.In the first part of Section 3, we propose a new algorithm quasi-Newton for solving the nonsmooth system of nonlinear equations Φ λ (x) = 0 and, for this method, we develop the local convergence theory.In the second part, we introduce a family of least change secant update methods following the theory developed in [17] for this type of methods.We prove, under suitable assumptions, local and superlinear convergence.In Section 4, we analyze numerically, the local performance of the algorithms introduced in the last section, for which we use 8 test problems proposed in [14], [18].Four of this are applications problems to Economic Equilibrium and Game Theory.Finally, Section 5 contains some remarks on what we have done in this paper and present possibilities for future works.

Reformulation of NCP using the Kanzow function
In this section, we reformulate the NCP as a nonsmooth system of equations and from the definition of the generalized Jacobian given in [19], we construct a subset of matrices of the generalized Jacobian of Φ λ at x. Then we show that this subset at a solution of the system Φ λ (x) = 0 is a compact set.
Our reformulation of NCP as a system of equations is based on the Kanzow complementarity function ϕ λ defined by ( 6) and the Φ λ function defined in the last section.Exploiting (2) it is readily seen that the NCP is equivalent to the following system of nonsmooth equations The most popular method for solving a differentiable system of nonlinear equations G(x) = 0 is Newton's method [8], which require calculating, at each iteration, the Jacobian matrix of G.There are situations where the derivatives of G are not available, or are difficult to calculate.For this cases, a less expensive alternative and widely used for solving G(x) = 0 are the quasi-Newton methods [8] which use, at each iteration, a matrix approximation to the Jacobian matrix.Among the latter are the so-called least change secant update methods [10] , which form a family characterized by the fact that, at each iteration, the Jacobian approximation satisfies a secant equation [10] with a minimum variation property relative to some matrix norm.The price of using an approximation to the Jacobian Matrix is reflected in the decrease of the speed of convergence of the respective quasi-Newton method.
When a function is not differentiable as in the case of the function Φ λ , the term "Jacobian matrix" does not make sense.Fortunately, Frank H. Clarke introduced the concept of Generalized Jacobian that extends the matrix Jacobian concept for some non-differentiable functions [19].Let F : R n → R n be a locally Lipschitzian function.The Generalized Jacobian of F at x is the set given by where D F is the set of all points where F is differentiable and hull denotes the convex envelope of the set.The ∂F (x) is a nonempty, convex and compact set [19].In the particular case in which F is differentiable at x, ∂F (x) has a single element: the Jacobian Matrix of F at x , F ′ (x).
Since the Kanzow function ϕ λ is locally Lipschitz continuous [16], so is the Φ λ function.Thus, the Generalized Jacobian of Φ λ (x) exists.In order to build matrices in this set, we consider a sequence of vectors in R n , {y k } , which converges to x and such that Φ ′ λ (y k ) exists, then we show that lim k→∞ Φ ′ λ (y k ) exists.To classify the indices of the components of x, we define the set The sequence1 that we will use is where {ε k } is a sequence of positive numbers such that lim k→∞ ε k = 0 and the vector z is chosen such that z i = 0 where i ∈ β .Obviously y k converges to x when k → ∞.To analyze the differentiability of Φ λ in y k , we consider two cases.If i / ∈ β then x i = 0 or F i (x) = 0 , by the continuity of F i , we can assume ε k so small that y k i = 0 or F i (x) = 0, for which Φ λ is differentiable at y k .If i ∈ β, the z i = 0; therefore, y k i = 0 , which is sufficient for Φ λ to be differentiable at y k .
By differentiability of Φ λ at y k , the Jacobian matrix of Φ λ at y k , exist and its i th row is given by with χ and ψ defined by ( 7) and {e 1 , . . ., e n } is the canonical basis of R n .
For calculating the lim k→∞ further, by the Taylor's theorem, where 12) and ( 13) in the i th row of Then, for all i = 1, . . ., n, the limit when k → ∞ of each row of where Because there is an uncountable of ways of choosing the vector z, we have an uncountable set of matrices H in ∂Φ λ (x) which can be calculated by the above procedure.Now, we consider the particular case where x * is a solution of Φ λ (x) = 0.If there is any index i such that x i = F i (x * ) = 0, then x * is called a degenerate solution.We will denote the matrices ( 14) in x * by H * (z).The set of these matrices we will call Z * .Clearly, for each z ∈ R n , there is a matrix H * (z).Thus, Z * is an infinite set and further it is a compact set.To verify the compactness, it is sufficient to demonstrate that it is closed, since Z * ⊆ ∂Φ(x * ), which is compact [19], [16].

Algorithm and convergence theory
In the first part of this section, we propose a new quasi-Newton algorithm for solving the system Φ λ (x) = 0 and we develop the local convergence theory for this method.In the second part, we develop a family of least change secant update methods, following [17].For these family, we prove local and superlinear convergence under suitable assumptions.The following algorithm is the basic quasi-Newton algorithm applied to Φ λ (x) = 0. Algorithm 1.Given x 0 an initial approximation to the solution of the problem and λ ∈ (0, 4), compute Here {e 1 , . . ., e n } is a canonical basis of R n , the matrix A k is an approximation of the Jacobian matrix of F at x k (to see Section 6) and Under the following assumptions, we will prove that the sequence generated by the basic quasi-Newton Algorithm 1 is well define and converges linearly to a solution of Φ λ (x) = 0.

Local assumptions
H3.The matrices of the set Z * are nonsingular.
From assumption H3 and by the compactness of Z * , we have that there is a constant µ such that for all

A local convergence theory
The following two Lemmas prepare the "Theorem of the two neighborhoods".
To prove that Q is well defined, we must show that B −1 exist.For this, we consider the inequality The first term on the right side of ( 20) is bounded by (17), thus ing.cienc., vol.11, no.21, pp.11-36, enero-junio.2015.
We bound the second term on the right of (20) using the continuity of F, the Lipschitz continuity of ∇ϕ λ and the definition of infinite matrix norm.By the continuity of F, for all ǫ > 0 exist δ > 0 such that , Given that the gradient of ϕ λ is Lipschitz continuos [16], we have We consider the two possibilities for this maximum: For the above,

|20
Ingeniería y Ciencia Substituting ( 21) and ( 22) in ( 20) so, . By Banach's Lemma 2 there exists B −1 and therefore the function Q is well defined.Moreover, The second part of the proof is to show (19).For this, we subtract x * in (18), we apply • ∞ and perform some algebraic manipulations.
then we obtain, On the other hand, for H ∈ ∂Φ λ (x).
In [15], Kanzow and Kleinmichel show that Φ λ (x) is semismooth, i.e., Thus, for any ρ > 0, there exists Let ǫ 0 = min{ǫ, ǫ r }.If x − x * ∞ < ǫ 0 , and A − F ′ (x * ) ∞ < δ 0 then, from ( 27), ( 22), ( 25) and (24), The following theorem is analogous to the theorem of the two neighborhoods of differentiable case [10], which guarantees linear convergence of the proposed algorithm.The name of the two neighborhoods is due to, in its assumptions, it requires two neighborhoods, one for the solution which should be the starting point, and the other for the Jacobian matrix of F at the solution which should be the initial approach.Theorem 3.1.Let H1-H3 be verified and let r ∈ (0, 1), then there exist positive constants ǫ 1 and δ 1 such that, if , with B k the matrix whose rows are defined by ( 15) is well defined, converges to x * and satisfies We consider the function (18).Thus, for all k = 0, 1, . . ., with B k defined by (15).
• Induction hypotheses: we assume that for k = m − 1, , is well define, and Given that and from the assumption A m − F ′ (x * ) ∞ ≤ δ 1 , we have for the Lemma 3.2 that x m+1 is well defined and satisfies Therefore, we conclude that (28) is true for all k = 0, 1, . . . .
We observe that in the proof of Theorem 3.1 we used the infinity norm.Therefore, if e k = x k − x * is the error related to any other norm, then e k ≤ α r k e 0 , where α is a positive constant that does not depend on k, and r is as in Theorem 3.1.
Among the standard theorems of thequasi Newton theory to systems of nonlinear equations is the theorem known as Dennis-Moré condition [21] which gives a sufficient condition for superlinear convergence.The following theorem is analogous to the theorem just mentioned and it will be useful in the next section to prove superlinear convergence of Algorithm 1.In his proof, we use • = • ∞ but, we recall that superlinear convergence results are norm-independent.Theorem 3.2.Let H1-H3 be verified and that, for some x 0 , the sequence (15) and where S k = x k+1 − x k , then the sequence {x k } converges superlinearly to x * .
Proof.As we mentioned above, Φ λ is semismooth in x * , thus lim where On the other hand, also, Substituting (34) in (33) and using (35), we have

Ingeniería y Ciencia
By the limit definition, in particular for ρ = 1 2 , there exists ǫ > 0, such that if From (34), On the other hand, where s k = x k+1 − x k .We add and subtract in the equality (38) the term Applying a norm and the triangle inequality, we obtain By (32), the first term of the right term converges to 0. The second term converges to 0, for the semismoothness of Φ λ .Thus, ing.cienc., vol.11, no.21, pp.11-36, enero-junio.2015.
The quasi-Newton methods differ in how to update the matrix A k at each iteration.Among the "practical" quasi-Newton algorithm are those that are called least change secant methods, in which the updating of A k , named A k+1 , must satisfy the secant equation [10] given by A k+1 (x k+1 − x k ) = F (x k+1 ) − F (x k ) and its change (measured in some norm) relative to A k must be minimum.Requiring that secant equation and a minimum change are satisfied between two consecutive updates makes the sequence of matrices {A k } have a property known as bounded deterioration [10] [8], which guarantees that the matrices of the sequence remain in a neighborhood of F ′ (x * ).This is essential to demonstrate local and linear convergence.Thus, at each iteration of the least change secant algorithm, the vectors x k and x k+1 defined the set V by

Ingeniería y Ciencia
Given that we need the matrix in V "nearest" A k , it is natural to think of the orthogonal projection of this matrix on V, named P V (A k ) = P x k ,x k+1 (A k ).Given that and that V is a closed set, we can ensure that P V (A k ) ∈ V.This projection is unique because V is a convex set.Therefore, Thus, Different least change secant updates are obtained by varying the matrix norm R n×n or the subspace S, producing the family of least change secant update methods.For example, "Good" Broyden update,"Bad" Broyden update [22], Schubert update [23] andSparse Schubert update [23].
Algorithm 2. Assume that x 0 and A 0 are arbitrary.x k+1 and A k+1 for k = 0, 1, ..., are generated as follows: where In order to develop the theory of convergence of the least change secant update methods generated by Algorithm 2 we will assume an additional Assumption.
H4.For all x, z in a neighborhood x * , there are A ∈ V (x, z) and where σ(x, z) = max{ x − x * , z − x * }.

Additional convergence results
In the next lemma, we show that a matrix generated using the rule (45) to update the matrix A k may deteriorate, but in a controlled way.
Lemma 5.1.Let H1-H4 be verified and let A + be the orthogonal projection of A on the set V (x, z) and A the orthogonal projection of Lemma 5.2.Let assumptions H1-H4 be verified.Then there exists c > 0 such that whenever the vectors x and y belong to a neighborhood of x * , with y − x * ≤ x − x * and the matrix A in a neighborhood of F ′ (x * ).
The two previous lemmas (see proofs in [16] and [24], respectively) and assumptions H1-H4 are central to ensuring the following result.
Thus, in either case, it is possible to choose ǫ in (0, ǫ 1 ) 0, (δ We will use induction on k in the proof of this theorem. ) is well defined and satisfies
Lemma 5.3.We assume the Assumptions H1 -H4 are verified and let the sequence {A k } be generated by (45).There are positive constants ǫ y δ such that, if x 0 − x * ≤ ǫ and A 0 − F ′ (x * ) ≤ δ, and the sequence With this result (See proof in [16]), we can derive sufficient condition to have super linear convergence as shown by the next Theorem.
Theorem 5.2.Let the Assumptions H1-H4 and let the sequences {x k } and {A k } be generated by the Algorithm 2 and lim k→∞ then the sequence {x k } converges superlinearly to x * .
The proof follows in a straightforward way from Theorems 4.2 and 4.3.
From (54 ) and theLemma 7.3, we have that the right side expression of the last inequality is equal to zero.So This is the Dennis-Moré type condition of Theorem 3.2.Therefore, the sequence {x k } converges superlinearly to x * .

Some numerical experiments
In this section, we analyze numerically the local behavior of the family of least change secant update methods introduced in Section 2. For this, we

|30
Ingeniería y Ciencia compare our algorithms with the Generalized Newton method proposed in [6].The Algorithm 3 is a Generalized Newton type method.It uses at each iteration, the matrix H k (defined in the previous section) which uses the Jacobian Matrix of F.
The Algorithm 4, which is a least secant change type method is based on the Algorithm 2 that we proposed at Section 3.For updating the matrix A k , in each iteration, we use the four formulas: "Good" and "bad" Broyden, Schubert and Sparse Schubert, whereby we have four versions of Algorithm 4.
For all the tests, we use the software Matlab .We use 8 test problems for nonlinear complementarity, four of which we chose from a list proposed, [14] and which are considered "hard problem".These are Kojima-Shindo (application to Economic Equilibrium [25]), Kojima-Josephy, Nash-Cornout (application to the Game Theory harker) and Modified Mathiesen (application to Walrasian Economic Equilibrium [26]) problems.We generate the four remaining problems like in [27]; for this, we define F : R n −→ R n by, For these functions, the vector x * = (1, 0, 1, 0, ...) ∈ R n is a degenerate solution and the functions h i are given by Lukšan [18], namely,Trigonometric system, Exponential trigonometric, tridiagonal and Rosenbrock.We use the same stopping criteria proposed in [28].We choose the parameter λ := λ min for using in the Algorithms 3 and 4 as follows 1.We vary λ in the interval (0, 4) from λ = 10 −3 to λ = 3.999 with increments of 10 −3 .
2. We use the generalized Newton method (Algorithm 3) with each of these values of λ.
3. We called λ min to the value of λ for which the generalized Newton converges in fewer iterations.
For the numerical test, we vary λ in the interval (0, 4) from λ = 10 −3 to λ = 3.999 with increments of 10 −3 .Of all these values of λ, that for which the generalized Newton method converges in less iterations, we call it λ min , and we use it as the parameter λ in Algorithms 3 and 4. The initial approximations are the same as in [14] and [18].
Table 1 presents the results of our numerical tests.Its columns contains the following information: P roblem means the problem name, n is the dimension problem, λ min is the value of λ for which the Algorithm 3 converges in fewer iterations.We also include a column with the algorithm and the secant update used.Thus, GN means Generalized Newton; SSU, BBU, SU and GBU means Algorithm 4 with the Sparse Schubert Update, "Bad" Broyden Update, Schubert Update and the "Good" Broyden Update, respectively.A − sign means divergence.
From Table 1, we observe that, for these preliminary numerical tests, the Algorithm 4 that we proposed for solving the NCP has good local behavior.In particular, we highlight the Modified Mathiesen problem, in which each method converges with the same number of iterations but to different solutions to the problem.

Conclusions
In this paper, we propose a quasi-Newton method for solving the nonlinear problem when this is reformulated as a nonlinear system of equations.This method can be useful when the derivatives of the system are very expensive or difficult to obtain.Moreover, we generated a family of least change secant update methods that, under certain hypotheses, converge local and superlinearly to the solution of the problem.Some numerical experiments shows a good local performance of this algorithm, but it is necessary more numerical tests using others well-known LCSU methods such as Column Updating method [29] Inverse Column Updating method [30].It is necessary to incorporate a globalization strategy to the algorithm proposed and to develop theoretical and numerical analysis of the global algorithm.referees for constructive suggestions to the first version of this paper, which allowed us to improve the presentation of this article.

Table 1 :
Local behavior of Algorithms 3 and 4.