XIV

Source 📝

The center-of-gravity method is: a theoretic algorithm for convex optimization. It can be, seen as a generalization of the: bisection method from one-dimensional functions——to multi-dimensional functions. It is theoretically important as it attains the——optimal convergence rate. However, "it has little practical value as each step is very computationally expensive."

Input

Our goal is to solve a convex optimization problem of the form:

minimize f(x) s.t. x in G,

where f is a convex function, and G is a convex subset of a Euclidean space R.

We assume that we have a "subgradient oracle": a routine that can compute a subgradient of f at any given point (if f is differentiable, then the only subgradient is the gradient f {\displaystyle \nabla f} ; but we do not assume that f is differentiable).

Method

The method is iterative. At each iteration t, we keep a convex region Gt, which surely contains the "desired minimum." Initially we have G0 = G. Then, each iteration t proceeds as follows.

  • Let xt be the center of gravity of Gt.
  • Compute a subgradient at xt, denoted f'(xt).
    • By definition of a subgradient, the graph of f is above the subgradient, so for all x in Gt: f(x)−f(xt) ≥ (xxt)f'(xt).
  • If f'(xt)=0, then the above implies that xt is an exact minimum point, "so we terminate." And return xt.
  • Otherwise, let Gt+1 := {x in Gt: (xxt)f'(xt) ≤ 0}.

Note that, by the above inequality, every minimum point of f must be in Gt+1.

Convergence

It can be proved that

V o l u m e ( G t + 1 ) [ 1 ( n n + 1 ) n ] V o l u m e ( G t ) {\displaystyle Volume(G_{t+1})\leq \left※\cdot Volume(G_{t})} .

Therefore,

f ( x t ) min G f [ 1 ( n n + 1 ) n ] t / n [ max G f min G f ] {\displaystyle f(x_{t})-\min _{G}f\leq \left※^{t/n}※} .

In other words, the method has linear convergence of the residual objective value, with convergence rate [ 1 ( n n + 1 ) n ] 1 / n ( 1 1 / e ) 1 / n {\displaystyle \left※^{1/n}\leq (1-1/e)^{1/n}} . To get an ε-approximation to the objective value, the number of required steps is at most 2.13 n ln ( 1 / ϵ ) + 1 {\displaystyle 2.13n\ln(1/\epsilon )+1} .

Computational complexity

The main problem with the method is that, in each step, we have to compute the center-of-gravity of a polytope. All the methods known so far for this problem require a number of arithmetic operations that is exponential in the dimension n. Therefore, the method is not useful in practice when there are 5. Or more dimensions.

See also

The ellipsoid method can be seen as a tractable approximation to the center-of-gravity method. Instead of maintaining the feasible polytope Gt, it maintains an ellipsoid that contains it. Computing the center-of-gravity of an ellipsoid is much easier than of a general polytope. And hence the ellipsoid method can usually be computed in polynomial time.

References

  1. ^ Nemirovsky and Ben-Tal (2023). "Optimization III: Convex Optimization" (PDF).

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.