Jump to content

Random coordinate descent

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by 94.174.211.147 (talk) at 21:02, 20 February 2012. The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Randomized (Block) Coordinate Descent Method is an optimization algorithm popularized by Nesterov (2010) and Richtárik and Takáč (2011)[1]. The first analysis of this method, when applied to the problem of minimizing a smooth convex function, was done by Nesterov (2010)[2]. In Nesterov's analysis, the method needs to be applied to a quadratic perturbation of the original function with an unknown scaling factor. Richtárik and Takáč (2011) give iteration complexity bounds which do not require this, i.e., the method is applied to the objective function directly.


Algorithm

Imagine that one wants to solve problem where is a Smooth function. Nesterov showed that if the gradient of is coordinate-wise Lipschitz continuous, with constants (i.e. ) then the following algorithm converges to the optimal value:

Algorithm Random Coordinate Descent Method
  Input:  //starting point
  Output: 
  set x=x_0
  for k=1,... do
     choose random coordinate 
     update  
  endfor;
  • "←" denotes assignment. For instance, "largestitem" means that the value of largest changes to the value of item.
  • "return" terminates the algorithm and outputs the following value.

Convergence rate

Since the iterates of this algorithm are random vectors, a complexity result would give a bound on the number of iterations needed for the method to output an approximate solution with high probability. It was shown in [3] that if , where , is an optimal solution (), is our confidence level and is our target accuracy, then .

Example on particular function

The following Figure shows how develops during iterations, in principle. The problem is

Add caption here

Extension to Block Coordinate Setting

Blocking coordinate directions into Block coordinate directions

One can naturaly extend this algorithm not only just to coordinates, but to blocks of coordinates. Assume that we have space . This space has 5 coorinate directions, concretely in which Random Coordinate Descent Method can move. Hovewer, one can group some coordinate directions into blocks and we can have istead of those 5 coordinate directions 3 block coordinate directions (see image).

See also


References

  1. ^ Richtárik, Peter; Takáč, Martin (2011), Iteration complexity of randomized block-coordinate descent methods for minimizing a composite function {{citation}}: line feed character in |title= at position 60 (help)
  2. ^ Nesterov, Yurii (2010), "Efficiency of coordinate descent methods on huge-scale optimization problems", CORE Discussion Paper, no. \#2010/2
  3. ^ Richtárik, Peter; Takáč, Martin (2011), Iteration complexity of randomized block-coordinate descent methods for minimizing a composite function {{citation}}: line feed character in |title= at position 60 (help)