MM algorithm
The MM algorithm is an iterative optimization method which exploits the convexity of a function in order to find their maximums or minimums. The MM stands for “Majorize-Minimization” or “Minorize-Maximization”, depending on whether you’re doing maximization or minimization.MM itself is not an algorithm, but description of a way of how to construct an optimization algorithm.
The EM algorithm can be treated as a special case of the MM algorithm. However, in the EM algorithm complex conditional expectation and extensive analytical skills are usually involved, while in the MM algorithm convexity and inequalities are our major fouc, and it is relatively easier to understand and apply in most of the cases.
History
The original idea of the MM algorithm can be dated back at least to 1970 when Ortega and Rheinboldt were doing their studies related to line search methods.[1] The same idea kept reappearing under different guise in different areas since, until 2000 Hunter and Lange put all in to a general frame works and named MM for the first time.[2] Recently studies have shown that it can be used in a wide range of context, like mathematics, statistics, machine learning, engineering, etc.
How it works
MM algorithm works by finding a surrogate function that minorizes or majorizes the objective function. Optimizing the surrogate functions will drive the objective function upward or downward until a local optimum is reached.
Take the minorize-maximazation version for example.
Let be the objective convex function we want to maximize. At the step of the algorithm, , the constructed function will be called the minorized version of the objective function (the surrogate function) at if
≤ for all
Then we maximize instead of , and let
The above iterative method will guarantee that will converge to a local optimum or a saddle point as goes to infinity, because by the construction we have
≥ ≥
The marching of and the surrogate functions relative to the objective function is shown on the Figure

Ways to construct surrogate functions
Basically, we can use any inequalities to construct the desired majorized/minorized version of the objective function, but there are several typical choices
- Jensen's inequality
- Convexity inequality
- Cauchy–Schwarz inequality
- Inequality of arithmetic and geometric means
References
- ^ Ortega, J.M.; Rheinboldt, W.C. (1970). "Iterative Solutions of Nonlinear Equations in Several Variables". New York: Academic: 253–255.
- ^
Hunter, D.R.; Lange, K. (1970). "Quantile Regression via anMMAlgorithm". Journal of Computational and Graphical Statistics. 9: 60–77.
{{cite journal}}
: line feed character in|journal=
at position 27 (help)