LogSumExp

The LogSumExp (LSE) function is a smooth approximation to the maximum function, mainly used by machine learning algorithms^[1]. It's defined as the logarithm of the sum of the exponentials of the arguments:

LSE(x_{1},\dots ,x_{n})=\log \left(\exp(x_{1})+\cdots +\exp(x_{n})\right)

The LogSumExp function domain is $\mathbb {R} ^{n}$ , the real coordinate space, and its range is $\mathbb {R}$ , the real line. The larger the values of $x_{k}$ or their deviation, the better the approximation becomes. The LogSumExp function is convex, and is strictly monotonically increasing everywhere in its domain^[2] (but not strictly convex everywhere ^[3]).

On the other hand, when directly encountered, LSE can be well-approximated by $\max {\{x_{1},\dots ,x_{n}\}}$ , owing to the following tight bounds.

\max {\{x_{1},\dots ,x_{n}\}}\leq LSE(x_{1},\dots ,x_{n})\leq \max {\{x_{1},\dots ,x_{n}\}}+\log(n)

The lower bound is met when all but one of the arguments approach negative infinity, and the upper bound is met when all the arguments are equal.

log-sum-exp trick for log-domain calculations

The LSE function is often encountered when the usual arithmetic computations are performed in log-domain or log-scale.

Like multiplication operation in linear-scale becoming simple addition in log-scale; an addition operation in linear-scale becomes the LSE in the log-domain.

A common purpose of using log-domain computations is to increase accuracy and avoid underflow and overflow problems when very small or very large numbers are represented directly (i.e. in a linear domain) using a limited-precision, floating point numbers.

Unfortunately, the use of LSE directly in this case can again cause overflow/underflow problems. Therefore, the following equivalent must be used instead (especially when the accuracy of the above 'max' approximation is not sufficient). Therefore, many math libraries such as IT++ provide a default routine of LSE and use this formula internally.

LSE(x_{1},\dots ,x_{n})=x^{*}+\log \left(\exp(x_{1}-x^{*})+\cdots +\exp(x_{n}-x^{*})\right)

where $x^{*}=\max {\{x_{1},\dots ,x_{n}\}}$

A strictly convex log-sum-exp type function

LSE is convex but not strictly convex. We can define a strictly convex log-sum-exp type function^[4] by adding an extra argument set to zero:

LSE_{0}^{+}(x_{1},...,x_{n})=LSE(0,x_{1},...,x_{n})

This function is a proper Bregman generator (strictly convex and differentiable). It is met in machine learning. For example, as the cumulant of the multinomial/binomial family.

References

^ Nielsen, Frank; Sun, Ke (2016). "Guaranteed bounds on the Kullback-Leibler divergence of univariate mixtures using piecewise log-sum-exp inequalities". Entropy. 18: 442. arXiv:1606.05850. Bibcode:2016Entrp..18..442N. doi:10.3390/e18120442.{{cite journal}}: CS1 maint: unflagged free DOI (link)
^ El Ghaoui, Laurent (2015). Optimization Models and Applications.
^ "convex analysis - About the strictly convexity of log-sum-exp function - Mathematics Stack Exchange". stackexchange.com.
^ Nielsen, Frank; Hadjeres, Gaetan (2018). "Monte Carlo Information Geometry: The dually flat case". arXiv:1803.07225. Bibcode:2018arXiv180307225N. {{cite journal}}: Cite journal requires |journal= (help)

[F._Nielsen_2016-1] Nielsen, Frank; Sun, Ke (2016). "Guaranteed bounds on the Kullback-Leibler divergence of univariate mixtures using piecewise log-sum-exp inequalities". Entropy. 18: 442. arXiv:1606.05850. Bibcode:2016Entrp..18..442N. doi:10.3390/e18120442.{{cite journal}}: CS1 maint: unflagged free DOI (link)

[L._El_Ghaoui_2015-2] El Ghaoui, Laurent (2015). Optimization Models and Applications.

[3] "convex analysis - About the strictly convexity of log-sum-exp function - Mathematics Stack Exchange". stackexchange.com.

[F._Nielsen_2018-4] Nielsen, Frank; Hadjeres, Gaetan (2018). "Monte Carlo Information Geometry: The dually flat case". arXiv:1803.07225. Bibcode:2018arXiv180307225N. {{cite journal}}: Cite journal requires |journal= (help)

[1]

[2]

[3]

[4]

log-sum-exp trick for log-domain calculations

A strictly convex log-sum-exp type function

See also

References