Talk:Universal approximation theorem

Mathematics Start‑class Low‑priority

	Mathematics portal This article is within the scope of WikiProject Mathematics, a collaborative effort to improve the coverage of mathematics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.MathematicsWikipedia:WikiProject MathematicsTemplate:WikiProject Mathematicsmathematics
Start	This article has been rated as Start-class on Wikipedia's content assessment scale.
Low	This article has been rated as Low-priority on the project's priority scale.

Deep variants

The two deep variants appear to be somewhat dubious. The citation is just a conference preceding, and it omits the proofs. If an expert knows any more reliable sources that would be ideal — Preceding unsigned comment added by Pabnau (talk • contribs) 01:47, 21 April 2019 (UTC)[reply]

In the last paragraph of the introduction, the result of n+1 width on continuous convex functions is stated as an "improvement" over the result of n+4 width on Lebesgue-integrable functions. Without additional knowledge of those papers, it's not clear to me why this should be considered an "improvement", since continuous convex functions are a much more restrictive class, and the network width difference is not asymptotically significant. - anonymous commenter

Hanin and Sellke's work are an improvement over Lu et al's because their result applies to general continuous functions, not just convex ones; furthemore they work in the uniform topology, rather than the L1 topology; furthermore they are narrower. This does still all apply to the ReLU activation function. There has also been another recent paper for general activation functions, but I'm an author on that so I don't know if it's acceptable for me to go around editing Wikipedia pages to mention it... (see also my discussion in 'Out of date', below.) 82.14.199.121 (talk) 13:40, 7 October 2019 (UTC)[reply]

Vague wording

The line "All Lebesgue integrable functions except for a zero measure set cannot be approximated by width-n ReLU networks" is confusing. What did the wikipedia editor mean to say? It's clearly not true that "All Lebesgue integrable functions cannot be approximated by width-n ReLU" networks, because if you take a given width-N ReLU network, it defines a Lebesgue integral function.

Maybe the original editor meant to say "Not all Lebesgue integrable functions can be approximated by width-n ReLU networks (approximation up to a set of zero measure)", or "There exists a Lebesgue integrable function that cannot be approximated by any width-n ReLU network (even up to a set of zero measure)." Lavaka (talk) 22:14, 16 August 2019 (UTC)[reply]

Out of date

This page is about twenty years out of date. A simpler more general form of the universal approximation theorem has been known since 1999 (http://www2.math.technion.ac.il/~pinkus/papers/acta.pdf). As mentioned in the discussion above, there's also some confusion about the deep variants of this theorem. I am happy to bring this page up to date as long as that's not breaking any Wikipedia rules about discussing one's own work? (As I have a paper on this topic myself; see also my comment in 'Deep variants', above.) 82.14.199.121 (talk) 13:40, 7 October 2019 (UTC)[reply]