Jump to content

Talk:Universal approximation theorem

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by PatrickKidger (talk | contribs) at 08:52, 1 July 2020. The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.
WikiProject iconMathematics Start‑class Low‑priority
WikiProject iconThis article is within the scope of WikiProject Mathematics, a collaborative effort to improve the coverage of mathematics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
StartThis article has been rated as Start-class on Wikipedia's content assessment scale.
LowThis article has been rated as Low-priority on the project's priority scale.

Deep variants

The two deep variants appear to be somewhat dubious. The citation is just a conference preceding, and it omits the proofs. If an expert knows any more reliable sources that would be ideal — Preceding unsigned comment added by Pabnau (talkcontribs) 01:47, 21 April 2019 (UTC)[reply]


In the last paragraph of the introduction, the result of n+1 width on continuous convex functions is stated as an "improvement" over the result of n+4 width on Lebesgue-integrable functions. Without additional knowledge of those papers, it's not clear to me why this should be considered an "improvement", since continuous convex functions are a much more restrictive class, and the network width difference is not asymptotically significant. - anonymous commenter


Hanin and Sellke's work are an improvement over Lu et al's because their result applies to general continuous functions, not just convex ones; furthemore they work in the uniform topology, rather than the L1 topology; furthermore they are narrower. This does still all apply to the ReLU activation function. There has also been another recent paper for general activation functions, but I'm an author on that so I don't know if it's acceptable for me to go around editing Wikipedia pages to mention it... (see also my discussion in 'Out of date', below.) 82.14.199.121 (talk) 13:40, 7 October 2019 (UTC)[reply]


Vague wording

The line "All Lebesgue integrable functions except for a zero measure set cannot be approximated by width-n ReLU networks" is confusing. What did the wikipedia editor mean to say? It's clearly not true that "All Lebesgue integrable functions cannot be approximated by width-n ReLU" networks, because if you take a given width-N ReLU network, it defines a Lebesgue integral function.

Maybe the original editor meant to say "Not all Lebesgue integrable functions can be approximated by width-n ReLU networks (approximation up to a set of zero measure)", or "There exists a Lebesgue integrable function that cannot be approximated by any width-n ReLU network (even up to a set of zero measure)." Lavaka (talk) 22:14, 16 August 2019 (UTC)[reply]


Out of date

This page is about twenty years out of date. A simpler more general form of the universal approximation theorem has been known since 1999 (http://www2.math.technion.ac.il/~pinkus/papers/acta.pdf). As mentioned in the discussion above, there's also some confusion about the deep variants of this theorem. I am happy to bring this page up to date as long as that's not breaking any Wikipedia rules about discussing one's own work? (As I have a paper on this topic myself; see also my comment in 'Deep variants', above.) 82.14.199.121 (talk) 13:40, 7 October 2019 (UTC)[reply]

I am not a seasoned wikipedian, so take my upvote cautiously. But 100% am in favor of this. I hope it it not a stale offer by now. Wikipedia allows paid editing of pages as long as the user discloses that they are paid. And usually there is a clear COI interest there. People just check their edits to ensure they are not being unfair. The Wikipedia:Plain_and_simple_conflict_of_interest_guide recommends that people in your situation make suggestions in the talk page, if I am reading it correctly. But I think it would also allow you to make changes as long as they are reviewed and you specific COI is declared. I suggest copying the page to your sandbox and working on it there. Then asking for someone to review and give either suggestions or an OK for merging it in. This could be viewed as making a lot of really specific suggestions, which are the best kind. I would love to help with that if you are still interested. Themumblingprophet (talk) 12:54, 1 May 2020 (UTC)[reply]
The offer stands! (I'm the previously anonymous user from above.) I'll try and put something together, and then hopefully run it past you. My first time editing a Wikipedia article; very exciting. PatrickKidger (talk) 18:01, 10 June 2020 (UTC)[reply]
First of all, welcome to Wikipedia and thank you for your contribution! I certainly agree that the article needed an update as it lacked novel developments. On the other hand, I am not entirely happy with the changes, for several reasons:
  • In my opinion, you should avoid adding your own works to WP, as you may very likely be biased with respect to their importance. It is especially true, if your work is not yet published (even if it is accepted to some conference). Even if you think that it is elegant and simplifies the results, you should let others decide. If it is really that good, then it will be eventually added to WP. WP is not for advertising your work.
  • For similar reasons, we prefer secondary sources. For example, if a scientific book or university textbook surveys your theorem, then it is an indicator of its importance. It looks quite strange that you gave the same weight to your new result and to the classical well-cited formulation of the theorem (contained in dozens of scientific books). Moreover, you have simply removed other results, which is also not elegant.
  • I am a bit disappointed that you have deleted the proof of the original version of the theorem. In my opinion, it was very instructive and would be nice to have. The aim of WP is not to present the newest results (because there are tons of such results and it is hard to decide their importance), but to show standard, widespread formulations for a general audience. For example, the article on Hoeffding's inequality starts with the Bernoulli case, even though there are much more general formulations of the result.
Therefore, I am planning to put back the original formulation of the theorem with its proof, and move your contribution to another section. Cheers, KœrteFa {ταλκ} 10:59, 30 June 2020 (UTC)[reply]
PS: You have also removed the comment that the proofs are usually not constructive, which is a very important point and should be mentioned. KœrteFa {ταλκ} 11:05, 30 June 2020 (UTC)[reply]
@KœrteFa -- I appreciate your concern.
  • I completely appreciate one should avoiding adding one's own work - this is why I waited between preprint on arXiv (last year) and peer-reviewed publication (this year) to make any changes involving my work. The paper is not just accepted to some conference, it is both peer reviewed and published.
  • I certainly agree that in general it is instructive to present simple widespread formulations for a general audience! This was the primary purpose of my edit - the previous versions all make additional assumptions; current versions by contrast are much more straightforward. In particular I do not think the current state of having so many versions of the theorem proliferating in the article helps this matter. On the topic of sketchproofs, I agree that these can be instructive, and perhaps a sketchproof of Pinkus' version in particular is worth adding. I do not see the interest in including the sketchproof of Cybenko's result, which is far harder to follow than Pinkus' elegant proof.
  • I removed the comment on non-constructivity because it is false.
For the above reasons I would like to revert your edits, but I'd prefer to have a discussion about it here first. PatrickKidger (talk) 08:51, 1 July 2020 (UTC)[reply]