Jump to content

Swish function

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by EternalNomad (talk | contribs) at 02:15, 1 May 2020 (Start). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.
(diff) ← Previous revision | Latest revision (diff) | Newer revision → (diff)

The swish function is a mathematical function defined as follows: .

Applications

In 2017, after performing analysis on ImageNet data, researchers from Google alleged that using the function as an activation function in artificial neural networks improves the performance, compared to ReLU and sigmoid functions.[1] It is believed that one reason for the improvement is that the swish function helps alleviate the vanishing gradient problem during backpropagation.[2]

References

  1. ^ "Searching for Activation Functions" (PDF). 27 October 2017.
  2. ^ Swish as Neural Networks Activation Function