Jump to content

Top-p sampling

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by Dithridge (talk | contribs) at 04:18, 23 August 2023 (Created page with ''''Top-p sampling''', also called nucleus sampling, is a technique for language model decoding introduced by Ari Holtzman in 2019.<ref>{{cite journal |last1=Holtzman |first1=Ari |last2=Buys |first2=Jan |last3=Du |first3=Li |last4=Forbes |first4=Maxwell |last5=Choi |first5=Yejin |title=The Curious Case of Neural Text Degeneration |date=22 April 2019 |url=https://arxiv.org/abs/1904.09751 |access-date=23 August 2023}}</ref> Naively sampling the highest pro...'). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.
(diff) ← Previous revision | Latest revision (diff) | Newer revision → (diff)

Top-p sampling, also called nucleus sampling, is a technique for language model decoding introduced by Ari Holtzman in 2019.[1] Naively sampling the highest probability token at each step in auto-regressive decoding is known to product texts that are repetitive and otherwise unnatural. Top-p sampling avoids this by setting a threshold p and then restricting the sampling to the set of most probable tokens with cumulative probability less than p.

Top-k sampling is similar except that the sample is taken from the k-highest probability tokens regardless of their cumulative probability. The advantage of top-p sampling is that one avoids the difficult problem of choosing the optimal value of k which can very depending on the shape of the output distribution and the particular task and dataset[2].

The top-p sampling technique is used in popular large language model applications like ChatGPT and is implemented in language modeling frameworks like Hugging Face and Cohere[3].

  1. ^ Holtzman, Ari; Buys, Jan; Du, Li; Forbes, Maxwell; Choi, Yejin (22 April 2019). "The Curious Case of Neural Text Degeneration". Retrieved 23 August 2023. {{cite journal}}: Cite journal requires |journal= (help)
  2. ^ McCaffrey, James D. "Nucleus Sampling for Natural Language Processing". Retrieved 23 August 2023.
  3. ^ von Platen, Patrick. "How to generate text: using different decoding methods for language generation with Transformers". Hugging Face. Retrieved 23 August 2023.