Jump to content

1.58-bit large language model

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by Dimawik (talk | contribs) at 19:53, 21 April 2025 (top: Expanding article). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

A 1.58-bit Large Language Model (1.58-bit LLM) is a version of a large language model with weights using only three values: -1, 0, and +1. This restriction allows the model to replace costly multiplications with additions and reduce the storage memory. Since the end-task performance and perplexity of 1.58-bit LLMs are close to their "full precision" (16-bit FP16 or BF16) counterparts, this design allows reaching the same artificial intelligence goals with much lower hardware requirements, latency, and training effort.[1]

References

  1. ^ Ma et al. 2024, p. 1.

Sources

  • Ma, Shuming; Wang, Hongyu; Ma, Lingxiao; Wang, Lei; Wang, Wenhui; Huang, Shaohan; Dong, Li; Wang, Ruiping; Xue, Jilong; Wei, Furu (2024-02-27). "The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits". arXiv:2402.17764. A bot will complete this citation soon. Click here to jump the queue