1.58-bit large language model
![]() | This article is actively undergoing a major edit for a little while. To help avoid edit conflicts, please do not edit this page while this message is displayed. This page was last edited at 07:17, 22 April 2025 (UTC) (2 months ago) – this estimate is cached, . Please remove this template if this page hasn't been edited for a significant time. If you are the editor who added this template, please be sure to remove it or replace it with {{Under construction}} between editing sessions. |
A 1.58-bit Large Language Model (1.58-bit LLM) is a version of a transformer large language model with weights using only three values: -1, 0, and +1. This restriction theoretically allows the model to replace costly multiplications with additions and reduce the storage memory. Since the end-task performance and perplexity of the 1.58-bit LLMs, at least for smaller model sizes (up to 3-4 GB), are close to their "full precision" (16-bit FP16 or BF16) counterparts, this design allows reaching the same artificial intelligence goals with much lower hardware requirements, latency, and training effort.[1][2]
The name comes from a fact that a single trit, a ternary arithmetic equivalent of a bit that can take the {-1, 0, 1} values, carries bits of information. The 1.58-bit LLM models are also called 1-bit LLMs.[1][3]
References
- ^ a b Ma et al. 2024, p. 1.
- ^ Friha et al. 2024, p. 5822.
- ^ Morales 2025.
Sources
- Ma, Shuming; Wang, Hongyu; Ma, Lingxiao; Wang, Lei; Wang, Wenhui; Huang, Shaohan; Dong, Li; Wang, Ruiping; Xue, Jilong; Wei, Furu (2024-02-27). "The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits". arXiv:2402.17764. A bot will complete this citation soon. Click here to jump the queue
- Ma, Shuming; Wang, Hongyu; Huang, Shaohan; Zhang, Xingxing; Hu, Ying; Song, Ting; Xia, Yan; Wei, Furu (2025), BitNet b1.58 2B4T Technical Report, doi:10.48550/ARXIV.2504.12285, retrieved 2025-04-22
- Friha, Othmane; Amine Ferrag, Mohamed; Kantarci, Burak; Cakmak, Burak; Ozgun, Arda; Ghoualmi-Zine, Nassira (2024). "LLM-Based Edge Intelligence: A Comprehensive Survey on Architectures, Applications, Security and Trustworthiness". IEEE Open Journal of the Communications Society. 5: 5799–5856. doi:10.1109/OJCOMS.2024.3456549. ISSN 2644-125X.
- Morales, Jowi (2025-04-17). "Microsoft researchers build 1-bit AI LLM with 2B parameters". Tom's Hardware. Retrieved 2025-04-21.