Llama (language model)

LLaMA (Large Language Model Meta AI) is a large language model (LLM) released by Meta AI in February 2023. A variety of model sizes were trained ranging from 7 billion to 65 billion parameters. LLaMA's developers reported that the 13 billion parameter model's performance on most NLP benchmarks exceeded that of the much larger GPT-3 (with 175 billion parameters) and that the largest model was competitive with state of the art models such as PaLM and Chinchilla.^[1] Whereas the most powerful LLMs have generally been accessible only through limited APIs (if at all), Meta released LLaMA's model weights to the research community under a noncommercial license.^[2] Within a week of LLaMA's release, its weights were leaked to the public via BitTorrent.^[3]

Release and leak

LLaMA was announced on February 23rd, 2023 via a blog post and a paper describing the model's training, architecture, and performance.^[1]^[2] The code used to train the model was publicly released under the open-source GPL 3 license.^[4] Access to the model's weights was managed by an application process, with access to be granted "on a case-by-case basis to academic researchers; those affiliated with organizations in government, civil society, and academia; and industry research laboratories around the world".^[2]

On March 3rd, a torrent containing LLaMA's weights was uploaded, with a link to the torrent shared on the 4chan imageboard and subsequently spreading through online AI communities.^[3] Reactions to the leak varied. Some speculated that the model would be used for malicious purposes, such as more sophisticated spam. Some have celebrated the model's accessibility, as well as the fact that smaller versions of the model can be run relatively cheaply, suggesting that this will promote the flourishing of additional research developments.^[3] Multiple commentators, such as Simon Willison, compared LLaMA to Stable Diffusion, a text-to-image model which, unlike comparably sophisticated models which preceded it, was openly distributed, leading to a rapid proliferation of associated tools, techniques, and software.^[3]^[5]

References

^ ^a ^b A bot will complete this citation soon. Click here to jump the queue arXiv:2302.13971.
^ ^a ^b ^c "Introducing LLaMA: A foundational, 65-billion-parameter large language model". Meta AI. 24 February 2023.
^ ^a ^b ^c ^d Vincent, James (8 March 2023). "Meta's powerful AI language model has leaked online — what happens now?". The Verge.
^ "llama". GitHub. Retrieved 16 March 2023.
^ Willison, Simon (11 March 2023). "Large language models are having their Stable Diffusion moment". Simon Willison's Weblog.

[paper-1] A bot will complete this citation soon. Click here to jump the queue arXiv:2302.13971.

[blog-2] "Introducing LLaMA: A foundational, 65-billion-parameter large language model". Meta AI. 24 February 2023.

[verge-leak-3] Vincent, James (8 March 2023). "Meta's powerful AI language model has leaked online — what happens now?". The Verge.

[repo-4] "llama". GitHub. Retrieved 16 March 2023.

[willison-5] Willison, Simon (11 March 2023). "Large language models are having their Stable Diffusion moment". Simon Willison's Weblog.

[1]

[2]

[3]

[4]

[5]