Stable Diffusion

Stable Diffusion
Tipus	model de text a imatge, latent variable model (en) , Models de difusió i model d'aprenentatge profund
Versió inicial	22 agost 2022
Versió estable	3.5 (23 octubre 2024)
Llicència	Stability AI Community License (en) ; CreativeML Open RAIL-M (en)
Epònim	Models de difusió
Característiques tècniques
Sistema operatiu	Any that support CUDA kernels
Escrit en	Python
Equip
Desenvolupador(s)	CompVis group LMU Munich; Runway; Stability AI
Codi font	Fonts de codi
Codi font	Codi font
Més informació
Lloc web	stability.ai… (anglès)
Id. Subreddit	stablediffusion

Stable Diffusion és un model d'aprenentatge profund de text a imatge llançat el 2022. S'utilitza principalment per generar imatges detallades condicionades a descripcions de text, tot i que també es pot aplicar a altres tasques com ara la restauració d'imatges, repintar i generar traduccions d'imatge a imatge guiades per una entrada de text.^[3]

Stable Diffusion és un model de difusió latent, una varietat de xarxes neuronals generatives profundes desenvolupades pel grup CompVis de la LMU de Munic.^[4] El model ha estat llançat per una col·laboració de Stability AI, CompVis LMU i Runway amb el suport d'EleutherAI i LAION. ^[5]^[1]^[6] L'octubre de 2022, Stability AI va recaptar 101 milions de dòlars en una ronda liderada per Lightspeed Ventures i Coatue.^[7]

El codi i els pesos del model de Stable Diffusion són públics i es pot executar amb la majoria de maquinari de consum equipat amb una GPU modesta. Això va marcar una diferència dels models propietaris anteriors de text a imatge, com ara DALL-E i Midjourney, als quals només es podia accedir mitjançant serveis al núvol.^[8]

Referències

↑ ^1,0 ^1,1 «Stable Diffusion Repository on GitHub». CompVis - Machine Vision and Learning Research Group, LMU Munich, 17-09-2022. [Consulta: 17 setembre 2022].
↑ RunwayML. «stable-diffusion-v1-5». Hugging Face.
↑ «Diffuse The Rest - a Hugging Face Space by huggingface». huggingface.co. Arxivat de l'original el 2022-09-05. [Consulta: 5 setembre 2022].
↑ Rombach, Robin; Blattmann, Andreas; Lorenz, Dominik; Esser, Patrick; Ommer, Björn «High-Resolution Image Synthesis with Latent Diffusion Models». arXiv:2112.10752 [cs], 13-04-2022.
↑ «Stable Diffusion Launch Announcement». Stability.Ai. Arxivat de l'original el 2022-09-05. [Consulta: 6 setembre 2022].
↑ «Revolutionizing image generation by AI: Turning text into images». LMU Munich. [Consulta: 17 setembre 2022].
↑ Wiggers, Kyle. «Stability AI, the startup behind Stable Diffusion, raises $101M» (en anglès). Techcrunch. [Consulta: 17 octubre 2022].
↑ «The new killer app: Creating AI art will absolutely crush your PC». PCWorld. Arxivat de l'original el 2022-08-31. [Consulta: 31 agost 2022].

Enllaços externs

Demostració de Stable Diffusion

[stable-diffusion-github-1] 1,0 ^1,1 «Stable Diffusion Repository on GitHub». CompVis - Machine Vision and Learning Research Group, LMU Munich, 17-09-2022. [Consulta: 17 setembre 2022].

[2] RunwayML. «stable-diffusion-v1-5». Hugging Face.

[3] «Diffuse The Rest - a Hugging Face Space by huggingface». huggingface.co. Arxivat de l'original el 2022-09-05. [Consulta: 5 setembre 2022].

[4] Rombach, Robin; Blattmann, Andreas; Lorenz, Dominik; Esser, Patrick; Ommer, Björn «High-Resolution Image Synthesis with Latent Diffusion Models». arXiv:2112.10752 [cs], 13-04-2022.

[stable-diffusion-launch-5] «Stable Diffusion Launch Announcement». Stability.Ai. Arxivat de l'original el 2022-09-05. [Consulta: 6 setembre 2022].

[6] «Revolutionizing image generation by AI: Turning text into images». LMU Munich. [Consulta: 17 setembre 2022].

[7] Wiggers, Kyle. «Stability AI, the startup behind Stable Diffusion, raises $101M» (en anglès). Techcrunch. [Consulta: 17 octubre 2022].

[pcworld-8] «The new killer app: Creating AI art will absolutely crush your PC». PCWorld. Arxivat de l'original el 2022-08-31. [Consulta: 31 agost 2022].

[1]

[3]

[4]

[5]

[6]

[7]

[8]