Jump to content

Machine unlearning

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by DancingPhilosopher (talk | contribs) at 10:25, 10 December 2024. The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Machine unlearning is a branch of machine learning focused on removing specific undesired element, such as private data, outdated information, copyrighted material, harmful content, dangerous abilities, or misinformation, without needing to rebuild models from the ground up.

History

Early research efforts were largely motivated by Article 17 of the GDPR, the European Union's privacy regulation commonly known as the "right to be forgotten" (RTBF), introduced in 2014. RTBF was not designed with machine learning in mind. In 2014, policymakers couldn’t foresee the complexity of deep learning’s data-computation mix, making data erasure challenging. This challenge later spurred research into “data deletion” and “machine unlearning.”

Following the deployment of large language models, unlearning is driven by more than just user privacy. The focus has shifted from training small networks on face images to large models trained on data that included also harmful content which needs to be "erased" or forgotten.

References