The SWiP project makes use of language, data and knowledge technologies to promote language equality among all of South Africa's official languages. The linguistic hegemonic status of English (and to a lesser extent Afrikaans) has resulted in English being the language of learning and teaching[1] which downplays an African epistemology,[2] thus local African languages are commonly under resourced.[3] The acronym SWiP describes the three main partners in a national collaboration between SADiLaR, the free encyclopedia Wikipedia and PanSALB who are working alongside local speech and language communities within Academica, to address language equality using digital technologies, especially Wikipedia.[4]
Wikipedia is a common source of language data for natural language processing (NLP).[7] Low-resource languages have limited corpora of text (speech data, annotated text and other forms of linguistic data) for LLMs to draw on for NLP. The SWiP project has introduced a variety of alternative possibilities for the collection and compilation of corpora of suitable text for low-resource languages, and rolled this out on a national scale. These corpora can be used to create corpus-based dictionaries or semi-automatic translation.[8]
This collaborative project is also intended to promote, preserve, and digitise South Africa's indigenous languages and cultural knowledge by enhancing their presence on digital platforms such as Wikipedia.[9] By partnering with cultural and linguistic organisations, the project was designed to close the digital gap and ensure that local languages and cultural narratives are preserved and shared online.[6]
Phase 1 of the SWiP Project was launched on 20 September 2023 at UNISA with his Royal Majesty Enock Makhosoke
II Mabhena, the King of amaNdebele, attending.[5] This event launched a number of events listed below and was successfully completed. Phase 2 of the project began in November 2024 and continues through 2025 at venues such as the Nelson Mandela University, University of Mpumalanga as well as University of Limpopo.[10]
An early success of the project was the integration of IsiNdebele into Wikipedia. Initially represented by only 11 articles in the Wikipedia Incubator, the language saw rapid growth to over 140 articles within a year (currently[when?] at 164),[11] marking its transition to Wikipedia's main platform.[6]
The project has conducted extensive training sessions, engaging over 300 participants from various South African universities. Trainers introduced academics to Wikipedia and they learned article authorship skills (add content, citations, and photographs) and practiced translation using the Wikipedia translation tool.
These sessions led to the creation of hundreds of new articles, thousands of edits, and significant contributions of written content, references, and multimedia. The initiatives have fostered digital literacy and community engagement while significantly enhancing Wikipedia's indigenous language content.[9][12]
Expanded Digital Content – Hundreds of new Wikipedia articles have been created in indigenous languages.
Preserved Cultural Narratives – The project has ensured that cultural stories, languages, and traditions are accessible to global audiences.
Empowered Communities – Through training sessions and collaborative workshops, over 300 participants have become active digital content creators.
Swip Project in progress at the University of LimpopoEnhanced Visibility – The newly created content has collectively amassed millions of views, signifying a broad digital reach.