Bioinformatics Open Source Conference
Bioinformatics Open Source Conference | |
---|---|
![]() BOSC Logo | |
Frequency | Annually |
Location(s) | Madison, United States (2022) |
Years active | 24 |
Previous event | BOSC 2022 |
Next event | BOSC 2023 |
Attendance | ~100[1] |
Organised by | Nomi L. Harris, Karsten Hokamp (2021 chairs)[2] |
Member | Open Bioinformatics Foundation |
Website | www |
The Bioinformatics Open Source Conference (BOSC) is an academic conference on open-source programming and other open science practices in bioinformatics, organised by the Open Bioinformatics Foundation. The conference has been held annually since 2000 and is run as a two-day meeting either within Intelligent Systems for Molecular Biology (ISMB) conference or as a joint conference with the Galaxy community.
Program
The conference is held as a single track consisting of presentations, poster sessions and two keynote talks by people of influence in open-source bioinformatics.[1]
Since 2010, an informal two-day "CollaborationFest" (formerly Codefest) has been held directly preceding the conference.[3][4]
History
National Institutes of Health Associate Director for Data Science Philip Bourne and C. Titus Brown gave keynote talks at BOSC 2014.[5]
BOSC 2016 was organized in Orlando, Florida from July 8–9 before the main ISMB conference.[6]
In 2018 and 2020, BOSC partnered with Galaxy to organize two joint conferences called GCCBOSC and Bioinformatics Community Conference (BCC) respectively.[7] The event in 2018 was held in Portland, Oregon.[8] The BCC in 2020 took place online with two time schedules for eastern/western time zones[9]
Since 2021, BOSC has been taking place within the ISMB conferences again. In 2023 BOSC took place in Lyon, France between July 24-28 as part of the ISMB/ECCB conference.
Conference Highlights
BOSC 2024
The BOSC 2024 conference was a part of the Systems for Molecular Biology Conference of 2024. The 2024 event also marked the 25th anniversary of the conference, which took place in Montreal, Canada.
The conference was held in a hybrid setting, with around 200 people attending in person and many others viewing the presentations online.
The conference covered a wide variety of topics, with the main theme focusing on approaches to using Artificial Intelligence and Machine Learning in Bioinformatics.
Event Highlights
The conference featured two keynote speakers.
One of them, Dr. Mélanie Courtot, gave a presentation titled "The Data Shows We Need Better Data" on day one of the conference. During her speech, she discussed some of the resources available to obtain quality free data and open-source software programs for conducting research. In addition, she introduced the TRUE principles for preparing data for AI tools. TRUE is an acronym standing for Tracked, Reasonable, Understandable, and Ethical.
Dr. Courtot explained that tracked data for AI means that it should be known how the data was obtained, there should be evidence to support the claims of the data, and the authors who released the data should be properly credited. The final part of this principle is that the data should be computationally manageable.
The Reasonable component of the principle states that the data should be organized in a logical way so that new inferences and conclusions can be made from it.
The Understandable part dictates that the data should be able to be processed by open-source AI models. Some of the models she included in her presentation were LLaMA and Mistral.
Finally, the Ethical principle emphasized that available data should promote diversity, equity, and inclusion, while maintaining the privacy of those the data may be linked to.
The next keynote speaker to present on day two was Andrew Su, who gave a presentation titled "Open Data, Knowledge Graphs, and Large Language Models". This presentation discussed how, despite the usefulness of large language models (LLMs) for retrieving data or answering specific questions, they are not always accurate and the responses they generate still need to be verified.
A solution he presented was Retrieval-Augmented Generation (RAG). He explained this as a way to improve the accuracy of answers provided by LLMs by keeping the information they query well-organized.
Another topic in his presentation included tools that can be used to test the accuracy and rate the quality of answers obtained from LLMs.
Timeline
Day 1
Other than the keynote speakers, there were a total of 36 talks and 23 posters selected to be presented at the conference. One of the sessions for day one was Data Analysis. These presentations were about open-source approaches to analyzing biomedical data, different types of data that are freely available for use, and some of the research that has been done using these open-source tools and data. Some of the presentations for this session included:
- "Gemma: Curation, Re-analysis and Dissemination of 18,000 Gene Expression Studies" by Paul Pavlidis
- "ROC Picker: Propagating Statistical and Systematic Uncertainties in Biological Analysis" by Jeffery Roskes
- "Antimicrobial Resistance Prediction of Non-Tuberculosis Mycobacteria from Whole Genome Sequence Data" by Idowu Olawoye
The next session of day one was the Open Data Session, which included presentations about some of the databases, data portals, and platforms that are being used by researchers around the world. Some of the presentations in this session were:
- "Creating an Open-source Data Platform" by Mitchell Shiell
- "Going Viral: The Development of the VirusSeq Data Portal" by Justin Richardsson
- "intermine.bio2rdf.org: A QLever SPARQL Endpoint for InterMine Databases" by Francois Belleau
The next session was Visualization, which included presentations about new additions to older databases. Presentations in this session included:
- "Connecting Integrated Genome Browser to a Huge Genome Database Using Its Own API Solves One Problem and Creates Another" by Ann Loraine
- "Collaborating Our Way to Optimal Integration Between Tripai 4 and JBrowse 2" by Carolyn T. Caron
- "An Integrated Environment for Browsing 3-D Protein Structures and Multiple Sequence Alignment in JBrowse 2" by Colin Diesh
The last session for day one was Developer Tools and Libraries, displaying some of the open-source tools used for analyzing data. Some of the presentations in this session included:
- "Codefair: Make Biomedical Research FAIR Without Breaking a Sweat" by Bhavesh Patel
- "An Open-source Ecosystem for Scalable and Computationally Efficient Nanopore Data Processing" by Avishai Weissberg
- "Tattaki: Enhancing the Robustness of Bioinformatics Workflows with Simple, Tolerant File Format Detection" by Masaki Fuki
Day 2
The first session of day 2 was “Standards and Frameworks for Open Science”. This session was all about how to create consistent, recyclable, and long lasting software. Presentations in this session included.
- "Enhancing Reproducibility in Immunogenetics: Leveraging Containerization Technology for Bioinformatics Workflows" by Rayo Suseno
- "Breaking the silo: composable bioinformatics through cross-disciplinary open standards" by Nezar Abdennur
- "For long-term sustainable software in bioinformatics: a manifesto" by Luis Pedro Coelho
The next session was called “Open Approaches to AI/ML” , which was about how to use machine learning to solve biological problems. Presentations in this session included.
- "Gene Set Summarization Using Large Language Models" by Marcin Joanchimiak
- "FAIR, modular and reproducible image-based ML workflows for biologists: a template and case study from imageomics" by Hilmar Lapp
- "Trust and Transparency in Reporting Machine Learning: The DOME-GigaScience Press Trial" by Chris Armi
Open Panel Discussion
The events of day two concluded with an open panel discussion titled “Open Source AI/ML: A Game Changer for Bioinformatics?”. The researchers on the panel included Lawrence Hunter, Thomas Hervé Mboa Nkoudou, Mélanie Courtot, and Andrew Su. The moderator of the panel was Monica Munoz-Torres. This open discussion revolved around the potential gains and pitfalls of using AI and ML methods to conduct bioinformatic research.
Once each of the panelists had explained their positions, the discussion was opened to the audience. After a long discussion the sances of the panelists were split with half thinking the use of AI and ML in bioinformatics has been an important and bettering for the field while the other half were still weary of the potential harms of it. [10] [11] [12] [13]
BOSC 2023
The 2023 Bioinformatics Open Source Conference (BOSC 2023) was held on July 24–25, 2023, drawing over 2,100 in-person attendees and approximately 900 online viewers. About 200 participants actively engaged in the event's sessions and activities.[14]
Keynote Presentations
The keynote speakers were Sara El-Gebali and Joseph M. Yracheta.
- El-Gebali presented “A New Odyssey: Pioneering the Future of Scientific Progress Through Open Collaboration”. Her talk explored navigating the realm of science through diverse alliances and institutions, with a focus on promoting open science through collaboration.
- Yracheta gave a talk titled “The Dissonance between Scientific Altruism & Capitalist Extraction: The Zero Trust and Federated Data Sovereignty Solution”, offering insights from the American Indian perspective in the United States. He critiqued the lack of clarity and transparency in current Open Data policies, arguing they tend to prioritize funding and researcher data rights over individual privacy.
Open and Ethical Data Sharing Panel

In addition to the keynotes, BOSC 2023 hosted a panel on Open and Ethical Data Sharing, featuring keynote speakers El-Gebali and Yracheta along with Verena Ras and Bastian Greshake Tzovaras. The panel addressed the absence of a formal ethical code for bioinformaticians and emphasized the need for stronger advocacy in ethical data sharing practices.
Topical Sessions and Posters

BOSC also featured a topical session comprising 53 talks, with 49 presenters displaying posters. Topics included, but were not limited to:
- Open Science and Reproducible Research
- Open Biomedical Data
- Citizen/Participatory Science
- Standards and Interoperability
- Data Science, Workflows, Data Access and Visualization
- Open Approaches to Translational Bioinformatics
- Developer Tools and Libraries
- Inclusion, Outreach and Training
BOSC 2022
BOSC 2022 marked the first hybrid Bioinformatics Open Source Conference, offering both virtual and in-person attendance in Madison, Wisconsin. Approximately 1,000 participants attended in person, with an additional 800 joining virtually. The conference featured a panel discussion titled 'Building and Sustaining Inclusive Open Science Communities,' along with 28 talks and 46 posters covering various topics in bioinformatics. BOSC 2022 also included joint keynotes with the Education and Bio-Ontologies Communities of Special Interest (COSIs). Jason Williams presented 'Riding the Bicycle: Including All Scientists on a Path to Excellence,' and Melissa Haendel delivered 'The Open Data Highway: Turbo-Boosting Translational Traffic with Ontologies.' [16]

Past conferences
As of January 2024, there have been 24 BOSC held around the world, of those 20 were purely in-person conferences, 2 purely remote due to the COVID-19 pandemic and one that was organized as a hybrid meeting.[17]
Year | Conference partner | Location | Keynote speakers |
---|---|---|---|
2023 | ISMB | Lyon, France | Joseph M. Yracheta, Sara El-Gebali |
2022 | ISMB | Hybrid: Madison, WI and online | Jason Williams, Melissa Haendel |
2021 | ISMB | Online (would have been Lyon) | Christie Bahlai, Lara Mangravite, Thomas Hervé Mboa Nkoudou |
2020 | GCC | Online (would have been Toronto) | Lincoln Stein, Abigail Cabunoc Mayes |
2019 | ISMB | Basel, Switzerland | Nicola Mulder |
2018 | GCC | Portland, OR | Fernando Pérez, Tracy Teal |
2017 | ISMB | Prague, Czech Republic | Mad Price Ball, Nick Loman |
2016 | ISMB | Orlando, FL | Jennifer Gardy, Steven Salzberg |
2015 | ISMB | Dublin, Ireland | Ewan Birney, Holly Bik |
2014 | ISMB | Boston, MA | Philip Bourne, Titus Brown |
2013 | ISMB | Berlin, Germany | Sean Eddy, Cameron Neylon |
2012 | ISMB | Long Beach, CA | Jonathan Eisen, Carole Goble |
2011 | ISMB | Vienna, Austria | Lawrence Hunter, Matt Wood |
2010 | ISMB | Boston, MA | Guy Coates, Ross Gardler |
2009 | ISMB | Stockholm, Sweden | Robert Hanmer, Alan Ruttenberg |
2008 | ISMB | Toronto, Canada | Julian Lombardi |
2007 | ISMB | Vienna, Austria | Carole Goble |
2006 | ISMB | Fortaleza, Brasil | Amos Bairoch, Alberto M.R. Davila |
2005 | ISMB | Detroit, MI | Hilmar Lapp |
2004 | ISMB | Glasgow, Scotland | Wolfgang Huber |
2003 | ISMB | Brisbane, Australia | - |
2002 | ISMB | Edmonton, Canada | Ewan Birney, Michael Eisen, Winston Hide |
2001 | ISMB | Copenhagen, Denmark | Steven Brenner |
2000 | ISMB | San Diego, CA | Tim O'Reilly, Lincoln Stein |
References
- ^ a b Harris, N. L.; Cock, P.; Chapman, B.; Goecks, J.; Hotz, H.-R.; Lapp, H. (July 14, 2014). "The Bioinformatics Open Source Conference (BOSC) 2013". Bioinformatics. 31 (2): 299–300. doi:10.1093/bioinformatics/btu413. PMC 4287938. PMID 25024288.
- ^ "BOSC 2021 – Open Bioinformatics Foundation". Retrieved November 22, 2022.
- ^ "Codefest - Open Bioinformatics Foundation". www.open-bio.org. Retrieved July 20, 2014.
- ^ Möller, Steffen; Afgan, Enis; Banck, Michael; Cock, Peter J. A.; Kalas, Matus; Kajan, Laszlo; Prins, Pjotr; Quinn, Jacqueline; Sallou, Olivier; Strozzi, Francesco; Seemann, Torsten; Tille, Andreas; Valls Guimera, Roman; Katayama, Toshiaki; Chapman, Brad (October 14, 2013). "Sprints, Hackathons and Codefests as community gluons in computational biology". EMBnet.journal. 19 (B): 40. doi:10.14806/ej.19.B.726.
- ^ "BOSC 2014 Schedule - Open Bioinformatics Foundations". www.open-bio.org. Retrieved July 20, 2014.
- ^ "BOSC 2016 – Open Bioinformatics Foundation". Open Bio.
- ^ "About BOSC - Open Bioinformatics Foundation". Retrieved November 22, 2022.
- ^ "GCCBOSC 2018 - Open Bioinformatics Foundation". Retrieved November 22, 2022.
- ^ "Bioinformatics Community Conference". Retrieved November 22, 2022.
- ^ "BOSC 2024". Open Bioinformatics Foundation. Retrieved April 21, 2025.
- ^ Courtot, Mélanie (July 15, 2024). "BOSC keynote". Courtot Lab Genome Informatics. Retrieved April 21, 2025.
- ^ "BOSC 2024 Schedule". Open Bioinformatics Foundation. Retrieved April 21, 2025.
- ^ Harris, Nomi L.; Hokamp, Karsten; Maia, Jessica; Ménager, Hervé; Munoz-Torres, Monica C.; Sawant, Swapnil; Unni, Deepak; Williams, Jason (September 27, 2024). "25 Years of BOSC, the Bioinformatics Open Source Conference [version 1; peer review: not peer reviewed]". F1000Research. 13: 1100. doi:10.12688/f1000research.156426.1 (inactive April 22, 2025). Retrieved April 21, 2025.
{{cite journal}}
: CS1 maint: DOI inactive as of April 2025 (link) - ^ Harris, N. L.; Fields, C. J.; Hokamp, K.; Just, J.; Khetani, R.; Maia, J.; Ménager, H.; Munoz-Torres, M. C.; Unni, D.; Williams, J. (2023). "BOSC 2023, the 24th annual Bioinformatics Open Source Conference". F1000Research. 12: 1568. doi:10.12688/f1000research.143015.1. PMC 10704065. PMID 38076297.
- ^ "BOSC 2023: Bioinformatics Open Source Conference". BOSC 2023. Open Bioinformatics Foundation. Retrieved April 21, 2025.
- ^ Harris, Nomi (2022). "BOSC 2022: the first hybrid and 23rd annual Bioinformatics Open Source Conference". F1000Research. 11. U.S. National Library of Medicine: 1034. doi:10.12688/f1000research.125043.1. PMC 9468630. PMID 36128559.
- ^ "OBF » About BOSC » About BOSC". Retrieved November 23, 2022.