Jump to content

User:Clements.UWLib/sandbox

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by Clements.UWLib (talk | contribs) at 16:53, 17 October 2023. The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Wikibase and the German National Library (Jens Ohlig and Elena Aleynikova, Wikimedia Deutschland)

  • Slides
  • Jens Ohlig from Wikimedia Deutschland, works in software development. Working on Wikidata, and Wikibase (underlying Wikidata software). Will talk about project with GND.
  • But first an introduction to Wikibase!
    • Software that runs Wikidata, developed for Wikidata but available free & open source.
    • Stores structured data.
    • Features: data model supports multilingual usage (support for over 300 languages); like Wikidata, properties can have multiple contradictory values (cited to different sources); exports in a number of formats, SPARQL query functionality built in.
    • Machine friendly, building blocks for the semantic web
  • Would like more Wikibase installations, outside the Wikidata project because not everything is in scope for the Wikidata project. Would like to see domain specific repositories.
    • Examples
      • A library of the world’s greatest jams and marmalades; supports their specific previously developed data model and ontologies.
      • FactGrid: another example, a database for historians, documents around the Order of the Illuminati.
        • Queries can reveal more about the data than historians already know
        • Can see who wrote to whom
      • Rhizome, born digital art. Have been using Wikibase since 2015 to support digital preservation. Better suited than software that was as fit to their particular purpose. Flexibility of the system an advantage as they developed data models.
      • LinguaLibre: collection of audio snippets of spoken language.
  • GND project: integrated authority file for the German-speaking world, maintained by the German National Library. Focus on persons, places and events. Long history of collaboration with Wikimedians. Interest in opening the GND. GND4C is a project to open to cultural institutions (may or may not involve Wikibase). Putting all their data into Wikibase, or Wikibases. Seeing Wikibase as something that can support authorities, a natural transition for them. German National Library offers courses to Wikimedians and then they can contribute to parts of the GND
  • Current project: migrate the GND to Wikibase. Several workshops with engineers.
  • Current state: three Wikibase installations; more open than previously but invite only. Characterized as “semi-open”
  • Blog post on the project, explains starting point and current state.
  • Also a page on the project.
  • When can I see something? Evaluation will be done by the end of the summer. Wikibase is suitable, which is good news. Will need to do homework with user rights and roles (not a concept we have in the Wikidata project). Will be presented at WikidataCon in the fall. So stay tuned!

Questions

  • Diego from METRO in NYC. Have been testing Wikibase. Where can I find a public road map for Wikibase development? Amazon Neptune acquired parts of Wikibase?
    • Jens: Using Blazegraph, some complications with that but it may not be developed in the future. Product Manager -- this is something she worries about. The road map is in the open, he can share the information offline. GraphQL [can someone help fill in?]
    • Link: Wikidata development plan
  • Scott MacL: Can you describe the parameters of Wikibase projects (Olaf Simons, Rhizome, etc.) that might then codified please? And are all people in Germany in the Wikibase database for persons, events, etc.? And could this be extended to people in each of all ~200 countries potentially?
    • Jens: Wikidata is the largest installation in Wikibase at the moment, nothing else comes close. Would be surprised if anyone runs into limits of Wikibase. If there are limits, please get in touch with Jens. No, the GND doesn’t cover all persons in Germany -- for the use of libraries; people who have published, people who have been published about.
  • Karen Smith-Yoshimura: why three installations?
    • Jens: Currently evaluating different strategies. Will only have one in the end.
  • Chris: ARDC in Australia runs a research vocabularies service which uses PoolParty underneath it all. https://vocabs.ands.org.au/ Is there an institution that offers something similar for wikibase?
    • Jens: You can use Wikibase for controlled vocabularies: examples from Japan, material research vocabularies. Seems like Wikibase can support this usage.
  • Steve: Are there directions somewhere for getting QuickStatements to work with the Docker image?
    • Jens: Engineers have told me these problems are now resolved so try updating. QuickStatements = “quite a beast!”
  • Harvard: Who's the right person or persons to ask if you've installed the docker image, and everything works nicely, but then your queries stop returning results?
    • Jens: elaborate in an email please and he will connect you. (Will do!)
  • UMn: Does each installation of Wikibase offer a common set of data relationships to which the installation can make additions? What enables federation across Wikibase installations?
    • Jens: loves this question! A new Wikibase is naked, you need to define everything from scratch. What we are looking into right now is federation (which means different things to different people). Looking at the idea of reusing Wikidata items in a local Wikibase installation (so the concept of “human” for example) but then model other parts yourself. Currently in a research phase -- no code has been produced.
  • Dan Michael O. Heggø: Did you yet look into integrating with library systems? Specifically updating linked bibliographic records when concepts are modified or merged in Wikibase?
    • Jens: no, we have not. We probably lack knowledge. Libraries are an interesting field, but we want to look at other fields as well, civic data for example. Libraries close to our hearts, so collaborating and getting insights from the library community is important. No current plans for integrating but maybe we can work together to enable you to do your own integrations.
  • UMn: Could you say more about scope limitations in Wikidata and now Wikibase supports a defined scope?
    • Jens: all comes down to notability, but pretty open. Something like “Arrowhead number three” in your local museum may not be notable enough, for example and you may need to have your own Wikibase instance.
  • Hillary: what if you have questions about notability, where should those be directed?
    • Jens: it depends. Project chat is a good place to start. Community is still young and can be formed. I can’t give an answer for what the community wants, but they are quite open.
      • Hillary: will be focussing on different communication channels in Wikidata / Wikibase.
  • John Riemer: The Program for Cooperative Cataloging is weighing how best to conduct a Wikidata pilot project. One key question revolves around the advantages/disadvantages of working in a separate instance of Wikibase versus the “production” version of Wikidata.

From listening to the several examples provided in your presentation, it seems like specialized controlled vocabularies could be used within the production instance. It also seem likes particular project like Rhizome could be signaled by a data element in the production instance. Do you have comments on the disadvantages of trying to run those projects in the main instance of Wikidata?

    • Martin P: One answer is that Wikidata is not for original research: everything in Wikidata should already have been published somewhere. If you want to surface data from an original research process/ closed community, set up your own Wikibase.
    • Jens: data model needs to go through community discussion, and you may not want that for all the properties you need. Depends on the needs of your project.
  • Hilary: Will you be making changes to the user interface for GND?
    • Jens: no, they are happy with it as is. Changes to the interface with new features -- we have a UX team that thinks about this on an ongoing basis.
  • Hilary: will you be creating customized template for validation?
    • Jens: we will have to see. Not on the roadmap but it is a FAQ!

Thanks to Jens for a great presentation!