Wikidata:Project chat
Shortcuts: WD:PC, WD:CHAT, WD:?
Wikidata project chat A place to discuss any and all aspects of Wikidata: the project itself, policy and proposals, individual data items, technical issues, etc.
Please use
|
- Afrikaans
- العربية
- беларуская
- беларуская (тарашкевіца)
- български
- Banjar
- বাংলা
- brezhoneg
- bosanski
- català
- کوردی
- čeština
- словѣньскъ / ⰔⰎⰑⰂⰡⰐⰠⰔⰍⰟ
- dansk
- Deutsch
- Zazaki
- dolnoserbski
- Ελληνικά
- English
- Esperanto
- español
- eesti
- فارسی
- suomi
- føroyskt
- français
- Nordfriisk
- galego
- Alemannisch
- ગુજરાતી
- עברית
- हिन्दी
- hrvatski
- hornjoserbsce
- magyar
- հայերեն
- Bahasa Indonesia
- interlingua
- Ilokano
- íslenska
- italiano
- 日本語
- Jawa
- ქართული
- қазақша
- ಕನ್ನಡ
- 한국어
- kurdî
- Latina
- lietuvių
- latviešu
- Malagasy
- Minangkabau
- македонски
- മലയാളം
- मराठी
- Bahasa Melayu
- Mirandés
- مازِرونی
- Nedersaksies
- नेपाली
- Nederlands
- norsk bokmål
- norsk nynorsk
- occitan
- ଓଡ଼ିଆ
- ਪੰਜਾਬੀ
- polski
- پنجابی
- português
- Runa Simi
- română
- русский
- Scots
- davvisámegiella
- srpskohrvatski / српскохрватски
- සිංහල
- Simple English
- slovenčina
- slovenščina
- shqip
- српски / srpski
- svenska
- ꠍꠤꠟꠐꠤ
- ślůnski
- தமிழ்
- తెలుగు
- ไทย
- Tagalog
- Türkçe
- українська
- اردو
- oʻzbekcha / ўзбекча
- Tiếng Việt
- Yorùbá
- 中文
![]() |
On this page, old discussions are archived after 7 days. An overview of all archives can be found at this page's archive index. The current archive is located at 2025/05. |
deidetected.com, a self-published source potentially used for harassment
This website launched and run by the creator of the "Sweet Baby Inc detected" Steam curator would fall under the definition of a self-published source on Wikipedia. The Steam curator has been linked to the harassment campaign against Sweet Baby Inc. by reputable sources like PC Gamer, The Verge, and multiple others.
Wikidata has a page for the website, with the website linked via the described at URL property, by User:Kirilloparma on more than one if not every occasion. Even within the scope of that source, it is done in a very targeted way in that the website seems to be added to the Wikidata pages only when the game is recommended against at deidetected.com (e.g. The First Descendant, Abathor, Valfaris: Mecha Therion recommended as "DEI FREE" by deidetected do not have the property set). Based on that, its goal of harassment or POV pushing appears to be evident.
Does Wikidata have any guidelines that would explicitly allow or disallow this behavior or the coverage of deidetected.com at all? Daisy Blue (talk) 09:45, 14 September 2024 (UTC)
- There is no policy on WD for blacklisting websites for other than malicious cases such as spam or malware Trade (talk) 11:59, 14 September 2024 (UTC)
- Now from having read the property description for described at URL on its talk page, which explains that it's for "reliable external resources", I'm convinced the website has no place on Wikidata, as it's not a reliable source (at least not per the guidelines of Wikipedia (WP:RSSELF)). What is the best place to initiate its removal without having to start a potential edit war? A bot would also do a more efficient job at removing it from all the pages. Daisy Blue (talk) 12:03, 14 September 2024 (UTC)
- You might have more luck if you stopped bringing up Wikipedia guidelines and used the Wikidata ones instead Trade (talk) 00:09, 15 September 2024 (UTC)
- Wikidata itself cites the Wikipedia guidelines on self-published sources (and on original research). Daisy Blue (talk) 05:04, 15 September 2024 (UTC)
- English Wikipedia policy is im many cases useful to decide what should be done in Wikidata (e.g. which sources are reliable), but should never be considered normative and have no more authoritativeness than policies in any other project. GZWDer (talk) 06:37, 15 September 2024 (UTC)
- Wikidata itself cites the Wikipedia guidelines on self-published sources (and on original research). Daisy Blue (talk) 05:04, 15 September 2024 (UTC)
- You might have more luck if you stopped bringing up Wikipedia guidelines and used the Wikidata ones instead Trade (talk) 00:09, 15 September 2024 (UTC)
- Now from having read the property description for described at URL on its talk page, which explains that it's for "reliable external resources", I'm convinced the website has no place on Wikidata, as it's not a reliable source (at least not per the guidelines of Wikipedia (WP:RSSELF)). What is the best place to initiate its removal without having to start a potential edit war? A bot would also do a more efficient job at removing it from all the pages. Daisy Blue (talk) 12:03, 14 September 2024 (UTC)
This could be used to mass undo 18 of the edits that introduced the links, but it's not progressing for me when trying. Daisy Blue (talk) 11:14, 15 September 2024 (UTC)
Seems like a low-quality, private website that doesn't seem to add anything of value to our items. There are countless websites out there, but we generally don't add every single site via described at URL (P973) just for simply existing. IIRC, there were various cases in the past where users added unreliable websites to lots of items, that were then considered spam and deleted accordingly. And if the site's primary purpose is indeed purely malicious and causing harassment, there's really no point in keeping it. Best to simply put it on the spam blacklist and keep the whole culture war nonsense out of serious projects like Wikidata. Additionally, DEIDetected (Q126365310) currently has zero sources indicating a clear lack of notability. --2A02:810B:5C0:1F84:45A2:7410:158A:615B 13:50, 15 September 2024 (UTC)
- I've already nominated that and Sweet Baby Inc detected for deletion citing the same reason, though specifically for the curator, one could stretch point 2 of Wikidata:Notability to argue against it, but I'm not sure what value it would bring to the project apart from enabling harassment and its use to justify any other related additions. Daisy Blue (talk) 16:06, 15 September 2024 (UTC)
- Just add this website to the spam blacklist, no one will be able to add links to this website on Wikimedia projects anymore. Midleading (talk) 17:18, 16 September 2024 (UTC)
- What's the proper venue for proposing that? Also, seeing how you have a bot, could you suggest a quick way to mass remove the remaining instances from Wikidata? I've already undone a number by hand but it's not the greatest experience. Having the knowledge may also help in the future. Daisy Blue (talk) 18:24, 16 September 2024 (UTC)
- On the home page of Meta-Wiki, click Spam blacklist, and follow instructions there.
- To clean up links to this website, I recommend External links search. A WDQS search is likely to time out. I also recommend reviewing each case manually, sometimes the item should be nominated for deletion, but tools can't do that. Midleading (talk) 01:27, 17 September 2024 (UTC)
- Thanks. I'll remove the rest by hand then. As for the Wikimedia spam blacklist, it says that "Spam that only affects a single project should go to that project's local blacklist". I'm not sure if there have been any attempts to cite deidetected on Wikipedia or elsewhere. We can search for the live references (there are none) but not through the potential reverted edits, I don't think. Daisy Blue (talk) 07:33, 17 September 2024 (UTC)
- Well, you may request this website be banned on Wikipedia first, then you may find some users who agree with you. Midleading (talk) 08:45, 18 September 2024 (UTC)
- I believe Wikipedia has the same policy in that if it hasn't been abused (and I wouldn't know if it has been specifically on Wikipedia), then there is no reason to block it. On Wikidata, as it stands now, the additions come from one user, Kirilloparma, who pushed back on my removals here but hasn't reverted. Unless it becomes a sustained effort by multiple users, it will come down to whether Kirilloparma concedes that described at URL is for reliable sources and the website is not a reliable source. Daisy Blue (talk) 12:14, 18 September 2024 (UTC)
- For some reason Kirilloparma keeps making points on the subject on the Requests for deletions page rather than here (despite having been informed), now arguing that the short property description takes precedence over the property documentation on the talk page, which is dismissed as "outdated". Daisy Blue (talk) 09:29, 20 September 2024 (UTC)
- Well, you may request this website be banned on Wikipedia first, then you may find some users who agree with you. Midleading (talk) 08:45, 18 September 2024 (UTC)
- Thanks. I'll remove the rest by hand then. As for the Wikimedia spam blacklist, it says that "Spam that only affects a single project should go to that project's local blacklist". I'm not sure if there have been any attempts to cite deidetected on Wikipedia or elsewhere. We can search for the live references (there are none) but not through the potential reverted edits, I don't think. Daisy Blue (talk) 07:33, 17 September 2024 (UTC)
- What's the proper venue for proposing that? Also, seeing how you have a bot, could you suggest a quick way to mass remove the remaining instances from Wikidata? I've already undone a number by hand but it's not the greatest experience. Having the knowledge may also help in the future. Daisy Blue (talk) 18:24, 16 September 2024 (UTC)
- Just add this website to the spam blacklist, no one will be able to add links to this website on Wikimedia projects anymore. Midleading (talk) 17:18, 16 September 2024 (UTC)
- Wikidata has items for many websites even if those websites are worthy of criticism. Knowing that "Sweet Baby Inc detected" is linked to "DeiDetected" is useful information even if both of those sources would be completely unreliable.
- I don't see any use of links to deidetected.com within Wikidata where it's used for the purpose of harassement which would justify putting it on a blacklist. ChristianKl ❪✉❫ 13:09, 26 September 2024 (UTC)
Captions and colour of images of personalities scanned from books
There are many drawings of Czech personalities used here in Wikidata which look like e. g. File:Adolf Heyduk – Jan Vilímek – České album.jpg or File:Karel Jaromír Erben – Jan Vilímek – České album.jpg. Sometimes there may be another version of the same image of similar quality in Commons (often uploaded by me) such as File:Karel Jaromír Erben (cut).jpg, which
- is without the caption present directly in the image,
- is more cut with less space around the portrait
- has the colour of the book paper removed.
Until now I did not have any problems with such replacements in Wikidata although I've been doing this occasionally for years, but now I have met with disagreement by User:Skot. Because my arguments did not convince him, and because I would like to prevent any edit-warring, I would like to ask other people for their opinions.
While I admit that the pictures with captions can be useful for somebody and so it is good to have them in Commons, I also believe that pictures without them like File:Karel Jaromír Erben (cut).jpg are better for Wikidata, because WD has a different tool to include captions if they are needed–the qualifier "media legend". Images from Wikidata are also being taken over by other projects, which also use different tools for captions, and the result then is that the caption is being doubled, as has happened e. g. in s:Author:Adolf Heyduk.
Cutting off the extra space around the portrait leads to better display in infoboxes, where the portrait looks larger, while taking the same infobox space.
I also think that the yellow-brown colour of the book paper is a noise contaminating the picture and worsening its contrast, and that it is not the colour of the picture but the colour of the medium from which the picture was taken. While it is absolutely possible to use such coloured pictures in Wikidata when there is no choice, if there is a possibility to choose between two alternatives, users should not be prevented to replace the coloured version by the de-coloured one in these cases.
Any opinions whether such replacements in Wikidata are possible are really appreciated. -- Jan Kameníček (talk) 20:58, 17 September 2024 (UTC)
- @Jan.Kamenicek: I totally agree with you, a cropped and clean version is obviously better (especially as the coloration is a degradation of the paper that was not originally present nor intended ; but also the white space or the caption inside the image doesn't bring anything really useful for Wikidata). AFAIK, most people are doing the same (and for years, it was done also on other projects before Wikidata). I'm curious of hearing Skot arguments and reasoning. Cheers, VIGNERON (talk) 11:21, 21 September 2024 (UTC)
- @VIGNERON The question is not whether to adjust the color of the images, we adjust them both, the question is the degree of adjustment. You can compare the original file: https://ndk.cz/uuid/uuid:8ea19fd0-cb83-11e6-ac1c-001018b5eb5c
- For me personally, the degree of colour editing (it is simply automatic contrast) from a colleague is beyond the edge of losing image information, so I will not do it this way. At the same time, I have no intention of replacing his images as long as he uploads them in full resolution from the original source. The core of the problem is determining whether one of the color versions is significantly better than the other, justifying systematic replacement of each other's images within Wikimedia projects.
- Regarding uploading images without captions, there i feel quite consensus within the community, so there is no problem uploading both the version with a caption and the one without directly to Commons. Skot (talk) 12:05, 21 September 2024 (UTC)
Dict: protocol
Do we have a server for the dict:
protocol, as described in this blog post and at DICT?
Curiously, if I type dict:cheese in the search bar here, I am taken to https://www.wikidata.org/wiki/Special:GoToInterwiki/dict:cheese
(and similar if I do so on en.Wikipedia, etc.*), which displays:
Leaving Wikidata
You are about to leave Wikidata to visit dict:cheese, which is a separate website.
Continue to https://www.dict.org/bin/Dict?Database=*&Form=Dict1&Strategy=*&Query=cheese}}
and not to a Wikidata entry (nor a Wiktionary page**). Can we get that changed?
[* doing so on fr.Wikipedia still takes me to an English definition; does it do so for people whose browsers use other languages?]
[** Also raised at wikt:Wiktionary:Grease pit/2024/September#Dict: protocol]. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:15, 18 September 2024 (UTC)
- See phab:T31229.--GZWDer (talk) 16:02, 18 September 2024 (UTC)
- It's kind of retro-sexy but also incredibly niche. Since the WMF offers servers to useful projects you could set up a server that offers wikidata or wikipedia over gopher, telnet (BBS-style or non-interactive) or dict. I didn't intend to register a developer account, but for something fun like this I might change my mind. I can hardly think of a better way of procrastinating. Be warned though, I might be compelled to do the implementation in D (language designed by walter bright and andrei alexandrescu, both C++ heavyweights) just because I want more experience with it. I recon a single docker instance will do which doesn't require much formality, so this this could be up and running without too much delay. Main thing would be agreeing on how the protocols should be queried. Infrastruktur (talk) 17:39, 19 September 2024 (UTC)
- Had a quick look at it. I guess dict protocol makes more sense for Wikipedia and Wiktionary than it does for Wikidata, as Wikidata doesn't have short definitions. Seems most dict servers serve a unicode text file that consists of a key-value pair of dictionary entry and its definition. If we don't expect much traffic I think an approach where we skip compiling a dictionary and merely act as a gateway transforming the first paragraph of Wikipedia articles into pure text, stripping out any templates, should be sufficient. Might also scrap the plan to use D for this and just use good old Python. Looks like lookups will also be exact, so no search suggestions or anything like that. No user authentication required either, but people might like support for encryption.
- On a related note I use bang codes in my browser bar to look up stuff. If DuckDuckGo is configured you can just type "!wd Q12345", "!wen Marco Polo" or "!mw gourd" to look up stuff quick. Lots of dictionaries supported [1]. Infrastruktur (talk) 17:45, 25 September 2024 (UTC)
- @Infrastruktur: "Wikidata doesn't have short definitions" On cheese (L4517), for example, at L:L4517#S1, I can see "milk-based food product". But then we also have cheese (L331133),so I guess we would need to query all lexemes with a label matching the desired string. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:04, 25 September 2024 (UTC)
- Forgot about lexemes. It might be easier to handle lexemes with the same name than it is to handle Wikipedia disambiguation pages, where it's not clear which of the links is a definition of the word. This protocol is all new to me so lots of things to figure out still. Infrastruktur (talk) 23:43, 25 September 2024 (UTC)
- @Infrastruktur: "Wikidata doesn't have short definitions" On cheese (L4517), for example, at L:L4517#S1, I can see "milk-based food product". But then we also have cheese (L331133),so I guess we would need to query all lexemes with a label matching the desired string. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:04, 25 September 2024 (UTC)
aircraft engine (Q743004) and models (also many classes similar to aircraft engine (Q743004))
The class aircraft engine (Q743004) has problems in Wikidata. aircraft engine (Q743004) is a subclass of physical object (Q223557), which means that its instances are physical objects. But almost all the instances of aircraft engine (Q743004) are not physical objects, instead mostly being aircraft engine models like Poinsard (Q7207885).
What should be done to fix this problem? The simplest fix would be to just make aircraft engine (Q743004) no longer be a subclass of physical object (Q223557) and other classes that cause similar problems, perhaps by replacing aircraft engine (Q743004)subclass of (P279)aircraft component (Q16693356) with aircraft engine (Q743004)is metaclass for (P8225)aircraft component (Q16693356). But that would leave all the labels and descriptions for aircraft engine (Q743004) as is, and not corresponding to the actual intent of the class. Changing just the English label and descriptions would be possible but would cause a difference in the meaning of the labels in different languages. Adding an English value for Wikidata usage instructions (P2559) would help a bit not would not solve the mismatch. Changing all the descriptions doesn't seem immediately possible.
A variation of the first option would be to add a new class for aircraft engines, transfer all the labels and descriptions and aliases to the new class, correctly place the new class in the Wikidata ontology, give the new class appropriate label and description and aliases, and make the model instances also be subclasses of the new class.
Another option would be to make all the aircraft engine models that are currently instances of aircraft engine (Q743004) subclasses of it instead, perhaps in conjunction making the models instances of some suitable metaclass like engine model (Q15057021). This option probably requires many more statment changes than the previous approaches. If this is done, adding an appropriate English value for Wikidata usage instructions (P2559) on aircraft engine (Q743004) seems indicated.
Given what I have seen in related classes, I expect that there are many classes that have the same problem so perhaps the best way forward is to consider the problem in general and come up with a general solution.
Does anyone have preferences between these approaches? Does anyone have a different approach to fix this problem? Does anyone know what is the best way to gather a community that could come up with a consensus decision? Peter F. Patel-Schneider (talk) 15:32, 18 September 2024 (UTC)
- the standard solution would be to create a new item "aircraft engine model" as a metaclass for aircraft engine. Fgnievinski (talk) 23:50, 19 September 2024 (UTC)
- We'd save a lot of work generating a parallel *_model hierarchy if something could be a instance of (P31) of product model (Q10929058) and a subclass of (P279) (or some other property) of its functional class. Vicarage (talk) 00:17, 20 September 2024 (UTC)
- Care to exemplify Fgnievinski (talk) 02:03, 20 September 2024 (UTC)
- Peter and I are working on a RfC. Vicarage (talk) 20:51, 21 September 2024 (UTC)
- Care to exemplify Fgnievinski (talk) 02:03, 20 September 2024 (UTC)
- We'd save a lot of work generating a parallel *_model hierarchy if something could be a instance of (P31) of product model (Q10929058) and a subclass of (P279) (or some other property) of its functional class. Vicarage (talk) 00:17, 20 September 2024 (UTC)
- Sadly I think the first thing you need to do is make sure aircraft engine (Q743004) is used as a subclass of (P279), not the 237 times its currently used as a instance of (P31). Once that's done, I don't see why you can't leave the subclass of (P279) aircraft component (Q16693356) alone, and work up the chain to find a point where a design idea erroneously becomes a physical thing. Naively it seems to me you can have design all the way down to a final instance of (P31) for a single object. But you know a lot more about ontology than me. Really it should be called "aircraft_engine_model", like we have for weapon model (Q15142894) but I don't like the idea of doubling up adding _model to every concept, to have 2 parallel trees. I see we have engine model (Q15057021), so perhaps as we do with weapons, we make the 237 aircraft engines use that, and keep looking for generic terms at the level above a specific design.
Vicarage (talk) 16:31, 19 September 2024 (UTC)
- Yes, I'm coming around to this approach of moving lots of items from instance to subclass of aircraft engine (Q743004), except that there might be a few instances that are actual physical objects (like aircraft engines in museum collections). That's probably the most actual changes but probably has the least changes to the Wikidata ontology. Peter F. Patel-Schneider (talk) 16:52, 19 September 2024 (UTC)
- Those actual objects should either at a pinch have P31 of physical_object directly, or for museum items, item of collection or exhibition (Q18593264). I expect we can come up with useful reasons why we care about a physical instance of a manufactured design. Vicarage (talk) 16:58, 19 September 2024 (UTC)
Enabling the CampaignEvents Extention on Wikidata
The Campaigns Product team at the Wikimedia Foundation is proposing to enable the CampaignEvents extension on Wikidata by the second week of October.
This extension is designed to make it easier for organizers to manage community events and projects on the wikis, and it makes it easier for all contributors to discover and join events and projects on the wikis. Once it's enabled on Wikidata, you will have access to features that will help with planning, organizing, and promoting events/projects on Wikidata.
These features include:
- Event Registration: A tool that helps organizers and participants manage event registration directly on the wiki.
- Event List: A simple event calendar that shows all events happening on the wiki, particularly those using the Event namespace. It will also be expanded soon to have an additional tab to discover WikiProjects on a wiki.
- Invitation Lists: A feature that helps organizers identify editors who might be interested in their events, based on their editor history.
Please note that some of these features, like Event Registration and the Invitation List, require users to have the Event Organizer right. When the extension is enabled on Wikidata, the Wikidata admins will be responsible for managing the Event Organizer right on Wikidata. This includes granting or removing the right, as well as establishing related policies and criteria, similar to how it’s done on Meta.
We invite you to help develop the criteria/policy for granting and managing this right on Wikidata. As a starting point for the discussion, we suggest the following criteria:
- No active blocks on the wiki.
- A minimum of 300 edits on Wikidata.
- Active on Wikidata for at least 6 months.
Additional criteria could include:
- The user has received a Wikimedia grant for an event.
- The user plans to organize a Wikidata event.
We would appreciate your input on two things:
- Please share your thoughts and any concerns you may have about the proposal to enable the CampaignEvents extension on Wikidata.
- Review the starting criteria listed above and suggest any changes or additions you think would be helpful.
Looking forward to your contributions - Udehb-WMF (talk) 16:00, 19 September 2024 (UTC)
- 300 edits may be too low; Wikidata edits are generally very granular, so it's easy to make a lot of them. Maybe set the minimum at 1000? ArthurPSmith (talk) 18:04, 19 September 2024 (UTC)
- I think 300 or 1000 matters little. The rights also don't give much room to mess up, so it is okay to have a low bar. From the additional criteria, I think a grant is way too restrictive, but the plan to organize is a must. Why else would the rights be needed? Ainali (talk) 18:22, 19 September 2024 (UTC)
- I think the proposed criteria are reasonable. It is really hard to judge someone by the amount of edits because of the tools we are using on Wikidata. Perhaps we want to use a trial period for granting the rights (at least for less experienced users). We could grant it temporary for one year and renew it if it is still needed. --Ameisenigel (talk) 19:35, 19 September 2024 (UTC)
- Hello! As a staff from an affiliate, I'd suggest to add a criteria that bypasses the number of edits for staff that belongs to an affiliate. In the case of Wikidata it's always useful if they know the platform before running an event, but it could be among the responsibilities of a new member of an affiliate staff to organize an event. Other than that, the criteria seems to follow what other wikis are currently discussing or implementing. Scann (WDU) (talk) 12:48, 20 September 2024 (UTC)
- That's an interesting point that makes me question why we need an extra limit at all. Couldn't this right just be added to what autoconfirmed users can do? If someone misbehaves, it wouldn't be too much hassle to notice it and block, and the harm they can do wouldn't be any worse than being able to create items or pages in the Wikidata namespace. Ainali (talk) 13:38, 20 September 2024 (UTC)
- We could start of without an edit limit for Wikidata and see whether any problems arise that way. If problems arise we can still increase the limit later.
- If I remember right there was in the past some grant funded event that produced a few problems with bad edits. Does anyone remember more and whether the people in question would have fulfilled the limits that are proposed here? ChristianKl ❪✉❫ 20:23, 23 September 2024 (UTC)
- @ChristianKl: I think you mean Wikidata:Project chat/Archive/2023/12#Wikidata-related grant proposals and Wikidata:Administrators' noticeboard/Archive/2023/12#Recent crop of new Nigerian items. --Matěj Suchánek (talk) 07:44, 26 September 2024 (UTC)
- @Udehb-WMF: what do you think about the case Matěj linked to? Should we assume that the WMF is capable of not repeating that mistake in future grants for events? If so we wouldn't need an amount of edits of Wikidata as a limit. ChristianKl ❪✉❫ 10:47, 26 September 2024 (UTC)
- @ChristianKl: I think you mean Wikidata:Project chat/Archive/2023/12#Wikidata-related grant proposals and Wikidata:Administrators' noticeboard/Archive/2023/12#Recent crop of new Nigerian items. --Matěj Suchánek (talk) 07:44, 26 September 2024 (UTC)
- That's an interesting point that makes me question why we need an extra limit at all. Couldn't this right just be added to what autoconfirmed users can do? If someone misbehaves, it wouldn't be too much hassle to notice it and block, and the harm they can do wouldn't be any worse than being able to create items or pages in the Wikidata namespace. Ainali (talk) 13:38, 20 September 2024 (UTC)
- Good idea, if we can enable this extension, we may need to remove Wikidata:Account creators group?--S8321414 (talk) 12:34, 25 September 2024 (UTC)
- This is completely unrelated since the event organizer role is only for the usage of the events extension. Account creator is nearly unused on Wikidata. --Ameisenigel (talk) 15:21, 25 September 2024 (UTC)
Rewriting Blazegraph syntax to pure SPARQL 1.1
Hi All,
@Andrawaag and I have been working on extracting all SPARQL templates in wikidata at the latest Biohackathon Japan DBCLS BioHackathon (Q109379755). So that we can automatically rewrite them to be SPARQL 1.1. The tool we use is something developed for the SIB Swiss Institute of Bioinformatics (Q3152521) sparql examples project [2]. That now has support for crawling wikibase. We still have issues open [3] and we hope that the community can find some more. We have aimed to be gentle with crawling and encourage to test the java code with the "--use-cached" flag so that you don't use more server resources than needed.
For example still missing is the nice static webpages due to an issue with Jekyll. Also the extraction of "comments" is still being worked on by @Andrawaag.
Please let us know if you like/dislike this and what you would like to see added.
Regards, Andra and Jerven (talk) 16:20, 20 September 2024 (UTC)
- An example of a rewritten query is http://jerven.eu/wikibase-sparql-examples/examples/wikidata/0a6179e690a052035e4812db5a566b87/ which was rewritten from blazegraph syntax from a query found on [a request a query archive page] Jerven (talk) 13:19, 23 September 2024 (UTC)
- We now have a status page [4] that keeps track of how many queries we can parse with the different parsers. This shows we fail to parse 3.6% with the Blazegraph parser, and 5.6% fails to parse with RDF4j and 6.9% with Jena. The blazegraph named subqueries are starting to get fixed, currently we still fail to fix example with an includes inside a with clause. @AndreaWest is this useful for you? Jerven (talk) 13:16, 24 September 2024 (UTC)
- @Andrea Westerinen is this useful for your work? Jerven (talk) 13:17, 24 September 2024 (UTC)
- We now have a status page [4] that keeps track of how many queries we can parse with the different parsers. This shows we fail to parse 3.6% with the Blazegraph parser, and 5.6% fails to parse with RDF4j and 6.9% with Jena. The blazegraph named subqueries are starting to get fixed, currently we still fail to fix example with an includes inside a with clause. @AndreaWest is this useful for you? Jerven (talk) 13:16, 24 September 2024 (UTC)
- @Jerven: IMO it would be useful to also keep and make accessible the original versions. If the queries use eg named subqueries, they may be significantly easier for humans to read in their original form than in the modified form -- and the named subquery form may be more efficient too, if the results of the subquery are re-used more than once.
- Keeping both forms may therefore be useful, for both output and performance comparisons, as well as for archive/corpus purposes.
- With luck, any replacement chosen for Blazegraph will include a native implementation of the named subquery extension to the standard; or, failing that, your translator could be used as a preprocessor (albeit that might not be able to capture the efficiency gain of named subqueries when data is used more than once). Jheald (talk) 20:39, 24 September 2024 (UTC)
- It's maybe also worth remembering other extensions in Blazegraph, such as
bd:sample
to generate a given number of truly random sample triples, that also have the potential to be very valuable, but cannot efficiently be translated into standard SPARQL 1.1; it would be good if potential vendors could find ways to make that accessible too. -- Jheald (talk) 21:00, 24 September 2024 (UTC) - Hi @Jheald not shown in the rendering to Markdown. But the blazegraph queries are stored, even if they are fixed a little bit by adding prefixes in use. See this example [5] for how it is done for now. Do you think it is useful to have this in the markdown?
- I hope that the vendors implement something more general than the named subqueries are implemented in anzo and blazegraph. Jerven (talk) 07:25, 25 September 2024 (UTC)
- It's maybe also worth remembering other extensions in Blazegraph, such as
Automotive
Imperial cars are called Chrysler Imperials. Imperial has not been a model of Chrysler since 1954. In 1955 Imperial became a separate car brand like Lincoln is from Ford or Cadillac is from Buick. Since 1954 there has not been a Chrysler Imperial. 2600:1700:E770:7680:54D0:3A6B:2947:91BA 20:32, 20 September 2024 (UTC)
- There was a Chrysler Imperial between 1989 and 1993 (https://www.nytimes.com/1989/11/08/business/chrysler-imperial-joins-cutthroat-car-market.html https://www.macsmotorcitygarage.com/the-royal-prerogative-chryslers-very-last-imperial-1990-93/). Chrysler Imperial (Q1088705) is for the Chrysler model series, Chrysler Imperial (Q97373892) is the 1989-1993 Chrysler model, and Imperial (Q668822) is for the brand. The models have "brand: Chrysler", which links to Chrysler (Q29610); the brand only has "founded by" and and "owned by" linking to the Chrysler company Stellantis North America (Q181114). This seems to be correct, at least for the time the cars were produced (although this may need changing if the Stellantis North America (Q181114) item is split). The items could probably be improved with the addition of start and end of production or sale - Imperial (Q668822) has inception (P571) and dissolved, abolished or demolished date (P576), but I'm not sure they are suitable properties for a product. Peter James (talk) 00:20, 21 September 2024 (UTC)
- I found the properties date of commercialization (P5204) and discontinued date (P2669) and added them. There was also the Chrysler Imperial Concept (Q50396854) from 2006 but that is probably not relevant as it was only a concept car. Peter James (talk) 00:50, 21 September 2024 (UTC)
Description
Hi, could you explain the easiest way to add the same description to all articles within a specific category on Wikipedia? I'm looking to do this for settlements in the municipalities of Serbia, based on the categories in the Serbian-language Wikipedia. I'm aware of QuickStatements but haven't used it before. Thank you! — Sadko (words are wind) 01:48, 21 September 2024 (UTC)
- Hello Sadko,
- with Help:QuickStatements you need to prepare a list of QIDs with the description, for example using PetScan or SPARQL.
- The format is described at:
- The statements for QuickStatements also could be prepared and modified with the help of LibreOffice, OpenOffice, Excel, etc. (three columns, which can be copy/pasted into QuickStatements)
- Also see:
- M2k~dewiki (talk) 20:00, 22 September 2024 (UTC)
Deletion of Userpage
Hello,
I unknowingly created my user page in this wiki. Later realized that global user pages at MetaWiki exist. Now I am confused where do I list my userpage for Speedy deletion. Any help would be greatly appreciated regarding this issue. Thanks in advance! Bunnypranav (talk) 12:36, 21 September 2024 (UTC)
- Hi Bunnypranav, just add
{{Delete}}
template to your user page. Samoasambia ✎ 13:13, 21 September 2024 (UTC)- Thanks for the clarification, I thought this template cannot be used for user nominated speedy deletions. Bunnypranav (talk) 13:15, 21 September 2024 (UTC)
Merging needed of two items

I accidentaly created Q130340447 while Q5649310 already existed. Some things are not correct. It not out of use but rebuild for another purpose. (no trains but used as a greenhouse).Smiley.toerist (talk) 21:24, 21 September 2024 (UTC)
- I tried to merge these but they have conflicting commonswiki sitelinks (one to the Category, the other to the name). I'm not familiar with how commonswiki links should work so if somebody else can take a look I'd appreciate it, thanks. ArthurPSmith (talk) 15:51, 23 September 2024 (UTC)
- From my point of view there is no need to merge these two items, since one covers the (former) railway station and the other the botanical garden at the same place. The two objects link to each other.
- Also see
- M2k~dewiki (talk) 15:53, 23 September 2024 (UTC)
- Ok, but they have the same commons category etc. statements, while the commonswiki sitelinks differ - surely only one of those should be in effect? ArthurPSmith (talk) 19:23, 24 September 2024 (UTC)
- Regarding Commons there is a category on commons as well as gallery. (Commons) categories should be connected to category wikidata objects, gallery sitelinks should be connected to articles / article objects, the object for the articles and the object for the category can link to each other. If there is no gallery on commons also the commonscat could be connected to the article object. M2k~dewiki (talk) 19:31, 24 September 2024 (UTC)
- Ok, but they have the same commons category etc. statements, while the commonswiki sitelinks differ - surely only one of those should be in effect? ArthurPSmith (talk) 19:23, 24 September 2024 (UTC)
- The train images of the old stationshed: File:Renfe 597-1981.jpg and File:Madrid Atocha 1981.jpg are placed in the general item Q2842655. I see no reason to use a separate item for the old trainshed in these cases. This causes only confusion. Smiley.toerist (talk) 10:23, 25 September 2024 (UTC)
- I would tend to agree. It is the same building. I have removed the statements which were confusingly linking them, and also sorted the category item. So they should be ready for merging now — Martin (MSGJ · talk) 11:57, 25 September 2024 (UTC)
Expelled because endorsement.
How to say that a person was excluded from a political party because he endorsed an another political party how can i represent that in Wikidata using the Property end cause (P1534)?
For instance most political parties have in their charter URL (P6378) that if you endorse or are elected on another political party ticket than you get expelled from your political party. Johshh (talk) 23:01, 21 September 2024 (UTC)
- I'd likely just use the P1534 but then add additional qualifier property statements to say when/where/how/why ? There are a few "expulsion" concepts you could link the qualifiers to, there's also this one removal from office (Q106677579) I found, but maybe that one is too narrow for your use case and you might have to search for others, or create one called "removal by policy" or something suiting the exact case like "removal from political party". Thadguidry (talk) 05:55, 22 September 2024 (UTC)
- exclusion from a political party (Q50394689) exists for this purpose M2Ys4U (talk) 20:05, 22 September 2024 (UTC)
- I'm not sure a database like Wikidata is really suited to model such fine-grained details. I'd almost argue that certain minute intricacies are much better handled in text form on platforms like Wikipedia. Wikidata might arguely be better off not trying to recreate every little detail with our rather blunt, nuance-less, context-less database properties. --2A02:810B:5C0:1F84:CC37:BB5C:F551:763A 17:41, 22 September 2024 (UTC)
How do I replace a citation? How do I get coords displayed in decimal degrees?
First question. There are many Queensland place names which have been imported from something called the Geographic Names Server. I don't know what that is, but it is not an authoritative source for Queensland place names and the information is often wrong, and then it creeps into Wikipedia and Commons infecting them with its misinformation. My immediate problem is Alligator Creek (Q21922742). Although I managed to change the coords and I eventually managed to delete the Geographic Names Server, I cannot see how to add a citation to a reliable source, the Queensland Places names database, entry 391. So how do I do this? There is a cite template on Wikipedia for the purpose (cite QPN). How do I use it here or get set up something equivalent set up on Wikidata?
To illustrate the problem, see this Wikidata-generated infobox on Commons
https://commons.wikimedia.org/wiki/Category:Alligator_Creek_(creek,_City_of_Townsville)
See how the point is the middle of dry land because it is incorrect. The creek mouth is further west.
Second question. Most of the Queensland resources use decimal degree format. Although I entered my coords in the decimal degree format, they are displaying as DMS. How do I get them to display (in general or just for me) as decimal degrees so I can check what's in Wikidata without having to manually convert all the DMS each and every time.
Thanks, Kerry Kerry Raymond (talk) 02:56, 22 September 2024 (UTC)
- They are stored in decimal format; it can be seen in the Wikidata Query Service, and in diffs when coordinates are changed, added or removed (example: Special:Diff/2251695016), but I am not aware of a way to display the decimal values in a page. The coordinates form a link using decimal format, so Q8678#P625 displays as 22°54'40"S, 43°12'20"W but the link goes to https://geohack.toolforge.org/geohack.php?params=-22.91111111111111_N_-43.205555555555556_E_globe:earth&language=en (south and west are represented by negative values of N and E) - hover over the coordinates or copy the link for the decimal value. After clicking "edit" the link disappears; it returns after refreshing or reopening the page. The DMS coordinates displayed can also be inaccurate because of precision, but the correct value is still in the map and link, for example in Q9397598#P625 the value entered is 53.5592°N, 18.3625°E, which with default precision would convert to 53°33'33.1"N, 18°21'45.0"E, but precision - seen by clicking on "edit" or viewing a diff - is 0.15123359250243 (added by a bot; that value is not an option in normal editing), so the DMS displays as 53°32'N, 18°18'E. Peter James (talk) 19:37, 22 September 2024 (UTC)
- There is a long-standing Phabricator ticket for displaying coordinates as decimals. M2Ys4U (talk) 20:12, 22 September 2024 (UTC)
- It seems amazing that a simple thing like this can't be done. And I guess I wonder why, if the internal representation is decimal degrees, is the default rendering DMS? It is the 21st centuryand all geospatial tools and datasets I use are based decimal degrees. Kerry Raymond (talk) 01:25, 23 September 2024 (UTC)
- @Kerry Raymond You can't easily do that on the front-end, but you can download any coordinate statement in the original (i.e. decimal) format using a SPARQL query. People are confusing the front end of Wikidata with Wikipedia, but Wikidata is not really meant for browsing. Most of the stuff here is meant to be done on the underlying data, and that's where the development of Wikidata should concentrate its focus. In other words, what you call a 'simple thing' is actually simple, but probably unimportant and a useless waste of developer's time. Vojtěch Dostál (talk) 07:11, 23 September 2024 (UTC)
- "Unimportant" and "useless waste of time". We are discussing data quality. How unimportant is that? Kerry Raymond (talk) 08:45, 23 September 2024 (UTC)
- @Kerry Raymond In my view, the way data are displayed has nothing to do with data quality. Perhaps you mean 'ability to assess data quality quickly' because you want to compare the statements in Wikidata to a source which uses a decimal format. I'd retort to that such comparisons are best done by downloading coordinates in both databases, merging them in a spreadsheet and doing the comparison in a much more high-throughput manner. For that, the way data are displayed is not an issue at all. Is it maybe possible that you are trying to use the Wikipedian way of verifying stuff for a completely different Wikimedia project, where we tend to have different approaches to this? Vojtěch Dostál (talk) 09:08, 23 September 2024 (UTC)
- "Unimportant" and "useless waste of time". We are discussing data quality. How unimportant is that? Kerry Raymond (talk) 08:45, 23 September 2024 (UTC)
- @Kerry Raymond You can't easily do that on the front-end, but you can download any coordinate statement in the original (i.e. decimal) format using a SPARQL query. People are confusing the front end of Wikidata with Wikipedia, but Wikidata is not really meant for browsing. Most of the stuff here is meant to be done on the underlying data, and that's where the development of Wikidata should concentrate its focus. In other words, what you call a 'simple thing' is actually simple, but probably unimportant and a useless waste of developer's time. Vojtěch Dostál (talk) 07:11, 23 September 2024 (UTC)
- It seems amazing that a simple thing like this can't be done. And I guess I wonder why, if the internal representation is decimal degrees, is the default rendering DMS? It is the 21st centuryand all geospatial tools and datasets I use are based decimal degrees. Kerry Raymond (talk) 01:25, 23 September 2024 (UTC)
- @Kerry Raymond: Regarding adding references, there is Queensland place ID (P3257) you can use. I've added it as an identifier to Alligator Creek (Q21922742), plus you can use it as a reference. Check the item to see how I did it. Unfortunately, there doesn't seem to be any way to reference the specific "Queensland place names search", as individual entries do not have their own webpage. But, it appears P3257 uses the exact same database. — Huntster (t @ c) 23:29, 22 September 2024 (UTC)
- Thanks for that. I did find that and tried to use it but WikiData refused to let me publish it, so evidently whatever I did with it was wrong in some way. I guess I try again and see if I can make it work. Kerry Raymond (talk) 01:20, 23 September 2024 (UTC)
Extra badge for WikiDictionary redirects
Some Wikipages like https://en.wikipedia.org/wiki/Brachyology aren't normal redirects but redirects to WikiDictionary. Currently, it seems that those sitelinks are often listed without a badge, similar to sitelinks that aren't redirect. Should we have a new badge for those pages? ChristianKl ❪✉❫ 17:41, 22 September 2024 (UTC)
- Yes. Ymblanter (talk) 18:48, 22 September 2024 (UTC)
- @ChristianKl: I think Wiktionary redirects are usually stored in separate items with statement instance of (P31)Wiktionary redirect (Q21278897). But sometimes they have gotten mixed up with regular sitelinks. Samoasambia ✎ 07:46, 23 September 2024 (UTC)
- Those items are not notable according to our policies, so the solution for them is to be deleted. Sometimes however they are intermixed with other sitelinks. ChristianKl ❪✉❫ 08:06, 23 September 2024 (UTC)
New ticket about making Wikidata horizontally scalable
Feel free to join the discussion about making Wikidata great and sustainable 🤩 https://phabricator.wikimedia.org/T375352 So9q (talk) 04:34, 23 September 2024 (UTC)
- It not a ticket about making it scalable. It's a ticket about wanting it to be scalable without understanding the reasons why Wikidata isn't. SPARQL-based databases don't scale horizontally the way a lot of other databases do. ChristianKl ❪✉❫ 08:10, 23 September 2024 (UTC)
- Don't you think that's unnecessarily blunt? But well, graphs are very nice but they don't scale into eternity either. And I'm not sure how well SQL scales beyond single-server. A laymans naive impression is we might get two decades if we federate and get a better triplestore. But yeah, at some point, if we refuse to set hard guidelines for what we include (which I believe So9q have advocated for) we will eventually reach the point where graphs simply is no longer an option so a fundamental change is inevitable. As the CouchDB docs say "disks are cheap", but expanding from 3 indexes to a stupid amount also have a cost, although it certainly will scale, but it will also have lost some of its appeal. Infrastruktur (talk) 15:05, 23 September 2024 (UTC)
- So9q wrote a post claiming that he knows what the community wants without having done the work of figuring out what the community wants, in a case like that I do think a blunt statement is warrented. I don't think people should write in that way if they just speak about their own opinion.
- As one of the lead CouchDB developers once explained to me, CouchDB has a philosophy of not allowing you features that don't scale. If you ask them "Why does CouchDB does not support feature X that MongoDB supports?" the standard answer is "Because there's no way to develop the feature so that it scales to really large datasets".
- Disks are cheap and some problems are solved by having more disks. Storing data on WikiCommons for example is solved by simply having more disks and thus we could use "tabular data" more to offload some data off Wikidata. ChristianKl ❪✉❫ 17:40, 23 September 2024 (UTC)
- Thanks for pointing that out. I will gladly copyedit the statements in question. Which are you referring to?
- The issue here from my point of view is that very little discussion has happened here since 2019 about what the community wants.
- Based on the very recent discussion about import-policy I conclude that the community does not want to limit the growth.
- It wants the WMF to fix any scaling issues so we don't have to worry about technical limits or choosing to import some amount of information over another despite both being notable. So9q (talk) 09:05, 24 September 2024 (UTC)
- I think statements about what the community wants in a phabricator ticket should only be made if there's community consensus for a given position. You wrote "The Wikidata community does not want to bother or worry about technical limits". For my part, having more information about the technical limits so that we can optimize Wikidata to work better within the existing technical limits would be great.
- Ideally, we would have a system that scales perfectly. Unfortunately, that's not possible. The fact that a system like Telegram can easily run on a NoSQL databases and thus scale does not imply that this is possible for a triple store that can be queried with SPARQL. If you want to Wikidata to scale horizontally in a way that makes it impossible to run SPARQL queries that currently run fine, there are likely going to be people in our community who think that this isn't worth it.
- WMDE recently developed the "mul" datatype to reduce the amount of unnecessary edits that get made and information that's stored in the database. That's a decision that allows us to have more data overall. ChristianKl ❪✉❫ 11:34, 24 September 2024 (UTC)
- I'm not talking about the sparql database per se. I know they don't scale well.
- The graph split can be viewed as a kind of manual sharding of the graph database with the downside that it affects queries and thus the user which is undesirable, but hard to avoid I'm the case of Blazegraph (and perhaps any other graph database in existence) So9q (talk) 08:57, 24 September 2024 (UTC)
- Don't you think that's unnecessarily blunt? But well, graphs are very nice but they don't scale into eternity either. And I'm not sure how well SQL scales beyond single-server. A laymans naive impression is we might get two decades if we federate and get a better triplestore. But yeah, at some point, if we refuse to set hard guidelines for what we include (which I believe So9q have advocated for) we will eventually reach the point where graphs simply is no longer an option so a fundamental change is inevitable. As the CouchDB docs say "disks are cheap", but expanding from 3 indexes to a stupid amount also have a cost, although it certainly will scale, but it will also have lost some of its appeal. Infrastruktur (talk) 15:05, 23 September 2024 (UTC)
- I think User:ASarabadani_(WMF)/Growth_of_databases_of_Wikidata would be a better place to discuss things. Vicarage (talk) 15:11, 23 September 2024 (UTC)
- I disagree, the scalability issues reported in that page is a concern for the whole Wikidata community and wider ecosystem IMO.
- Perhaps it should be moved to meta since a failure of the Wikidata mariadb cluster would effect all wikis that are linked to Wikidata which is all of them.
- The technical and community health of Wikidata is concerning all wikis and thus the whole movement. So9q (talk) 08:51, 24 September 2024 (UTC)
- I followed up with two child tickets initiating a search for a replacement of the master-n-replicas mariadb setup is outdated and does not scale horizontally for both read and write operations.
- Also it has issues like lack of automated failover, lack of features like sharding, self-healing nodes, etc.
- See https://phabricator.wikimedia.org/T375472 So9q (talk) 08:54, 24 September 2024 (UTC)
- I got a response from the lead mediawiki backend operations engineer and a decline of the ticket and subtickets I wrote. See my response
- As I note in the response the mariadb backend is NOT scalable and offloading all the scholarly articles to a separate Wikibase (which has not been funded or approved by the board yet, see the proposal) is NOT a viable long term solution.
- Basically our engineers are using a 2005 database setup (master on a single machine with a few replicas) not geared to big data at all. It's NOT best practise as of 2024 and it's not going to get any better by sticking our heads in the sand and hoping for good luck (like the lead engineer seems to want along with a few optimizations to the table layout).
- Soon enough we will reach 100M items again once @Egon Willighagen imports millions of more chemicals or someone imports all the named streets of the USA, Russia and Russia, all bridges in Sweden, etc.
- We need the WMF board and tech team to consider ways forward and time is running out for wikidatawiki according to @ASarabadani (WMF) NOW.
- I'm considering writing a letter to the new board alerting them to this precarious situation, you are very welcome to join me, write me an email through my user page or reach out to me in telegram. So9q (talk) 10:52, 26 September 2024 (UTC)
- The database architect of WMF seems surprisingly pessimistic when it comes to scaling a SQL database horizontally. I just replied in phabricator to one of this comments with a possible open source drop in replacement for mariadb.
- I urge the readers and users of Wikidata to ask themselves, if a community member can find a solution to the problem stated by @ASarabadani (WMF) in his spare time in a few minutes browsing Wikipedia for distributed SQL database engines that are open source, why have the WMF engineering team which is highly paid not done anything about this since the scalability issues became common knowledge? Why are they so negative to community members pointing to possible solutions? Why are they so unwilling to reflect on their own architecture decisions?
- What could be causing this? What has hindered a solution to be found since 2012? (they could have continuously projected the growth of Wikidata and tested their current setup with dummy data and forecasted that we would outgrow a single machine master mariadb database long ago). Why did they fail to do that?
- Imagine having a technical management and team of lead engineers who would rather try to impose growth limits on our thriving community of 23k contributors (and millions of consumers world-wide of the data every month) than do their job and make sure the backend scales according to community needs and the vision of the foundation[horizontally 1]. Is that what is going on?
- I wonder if this situation is known to the board and what consequences it is going to get. WDYT? So9q (talk) 12:04, 26 September 2024 (UTC)
- The fact that ASarabadani wrote the post, suggests to me that he's considering ways forward. Writing a letter to the WMF board suggesting that he isn't considering the problems because he closed your tickets, seem like unnecessary drama.
- Basically, you claim that you have a better idea of the kind of work that would be needed to change the present code base to software like MySQL Cluster than ASarabadani does. I find it highly unlikely that this is true. If you write a letter to the board, I would expect that you are not going to convince them that you understand the MediaWiki code base and what would be required to change it to be horziontally scalable better than ASarabadani just because you read a few articles on Wikipedia about distributed SQL database engines.
- Writing software new software COBOL is not "best practice". That doesn't mean that banks aren't still running on a lot of COBOL code. Changing legacy system is not easy.
- The scalablity bottleneck that Wikidata had to deal with in 2019 was about the amount of edits that Wikidata is able to do per minute. It was not about the size of the SQL database. Focusing engineering resources on the SQL database would not have helped with resolving the bottleneck we had at that time.
- When optimizing a system it's important to understand the bottlenecks that exist and focus on solving them. You make suggestions without having tried to understand the existing bottlenecks. ChristianKl ❪✉❫ 12:37, 26 September 2024 (UTC)
- The scalablity bottleneck that Wikidata had to deal with in 2019 was about the amount of edits that Wikidata is able to do per minute. It was not about the size of the SQL database. Focusing engineering resources on the SQL database would not have helped with resolving the bottleneck we had at that time.
- Are you sure? The master on a single server + replica setup helps scale read operations but not write operations. Moving to a distributed SQL database scales both write and read operations. So9q (talk) 13:08, 26 September 2024 (UTC)
- Changing the SQL database can only help scaling the write and read operations when the bottleneck is about the SQL database in the first place. When the bottleneck however is about the performance of the triple store, it doesn't help you at all. ChristianKl ❪✉❫ 13:18, 26 September 2024 (UTC)
- Writing software new software COBOL is not "best practice". That doesn't mean that banks aren't still running on a lot of COBOL code. Changing legacy system is not easy.
- I agree, but this situation is very different. I'm NOT talking about rewriting any code. The MediaWiki software is separated from the database. How the database distribute queries and sharding etc. is not affecting the code in any way AFAIK. That is why it is a drop-in solution that could be tested out in a weekend by anyone who wants. The only thing you need is two networked machines, a good internet connection and a bit of linux command line know how to load the data from the dumps to setup a wikidata clone on a distributed database. So9q (talk) 13:12, 26 September 2024 (UTC)
- AFAIK doesn't bring you very far when you don't know what you are talking about. If you ask ChatGPT, who also doesn't understand all roadblocks, it's able to give you a bunch of reasons why it would require a lot of work to change to MySQL Cluster such as limits of transaction size. ASarabadani is going to know a lot of other reasons why it's hard to simply switch databases. ChristianKl ❪✉❫ 13:25, 26 September 2024 (UTC)
- When optimizing a system it's important to understand the bottlenecks that exist and focus on solving them. You make suggestions without having tried to understand the existing bottlenecks.
- Are you sure? If I understood @ASarabadani (WMF)s information correctly the core problem is that the sheer size of the wikidatawiki tables makes it hard for the master and replicas to keep all the information needed to serve MediaWiki in a timely manner in RAM. Buying larger servers is not a solution because of the growth rate of the project. Distributing the load over multiple servers is the go-to industry solution when doing big data projects like Wikidata seem to have become. So9q (talk) 13:17, 26 September 2024 (UTC)
- While ASarabadani used to work on Wikidata (and WMDE) he's now at the WMF and chief database architect for MediaWiki.
- As such the bottleneck that Wikidata faces that are outside of MediaWiki currently aren't his job. That does not mean that Wikidata does not have other bottlenecks that come from the triple store. If you look at the evaluation documents for choosing a new triple store for Wikidata, you find that amount of triples that those triple stores can store is unfortunately limited.
- While there are technical solutions that require a lot of work that might allow MediaWiki to be horizontally scalable, implementing them would not result in the Wikidata Community not having to worry about our triple count. You don't get 100x growth out of the available triple store technology. ChristianKl ❪✉❫ 13:47, 26 September 2024 (UTC)
- Wikidata will never be horizontally scalable. Asking who are the POTUS and asking who are male humans have no sematic difference. If there are as many POTUS as there are male humans, Wikidata will not be able to give an answer to either question. Midleading (talk) 09:41, 25 September 2024 (UTC)
- Let's make one thing clear. Wikidata is a MediaWiki-run wiki. MediaWiki supports nothing but a relational (SQL) database. Such databases are known not to be horizontally scalable. Therefore, Wikidata simply cannot be completely horizontally scalable. I can't imagine the amount of work needed to implement support for a (hybrid) NoSQL storage.
- Note that this has actually nothing to do with the Wikidata Query Service split. These are, unfortunately, two different problems, which do have a common cause: Wikidata is becoming unsustainably large. This is the only thing we can do something about right now. --Matěj Suchánek (talk) 15:38, 26 September 2024 (UTC)
- There are many things that could be done. Currently, the knowledge about how various knowledge modeling decisions affect performance isn't readily available. Gathering that knowledge, writing it up and then bringing it up in relevant decisions would be helpful.
- Initiatives like "mul" can free up capacity that we can use better otherwise. ChristianKl ❪✉❫ 22:38, 26 September 2024 (UTC)
Refs:
Question about Thesaurus links
Hi!
I'm new here on WikiData, so excuse me if I'm making a faux pas...
Can someone address the question I asked here?
Thanks.
ValJor (talk) 20:21, 23 September 2024 (UTC)
Wikidata weekly summary
There is no status update this week? Ayack (talk) 09:43, 24 September 2024 (UTC)
- The link was posted on X/Twitter by the Wikidata account last night https://www.wikidata.org/wiki/Wikidata:Status_updates/2024_09_23 Piecesofuk (talk) 09:54, 24 September 2024 (UTC)
- Thank you, but usually I "received" it on my user talk page and it's also posted here, on the project chat. Ayack (talk) 10:25, 24 September 2024 (UTC)
- @Mohammed Abdulai (WMDE) FYI. Ayack (talk) 17:47, 24 September 2024 (UTC)
- Very weird. I'm certain that I did pushed the send button. Thanks for notifying me Ayack, I'll look in to it. -Mohammed Abdulai (WMDE) (talk) 19:17, 24 September 2024 (UTC)
- @Mohammed Abdulai (WMDE) FYI. Ayack (talk) 17:47, 24 September 2024 (UTC)
- Thank you, but usually I "received" it on my user talk page and it's also posted here, on the project chat. Ayack (talk) 10:25, 24 September 2024 (UTC)
Problem bot edit on Sick Boi (Q7507561)
Sick Boi (Q7507561) was created to link to the Wikipedia article https://web.archive.org/web/20121001051141/https://en.wikipedia.org/wiki/Sick_Boi which was an album/mixtape by the artist Donald Glover aka Childish Gambino (note that most of the statements refer to this original item)
However, at some point this Wikipedia link was turned into a redirect, see https://web.archive.org/web/20210716053015/https://en.wikipedia.org/wiki/Sick_Boi which redirected to https://web.archive.org/web/20210716053015/https://en.wikipedia.org/wiki/Childish_Gambino_discography#Mixtapes
Then at some point this redirect seems to have been redirected to https://en.wikipedia.org/wiki/Ren_Gill#Discography which eventually turned into the article https://en.wikipedia.org/wiki/Sick_Boi (there's a new redirect for the Childish Gambino album at https://en.wikipedia.org/wiki/Sick_Boi_(mixtape) )
I guess it's this edit https://www.wikidata.org/w/index.php?title=Q7507561&oldid=1780864042 by @MsynBot which updated the redirect to Ren's album.
Is there a automatic fix to get this back to as it was? Or could I just restore to just before this edit and add a separate item for Ren's album (I noticed that there are dozens of international labels that would be lost if I did this)
Can this sort of bot edit be prevented from happening in future? Piecesofuk (talk) 09:49, 24 September 2024 (UTC)
- The bot just adds the badges to make it easier on Wikidata to see what's a sitelink and what isn't. If people over at Wikipedia change what a page is about, there's little the bot can do about that.
- Hey man im josh (talk • contribs • logs) was the first person who changed the label against the identity of the item.
- Restoring it to the state before Josh's edit and then creating a new item for Ren's album would be the way to go. ChristianKl ❪✉❫ 10:13, 24 September 2024 (UTC)
- Thanks for the reply. I've restored the item and created a new one for Ren's album. Piecesofuk (talk) 15:00, 24 September 2024 (UTC)
Misspelled wikidata item
So this was my first time creating a Wikidata item, but I mistakenly putted _ instead of spaces in the title. See Mukhtar_Ali_(footballer,_born_1962). Could anyone please let me know how to sort this out. Regards JayFT047 (talk) 23:42, 24 September 2024 (UTC)
- @JayFT047:
Done, there's a edit button on the upper right corner which enables editing labels, descriptions and aliases. Samoasambia ✎ 06:25, 25 September 2024 (UTC)
datatype of P5143
Hi all. I have proposed changing the datatype of amateur radio callsign (P5143) from external ID to string, and I'd like to have a consensus about it before making any changes. If you are interested, go ahead and join the discussion on the property talk page. Thanks. Samoasambia ✎ 08:26, 25 September 2024 (UTC)
- A good way, it to ping all the people who commented on the creation of the property when proposing to change it. ChristianKl ❪✉❫ 12:45, 25 September 2024 (UTC)
- Thanks for the suggestion. I did that now. Samoasambia ✎ 15:23, 25 September 2024 (UTC)
(Cross-posted from Wikidata:Administrators' noticeboard#Q1477321_and_Q28496595)
The strange edit histories of these two items came to my attention through a reported issue at the Name Suggestion Index GitHub. At one point, Q1477321 referenced what is now the subject of Q28496595, and vice versa. Their original subjects (the entity now represented by Q1477321) appear to be identical (compare the first revisions of Q1477321 and Q28496595), meaning Q28496595 now refers to a different entity than it originally did. Since a lot of the messiness happened years ago, should these pages be left as is, or should Q28496595 still be merged into Q1477321, with the subject of Q28496595 getting a new QID? BrownCat1023 (talk) 12:55, 25 September 2024 (UTC)
FBI file numbers
I’d like to add an FBI file number to a Wikidata profile, ( i.e. 100-HQ-34789, or 92-NY-1456, etc.). However, many FBI files were destroyed or are still classified, so I can’t link the file number to an external copy of the file in every case. I can provide a reference for each file number though.
- Is there an existing property, such as “Described by Source” or “Inventory Number”, that could be used for these numbers? If so, would it be best to create a new Item for each FBI file?
- If not, would this be appropriate for a new property (something like “Federal Bureau of Investigation File Number”), even if the file numbers won’t link to an external database or site?
Thanks! Nvss132 (talk) 10:40, 26 September 2024 (UTC)
Dereferencing missatributed Israel CBS Ids
I cant figure out how to edit the pages and the bot which originally made the errors seems to have been dead for a couple years now. I added a comment on the talk pages but would appreciate if someone who knows how to do this would remove the properties.
https://www.wikidata.org/w/index.php?title=Talk:Q48195&oldid=2253153681 https://www.wikidata.org/w/index.php?title=Talk:Q121157&oldid=2253151994 Wissotsky (talk) 11:31, 26 September 2024 (UTC)
- You wouldn't have been able to edit those two items because of the protection ([6][7][8][9]) so it's possible the "edit" links don't appear. I removed them, the correct values were already on items for places in Israel. Everything with CBS IDs now seems to be somewhere in Israel or Israeli-occupied territories. Peter James (talk) 13:52, 26 September 2024 (UTC)
RfC on object vs design class vs functional class for manufactured objects
@Peter F. Patel-Schneider and I have been in discussion over how we distinguish for manufactured items a physical object, its design, and the function it performs. We propose a series of constraints on their instance and subclass properties, and a simplification of the parochial set of something_type, something_model and something_family classes. We have used military items as exemplars, but the approach would have much wider application. We would appreciate your views at Wikidata:Requests for comment/object vs design class vs functional class for manufactured objects. (talk) 14:26, 26 September 2024 (UTC)
ID property for the actual WPBSA site (snooker association)
It seems we have the WST.tv property: World Snooker Tour player ID (P4498) and the SnookerScores.net property: WPBSA SnookerScores player ID (P10857), but we do not have an ID property for wpbsa.com. It appears that wpbsa.com actually contains a significant amount of data, for example: Mark Allen on WPBSA, which is more than on: the same player on WST. Nux (talk) 19:07, 26 September 2024 (UTC)
Wikidata MOOC For Beginners (in English) - Starting October 1, 2024!
Hi everyone,
A rerun of the Wikidata Open Online Course will kick off on October 1, 2024, and will be available for the following 5 weeks. The previous iteration of the course saw a great turnout, with positive feedback from learners, including GLAM professionals and students.
Here’s what you can expect:
Course Structure
- Chapter 1: The Wikimedia Movement and the Creation of Wikidata
- Chapter 2: Understanding Knowledge Graphs and Queries
- Chapter 3: Discovering Wikidata, Open Data, and the Semantic Web
- Chapter 4: Contributing to Wikidata, the Community, and Data Quality
- Chapter 5: Bonus Resources on Scientific Bibliography from Wikidata
Head over to Wikidata 101: An Introduction to enroll, and don’t hesitate to share it with your friends and colleagues. The course is hosted on learn.wiki, and you can sign up using the same credentials you use for Wikimedia projects.
If you have any questions, feel free to reach out to me directly.
Cheers, Mohammed Abdulai (WMDE) (talk) 19:44, 26 September 2024 (UTC)
Q3219113 : error - PS : fixed
Gender is male, as it can be checked via the given url. But I don’t know how to correct this error. Can anybody help ? Thank you. Punctilla (talk) 00:35, 27 September 2024 (UTC)
Édit : Oups ! I just found I could revert vandalism : error fixed. Sorry for disturbing. Punctilla (talk) 00:44, 27 September 2024 (UTC)
- Thank you for keeping eyes open! --Matěj Suchánek (talk) 06:32, 27 September 2024 (UTC)
I think that this discussion is resolved and can be archived. If you disagree, don't hesitate to replace this template with your comment. Matěj Suchánek (talk) 06:32, 27 September 2024 (UTC) |
Duplicate entries due to ceb wiki?
Landau an der Isar (Q509536) and Landau an der Isar (Q32084506) seem to be the same but ceb.wiki has two articles. Magnus Manske (talk) 09:24, 27 September 2024 (UTC)