Wikipedia talk:WikiProject Data Visualization
Appearance
This is the talk page for discussing WikiProject Data Visualization and anything related to its purposes and tasks. |
|
Prior discussion of this project?
[edit]Please see the thread I've started at Wikipedia talk:WikiProject Council#Wikipedia:WikiProject Data Visualization concerning the apparent lack of a formal proposal, or indeed any obvious prior discussion of this project at all, before its creation. AndyTheGrump (talk) 10:47, 18 April 2025 (UTC)
Pattern of data visualizations by country
[edit]This is a big question, but I have been wondering what standard data visualizations apply to Wikipedia articles of many, most, or all countries.
Is there a popular data visualization which would be a welcome addition to any Wikipedia article on a country? Bluerasberry (talk) 21:36, 19 April 2025 (UTC)
- In order to visualise data, you need the data first. And beyond some very basic demographic data (if even that is available), I doubt you'd find anything standard worth visualising. AndyTheGrump (talk) 21:53, 19 April 2025 (UTC)
- @AndyTheGrump: At Wikidata there already are WikiProjects for countries. They all are trying to coordinate queries for national data in all kinds of fields. I do not think there is a strong connection between the data curation at Wikidata and data visualization here in Wikipedia. There must be something there. Economics? Tourism? Energy use? Education? Trade? Pop culture?
- Steve Balmer former CEO of Microsoft has this project USAFacts where he presents all sorts of data visualizations at the national level for the United States. https://usafacts.org/ I am not sure where Wikidata and Wikipedia align for such things but if we found just one visualization model which was generally interesting, then we could mass post that in many articles in many languages. Bluerasberry (talk) 17:03, 20 April 2025 (UTC)
- Regarding Wikidata, as I'm sure you are aware, it isn't WP:RS. As for mass-posting, that would appear to be grossly inappropriate, given that this is a draft English-language Wikiproject, with no mandate whatsoever to add content elsewhere.
- Regarding Wikidata, as I'm sure you are aware, it isn't WP:RS. As for mass-posting, that would appear to be grossly inappropriate, given that this is a draft English-language Wikiproject, with no mandate whatsoever to add content elsewhere.
- While I can see real merit in this draft proposal, I have some reservations about its implementation: not least of which is its potential to encourage 'looking for data to visualise', rather than serving the interests of good article creation and maintenance as a whole. In my opinion, rather than looking for new data to 'visualise', the project should instead be looking at individual articles, and asking whether data already present (appropriate, due, reliably sourced data, that is) can usefully be presented visually. If, as a result of this, we conclude that such visualisations could be used more generally, we can then consider doing so. But we need to start from the premise that it is the data that matters, and that good 'visualisation' comes later, where is appropriate. AndyTheGrump (talk) 17:30, 20 April 2025 (UTC)
- @AndyTheGrump: Right, Wikidata does not fit WP:RS, but also, Wikipedia infoboxes also typically do not meet RS, and we have not reconciled that over the past ~25 years. Infoboxes for biographies contain birthdays, locations, marriages and other key data which often does not pass RS, and for cities everyone expects population, geography, GDP, and other primary source content which also fails RS. I think the way to address this is to face it directly, state the obvious that there are some expected data values that either we must include or recognize as conspicuously absent, and that we often fill our boxes with content from databases or primary sources.
- We have consensus based on practice and precedent that non-RS data sources go in infoboxes, and going further, it is easy to imagine equivalents of infoboxes for subarticles or subsections. Check the right side of the article Economy of the United States - the infobox is full of primary content, and I count more than 20 data visualizations which themselves are mostly arbitrary, mostly primary source content. You might react that us editors organizing mass-posting is inappropriate, but if you are against standardization, then the default alternative is keeping content like this where random arbitrarily chosen graphics are on the side with no planning and no deliberation. We have been doing wiki long enough that now it is evident that we do not have, and will never have, the editorial labor pool to manually deliberate meaningful up-to-date data tables and visualizations custom for every article where they are needed. Also, I feel it is an inarguable fact that the quality of non-RS compliant datasets is going up quickly. While I am afraid that data can be co-opted for propaganda and by ill-will actors, Wikipedia could legitimately be a place to do some quality control and standardization to say which data sets are reliable, and which data visualizations ought to be standard as general reference information.
- I get your intent when you say "it is the data that matters and visualization comes later", but as a matter of practicality, I think that we might be able to identify the data which matters by identifying which visualizations are super popular, noncontroversial, and widely discussed. I am not sure where this begins - maybe Our World in Data? maybe USAFacts? Maybe what people are doing on Wikidata? Maybe by looking at what visualizations are popular in Wikipedia articles.
- I do not want to be transgressive in lowering Wikipedia's quality standards, but I do think it might be timely to admit that we already lack sourcing for some key data that we already include. Evaluating certain kinds of data visualizations as "standard" could help us streamline our fact-checking process for some data sources for visualizations. I would like your support in finding some way to mint some kind of visualization at scale from some primary data source, which we regard as trusted despite failing RS. Bluerasberry (talk) 16:53, 21 April 2025 (UTC)
- I have absolutely no idea where you got the bizarre idea that data in infoboxes doesn't need to comply with WP:RS, but it is complete and utter nonsense. I suggest that rather than wasting your time posting TLDR misinformation, you take the time to familiarise yourself with core policy. AndyTheGrump (talk) 20:14, 21 April 2025 (UTC)
- While I can see real merit in this draft proposal, I have some reservations about its implementation: not least of which is its potential to encourage 'looking for data to visualise', rather than serving the interests of good article creation and maintenance as a whole. In my opinion, rather than looking for new data to 'visualise', the project should instead be looking at individual articles, and asking whether data already present (appropriate, due, reliably sourced data, that is) can usefully be presented visually. If, as a result of this, we conclude that such visualisations could be used more generally, we can then consider doing so. But we need to start from the premise that it is the data that matters, and that good 'visualisation' comes later, where is appropriate. AndyTheGrump (talk) 17:30, 20 April 2025 (UTC)