Wikidata:Requests for comment/Allow for Wikidata items to be created that only link to a single Wikimedia Commons category (Wikidata notability discussion)
An editor has requested the community to provide input on "Allow for Wikidata items to be created that only link to a single Wikimedia Commons category (Wikidata notability discussion)" via the Requests for comment (RFC) process. This is the discussion page regarding the issue.
If you have an opinion regarding this issue, feel free to comment below. Thank you! |
The following discussion is closed. Please do not modify it. Subsequent comments should be made in a new section. A summary of the conclusions reached follows.
- no consensus to lower notability standards to allow a Wikidata item for any Commonswiki category --Pasleim (talk) 15:52, 11 May 2021 (UTC)[reply]
For a while the notability standards of Wikidata have excluded items which currently only have a Wikimedia Commons (hereafter "Commonswiki") category but not any other form of page on any other Wikimedia website. As Wikimedia Commons is now busy with the rollout of the Structured Data on Wikimedia Commons programme which will be heavily reliant on Wikidata I propose for the notability standards to be lowered to allow a Wikidata item for each and any Commonswiki category, this will help the interconnectivity between Wikidata and Commonswiki. -- Donald Trung/徵國單 (討論 🀄) (方孔錢 💴) 19:12, 12 February 2019 (UTC)[reply]
Note for !voters: The statement about notability standards is incorrect. Wikidata has three options for an item to be notable: "at least one valid sitelink" or "clearly identifiable conceptual or material entity" or "some structural need". Items that only fall under the category "at least one valid sitelink" (so not under "clearly identifiable conceptual or material entity" or "some structural need") have the Commonswiki exception. Multichill (talk) 22:25, 12 February 2019 (UTC)[reply]
Contents
Votes[edit]
Support[edit]
- -- Donald Trung/徵國單 (討論 🀄) (方孔錢 💴) 19:12, 12 February 2019 (UTC), as proposer.[reply]
- GPSLeo (talk) 19:41, 12 February 2019 (UTC) There are many categories should have a Wikidata item of the topic. But we should not establish the categories itself, when we want to replace them with structured data.[reply]
- Mike Peel (talk) This is overdue, and brings the standards inline with other projects (e.g., every Wikipedia category has an entry).
- Absolutely, per nom, GPSLeo, and Mike Peel. — Jeff G. ツ please ping or talk to me 20:54, 12 February 2019 (UTC)[reply]
- I didn't have an opinion, so I just checked my own creations in Wikidata. One of them: Q43965477. Some Wikidata items might be a trigger for a Wikipedia article. Or might help to find a date of birth. Vysotsky (talk) 23:18, 12 February 2019 (UTC)[reply]
- Jheald (talk) 00:43, 13 February 2019 (UTC) With SDC if an item X doesn't exist, so that you can't write eg "depicts X" on a file, then the file is not going to be discoverable by an SDC search based on the properties of X. If you want the files to be discoverable, then you have to have the items. Secondly, I think there is a huge amount of information locked up in the parent/child relationships in the Commons category structure that we haven't tapped, which (eg for places and many other classes of things) is often far more detailed and better organised than on other wikis. Systematically creating items for categories with statements to describe them would unlock this, and open up these relationships to full-scale analysis -- and the systematic description would also make the category tree far easier to navigate and apply (eg how to find the right categories for a file, and find out what they have actually been called). I think the worries about promotional activity expressed below are over-done -- does it really matter? At least the company or individual would be providing free images of themselves. The worries about BLP privacy and/or denigration are more concerning; but is this not something that could pretty much equally happen already? Are we that good at catching it even as things stand? Particularly when the inclusion criteria are already very low in so many areas -- eg authorship of any scientific paper, inclusion in pretty much any database. I think what is needed here is more rigorous referencing requirements for particular statements on BLPs, rather than blocking Commons items tout court and crippling SDC. Jheald (talk) 00:43, 13 February 2019 (UTC)[reply]
- (Though such items arguably already meet the "structural need" qualification.) Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:59, 13 February 2019 (UTC)[reply]
- Also these entries are needed for Structured Data on Commons to work properly. Regards, Yann (talk) 16:27, 4 May 2019 (UTC)[reply]
- I don't see any wrong doing this. --Tinker Bell ★ ♥ 04:09, 1 December 2019 (UTC)[reply]
- If these are category items (as opposed to topic items) then no sources are required and its the same as with any other category item. If these are topic items then this restriction doesn't prevent them, though #2 and #3 would apply. The only statements category items generally have apart from category's main topic (P301) are category contains (P4224) and Commons category (P373) (up for deletion). If there's a problem with having the category its no more difficult in getting it deleted on Commons as with any other project. Lucywood (talk) 15:24, 7 May 2020 (UTC)[reply]
- NMaia (talk) 00:44, 27 August 2020 (UTC)[reply]
- --Oursana (talk) 16:23, 20 January 2021 (UTC)[reply]
Oppose[edit]
- Sjoerd de Bruin (talk) 19:35, 12 February 2019 (UTC) Loophole for self-promotion and other promotional things.[reply]
- MisterSynergy (talk) 19:56, 12 February 2019 (UTC); I support the SDC project and generally welcome more Commons content here, but I cannot support the wording of the proposal as it lowers the notability threshold to basically zero requirements. The current notability criteria are designed in a way that we have (very likely) external sources available for the vast majority of items within a few clicks, either via identifiers, statement references, or Wikipedia articles. The problem with Commons categories is that they are in almost all cases unsourced, as categories are auxiliary content at Commons (media files are core content of that project). It would have several alarming implications if this proposal was accepted:[reply]
- Right now, Wikimedians typically do not “measure” notability, which is ensured by the external sources one has to provide when adding content here (except at Commons as a media file repository, and few very small Wikimedia projects). Things become “notable” because reputable third-parties decide to cover them in their works. This is why we can reject a lot of purely promotional content here which exploits our reputation as a non-commercial, fairly independent organization.
- Content that is not properly identified against external sources is very difficult to verify (which makes our project vulnerable to hoaxes, doxing, defamatory content, inacceptable privacy violations and other forms of abuse). In fact, in many cases the lack of external references makes it practically impossible for most editors to work with such items.
- If we lower the notability criteria to zero requirements, we might face scaling issues in terms of community workforce and technical equipment.
- According to the notability criteria, if an item is a clearly identifiable entity that can be described using sources, then it can have an item on Wikidata. In my opinion all categories from Commons deserve an item if and only if they fulfill that criteria. Otherwise "everything goes", there would not be standards at all, because in Commons nobody checks if a category is notable enough.--Micru (talk) 20:52, 12 February 2019 (UTC)[reply]
- Trying to make sure I understand: does that "only if" mean a higher standard for Commons notability than we have for other wikis, since it would not allow items to be introduced for structural reasons? This is likely to occur for some categories quite high in the hierarchy, most obviously metacategories. - Jmabel (talk) 21:40, 12 February 2019 (UTC)[reply]
- No, it does not mean a higher standard, it means some standard. Other wikis have some standard, and we respect that.--Micru (talk) 15:22, 14 February 2019 (UTC)[reply]
- Trying to make sure I understand: does that "only if" mean a higher standard for Commons notability than we have for other wikis, since it would not allow items to be introduced for structural reasons? This is likely to occur for some categories quite high in the hierarchy, most obviously metacategories. - Jmabel (talk) 21:40, 12 February 2019 (UTC)[reply]
- Per Sjoerd de Bruin --A.Savin (talk) 21:17, 12 February 2019 (UTC)[reply]
- Per Sjoerd. Trying to solve a non-existent problem and creating a notability loophole in the process. Reading the original proposal I realise it doesn't correctly describe the current situation. The sentence "For a while the notability standards of Wikidata have excluded items which currently only have a Wikimedia Commons (hereafter "Commonswiki") category but not any other form of page on any other Wikimedia website." is incorrect. As you can read on Wikidata:Notability we we have three options for an item to be notable: "at least one valid sitelink" or "clearly identifiable conceptual or material entity" or "some structural need". Items that fall in the second category ("clearly identifiable conceptual or material entity") do not suddenly become not notable when someone adds a sitelink to Commons. Multichill (talk) 22:18, 12 February 2019 (UTC)[reply]
- Per Multichill, the first criterium is not the only one. Also the rollout of SDC will gradually make all categories on Commons obsolete.--Jklamo (talk) 09:32, 13 February 2019 (UTC)[reply]
- I dont think it is the best way to go. When an item is needed, other already existing criteria like "structural need" seem to apply. Some categories should probably not get a Wikidata item. That includes maintenance categories, but also "composite" categories like Commons:Category:Paintings Madonna and Child in Russia that would actually be better expressed as several separate statements (best solution may be a semi-supervised migration process so that we can move those info to Wikidata and get rid of the categories on Commons at the same time).--Zolo (talk) 14:49, 13 February 2019 (UTC)[reply]
- Categories like c:Category:Nude or partially nude women facing right and looking left should never ever get a Wikidata item and the aim of Structured data on Commons is precisely to get rid of them by replacing them by individual statements. Categories that deserve an item are already covered by other notability criteria. --Nono314 (talk) 07:25, 14 February 2019 (UTC)[reply]
- Categories can't be sourced. Snipre (talk) 13:10, 14 February 2019 (UTC)[reply]
- per -Nono314 I really looking forward to get rid of all those combined subcategories. --Hannolans (talk) 19:25, 15 February 2019 (UTC)[reply]
- As demonstrated above, creating Wikidata entries for Commons' intersection categories would be wrong on several levels. Conversely, none of the supporters has demonstrated why this change in policy would actually be needed. Any example if a Category that should have a Wikidata item, but can not have it because of the current policy? Wikidata:Notability point 2 and 3 should cover pretty much anything that is notable but doesn't have a Wikipedia article yet. I'm not opposed to allowing this for some sensible subset of Commons categories, though. --El Grafo (talk) 11:06, 18 February 2019 (UTC)[reply]
- Intersection categories can be created for anything, specific or vague. The role of structured data is to get rid of the need for them. -Geraki (talk) 15:49, 18 February 2019 (UTC)[reply]
- Per Sjoerd, MisterSynergy and Multichill. --Sannita - not just another it.wiki sysop 22:02, 18 February 2019 (UTC)[reply]
- I thought the whole point of Structured Data on Commons was to allow us to finally move away from categories. Being able to automatically generate cool infoboxes for categories is a waste of time and resources, IMO. The sooner categories die, the better. Also, this change isn't even needed per Multichill. Kaldari (talk) 06:02, 3 March 2019 (UTC)[reply]
- I just can't reconcile allowing Wikidata entries for every Commons category with my concept of 'notability'. I personally have four Commons categories and I'm the most unremarkable, non-notable person you could meet. We really need to have some notability criteria more stringent than "topic has a Commons category". --RexxS (talk) 00:13, 10 March 2019 (UTC)[reply]
Neutral[edit]
- I agree with the general direction here, but believe that there are some Commons categories that don't deserve a wikidata item. Some examples:
- Probably most intersection categories unless we have a specific structural need for the. For example, if we already have a place and a date, we don't need a separate item for the conjunction of place & date.
- Maintenance categories (e.g. commons:Category:Images from the Alaska-Yukon-Pacific Exposition Collection to check and personal categories (e.g. commons:Category:Images from DoD uploaded by Fæ)
- More open to question: people of only marginal notability for whom we have a category only because we happen to have multiple pictures: e.g. someone whose only reason they are on Commons as a subject at all is their attendance at a notable conference.
- Perhaps similarly to Jmabel, I don't think an isolated Commons category (ie. without links to other projects) should automatically be excluded from Wikidata, but I don't think it should be the basis for inclusion in Wikidata either. Plenty of very notable subjects only have Commons categories, most notably the vast majority of cultural heritage monuments promoted by Wiki Loves Monuments, and many many pieces of art. On the other hand, if Commons has any requirements of notability before a category can be created, they are extremely limited. There are many categories on Commons for subjects that could never justify a wikipedia stub or category (Commons:Category:Purple-colored people and Commons:Category:Children wearing glasses come to mind). That doesn't mean they shouldn't exist on Commons (where they help greatly with organization) but I don't see the point in adding them to Wikidata. - Themightyquill (talk) 12:18, 13 February 2019 (UTC)[reply]
- @Themightyquill: How is one to be able to document what a category like "Children wearing glasses" means, in terms of Wikidata items and Wikidata statements, to be machine-interpretable and machine-queryable, if it does not have a wikidata item? Jheald (talk) 13:01, 13 February 2019 (UTC)[reply]
- @Jheald: Good question. I don't suppose it would be possible - so if that's the agreed purpose, wikidata items would have to exist for every commons category. - Themightyquill (talk) 13:09, 13 February 2019 (UTC)[reply]
- +1 from the IPs, some folks on Commons care about their categories, it's not only auxiliary content, but they are also years behind in their Category for Discussion procedures. Many categories such as c:Category:Nude or partially nude people on benches make no sense outside of Commons (if at all), let alone tons of c:Category:3675 (number). –84.46.53.230 17:19, 13 February 2019 (UTC)[reply]
- @Themightyquill: How is one to be able to document what a category like "Children wearing glasses" means, in terms of Wikidata items and Wikidata statements, to be machine-interpretable and machine-queryable, if it does not have a wikidata item? Jheald (talk) 13:01, 13 February 2019 (UTC)[reply]
- I don't think a Wikidata item should be created for every category on Commons. In many cases, Commons categories can be linked to a Wikidata main item. I don't think creating a category items for things like intersection categories that don't exist in other projects is useful (the intersection could be expressed with a template on the Commons category without bothering with a Wikidata item). I think this is also the case for category items created for other projects: the discrimination against Commons should be replaced with discrimination against useless category items. However, there's still a problem with Commons categories that can't be linked to main items because a gallery is already sitelinked. It should be possible to create category items in those cases. Ghouston (talk) 01:49, 14 February 2019 (UTC)[reply]
Discussion[edit]
The main rationale behind this is that Wikimedia Commons shall soon work with a new type of software called "depicts" which allow users to name every subject in a photograph, these file depicts will be linked to Wikidata items. According to Pigsonthewing this is already an unwritten standard and we had this discussion related to it. In fact this proposal can greatly improve how Structured Data on Wikimedia Commons works for the benefit of both Wikidatans and Wikimedian Commonists. -- Donald Trung/徵國單 (討論 🀄) (方孔錢 💴) 19:12, 12 February 2019 (UTC)[reply]
Note that the Structured Data on Wikimedia Commons is the structural need for the inclusion of these items, as Wikidata exists as the central repository of all 900+ (nine-hundred plus) Wikimedia websites, excluding the structure of the largest Wikimedia website just seems like an odd move, I can understand the historical need for the exclusion of these pages, but today there is very much a great need for their inclusion. -- Donald Trung/徵國單 (討論 🀄) (方孔錢 💴) 19:30, 12 February 2019 (UTC)[reply]
- @Donald Trung: I'm not sufficiently familiar with Wikidata to say anything too sensible right now.. But two questions come to mind. First, aren't categories supposed to become obsolete at some point? Second, how would Wikidata use a category like c:Category:2015 fires in France or c:Category:Politicians of Belgium by year? User talk:Alexis Jazz please ping me if you reply 19:33, 12 February 2019 (UTC)[reply]
- Alexis Jazz First of all, categories are supposed to become obsolete because of the file depicts which are coming this month or the next (don't quote me on that!), but file depicts work in a way that they can basically only link to a Wikidata Q-item page, I noted at the discussion page of the structured data team (Commons:Commons talk:Structured data) how some very common organisations such as Bristol (shop) and Ziengs Schoenen didn't have any Wikipedia articles dedicated to them but that they are both very common shops in the Netherlands and would probably be depicted quite often in photographs, if these subjects have a structural need which categories can provide but file depicts can't then there is no reason for someone to actually use the new method of organisation over the old one. As for the second question I'm aware that a hierarchal structure also exists in Wikidata so the new items created from them could probably also be mentioned as "sub-items" in the same way that a Human is a sub-item of Mammal, which is an Animal, Etc. -- Donald Trung/徵國單 (討論 🀄) (方孔錢 💴) 19:47, 12 February 2019 (UTC)[reply]
- @Donald Trung: but Bristol (shop) could have easily had a Wikidata item without a Wikipedia article. Wikidata guidelines don't require a Wikipedia article. Alexis Jazz please ping me if you reply 20:10, 12 February 2019 (UTC)[reply]
- Alexis Jazz but the current Wikidata notability standards read "It contains at least one valid sitelink to a page on Wikipedia, Wikivoyage, Wikisource, Wikiquote, Wikinews, Wikibooks, Wikidata, Wikispecies, Wikiversity, or Wikimedia Commons." followed by "In addition, sitelinks on category items to category pages on Wikimedia Commons are allowed if and only if they are linked with category pages on other Wikimedia sites.[8]", the practice of creating Wikidata items such as Ziengs isn't allowed according to that page, in practice users do create such pages. -- Donald Trung/徵國單 (討論 🀄) (方孔錢 💴) 20:15, 12 February 2019 (UTC)[reply]
- For what it's worth Ziengs (Q60887513) isn't a category item -- it's a main-type item -- so strictly speaking (at least by the letter of the policy, as you currently quote it) that second clause doesn't apply. Nevertheless, moves towards routine creation of such items does seem to have made some of the community somewhat unhappy. Jheald (talk) 00:01, 13 February 2019 (UTC)[reply]
- @Donald Trung: The notability standards read "it meets at least one of the criteria below". The first is to contain a valid sitelink, so forget about that. The third is "It fulfills some structural need" which usually doesn't apply. (I sometimes create items for non-notable people so I can list them as husband/wife on the item of someone who is notable) But the second criteria is what I was referring to: "It refers to an instance of a clearly identifiable conceptual or material entity. The entity must be notable, in the sense that it can be described using serious and publicly available references". This easily applies to Bristol (shop). Alexis Jazz please ping me if you reply 01:33, 13 February 2019 (UTC)[reply]
- Alexis Jazz but the current Wikidata notability standards read "It contains at least one valid sitelink to a page on Wikipedia, Wikivoyage, Wikisource, Wikiquote, Wikinews, Wikibooks, Wikidata, Wikispecies, Wikiversity, or Wikimedia Commons." followed by "In addition, sitelinks on category items to category pages on Wikimedia Commons are allowed if and only if they are linked with category pages on other Wikimedia sites.[8]", the practice of creating Wikidata items such as Ziengs isn't allowed according to that page, in practice users do create such pages. -- Donald Trung/徵國單 (討論 🀄) (方孔錢 💴) 20:15, 12 February 2019 (UTC)[reply]
- @Donald Trung: but Bristol (shop) could have easily had a Wikidata item without a Wikipedia article. Wikidata guidelines don't require a Wikipedia article. Alexis Jazz please ping me if you reply 20:10, 12 February 2019 (UTC)[reply]
- Alexis Jazz First of all, categories are supposed to become obsolete because of the file depicts which are coming this month or the next (don't quote me on that!), but file depicts work in a way that they can basically only link to a Wikidata Q-item page, I noted at the discussion page of the structured data team (Commons:Commons talk:Structured data) how some very common organisations such as Bristol (shop) and Ziengs Schoenen didn't have any Wikipedia articles dedicated to them but that they are both very common shops in the Netherlands and would probably be depicted quite often in photographs, if these subjects have a structural need which categories can provide but file depicts can't then there is no reason for someone to actually use the new method of organisation over the old one. As for the second question I'm aware that a hierarchal structure also exists in Wikidata so the new items created from them could probably also be mentioned as "sub-items" in the same way that a Human is a sub-item of Mammal, which is an Animal, Etc. -- Donald Trung/徵國單 (討論 🀄) (方孔錢 💴) 19:47, 12 February 2019 (UTC)[reply]
@Sjoerddebruin: regarding self-promotion and other non-notable topics, yes I do have to admit that Wikimedia Commons is a breeding ground for self-promotion and "spam", however one can also make the argument that one can discover an act of self-promotion made on Wikimedia Commons through Wikidata as it would place more eyes on the subject, I remember Jheald in the project chat addressing this issue last month in the conversation above this one (Wikidata crashes when I use "desktop mode" so I can't link to the specific version of the page). The benefits of the structural need outweigh the inclusion of trivial topics such as a category dedicated to a small charity event is Bumfarm, Ohio. -- Donald Trung/徵國單 (討論 🀄) (方孔錢 💴) 19:53, 12 February 2019 (UTC)[reply]
MisterSynergy Regarding the comment "Right now, Wikimedians typically do not “measure” notability, which is ensured by the external sources one has to provide when adding content here (except at Commons as a media file repository, and few very small Wikimedia projects). Things become “notable” because reputable third-parties decide to cover them in their works. This is why we can reject a lot of purely promotional content here which exploits our reputation as a non-commercial, fairly independent organization.", while the notability standards would essentially be lowered to zero (0), content on Wikimedia Commons would first have to exist in order for it to have a categories, however the core reason Wikidata exists is to help structure other Wikimedia websites. And regarding "Content that is not properly identified against external sources is very difficult to verify (which makes our project vulnerable to hoaxes, doxing, defamatory content, inacceptable privacy violations and other forms of abuse). In fact, in many cases the lack of external references makes it practically impossible for most editors to work with such items." unverified content other than category names wouldn't be allowed, typically random people don't have their own Wikimedia Commons categories (though they can as the Nipponese Dog Calvero even has his own Wikimedia Commons category), but hoaxes as quickly nominated for deletion as they fall outside of the scope of Wikimedia Commons unless the reporting on the hoax is notable (no different than that of a Wikipedia article), for doxing their is oversight, but could you please name a scenario where such abuses could occur? As this request only concerns the creation of items based on existing Wikimedia Commons categories, otherwise unsourced material isn't welcome on Wikidata, right? -- Donald Trung/徵國單 (討論 🀄) (方孔錢 💴) 20:10, 12 February 2019 (UTC)[reply]
- Sure, we do want to support Wikimedia projects with data, but we also have our well-grounded principles which we cannot just give up (external identification is key!). The potential damage of this proposed policy change is far too serious in my opinion, and we should insist on the existence of serious external sources. Abuse scenarios that I see are:
- I do not like some other person, so I upload content about them to Wikimedia Commons and create a category there which does not appear too suspicious (categorize e.g. by name, gender, nationality, and occupation). I do not need to provide sources, so nobody can verify the details anyways. As this would be a ticket to have a Wikidata item due to the existence of the Commonscat, I could create this item and add defamatory content (also unsourced) in it. Chances are very high that nobody would notice.
- (technically very similar) I deliberately want to have a Wikidata item about myself or my small business, so I upload some content at Commons (and a category) which warrants a Wikidata item. In the Wikidata item, I could perfectly promote myself.
- There are lots of people who eagerly look for new paths which gets them into Wikidata, and I am afraid they would not need much time to figure out what to do. As as admin who is particularly active in the deletion business here at Wikidata I unfortunately also have to deal with more hoax content than one would naïvely expect at Wikidata.
- Please also mind that there is no individual verification of item notability being conducted. Right now, as soon as items formally meet the notability criteria, very likely nobody would complain about them. In other words: if problematic content is not going to be identified on Commons side, it will very likely go unseen into Wikidata. Unfortunately, Commons has severe problems with identification of problematic content as well (they care a lot about licensing issues, but that’s it basically). —MisterSynergy (talk) 20:50, 12 February 2019 (UTC)[reply]
- A simple solution, of course only to be reached by good consultation, would be to have special reviewers on Commons decide whether a solitary Category (i.e. not linked to an existing WD-item) is eligible for Wikidata, either by request or default. At the same time, Wikidata could 'lock' all WD-items created on the basis of only such a Commons Category to prevent any addition of [visible and searchable] data, unless that data is a wikilink (at which moment the item should automatically be 'unlocked'). That way Wikidata can safely be the hub that Commons needs at the moment, because potential misuse would be, to say the least, minimized.
- That said, I like to remind of the fact that Wikipedia categories are, strictly speaking, also not based on external sources: categorization is in the hands of Wikipedia users and, while not a controversial activity in itself, not infrequently subject of debate. It is also not impossible to create fake, hoax or promotional categories on Wikipedia. But this all is rather irrelevant compared to a related issue, being the fact that Wikipedia Categories (can) have their own Wikidata item. It should be of the greatest importance that Wikidata centralizes all information on a subject.
- To me it is incomprehensible that, for instance, en:Psychology can not be linked to c:Category:Psychology, only because Q1983760 dictates the Commons categorie to en:Category:Psychology and Q9418 only allows the Wikipedia article to be linked to c:Psychology. This is nothing but a fundamental mistake in the Wikidata architecture, which in my opinion should be solved as soon as possible: Wikidata should connect data, not split up data. (Unnecessary to add that the Commons wiki pages are not externally sourced either, and one wonders why those pages actually even exist: is it some sort of relic from hazy days?)
- So: the solution might be simple, the problem lies elsewhere. Jürgen Eissink (talk) 23:31, 12 February 2019 (UTC).[reply]
- I do not understand why there is such a fear of self promoting Content. Commons has project scope too and self promoting images get deleted every day. --GPSLeo (talk) 12:47, 13 February 2019 (UTC)[reply]
- I don't think that the costs outweigh the benefits, "the Spampocalypse" of having Wikimedia Commons categories would probably be something that the people over at Wikimedia Commons itself could handle. The benefits would be having the structural needs met of the largest overhaul to the way content is both organised and searchable on Wikimedia Comons since the project launched, people should ask themselves what is more important, structuring every file on Wikimedia Commons, or keeping some information about a few businesses and/or museums and/or charities? Because the whole scope of Wikidata is providing a basis for the structural needs of other Wikimedia websites and excluding the largest of these doesn't seem like a sensible thing to do. -- Donald Trung/徵國單 (討論 🀄) (方孔錢 💴) 17:56, 15 February 2019 (UTC)[reply]
- I do not understand why there is such a fear of self promoting Content. Commons has project scope too and self promoting images get deleted every day. --GPSLeo (talk) 12:47, 13 February 2019 (UTC)[reply]
- To this discussion feels like it's about potential needs of Commonswiki without anybody who identitfies themselves as an experienced Commonswiki editors expressing a need. It seems to me that "2015 fires in France" doesn't seem like a good target for "depicts". Targets for "depicts" should be clearly identifiable material or conceptual entities and commons categories are not required for the notability of those. ChristianKl ❪✉❫ 12:08, 14 March 2019 (UTC)[reply]
- @ChristianKl: There are two issues here. Firstly how readily should "clearly identifiable material or conceptual entities" get items here. Because without the items they can't be the targets for "depicts"; but on the other hand some worry about non-notability. Also, as a practical issue, "clearly identifiable entities" are much easier to comprehensively identify that "notable clearly identifiable entities"; so in practical terms, imposing a notability threshold makes it much more difficult (or even impossible) to do the kind of bot work that Mike Peel had started on to try to ensure that all categories representing notable clearly identifiable entities have items.
- The second issue is compound categories. Here the issue is not so much to have items to be targets for properties, but rather to have items to have somewhere to be able to record and easily retrieve what the categories mean, in terms of Q-items. This is something that really matters, for two reasons: (1) because in many cases the categorisation is the best metadata we have for images, so if we want to translate that information into structured data statements, we need to know and record what those categories mean; (ii) because given structured data statements, knowing what the categories mean would be a lot of help towards being more able to auto-categorise images. So for those reasons, we need somewhere to be able to record what compound categories mean, in terms of properties and Q-numbers. Jheald (talk) 23:57, 14 March 2019 (UTC)[reply]
- Comment Probably phab:T54971 related. --Liuxinyu970226 (talk) 01:56, 14 May 2019 (UTC)[reply]
Excludes[edit]
I think we should exclude some category types of the general notability:
- All hidden categories
- All categories by date/year like commons:Category:2018_in_rail_transport_in_Berlin or commons:Category:Events in Berlin in the 2010s
- All meta-categories
- All categories by region like commons:Category:Grus grus in Germany
--GPSLeo (talk) 20:12, 12 February 2019 (UTC)[reply]
- @GPSLeo, Jmabel: On the subject of compound categories, like the ones you have both raised, IMO these categories are some of those where it would be particularly helpful to be able to describe what they represent in a structured way, i.e. using statements. This would (i) allow that information to be presented in a multilingual way via a Wikidata infobox on the category, also giving links back to the parent concepts; (ii) allow membership of the category to be interpreted, to allow SDC statements to be proposed for files within it; (iii) in the other direction help make auto-categorisation more possible, to allow files to more consistently be placed in such categories.
- This is the power released by making it possible to describe such categories using statements.
- But such statements need somewhere to live, to be readily editable and queryable --> so somehow it would be useful for there to be an item for them somewhere, and wikidata looks to be the only game in town. (Added to which, it would be crazy to store structured specifications for some categories in one place, but other categories not in the same place). Jheald (talk) 23:05, 12 February 2019 (UTC)[reply]
- I think with structured Data there is no need for those categories. The categories can replaced by queries. If I want to get any photo with rail transport in Berlin as topic and taken in 2018. I just have to write a query. I also can search for every photo taken in june 2007 of an event in India. Or for every photo of a Grus grus in Berlin-Mitte. All this could have a category on commons now but it dose not have to. --GPSLeo (talk) 09:13, 13 February 2019 (UTC)[reply]
- Potentially yes, though I'm not sure whether this is part of what belongs in Wikidata vs. in a Wikibase that is part of Commons. Along those lines, please see commons:Commons talk:Structured_data/Archive_2018#Categories, properties, & directed acyclic graphs that I started last year; I'm afraid it got promptly derailed into a discussion and eventually an near-argument with someone who seemed mostly confused, but I still think my original proposal merited some discussion. - Jmabel (talk) 23:34, 12 February 2019 (UTC)[reply]
- @Jmabel: Current wikibase development presumes that items of any particular type -- eg Q-items, L-items, M-items -- will each only be drawn from a single source, ie that M-items will exclusively live on the Commons wikibase, Q-items will exclusively live on Wikidata. Jheald (talk) 00:06, 13 February 2019 (UTC) Will respond tomorrow re your interesting 2018 post; though, at first take, some of its suppositions may be questionable. Jheald (talk) 00:09, 13 February 2019 (UTC)[reply]
- Then, yes, if we are to do anything like what I proposed we'd need these items on Commons. - Jmabel (talk) 00:21, 13 February 2019 (UTC)[reply]
- @Jmabel: Current wikibase development presumes that items of any particular type -- eg Q-items, L-items, M-items -- will each only be drawn from a single source, ie that M-items will exclusively live on the Commons wikibase, Q-items will exclusively live on Wikidata. Jheald (talk) 00:06, 13 February 2019 (UTC) Will respond tomorrow re your interesting 2018 post; though, at first take, some of its suppositions may be questionable. Jheald (talk) 00:09, 13 February 2019 (UTC)[reply]
Timeliness ?[edit]
I pinged User:SandraF (WMF) over at the SDC talk page on Commons (discussion) when the question of Wikidata items for people with Commonscats came up last month. Her whole response is worth reading, but a couple of things struck me as particularly worth noting:
- The decisionmaking around this topic is fully up to the community... As a staff member, I want to make a point of not wanting to impose an opinion on this at all.
- [This is] up to the community to reach consensus about! Deployment is around the corner, so the community can try this quite soon. Seeing the technology in front of one's eyes will certainly clarify things and cause more people to have strong opinions about this.
This seems to be somewhat the SDC team's modus operandi -- not to map things out too tightly ahead of time (nor worry too much about possible issues ahead of time), but just to concentrate on getting some functionality to a point where it can be released, then see how the community runs with it.
But she may have a point, that the time when the benefits of having items (or limitations of not having them) will come most sharply into focus may not be for a few months yet, ie when arbitrary statements and more sophisticated searching of Commons files becomes possible.
Nevertheless, it may be worth collating some reasons as to why it would be particularly useful to be able to make Wikidata items now, ahead of that time. I'll start with a few that come to my mind, but anybody else feel free to jump in and add to the list below (or challenge any of the items on it) Jheald (talk) 23:54, 12 February 2019 (UTC)[reply]
- Issues that make this relevant now
- Huge backlog of categories still to work through: only 2 million out of 7.2 million currently have infoboxes
- Matching from existing Wikidata items to Commons categories has gone about as far as it can
- Systematic approach makes it possible to whittle down eg all categories for individuals that don't have Wikidata items -- there may be some quite important people there that ought to have items on any basis, that Wikidata really ought to have as soon as possible, but with the current piecemeal approach they might never be found. Similarly for places, important buildings, etc.
- The more comprehensively we can have items for categories in place before SDC goes live, the more it will be possible to take a systematic approach rather than a piecemeal approach to creating SDC statements when it does go live
- ... more ?
Conflicts of interest[edit]
Anyone opposing this proposal (or opposing a liberal approach to creating items for subjects with a Commons category in general); and who is the subject of a Commons category, yet opposes there being an item about themselves, has a clear conflict of interest, and should declare that in their !vote. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:06, 13 February 2019 (UTC)[reply]
CC0[edit]
It turns out 'captions', and probably future 'depicts' as well, are now suddenly considered to be added with CC0 license. This might, if not: should, be reason for many to reject the functionality of depicts. See this discussion at Commons' Village Pump. Jürgen Eissink (talk) 22:06, 13 February 2019 (UTC).[reply]
- Jürgen Eissink Structured Data on Wikimedia Commons (SDC) features were always meant to be licensed Creative Commons 0 (Zero), this has been planned years in advance in order to make it more compatible with Wikidata. Why should depicts be rejected on Wikimedia Commons because of that anyways? It doesn't change the license of any photograph or other medium and every edit made on Wikidata is also licensed CC-0 so this should be more motivation to adopt a structure to better co-ordinate between the two (2) projects, not less. And naturally the CC-0 license applies to the requested changes here as everything here is CC-0. -- Donald Trung/徵國單 (討論 🀄) (方孔錢 💴) 20:11, 15 February 2019 (UTC)[reply]
- CC0 has not been chosen for SDC "to make it more compatible with Wikidata", of course. The statements to be deployed ("depicts" and so on) are factual data which, according to the prevailing view, are not copyright-able at all. Thus one cannot claim any rights for such data by adding a CC-by-sa licence. For captions, however, one could argue that they pass the threshold of originality (in some countries, probably not in all), which would technically allow them to become CC-by-sa licenced. Proper re-use would be utterly complicated then, so I think CC0 is by far the better option for captions compared to CC-by-sa. —MisterSynergy (talk) 20:45, 15 February 2019 (UTC)[reply]
- Summary, CC0 is a good thing, PoC vintage 2006: All mw: help pages (only "stealing" good m: help pages made me angry), and this very dry RfC page needs a nice image with a 100% bogus CC‑BY‑SA‑4.0 license instead of PD ineligible. –84.46.53.251 12:32, 17 February 2019 (UTC)[reply]
- CC0 has not been chosen for SDC "to make it more compatible with Wikidata", of course. The statements to be deployed ("depicts" and so on) are factual data which, according to the prevailing view, are not copyright-able at all. Thus one cannot claim any rights for such data by adding a CC-by-sa licence. For captions, however, one could argue that they pass the threshold of originality (in some countries, probably not in all), which would technically allow them to become CC-by-sa licenced. Proper re-use would be utterly complicated then, so I think CC0 is by far the better option for captions compared to CC-by-sa. —MisterSynergy (talk) 20:45, 15 February 2019 (UTC)[reply]
Random sampling[edit]
It's always good to look at some data when making a decision. So I've updated commons:User:Mike Peel/Commons categories without Wikidata sitelink with 100 commons categories that aren't yet linked to from Wikidata. I can re-run that to have a bigger sample if anyone wants me to. You can draw your own judgement from the sample, but if anyone can add sitelinks from existing Wikidata items for 10 of these then I'll give them a barnstar. Thanks. Mike Peel (talk) 23:36, 20 February 2019 (UTC)[reply]
Batch 1[edit]
- Matched: conchology (Q862072) -> c:Category:Conchology; but displaced c:Category:Illustrations of Mollusca Jheald (talk) 00:09, 21 February 2019 (UTC)[reply]
- c:Category:Diocese of Fossano is a cat redirect. Redirect target c:Category:Roman Catholic Diocese of Fossano is sitelinked to Roman Catholic Diocese of Fossano (Q868918). Jheald (talk) 00:15, 21 February 2019 (UTC)[reply]
- c:Category:Drumborg, Victoria is a cat redirect. Redirect target c:Category:Drumborg is sitelinked to Drumborg (Q30881983). Jheald (talk) 00:23, 21 February 2019 (UTC)[reply]
- Maximiliano Bustos (Q3302735) -> c:Category:Maximiliano Bustos (already had P373). Jheald (talk) 00:28, 21 February 2019 (UTC)[reply]
@Jheald: Category redirects don't count, I'm re-running the sample to exclude those. But you get 2 points so far. Thanks. Mike Peel (talk) 00:31, 21 February 2019 (UTC)[reply]
- George Hawley Hallowell (Q22002253) -> c:Category:George Hawley Hallowell Jheald (talk) 00:38, 21 February 2019 (UTC)[reply]
Batch 2[edit]
Batch 3[edit]
@Mike Peel: To see what sort of 'simple' categories currently don't have sitelinks, it might be worth getting a batch of categories excluding the words "of", "by", and "in" as these tend to be compound categories. Jheald (talk) 00:41, 21 February 2019 (UTC)[reply]
- OK, done: [3] Thanks. Mike Peel (talk) 00:54, 21 February 2019 (UTC)[reply]
- Lethacotyle (Q15256617) -> c:Category:Lethacotyle
- puerperal infection (Q1419347) -> c:Category:Postpartum infections
- Haverwerf (Q2787283) -> c:Category:Haverwerf (Mechelen)
- Henriette Widerberg (Q4990467) -> c:Category:Henriette Widerberg
- Ramón Escovar Salom (Q6099588) -> c:Category:Ramón Escovar Salom (had P373)
- Andris Siliņš (Q54321872) -> c:Category:Andris Siliņš
- Stentor muelleri (Q19605763) -> c:Category:Stentor muelleri
- Carbamoyl dehydratase HypE (Q24774790) -> c:Category:Hydrogenase expression or formation protein HypE (I think; but haven't added this. Images perhaps ought to be refined to category reflecting species & new category made for protein family. But this may be one for the GeneWiki crew.)
- Praça da República (Q10353435) -> c:Category:Praça da República (Elvas) (item formerly had no statements on it, just a sitelink to pt.wiki)
- Joseph Guiter (Q11928433) -> c:Category:Joseph Guiter
- Čechtická (Q45558862) -> c:Category:Čechtická (Prague)
- c:Category:Model Railway category redirect
- c:Category:Good Morning category redirect
- c:Category:Ghum railway station,Darjeeling category redirect
- c:Category:Ōmiya Park Soccer Stadium category redirect
- c:Category:Bohinj_tunnel category redirect
Batch 4[edit]
commons:User:Mike Peel/Commons categories without Wikidata sitelink 3 - now without redirects (previous runs had a bug in the code), again without 'of', 'by' or 'in' in the category names. Thanks. Mike Peel (talk) 13:38, 21 February 2019 (UTC)[reply]
Criteria[edit]
For Structured Data on Commons to work properly, we need a Wikidata entry for
- Every person who has a non-hidden category (contributors' category are usually hidden).
- Every work of art needs a category for depict statements.
- Every identified species need a category for depict statements.
- Every public (city hall, post-office, etc.) or named building (churches, etc.). This excludes non-notable private buildings, except some special cases.
- Every named street or road.
Criteria for objects, vehicles, and natural places need to be defined. Regards, Yann (talk) 16:34, 4 May 2019 (UTC)[reply]
Previous discussions
(for reference)[edit]
Could someone please link all the past (archived) discussions related to this subject here? -- Donald Trung/徵國單 (討論 🀄) (方孔錢 💴) 19:12, 12 February 2019 (UTC)[reply]
- Wikidata:Project chat/Archive/2019/01#Creating new items for humans based on Commons categories
- Wikidata:Project chat/Archive/2019/01#Wikidata:Notability and Wikimedia Commons