Wikidata:Project chat/Archive/2018/03

This page is an archive. Please do not modify it. Use the current page, even to continue an old discussion.

Babel

Hi! Is there any obligation to declare language skills in userpages? I do not mean if that's recommendable or advisable. I mean mandatory. strakhov (talk) 13:04, 1 March 2018 (UTC)

no there is no obligation to do it.

However, it is useful for talking with other users, and it allows some functions and scripts and gadgets to display your "preferred" languages (as many as there are), when you work : for example, easily accessing the labels of items in all the languages you work with... languages that would otherwise be hidden by default. :) --Hsarrazin (talk) 13:17, 1 March 2018 (UTC)

Thanks! :) strakhov (talk) 13:47, 1 March 2018 (UTC)

This section was archived on a request by: Matěj Suchánek (talk) 11:30, 2 March 2018 (UTC)

Is quickstatements slow?

Artix Kreiger (talk) 16:01, 1 March 2018 (UTC)

Resolved Never mind. Artix Kreiger (talk) 17:53, 1 March 2018 (UTC)

This section was archived on a request by: Matěj Suchánek (talk) 11:30, 2 March 2018 (UTC)

Zinnia elegans

Zinnia violacea (Q205087) and Zinnia elegans (Q15245321) - different species? Just a bad label? Botanists, advance! --Magnus Manske (talk) 09:59, 2 March 2018 (UTC)

WikiProject Taxonomy has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead. --Liuxinyu970226 (talk) 11:32, 2 March 2018 (UTC)

As far as I can see Zinnia violacea Cav. is Synonym (and also Crassina elegans) with Zinnia elegans which is the correct name, please see species:Zinnia_elegans and the source Germplasm Resources Information Network (GRIN) from U.S. Department of Agriculture Agricultural Research Service. Accessed on 08-Apr-12.

Dan Koehl (talk) 11:45, 2 March 2018 (UTC)

I agree with Dan. It all seems pretty straightforward when reading the references. Zinnia elegans Jacq. (i.e. Q15245321) is the currently valid taxon name, while the other involved taxon names are synonyms by Cavanilles and Kuntze. –Tommy Kronkvist (talk), 12:03, 2 March 2018 (UTC).

I've made some adjustments. --Succu (talk) 15:59, 2 March 2018 (UTC)

This section was archived on a request by: Matěj Suchánek (talk) 10:46, 3 March 2018 (UTC)

kulturnoe-nasledie.ru ID (P1483)

identifier for an item in a Russian cultural heritage register,Russian Cultural Heritage ID

is a dead URL.

http://nasledie-archive.ru/

from

50.254.21.213 04:31, 1 March 2018 (UTC)

It is not clear what you want but we have a full copy of the old website (possibly without some passports).--Ymblanter (talk) 06:40, 1 March 2018 (UTC)

the web site of the property is gone."The subject of a property should not be changed. If the archive hasn't just been renamed, but replaced by another, similar one, there should be a new property for it."

https://www.wikidata.org/wiki/Topic:U8h91drjod1hq4ms

"This site contains materials of the now closed portal kulturnoe-nasledie.ru (a saved copy of the main page in the Internet Archive) created by the FGUP GIVTS of the Ministry of Culture of Russia in 2008 and finally closed in March 2017.

"He was replaced by a new site http://mkrf.ru/ais-egrkn - The Unified State Register of Cultural Heritage Objects (monuments of history and culture) of the peoples of the Russian Federation, but"

i am just trying to help

50.254.21.213 13:44, 1 March 2018 (UTC)

I still do not understand what your point is.--Ymblanter (talk) 13:47, 1 March 2018 (UTC)

i thought the guy said you can not just change the name and URL the Archive is gone.

50.254.21.213 13:53, 1 March 2018 (UTC)

Well, if the identificators are still used, we can not change anything I guess. May be the url should disappear since it is clearly not functional any more.--Ymblanter (talk) 13:56, 1 March 2018 (UTC)

I added the former formatter urls with deprecated rank. Maybe archive-urls could be added too.
--- Jura 14:16, 1 March 2018 (UTC)

http://opendata.mkrf.ru/opendata/7705851331-egrkn/#{"version":"5a9691e4cc8aed9a73674939"} 50.254.21.213 14:34, 1 March 2018 (UTC)

just trying to help here but, if www.kulturnoe-nasledie.ru has been turned off, why do we keep the same (lable) and the (also known as) ? and if the identifer which is now wrong, does·n't that affect ALL russian cultural heritage register, page's (Property).shouldn't the identifer then be http://mkrf.ru/ais-egrkn , or http://opendata.mkrf.ru/ , or http://opendata.mkrf.ru/opendata/7705851331-egrkn/#

50.254.21.213 21:37, 1 March 2018 (UTC)

I think the best in this case is to ask for the creation of a new property for the new register. --Fralambert (talk) 13:42, 3 March 2018 (UTC)

Maybe it's worth mentioning that a property isn't deleted just because a website no longer exists. If the same identifier works at a slightly different website (or archive), a new formatter url can be added. Former formatter urls should use deprecated rank. archive URL (P1065) can point to an archive of a website. If there is a new, different website, it's better to create a new property for that, not change property values and relabel the properties.
--- Jura 13:58, 3 March 2018 (UTC)
https://ru.wikivoyage.org/wiki/Обсуждение:Культурное_наследие_России/Новый_реестр

50.254.21.213 03:46, 4 March 2018 (UTC)

Please use Wikidata:Форум to discuss this in Russian.
--- Jura 07:08, 4 March 2018 (UTC)

Jura I do not understand, who fixes, and not justs talk about things ? 50.254.21.213 13:55, 4 March 2018 (UTC)

I'm not sure if anything actually needs to be done. What changes do you think are needed (now that you read the explanations given by others and myself)?
--- Jura 14:22, 4 March 2018 (UTC)

you delete and replace the identifier.50.254.21.213 15:25, 4 March 2018 (UTC)

This section was archived on a request by: I don't think much can be added. A discussion in Russian might work better.
--- Jura 07:08, 4 March 2018 (UTC)

As someone responsible for the Russian part of the Monuments database, I agree that this discussion can be closed. If the anonymous user thinks that a new identifier is required, they can request a new Wikidata property. The current 10-digit identifier, P1483, is perfectly valid, linked to the existing database, and used throughout several Wikimedia projects. It should stay as it is. --Alexander (talk) 17:06, 4 March 2018 (UTC)

that was the point there was no link, just a dead web site.50.254.21.213 18:20, 4 March 2018 (UTC)

Help with enabling wikidata support for a widely used enwiki template

Hey there. I've been adding wikidata support to en:Template:Infobox power station at snail's pace over the past years. And as I move further, the wikidata coding is becoming a little too complicated for me. Is anyone willing to work with me so as to help convert the template to support wikidata?

The template supports all types of power stations. So wind farm articles only transcludes parameters pertaining to winds farm, whereas nuclear plant articles transcludes parameters pertaining to nuclear power stations, and so on. Certain parameters are obviously easy to add (i.e. name, country). The issue comes up with more complicated parameters, such as nameplate capacity in megawatts and parameters that may need the creation of new wikidata properties (i.e. nuclear plant cooling towers, etc).

Looking forward to getting this done once and for all. :-) Reh man 15:15, 24 February 2018 (UTC)

@Mike Peel, RexxS: who are good at this. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:30, 24 February 2018 (UTC)

I would actually wait for the time being, since it is likely (or at least not unthinkable of) that an RfC would result in a total prohibition of Wikidata in the English Wikipedia.--Ymblanter (talk) 20:09, 24 February 2018 (UTC)

I wouldn’t. If the data has quality good enough for Wikipedia, advance it to actual use. From the “short description” episode of the past months I wouldn’t say that there is a robust opposition against Wikidata use in English Wikipedia. Btw. WMDE is currently working on a solution that enables Wikidata editing directly from Wikipedia infoboxes in the Visual Editor, see File:Client editing prototype walkthrough.webm; this might be worth to consider when Wikidata is included in an infobox, although it probably needs a couple of months until this functionality arrives… —MisterSynergy (talk) 20:25, 24 February 2018 (UTC)

There are already a number of infoboxes in en.Wikipedia taking some, or all, of their data from Wikidata. Indeed, these seem to be most successful for technical subjects, such as the one under discussion. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:58, 24 February 2018 (UTC)

Well, the outcome of the Wikidata citation RfC was near-unanimous, with most users not actually listening to any arguments. As soon as the vandalism problem has not been solved, the vandal Wikidata edits will be reflected in the template, and it will be pretty easy for the antiWikidata brigade to argue that all edits must be rolled back. This is what happened with the World Heritage infobox, and I can not say that the concerns are completely unjustified.--Ymblanter (talk) 21:01, 24 February 2018 (UTC)

I'm happy to help with any technical issues that arise, but I'm currently not actively working on wikidata-enabled infoboxes on enwp until we have an RfC about them, given the WHS case and ongoing reverts by a couple of editors. Thanks. Mike Peel (talk) 20:13, 25 February 2018 (UTC)

Thank you for the replies everyone. Eagerly looking forward to that RFC. From the looks of it, it seems quite obvious that Wikidata integration will prevail as most of the issues are being resolved. The benefits far outweigh the issues in my opinion. Regardless, to ensure a smooth process, I will wait for the RFC to complete before making any significant changes. Yours truly, Reh man 05:56, 26 February 2018 (UTC)

While the RFC is worked on. Is someone willing to help me with "structuring" about one or two wikidata items for each type of power station (i.e. nuclear, wind, coal, etc)? Basically, these 'example' items should have all the applicable parameters of the infobox covered. We can start with the most powerful (i.e. most popular) of each type of power station, so that editors can look at how to fill similar items.

And while we do this, we should add opt-in wikidata support for the newly created parameters (i.e. if no details are filled on wiki, then the wikidata info is fetched, if that is also missing, then parameter is not displayed in the arcile). Parts of the template already has this arrangement for some time now, so there is no issue with this arrangement at WP:Energy.

I'd like to ping Mike, but I'm sure his hands are full with the RFC work, which at this point is much more important IMHO. Let me know if anyone's willing to work together on this. Thank you in advance! Reh man 03:11, 28 February 2018 (UTC)

@Rehman: You might want to start a WikiProject on power stations - there seem to be one or two related ones under Category:Science WikiProjects. ArthurPSmith (talk) 19:35, 28 February 2018 (UTC)

That's a good idea, ArthurPSmith. I've created one as a start, in order to support future projects. While work on building that further continue, I'm still looking for anyone willing to help with the above task :-) Reh man 09:51, 1 March 2018 (UTC)

Constraint reports will be integrated for all logged-in users

Hello all,

The Wikidata team continues working on supporting the community to improve data quality, by providing tools such as the checkConstraints gadget. Thanks to all the feedback you provided on this tool, we are ready to take it to the next step: integrate this feature in the default interface of Wikidata for all users with an account.

In order to make sure that our systems can handle the additional requests, we will enable the feature by groups of accounts, depending on the first letter of the usernames:

March 1st: usernames starting with Z
March 8th: Y, X, W
March 15th: V, U, T
March 22nd: S, R, Q, P, U, N
March 29th: M, L, K, J, I, H, G, F, E
April 5th: all other usernames (including non-ASCII characters)

This progressive deployment is also the occasion for the community to check the constraint definitions and make sure that they are ready to be exposed to everyone.

If you want to hide the constraint reports, you can do it by adding this line to your common.css:

/* hide constraint report indicators */
.wikibase-snakview-indicators .wikibase-snakview-indicator.wbqc-reports-button {
    display: none;
}

If you encounter any issue after the deployment is done, please let us know! You can also create a task on Phabricator with the tag Wikibase-Quality-Constraints.

Links to previous discussions about constraint checks since April 2017: 1, 2, 3, 4, 5, 6, 7.

Thanks, Lea Lacroix (WMDE) (talk) 15:10, 26 February 2018 (UTC)

@Lea Lacroix (WMDE), Lydia Pintscher (WMDE): Can you explain why we make it hard to turn off the reports, by making someone edit their common.css file? We have a perfectly good gadget system where we can turn on options by default, and turn them off if we so choose. I would encourage the developers to utilise the gadgets not require edit of css files for such an option, that is akin to wiki-barbaric and contra to approach of usability. [Before people attack me, please note that I have this set on for me and have had so for months.] — billinghurst sDrewth 11:01, 27 February 2018 (UTC)

My hope is that very very few people will want to turn it off. If this turns out to be false I'm happy to provide a better way. --Lydia Pintscher (WMDE) (talk) 11:17, 27 February 2018 (UTC)

I understand "hope", however, this is not the WMF wiki way to manage your hopes, this is basically enforcing such. Few are going to see the above mention and even know, so that becomes the regulating factor. This is the developers deciding what is better for user, this is not consensus, this is not about usability. Far better to demonstrate usability with a default gadget where people do not turn it off, that gives you better metrics. — billinghurst sDrewth 20:16, 27 February 2018 (UTC)

billinghurst, sorry. That's not how I mean this of course. My perspective: I have a very limited number of developers that I can ask to write code for the benefit of Wikidata. I want to make the best use of their time and only write code if I know it is needed. (Now obviously this is a small thing but there are a ton of small things every single day that pile up to costing a lot of time.) --Lydia Pintscher (WMDE) (talk) 09:07, 1 March 2018 (UTC)

Collaboration with Wikivoyage

As announced in the weekly summary above, Wikivoyage now links to Wikidata items.

Wikivoyage listings have many interesting properties such as address, coordinates, WiFi, accessibility, URL, opening hours, phone, fax, email, etc. All properties are validated and extracted to CSV here every 2 weeks, waiting for you harvesters! :-)

English Wikivoyage: 14803 listings with a Wikidata identifier property
French Wikivoyage: 13200 listings with a Wikidata identifier property
Russian Wikivoyage: 3808 listings with a Wikidata identifier property
German Wikivoyage: 2030 listings with a Wikidata identifier property
Spanish Wikivoyage: 156 listings with a Wikidata identifier property

Number of listings with a Wikidata property (across all languages, divided in categories):

See: 16443
Do: 3730
Buy: 607
Eat: 305
Drink: 161
Sleep: 443
Vicinity: 270
Diplomatic representation: 260
Unclassified: 9300

Only 5% of the listings have a Wikidata property, which means for the hundred thousands of others that: either the link has yet to be established, or the item is not Wikidata-worthy, or the Wikidata item is yet to be created. Almost everything in "See" is Wikidata-worthy, not so much for the other categories. Cheers! Syced (talk) 04:05, 27 February 2018 (UTC)

How many Wikivoyage listings have a Wikipedia article link, but no Wikidata item included? Is Wikidata:Bot_requests/Archive/2017/7#Adding items to Wikivoyage listings still needed?
--- Jura 04:20, 27 February 2018 (UTC)
- In English Wikivoyage, just over 1,600. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:37, 27 February 2018 (UTC)
- In Russian Wikivoyage, we do not include Wikipedia links into listings, only Wikidata links, and therefore the process can not be automated.--Ymblanter (talk) 15:23, 27 February 2018 (UTC)
  - 1600 is the number of articles. One article can have 100 listings.
    --- Jura 16:58, 27 February 2018 (UTC)
- In French Wikivoyage, 5877 listings have a wikipedia= but no wikidata= so a bot would still be very welcome indeed. The other language editions have far less,according to my SQL requests. Details: Git clone https://github.com/baturin/wikivoyage-listings then run tools/statistics.sh then run: sqlite3 listings/fr.sqlitedb 'SELECT COUNT(id) FROM wikivoyage_listings WHERE wikipedia != "" AND wikidata == "";' Syced (talk) 07:14, 1 March 2018 (UTC)

Help needed to import Directory of Open Access Journals data

Hi all

I'm starting a dataset import for the Directory of Open Access Journals. It is quite a complex dataset and I need help thinking about how to import it. I'm not sure all of it is suitable for Wikidata, I think that perhaps new properties will need to be added and possibly imported as multiple mix n' match catalogues to match the journals, publishers etc.

It would be amazing to have this data in Wikidata that could be used and visualised in many different ways and would probably benefit the work of others who are importing items for sources.

I started a discussion at the Wikidata Import Hub which provides a link to the dataset, a table of the database fields for matching to Wikidata properties etc.

All help very much appreciated

Thanks

John Cummings (talk) 09:41, 1 March 2018 (UTC)

Nigerien Air Base 201

En-wiki has an article about the U.S.-operated Nigerien Air Base 201 ( en:Nigerien Air Base 201 ). (I have translated the article to another language.) I can not see that the article has "Wikidata on the menu". Should anything be done? 176.11.84.113 09:52, 1 March 2018 (UTC)

A bot seems to have partially fixed the problem [1]. 176.11.84.113 09:56, 1 March 2018 (UTC)

Google Knowledge Graph identifier

Should the formatter of Google Knowledge Graph ID (P2671) be changed? Eg https://g.co/kgs/pj5VKe is the new Google KG url of Suez Crisis (Q49101). I entered it as "s/pj5VKe" and the link works, but it seems unlikely to me that is the real new format for Google KG IDs. Does anyone know? --Vladimir Alexiev (talk) 14:42, 1 March 2018 (UTC)

Plants of the World Online database

Plants of the World online (at [2]) looks to be an important database in future for plants. So could it be set up please? An example of where it is needed is Malva acerifolia (Q47519412) – see discussion page for the exact reference. Peter coxhead (talk) 11:06, 30 January 2018 (UTC)

That should not be a problem, although your example is testimony of a wrong attitude. - Brya (talk) 11:45, 30 January 2018 (UTC)

Started the ball rolling. - Brya (talk) 12:06, 30 January 2018 (UTC)

These values are already stored in IPNI plant ID (P961), with the PotW link as a third-party formatter URL (P3303). For the above example, the P961 value is 561509-1 which gives a PotW URL of http://www.plantsoftheworldonline.org/taxon/urn:lsid:ipni.org:names:561509-1 Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:20, 30 January 2018 (UTC)

@Pigsonthewing: Andy, please see my last comment at Wikidata:Property proposal/Plants of the World online. My only interest is to get Plants of the World online included in appropriate Wikidata items so that can be made to show up in articles when {{:en:Taxonbar}} is added. Please help to get this done in whatever way is appropriate. Peter coxhead (talk) 22:14, 2 February 2018 (UTC)

Before seeing your comment here, I had just written over on en.Wikipedia: "Taxonbar can be made to display a link to the PotW site, using values from Wikidata property P961". You don't need any change to Wikidata for that. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:51, 3 February 2018 (UTC)

That is an approach that could be adopted, provided one does not mind that it works for only part of the cases. - Brya (talk) 03:40, 10 February 2018 (UTC)

Please provide an example of a case where it does not work. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 09:36, 10 February 2018 (UTC)

This was done allready at Wikidata:Property proposal/Plants of the World online. --Succu (talk) 19:17, 10 February 2018 (UTC)

No such example is provided on that page. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 21:59, 10 February 2018 (UTC)

It is. And tons of plants not covered by IPNI. --Succu (talk) 22:02, 10 February 2018 (UTC)

Please give an example - here - of some of the "tons" of plants not covered by IPNI, which use IDs matching the definitions in the property proposal. And, if I'm wrong, prove it: give the former example here, too. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:59, 11 February 2018 (UTC)

Succu provided urn:lsid:ipni.org:names:128853-1 as an example missing from POWO; Peter coxhead provided urn:lsid:ipni.org:names:503872-1. ArthurPSmith (talk) 16:00, 12 February 2018 (UTC)

Those are examples in the wrong direction; the model proposed (giving a valid POTW URL as a reference, to indicate that a page exists on POTW) would clearly not be used in such cases. They are not "plants not covered by IPNI". Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:01, 12 February 2018 (UTC)

No such examples, then. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 19:16, 17 February 2018 (UTC)

Wrong direction: Your intention was to spare us a property because this property could be "remodeled" via third-party formatter URL (P3303). This is untrue. --Succu (talk) 21:58, 17 February 2018 (UTC)

False. And I asked you to "Please provide an example of a case where it does not work". Still no such examples. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:18, 18 February 2018 (UTC)

Probably only a faux pas (Q1398885) at your side, Mr. Mabbett? --Succu (talk) 22:21, 20 February 2018 (UTC)e

Wider issue

I have held off from creating this property because of the heated debate (which is apparently still on). I think we should just have one general discussion about these cases. When an identifier is shared by multiple databases, but these databases do not have the same coverage of that identifier, what do we do?

Either we create two separate properties, holding the same values but with different formatter URLs (and different coverage obviously)
Or we find another way to indicate that an identifier is available in one of the databases (Andy suggested to use references like this).

This is a fairly general problem that was raised in other proposals (such as Wikidata:Property proposal/Google Arts & Culture entity ID) so it would be worth settling it once and for all… Should we have a RFC or something like that? Or is it overkill because the consensus for one solution or another already clear somehow? − Pintoch (talk) 19:04, 24 February 2018 (UTC)

A similar issue arises at Wikidata:Properties for deletion#eFlora properties, where we currently have two properties, and potentially twenty or more, for a single set of identifier values. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 23:07, 24 February 2018 (UTC)

@Pintoch: We have properties for small datasets (some hundreds of usages). We have a lot of (nearly) unused properties. Creating another property shouldn't that problematic. The usage recommendations of third-party formatter URL (P3303) is a bit fuzzy. If another database uses the same set of identifiers all is fine to me. But what about sub-/supersets? Assuming that third-party formatter URL (P3303) is intended to be used as an alternative reference this won't work. Supersets will return 404 errors (POWO). Same is true if we want a direct link to a describing site (eFloras) to make it avaiable as a reference. Mr. Mabbett, as the property proposer of third-party formatter URL (P3303) could probably help to sort out this. --Succu (talk) 23:41, 24 February 2018 (UTC)

Gladly Succu. Which aspect of P3303, a property created with zero objections and which multiple editors have used without issue, albeit never as a reference, are you having difficulty understanding? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:10, 25 February 2018 (UTC)

To frame it with Pintoch's words: „When an identifier is shared by multiple databases, but these databases do not have the same coverage of that identifier, what do we do?“ You didn't answered that question. Why do you think using third-party formatter URL (P3303) is the best solution at hand? --Succu (talk) 21:43, 27 February 2018 (UTC)

You say I didn't answer Pintoch's question, but he already includes my answer, in his post, immediately beneath it. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 23:41, 27 February 2018 (UTC)

Avoiding a direct answer is not really helpful to solve problems. Why do you think using third-party formatter URL (P3303) is the best solution at hand? The idea reffered in your answer to me makes no use of third-party formatter URL (P3303). Do you think a third-party formatter URL (P3303) that constructs URLs pointing to nothing is OK (404 error)? If yes than why? --Succu (talk) 22:03, 28 February 2018 (UTC)

The http specification, RFC 7231, tells us that "The 404 (Not Found) status code indicates that the origin server did not find a current representation for the target resource". That is - in the appropriate circumstance - quite useful information. HTH. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 22:33, 28 February 2018 (UTC)

Again you are avoiding a direct answer: „A 404 status code does not indicate whether this lack of representation is temporary or permanent“. Is the usage of third-party formatter URL (P3303) intended to povide a permanent URL? --Succu (talk) 22:49, 28 February 2018 (UTC)

You had your chance to comment on P3303 when it was proposed. If you have concerns now which you failed to express then, you can use Wikidata:Properties for deletion. Speaking of "avoiding a direct answer", I note that you have yet to give examples - here - of some of the "tons" of plants which you apparently believe to exist, which are not covered by IPNI, but which use IDs matching the definitions in the property proposal; and which I asked you for above. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:25, 1 March 2018 (UTC)

This "Wider issue" is not about deleting a useful property. I never suggested this. „The domain of POWO is much broader than that of IPNI (fungi and algae are not the subject of IPNI)“ --Succu (talk) 22:47, 1 March 2018 (UTC)

Again you are avoiding a direct answer; and again you have not given the examples requested; you have not even given evidence of the example you claim to have given elsewhere. The claim that the model I proposed - and which Pintoch quotes as a possible solution for "the wider issue" discussed here - does not work is false; you cannot substantiate it. As I said to you above: "if I'm wrong, prove it". Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 23:31, 1 March 2018 (UTC)

Is this an answer of a parrot? In the case of POWO applying elementary logic should be enough. All this has little to do with the concerns raised by Pintoch. --Succu (talk) 21:14, 2 March 2018 (UTC)

town clerk

Are Public Notary (Q1047879) and municipal clerk (Q883211) the same thing for a merge? It seems that it is the same local government official in different languages. Where I live the town clerk is licensed to be a notary public. --RAN (talk) 16:23, 22 February 2018 (UTC)

What about notary (Q189010)? I'd merge this one with Public Notary (Q1047879), at least on first sight. Grüße vom Sänger ♫ ^(talk) 17:47, 22 February 2018 (UTC)

In the Netherlands, a notary is an actual academic degree and a salaried position. In the USA any secretarial position can become a notary through passing a test (like for a driver's license). I don't think you can compare these across country borders and the items should probably be per jurisdiction. Jane023 (talk) 17:54, 22 February 2018 (UTC)

For notary (Q189010), the English Wikipedia article it's linked to is an umbrella term that covers both the US-style notaries public with limited training and authority, as well as notaries in other countries with training comparable to attorneys. The item Public Notary (Q1047879) has no English Wikipedia article linked to it, and the English description, "clerk of the competent local government", and the also-known-as "town clerk" should not be applied to "notary". Jc3s5h (talk) 18:04, 22 February 2018 (UTC)

Certainly some merging seems to be in order; there are many items found when searching for "notary" or "notary public", and some of these seem to be suitable for merging. But municipal clerk (Q883211) should not be merged with any of the notary or notary public items. In some jurisdictions, notaries have vastly different education and qualifications than a municipal clerk, and it is unlikely the same person would fill both roles. In some places, like where I live, all town or city clerks are notaries public, but the vast majority of notaries public are not town or city clerks. Jc3s5h (talk) 17:56, 22 February 2018 (UTC)

Well I see the Dutch wiki article for "Notaris" now links to notary (Q189010) and I can see at a glance that the interwiki to the English article is incorrect. This probably is true for a multitude of professions that have been carried over in different ways by different countries over time. Sorry I have no time to look into this and help clean up though. Jane023 (talk) 18:05, 22 February 2018 (UTC)

It seems that notary public (Q15479268) (a licensed position) and notary (Q189010) (an historical position) are very similar and perhaps the Wikipedia articles should be merged. It seems that Public Notary (Q1047879) and municipal clerk (Q883211) are the same. The links to Wikipedia articles need to be sorted out, the problem is the wording "public". We use it to mean "civil position that serves the public" as in "notary public" and we use it as "public office" meaning an "appointed or elected political position". I changed "clerk" and "notary" to "municipal clerk" and "municipal notary" to distinguish the political offices. I think they can be merged, there is little overlap in the languages links to the Wikipedias and the ones that overlap are meant for "notary public", the civil position. --RAN (talk) 19:22, 22 February 2018 (UTC)

This is the wrong place to discuss merging Wikipedia articles, no matter which of the several Wikipedias for the various languages is being referred to. The Wikipedias write whatever articles they want to, and Wikidata links to them as best it can.

In English, there are three notary-related articles that cover large parts of the world: notary (Q189010) is any kind of notary who deals with legal papers. notary (Q189010) is an umbrella term for two kinds of notaries, notary public (Q15479268), the type of notary prevalent in most of the US and much of Canada, and civil law notary (Q23838068) who are prevalent in continental Europe and countries that derive their legal traditions from continental Europe. None of these terms are historical terms; they all apply to notaries active today.

All three of these articles discuss notaries who are installed and recognized by the government. The term "public" refers to the fact that all these notaries are awarded their positions by the government and the government accords extra recognition to their acts, beyond the acts of an ordinary private person. Sometimes the presence or absence of the word "public" is used as a shorthand to distinguish American-style notaries from continental-style notaries, but they are all recognized by their governments. A good contrast in the US would be a notary of the Roman Catholic Church (en:Notary (cannon law) in English Wikipedia or Notary (Catholic canon law) (Q25345637) in Wikidata. Such a notary's acts would only be recognized by the Church and the government would not give any special recognition to the acts of a Church notary.

"Municipal clerk" is a pretty good term, but "municipal notary" is not. In most of the US notaries are appointed by the state (e.g. California), not by a city or town.

By the way, I am a notary public in US State of Vermont, and was appointed by the assistant judges of my county. Jc3s5h (talk) 21:18, 22 February 2018 (UTC), corrected 20:42, 25 February 2018 (UTC).

I have reverted a terrible English translation from Public Notary (Q1047879) which led to a whole reinterpretation of the item. Please always refer to the original sitelinks when making a judgment about possible merges. Maybe now the statements make sense. I am not sure if the Hungarian sitelink belongs to Public Notary (Q1047879), and I have no idea about vi, but the rest of the sitelinks seems alright. Anyway, it is a good idea to check if some sitelinks need to be moved to another item rather than merging two items. Andreasm ^{háblame / just talk to me} 03:18, 26 February 2018 (UTC)

@Andreasmperu: Maybe "notary services" as per Yandex translate of Slovene: https://translate.yandex.com/?lang=sl-en&text=Notariat? --Liuxinyu970226 (talk) 11:39, 2 March 2018 (UTC)

In the US there are for-profit services, mostly web-based, who maintain databases of notaries public around the country, and will help a person or financial institution who needs a notary public find one. Most of the notaries public in these databases are willing, for a fee, to travel to the location of the person requesting the notary public. These services could be labeled "notary services", and I guess they are different from what the item mentioned by User:Liuxinyu970226 is referring to (but it's hard for me to tell since I only read English). Jc3s5h (talk) 11:55, 2 March 2018 (UTC)

Looking up properties using Search

I know there were improvements in the pipeline for the search boxes, both the one at the top-right of the page with the incremental suggester, and the main text search function. (Which, curiously, still give different suggestions -- the incremental suggester is usually better). Can anyone give an update on the progress of these? Are they now implemented, or are further adjustments still coming?

In particular, I note that currently when I key "Property:named as" into the search, not only does subject named as (P1810) not appear at the top of the list, it does not in fact appear in the list returned at all!

(Instead I think "named" gets stemmed to "name", and then various hits come back containing the word "name" -- indeed those seem to come back higher than hits containing the word "named" itself).

Similarly, searching for "Property:stated as", the property object named as (P1932) only appears at #3 on the list returned, despite being an exact match for the words keyed in.

Pinging User:Smalyshev (WMF) -- where are we currently at on this? Are there modifications you're still looking at? Jheald (talk) 16:47, 2 March 2018 (UTC)

(Which, curiously, still give different suggestions -- the incremental suggester is usually better). Thank you, this is because completion suggester is driven by the new code, while the fulltext still uses the old one. Watch https://gerrit.wikimedia.org/r/c/380895/, that improvement is coming to fulltext too. Not sure why your property search didn't work, I'll check into it. Smalyshev (WMF) (talk) 22:13, 2 March 2018 (UTC)

Architect versus notable work

If I am giving the architect (P84) for an item. could it then be possible to have notable work (P800) filled in for that architect? – The preceding unsigned comment was added by Pmt (talk • contribs) at 16:10, 27 February 2018‎ (UTC).

If it's a notable one, sure.
--- Jura 17:01, 27 February 2018 (UTC)

If the notable work is not in our database, you have to create it. --RAN (talk) 18:02, 27 February 2018 (UTC)

@Richard Arthur Norton (1958- ): It was indeed ment that both the work and the architect is in the database, and that you are working in the item for the specifik notable work or architecht. Breg Pmt (talk) 19:58, 27 February 2018 (UTC)

@Jura1: and @Richard Arthur Norton (1958- ): Sorry for being unclear, what I was thinking about was to have this happen automatically. As an example. If i was creating an new item about a building and is adding the architect (P84) for that building, and the architect is notable and already has an wikidata item/"Q". Instead of then open up the item for the architect and add the new building item just created With the architect added. Why can'nt the programe do it automatically? – The preceding unsigned comment was added by [[User:|?]] ([[User talk:|talk]] • contribs).

How would it know that the building is notable for that architect?
--- Jura 06:59, 1 March 2018 (UTC)

@Jura1:Thinking of already existing as an item in wikidata, and as so it is a notable work for the architech who created it. Breg Pmt (talk) 07:20, 1 March 2018 (UTC)

Some architects built hundreds of buildings, some happen to be notable for them, others not. Some have items, others still need to be created. I don't think we can assume that even if there is just one, it's necessarily notable. Besides, the full list can always be queried.
--- Jura 07:23, 1 March 2018 (UTC)

@Jura1: Ok, as mentioned above, I was thinking about items existing in wikidata and that do have an archicht or a building not having an architect given for that item. But since you are bringing it up. Do you mean that there, an now in general, that works by an architect, author or designer who has its own item on wikidata is not necessary notable for that creator? Who then desides what works are notable for that creator. For instanse for William Shakespeare how many of his works here at wikidata is not notable, can you provide me With a list. For Sigurd Hoel (Q138650) is Meeting at the Milestone (Q6807901) notable? Breg Pmt (talk) 08:39, 1 March 2018 (UTC)

@Pmt: IMO:

If the architect is a "clearly identifiable conceptual or material entity", it's notable for Wikidata (not. 2).
If the building can be attributed to the architect using reliable sources, it's notable for Wikidata, as "it's (somehow, to some extent) a clearly identifiable conceptual or material entity" (not. 2.) and "it fulfills some structural need": databasing this architect's production (not. 3).
If the building is a "clearly identifiable conceptual or material entity", it's notable for Wikidata too (not. 2) regardless the architect is known or not.
Is the building notable enough for being displayed through notable work (P800) in architect's item? Don't know, don't care, as already said "the full list can always be queried".

I created a few weeks ago a pretty simple template in es.wikipedia ({{wikidata arquitecto}} "example", external links), and it works nice without using P800. strakhov (talk) 16:31, 3 March 2018 (UTC)

Is it possible to provide contextual statement ranking when adding statements?

Hi

Sorry for the weird title, basically I'm adding information by hand for a group of women and when I add sex or gender (P21) the first option in the list when I type 'Female' is is female organism (Q43445) not female (Q6581072), which I know is wrong but it gives me this as the first option every time. Is it possible/realistic to provide more contextual suggestions?

Thanks

--John Cummings (talk) 14:19, 2 March 2018 (UTC)

Hello John Cummings,

to avoid this specific problem with P21, I use and recommend User:Magnus Manske/wikidata useful.js, which allows to add the statement with a single click ;)

more generally, I would agree that it would be very useful (for adding names, countries, languages, occupations), because there are a lot of them that have the same label, which are not at all the same ;) --Hsarrazin (talk) 14:28, 2 March 2018 (UTC)

Yeah we're looking into that - specifically User:Smalyshev (WMF). Input is being collected at Wikidata:Suggester ranking input. --Lydia Pintscher (WMDE) (talk) 14:33, 2 March 2018 (UTC)

@Lydia Pintscher (WMDE):, 👍 , --John Cummings (talk) 10:41, 3 March 2018 (UTC)

Inherently ambiguous birth dates

I want to flag instances of humans that have an "inherently ambiguous birth date". Some people were born without birth certificates at home in rural poverty areas in the 1800s. For some people up to the 1600s records do not exist and we can only give the year they were born. Some people have given different years in different documents. I have found multiple people, especially celebrities, who keep making themselves younger as they age in their official documents. Unless these people that give multiple birthdates have a birth certificate online, they are one category of people that have an "inherently ambiguous birth date". I have been adding in hundreds of full birthdays where the year was only known, based on the WWI and WWI draft or passport applications for people in the USA. Sometimes the value is already in Wikipedia and Wikidata has not been updated. I run this: tinyurl.com/o26zc83 However, after searching and finding nothing, I want to create a field to let myself know a search has been done and nothing has been found, so I do not keep looking when I run the program again a month later. One day in the future when more records are online someone can search all the humans that have an "inherently ambiguous birth date" and look again. This would be a "Wikidata-specific criterion" Can someone suggest a scheme? Does anyone else see this as useful? --RAN (talk) 21:24, 3 March 2018 (UTC)

@Richard Arthur Norton (1958- ): Maybe add a qualifier of sourcing circumstances (P1480) = presumably (Q18122778) (or a new QID for 'ambiguous')? Thanks. Mike Peel (talk) 00:10, 4 March 2018 (UTC)

@Richard Arthur Norton (1958- ):

if I understand correctly what you want, it is a mean to not check again each date, because you want to know which have already been checked ?

for this purpose, on VIAF ID (P214) and Bibliothèque nationale de France ID (P268) which did not exist when I checked, I use retrieved (P813) as qualifier. Could this help you ? --Hsarrazin (talk) 08:46, 5 March 2018 (UTC)

Yes, both schemes are good ideas. I will see which one works best and then modify my SPARQL search to look for this qualifier, so I do not search the same people over and over.

Bracteates

bracteate (Q848960) currently covers migration period/ancient pendants and medieval coins that were based on these pendants. These probably need to be two separate items. There are two AAT identifiers and the dates (when we add them) will be different. Some of the linked Wikipedia articles cover both, some only one or the other. Does anyone with expertise want to take a stab at disentangling these items? - PKM (talk) 04:29, 4 March 2018 (UTC)

@PKM: I can have a look. Breg Pmt (talk) 18:23, 5 March 2018 (UTC)

New « season » property and « part of »

Just walked on

Under discussion

Description	MISSING
Data type	MISSING
Example 1	MISSING
Example 2	MISSING
Example 3	MISSING

through this impressive diff : https://www.wikidata.org/w/index.php?title=Property_talk:P361&diff=643090919&oldid=634204131&diffmode=source and saw that the argument to create this that it created mass constraint violation. I think that the issue actually is that we require that « part of » to be actually an inverse of « has part » and that we require it to actually have that claim. Actually has « part of » is also transitive ,so requiring to have explicit claim we would also need to include all « has part » transitively to bigger parts, this seemr rather unpracticable. Why do wo actually require that such statements has explicit inverses, by the way ? « part of » is transitive but we actually require, like in several cases, to link onto the smallest part it is the part of and to not put a « part of » statement if there is already have a « part of* » path from the small to the bigger path. This is a tension between a constraint and an inference semantics that is unconfortable to resolve at current state.

Notified participants of WikiProject Reasoning WikiProject Properties has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead. WikiProject Ontology has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead. Also there is the option, and I think it has been used, to overload « series » to be able to handle seasons items. Actually a season is also a sequence of episode. And we know the object is a season of something, or another kind of subsequence. There is also the possibility to qualify « preceded by » by « series » because, say for star wars, there is several way to order the episode : by narration order, by diffusion order, there may be « in between » episodes that appears laters. We don’t solve this by « series » nor « season ». Different order may lead to different « series » items, that could be used to different « preceded by » statement qualified by « series : original star wars order » or « preceded by : ep 1 prequel -> series : extended star wars narrative order ». « Season » seems to be a popular solution but honestly I don’t think it’s really useful or especially expressive, as we already know that « seasons » are sequences of episodes, that they are seasons of some longer sows like a TV series, and that it does not solve more complex issues so more creative solutions have still to be invented :/ to me it’s a false good idea.

Notified participants of WikiProject Movies @Jura1: author TomT0m / talk page 14:20, 4 March 2018 (UTC)

You wrote "we require that « part of » to be actually an inverse of « part of »". Is it a typo? Thanks! Syced (talk) 02:08, 5 March 2018 (UTC)

corrected. author TomT0m / talk page 09:15, 5 March 2018 (UTC)

Given names as child (P40)?

Noticed, that Antonio Cavalieri Ducati (Q15059980) has male given name (Q12308941) type values for child (P40). Of course, it's wrong, but maybe we can allow them (maybe a specific property)? Having names of children would be better than having simply number of children (P1971), imho. And sometimes the only information about children is their name (and not always the surname, which may be different from parents). Of course, they would be notable per WD:N, but... --Edgars2007 (talk) 07:53, 5 March 2018 (UTC)

It would just break any query that assumes P40 provides you with items about children ..
--- Jura 07:55, 5 March 2018 (UTC)
- That's where "maybe a specific property" part comes in. --Edgars2007 (talk) 08:01, 5 March 2018 (UTC)
Another approach would be
⟨ subject ⟩ child (P40) ⟨ unknown value Help ⟩
given name (P735) ⟨ Bob ⟩
. But if we also have an item for the other parent this does not express the fact they had the child together. A
⟨ subject ⟩ child (P40) ⟨ unknown value Help ⟩
given name (P735) ⟨ Bob ⟩
parent (P8810) ⟨ John Smith ⟩
but… author TomT0m / talk page 09:14, 5 March 2018 (UTC)
Names of children aren't relevant if they don't have their own entity, please respect wmf:Resolution:Biographies of living people. Sjoerd de Bruin (talk) 09:30, 5 March 2018 (UTC)

I've removed them. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:16, 5 March 2018 (UTC)

Q2965940 - Christine Laurent

Christine Laurent (Q2965940) was created as representing fr:Christine Laurent. It has since - and several times - been re-purposed as "P31 conflation (Q14946528) of Christine Laurent (Q45180949) + Christine Laurent (Q45180738)". Nonetheless, it currently includes numerous external IDs. This re-purposing does not seem helpful. How should it be resolved? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:02, 5 March 2018 (UTC)

It appeared that fr:Christine Laurent and pt:Christine Laurent as described didn't actually exist. You could check the identifiers and move them to the correct items. Once done, arrange for deletion.
--- Jura 11:09, 5 March 2018 (UTC)

Wikidata weekly summary #302

Here's your quick overview of what has been happening around Wikidata over the last week.

Events/Press/Blogs
- Past: Wikidata workshops in different places in the world for Open Data Day
- WikiCite presentation (video) as part of Wikimedia Foundation metrics and activities meeting - February 2018
- Wikidata: Knowledge as a Service, by martin Poulter

Other Noteworthy Stuff

Did you know?
- Newest properties:
  - General datatypes: court, time index, 3D model, broader concept, season, number of players in region, century breaks, dialect of, produces cohesive end, isocaudomer
  - External identifiers: FFF male player ID, AFL Tables coach ID, FFF female player ID, Webumenia creator ID, AFL Tables umpire ID, MuIS person or group ID, EPHE ID, Patrons de France ID, Siprojuris ID, ESPN X Games athlete ID, ACE work ID, AICTE institute ID, Chronicling America newspaper ID, Brooklyn Museum Exhibition ID, Zenodo ID, CONABIO ID, New Georgia Encyclopedia ID, Tropicos publication ID, KMSKA work PID, Bargeton ID, Guide Nicaise ID, AlloCiné company ID, Annuaire des fondations ID
- Query examples:
- Newest WikiProjects: Energy, Motorsports
- Newest database reports: nomes de países em português

Development
- Re-enable Wikidata Recent Changes integration on Russian Wikipedia (phab:T179012)
- Investigate on the size of logging table (phab:T188635)
- Fix issues with graph vizualisation UI (phab:T186467)
- Work on results of security review for the deployment of Wikibase-Lexeme (phab:T186726)
- Enable constraint result caching on Wikidata (phab:T184812)

You can see all open tickets related to Wikidata here.

Monthly Tasks
- Add labels, in your own language(s), for the new properties listed above.
- Comment on property proposals: all open proposals
- Suggested and open tasks!
- Contribute to a Showcase item.
- Help translate or proofread the interface and documentation pages, in your own language!
- Help merge identical items across Wikimedia projects.
- Help write the next summary!

Read the full report · Unsubscribe · Lea Lacroix (WMDE) 15:05, 5 March 2018 (UTC)

Q48975747 and Q2402627 redundant

I don't know how to merge redundant entries Yehoshua Blau (Q48975747) and Yehoshua Blau (Q2402627) together. I thought that Wikidata would automatically follow the interwikis on the en.wikipedia article, but that doesn't seem to be the case... AnonMoos (talk) 13:19, 8 March 2018 (UTC)

Done

you may use the "Merge" tool (first of the gadgets) for this kind of job : just make sure that it is really the same concept (easy for people, less easy for other items) :) --Hsarrazin (talk) 15:01, 8 March 2018 (UTC)

Thanks... AnonMoos (talk)`

This section was archived on a request by: Matěj Suchánek (talk) 09:29, 11 March 2018 (UTC)

Are cook (Q156839) and chef (Q3499072) the same?

Hi! To me cook (Q156839) and chef (Q3499072) looks the same. Are they, and should they be merged? //Mippzon (talk) 18:13, 10 March 2018 (UTC)

Did you even took a look at the sitelinks? Sjoerd de Bruin (talk) 18:19, 10 March 2018 (UTC)

No, cook (Q156839) is only creator (Q2500638), but chef (Q3499072) is also manager (Q2462658). - Kareyac (talk) 04:57, 11 March 2018 (UTC)

This section was archived on a request by:
--- Jura 09:36, 11 March 2018 (UTC)

Merging Google Pay and Android Pay

I suggest merging the two payment platforms: Google Pay and Android Pay since they are basically the same platform but Google Pay is the new name for Android Pay after the rebranding from Google. – The preceding unsigned comment was added by Wefk423 (talk • contribs).

For you the same question as above. Sjoerd de Bruin (talk) 19:53, 10 March 2018 (UTC)

This section was archived on a request by:
--- Jura 09:36, 11 March 2018 (UTC)

Cordillera Azul Antbird, Myrmoderus eowilsoni

Q46624807 has the English common name "Cordillera Azul Antbird" and the scientific name "Myrmoderus eowilsoni". The former commemorates Cordillera Azul National Park, Q264948; the latter E.O. Wilson, Q211029,

When I created the item, I added statements indicating the etymolgy of each of these names; citing sources and giving quotes (" We select the English name to draw attention to the little known but biogeographically important and biodiverse mountain range that contains the type locality of the species." and " We name Myrmoderus eowilsoni in honor of Dr. Edward Osborne Wilson to recognize his tremendous devotion to conservation and his patronage of the Rainforest Trust, which strives to protect the most imperiled species and habitats in the Neotropics and across the globe. (English)", respectively).

For some reason, User:Succu has twice ([3], [4]) removed the cited etymology from the latter of these names. (I say "for some reason", as the only explanation given was the edit summary "per chat disk".)

I have, naturally, restored it. Repeatedly removing cited data with no cogent explanation is clearly unhelpful to the project, and to our users. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 19:57, 6 February 2018 (UTC)

And while I was writing the above, did so third time ([5]), with the edit summary "please do not remove a valid source, thx" - despite removing data cited to a valid source in the same edit. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 19:59, 6 February 2018 (UTC)

And now a fourth time ([6]), with edit summary "??!!". Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:06, 6 February 2018 (UTC)

Info Wikidata:Project_chat/Archive/2017/07#Editwar_at_Desmopachria_barackobamai_(Q30434384). --Succu (talk) 20:10, 6 February 2018 (UTC)

Thanks for the reminder. That's another example of you edit warring to remove cited data on the origin of a specific (both senses) name. I've duly restored it there, too. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:13, 6 February 2018 (UTC)

Sigh: „duly restored“. OMG. --Succu (talk) 20:17, 6 February 2018 (UTC)

...and I have been reverted there also ([7]), again with the loss of cited metadata. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:19, 6 February 2018 (UTC)

Beyond „I have“ and „I was“ do you have some additional arguments to the discussion I reminded you above? --Succu (talk) 21:23, 6 February 2018 (UTC)

Redux

I've restored the above, unresolved, topic from this month's archive, because we have a similar issue to the one originally raised (i.e. not the sp. nov. matter which side-tracked it; hence now collapsed) at Draba kananaskis (Q47507633), where User:Succu persists in removing a cited qualifier of taxon name (P225) which describes the etymology of the specific name. I raised the same issue last year, but that too petered out without resolution. It is simply not tenable to store the etymology of such names at item level, because that fails when an item can have different names/ labels in different languages, or where the scientific and vernacular names have different roots (see the 'Kentish Plover' example in last year's discussion). Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 23:27, 16 February 2018 (UTC)

What is a „cited qualifier“? BTW: The edit history of Draba kananaskis (Q47507633) is revealing. --Succu (talk) 22:03, 17 February 2018 (UTC)

Wikidata:Project_chat/Archive/2017/07#Summary?, was the résumé. --Succu (talk) 22:12, 17 February 2018 (UTC)

Do you have a credible data model, that caters for the use-cases given above, other than the one which you keep undoing in your reverts? If so, what is it? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 22:14, 17 February 2018 (UTC)

Could you please refine (or rephrase) your use case (Q613417)? What information do you want to extract say with a SPARQL query? BTW: What is a „cited qualifier“? --Succu (talk) 20:45, 20 February 2018 (UTC)

As you can see, my arguments are laid out above. As a courtesy to our fellow editors, I see no need to repeat them. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:19, 22 February 2018 (UTC)

You offered your opinion but not a use case. Just another discussion where you are either unwillingly or unable to argue in a comprehensibe way. What a pitty for our project. --Succu (talk) 21:07, 22 February 2018 (UTC)

So, you offer no credible data model, then; just ad hominem. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 00:29, 23 February 2018 (UTC)

A data model about what? Without use cases it's not possible to develop suggestions. --Succu (talk) 19:22, 23 February 2018 (UTC)

This still requires a resolution. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 23:33, 1 March 2018 (UTC)

It still needs some input by you, Mr. Mabbett! --Succu (talk) 22:47, 3 March 2018 (UTC)

Hopefully this renewed your attention. --Succu (talk) 22:54, 6 March 2018 (UTC)

descriptions for tv series episodes

I'm just adding descriptions for tv series episodes in en, fr and de. You might add some other languages descriptions with that query. Queryzo (talk) 13:48, 5 March 2018 (UTC)

this query is for tv seasons, which I add now for de. Queryzo (talk) 16:39, 5 March 2018 (UTC)

@Queryzo: Don’t know if this is related, but I see edit like this https://www.wikidata.org/w/index.php?title=Q2817294&curid=2698014&diff=644038140&oldid=629359946 on series seasons. It seems that the description in those cases contains less informations than the label … Does not seem like a good idea since it can shbdow automatic descriptions sometime that m

y be more informative. author TomT0m / talk page 11:56, 6 March 2018 (UTC)

A description should (in combination with the label) be suitable for identifying the subject, see Help:Description/de. In fact, a proper description would have been "Staffel einer Fernsehserie", but this is not very common right now. Queryzo (talk) 12:30, 6 March 2018 (UTC)

@Queryzo: Not really. You’re not identifying the subject with this kind of desciption, or worse with your last suggestion, you just giving its type, which is the same as many other objects, so it does not help identifying the subject )identifying would mean express how different it is from other similar objects.) Quote :

« A useful template for creating definitions […] is provided by what are called Aristotelian definitions, which is to say definitions of the form «S = def. a G which Ds»where ‘G’ (for: genus) is the parent term of ‘S’ (for: species) in some ontology. Here ‘D’ stands for ‘differentia’, which is to say that ‘D’ tells us what it is about certain Gs in virtue of which they are Ss. An example Aristotelian definition (from the Foundational Model of Anatomy Ontology): « cell =def. an anatomical structure which consists of cytoplasm surrounded by a plasma membrane » «plasma membrane =def. a cell part that surrounds the cytoplasm » »

from the recommandations for descriptions, which I found quete good, in another project (quoted from https://pdfs.semanticscholar.org/6ff2/f127a6c75cd3461eff16ad62a4d0b0b5a090.pdf author TomT0m / talk page 14:01, 6 March 2018 (UTC)

There is no need to identify a subject by a description itself, f.e. there have been a lot of former Prime Ministers of the United Kingdom, but Margaret Thatcher (Q7416) is only described as "Former Prime Minister of the United Kingdom". The only reason to specify a description is in case of a possible ambiguity in connection to the label! In the exemple above this would be the case, if there are two series named "Operación Triunfo" with an eight season. This would mean that "Operación Triunfo/Staffel 8" exists twice, so I have to specify descriptions with "Staffel von Operación Triunfo (year or sth.)" and "Staffel von Operación Triunfo (the other year)". The number of the season is sufficient in the label. Queryzo (talk) 14:55, 6 March 2018 (UTC)

That’s not really what you did in the edit I quote, you just repeat informations of the label. author TomT0m / talk page 15:10, 6 March 2018 (UTC)

Gap between Wiktionary.org and Wiktionary namespace at Wikidata?

In terms of technical functionalities, what are the gaps between the two? In other terms, is there any scope left for Wiktionary.org beyond differently licensed content and/or different visual presentations?
--- Jura 05:34, 27 February 2018 (UTC)

@Jura1: Much like how Wikidata can't include encyclopedia articles, it also can't include non-structurable elements of Wiktionary. For example, the elaborate wikitext-based definitions, extensive usage notes, and all but the simplest etymologies can't be included on Wikidata. And all of Wiktionary's appendices detailing areas of language and grammar, the specialized glossaries, details on reconstructed terms, textual details of use of alternative forms, useful details in rhyme guides for working around things, etc, etc... Take a look at fr.wikt's accommodation or ripopée or en.wikt's háček, or even the pronunciation section of pecan. There's a lot there that can't reasonably go on Wikidata.

That's not to say there isn't also a lot that can go on Wikidata. Wiktionary will probably have more use for structured data than Wikipedia, but that doesn't mean that it's independent scope would be minimized that much. --Yair rand (talk) 03:49, 28 February 2018 (UTC)

When you are writing "it also can't include" is this an affirmation you are making or a technical limitation?
--- Jura 07:24, 28 February 2018 (UTC)

@Jura1: The technical limitation is that neither Wikidata items nor Lexemes can contain wikitext, formatting, paragraph breaks, etc. While in theory, a central database could also have wikitext pages in a different namespace (or just use giant strings, or something), that wouldn't really fit the idea of a structured database. Free text isn't structured, it means nothing to a machine and can't be cleanly divided into meaningful component parts. --Yair rand (talk) 21:26, 5 March 2018 (UTC)

While https://fr.wiktionary.org/wiki/accommodation includes text, I don't think it is unstructured. The various elements can be added in a structured way to the suggested lexeme-type at the appropriate parts.

wikitext is already being requested as a new datatype and might eventually be available, but I don't think this minor technical limitation is much of an issue for the sample.

On a more systematic level, I think including (e.g.) usage samples within Wikibase in a structured way provides information necessary for understanding lexemes. If this information would be disconnected from lexemes and stored in an unstructured way at another site, people couldn't query it. It could be compared to storing references for any statements outside Wikidata.
--- Jura 07:00, 7 March 2018 (UTC)

Correct way to link commons category

Hello.

Looking at Lakvijaya Power Station (Q6479997), there is already Commons category (P373) linked.
But, when I visited Commons:Category:Lakvijaya Power Station, there was no links shown under "In Wikipedia" at the bottom of the left panel (but there was Wikidata item under "Tools").
On Commons, when I added the Wikipedia links by clicking Add links under "In Wikipedia", another category link appeared on the Wikidata item (on the right panel showing other projects).

So, which is the correct way to link a Commons category on Wikidata? Since now Wikidata has two links pointing to Commons. I couldn't find anything about it in the Help:Contents. Thank you, Reh man 03:49, 28 February 2018 (UTC)

Hello. The recent discussion over it can be observed here. - Kareyac (talk) 05:15, 28 February 2018 (UTC)

Thank you for that, Kareyac. So from what I understand, the current practice mostly is linking twice from the same wikidata item... Until consensus is reached (or a technical "fix" is made), maybe someone familiar with the local policies should mention that in one of the help pages? Just so that more questions like this can be avoided... Reh man 06:04, 28 February 2018 (UTC)

On the Commons category page, I see a link to Reasonator (top right corner of the page), which does the job anyway.--Ymblanter (talk) 06:41, 28 February 2018 (UTC)

@Ymblanter: because you have added this script to your common.js at Commons... --Edgars2007 (talk) 07:19, 28 February 2018 (UTC)

@Ymblanter: +1, 99 % of people reading this page won't see this Reasonator link. And as wikimedian, I activated this gadget but I would prefer a direct to Wikidata than to have to click two times to get to Wikidata. Cdlt, VIGNERON (talk) 07:51, 28 February 2018 (UTC)

Yes, sure, but to be honest I do not see why anybody who is not Wikimedian would be interested in a connection of a Commons category to a Wikidata item. 99.999999% of non-Wikimedians have never heard of categories, Wikidata, and most of them of Commons.--Ymblanter (talk) 08:03, 28 February 2018 (UTC)

True but this is a bit of a vicious circle: people don't see the link so don't display the link, so people won't see the link... Plus, I don't like to guesstimate what people want or not, I've been too often surprised to learn what people are interrested in or not. And Wikimedia and/or Wikidatian maybe are 0,000001% of the readers but I guess most of them would prefer a direct link than a reasonator link (reasonator is more intended for non-wikimedian but they don't see this link :/ the navigation flow is a bit off). Cdlt, VIGNERON (talk) 08:34, 28 February 2018 (UTC)

Categories are rather visited actually, even by unregistered users. For some groups they're a lifesaver. They're also linked from rather popular non-Wikimedia websites (e.g. sbn.it in Italy). For this reason, it's important that Wikidata items on a subject (those linked to a main namespace article in Wikipedia, Wikiquote etc.) include the corresponding Commons categories in their sitelinks, so that people can easily reach Commons categories from related articles (and vice versa). --Nemo 08:45, 28 February 2018 (UTC)

for me, on Commons categories, I see a big blue (+) near the name of the category, and when I click on it, I get a small popup which allows to "Edit data". When I click on it, I directly access the wikidata item. I do not remember what I did to get this, but it's been there for as long as Commons categories on wikidata. Is it a gadget, a script, or a normal behaviour ? --Hsarrazin (talk) 09:16, 28 February 2018 (UTC)

No, not "standart" behavior. Probably User:Yair rand/WikidataInfo.js. --Edgars2007 (talk) 14:29, 28 February 2018 (UTC)

Be aware also that the new commons:Template:Wikidata Infobox uses our site links to Commons. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 09:38, 28 February 2018 (UTC)

I've added the infobox to that category as a demo. Note that if you add a commons sitelink, then a bot will come along and add it to P373. Unfortunately the reverse isn't (yet) true. Thanks. Mike Peel (talk) 12:07, 28 February 2018 (UTC)

Can someone remind me on what is the holdup of having commons have its own box along with the other Wikiprojects? We still have to add it to others to get a backlink. Are there any other Wiki projects other than Commons that need that "other" box? RAN (talk) 17:26, 28 February 2018 (UTC)
- The most important one? ;)
  --- Jura 06:44, 1 March 2018 (UTC)

@Rehman, Kareyac, Ymblanter, Edgars2007, VIGNERON, Nemo_bis: @Hsarrazin, Yair rand, Pigsonthewing, Mike Peel, Richard Arthur Norton (1958- ):

I've opened an RfC at Wikidata_talk:Notability#RfC:_Notability_and_Commons, based on text previously discussed here in November, to try to update and clarify our guidance on notability and sitelinks for Commons categories. Jheald (talk) 23:37, 5 March 2018 (UTC)

As Commons categories are likely to become less important in the future, a good question could be what we need to do to absorb Commons categories that haven't been linked from Wikidata yet.
--- Jura 10:18, 7 March 2018 (UTC)
Done at #Things_covered_by_Commons_categories,_but_not_by_Wikidata below.
--- Jura 11:59, 7 March 2018 (UTC)

P1461

Patientplus ID (P1461): shouldn't this property be an external-identifier? I've just noticed this alongside the 'normal' properties in ubidecarenone (Q321285). Wostr (talk) 00:09, 5 March 2018 (UTC)

It wasn't converted following the comments at Identifier migration/1.
Obviously, it could be moved to the identifier section even with string datatype. At Wikidata:Contact the development team#Placement in identifier section, I asked to arrange for this. Maybe @Lydia Pintscher (WMDE): can update us on that request.
--- Jura 07:41, 5 March 2018 (UTC)

Yes. The only comment about them at Identifier migration/1 is "only 96.77% unique out of 773 uses". A quick check of the constraint report showed that roughly half of the bad matches were for disambiguation pages; I have removed them. The rest seem to be Bonnie-and-Clyde cases, a few of which require somebody with medical knowledge to resolve. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:26, 5 March 2018 (UTC)

For consistency, we should convert other similar ones as well if we convert this one.
--- Jura 07:17, 7 March 2018 (UTC)

P2306

Why property (P2306) added to property constraint (P2302) : required qualifier constraint (Q21510856) have to be splitted if there are more than one required qualifiers? It does not make sense; I merged few property (P2306) into one in some properties, but after I got notification that it should be 'single value', I reverted my edits. But I can't see any reason why it should be like this (every required qualifier in different required qualifier constraint (Q21510856)). Wostr (talk) 16:53, 5 March 2018 (UTC)

Not sure why it has to (query wise or other), but if it's split, you will get separate reports to clean it up: Wikidata:Database reports/Constraint violations/P2118.
--- Jura 16:59, 5 March 2018 (UTC)
- But having this as a separate statements may cause that one of required qualifiers won't be noticed by someone (i.e. someone will check that it is required to add qualifier X and won't notice that there are other required qualifiers). Kind of wierd this is. Wostr (talk) 18:05, 5 March 2018 (UTC)
  - display of statements on www.wikidata.org isn't necessarily ideal and I don't think its optimization should need editing statements.
    --- Jura 07:16, 7 March 2018 (UTC)

@Wostr, Jura1: Merging all required qualifiers into a single constraint statement doesn’t allow you to mark only some of them as constraint status (P2316)mandatory constraint (Q21502408). See Help talk:Property constraints portal/Mandatory qualifiers#Modeling of multiple qualifiers in constraint statements. --Lucas Werkmeister (WMDE) (talk) 10:29, 6 March 2018 (UTC)

It is still nor clear to me (and I can't find it on any pages related to these properties) why I should add constraint status (P2316)mandatory constraint (Q21502408) to property constraint (P2302)required qualifier constraint (Q21510856)? Both say that the listed qualifiers are mandatory or am I missing something? Wostr (talk) 13:34, 6 March 2018 (UTC)

One is to indicate that this would be the dream situation, the qualifier just marks that dream as currently fulfilled. Sjoerd de Bruin (talk) 17:27, 7 March 2018 (UTC)

Deriving text statistics (such as readability scores)

I am currently developing a few bots which have access to written text, but currently ignore/discard it. For items such as Request for Comments (Q212971), news article (Q5707594), treaty (Q131569), statute (Q820655) where the content of written text (minus headers, footers, page numbers, image captions, etc) can be separated easily, it would be easy to generate some text statistics including various readability test (Q2114712) and counts of word (Q8171), sentence (Q41796), quotation (Q206287), syllable (Q8188) and perhaps even phoneme (Q8183) (if the text were spoken). A library such as textstat (Python) could be used to generate readability test (Q2114712) scores, and extraction of other metadata about text is possible using other readily available software. The difficulty I have is how this information could be included within Wikidata in a way that has accurate and reliable sourcing. The source I had in mind was the value of full work available at URL (P953) (the text itself) with determination method (P459) linking to a readability test (Q2114712) or other method, as well as a specific implementation of text statistic extraction software (including version/build numbers that would be needed to fully replicate the result). Otherwise, a web service on wmflabs (or elsewhere) could accept a URL from a supported domain, extract the text content, and return a page or data with the statistics generated. The source would then be reference URL (P854) linking to the web service results page. Does anyone have feedback or ideas on whether this idea is viable and suitable for Wikidata? Dhx1 (talk) 02:08, 7 March 2018 (UTC)

Interesting idea. Currently, I don't think we have much beyond intended public (P2360) and ratings for films. If the score is reproducible by others with the same or a similar tool, I think it could be included. Maybe in a dedicated property.
--- Jura 06:28, 7 March 2018 (UTC)
- @Jura1: Properties which could be created for these statistics include Flesch–Kincaid readability tests (Q2521552), Gunning fog index (Q3798188), SMOG (Q7391268), Automated Readability Index test (Q17014397), Coleman–Liau (Q5143047), Linsear Write (Q6554793) and Dale–Chall readability formula (Q5210828) (all supported by the Python textstat library). Dhx1 (talk) 07:08, 7 March 2018 (UTC)
  - The question is if we should make one for each or a single one for any "readability" linking the items for the method and adding the score as a qualifier. The item could be about the version of the scoring used.
    --- Jura 08:12, 7 March 2018 (UTC)

11 days left to submit proposals for Wikimania 2018

Hello all,

As every year, the call for submissions for Wikimania 2018 is running, and the deadline for talks, workshops and posters is March 18th. We hope that the Wikidata community will be well represented in the program and in the attendees.

This year, the system of submission goes through OpenChair, therefore the submissions will not be public. You won't be able to check other Wikidata-related submissions. So if you plan to submit something, I encourage you to describe your idea on Wikidata:Wikimania 2018. On this page, you can also see the ideas of other people, and suggest your help for one of them.

Here are a few ideas that have been suggested by the development team, where we would love to support volunteers: SPARQL workshop for beginners, SPARQL workshop for confirmed users, how to model lexicographical data for your language, explaining the community processes on Wikidata, showing useful tools around Wikidata...

When you do your submission through the official process, feel free to also edit Wikidata:Wikimania 2018 and add your idea into the "proposals submitted" section. That will help the other editors to avoid duplicates.

Last but not least, I also added a section "I'm attending", to have an overview of who from the Wikidata community will attend to Wikimania this year. Please register here if you already know (even if you're waiting for scholarship result for example). This list can also help you find more volunteers to run a workshop or participate to a discussion.

Thank you very much, Lea Lacroix (WMDE) (talk) 09:18, 7 March 2018 (UTC)

Wikinews categories

Our current practice is to connect article sitelinks to one item (for example Senegal (Q1041)) and category sitelinks to one item (for example Category:Senegal (Q6975863))). The Wikinews people keep moving the sitelinks to Wikinews category from the category item to the article item. This messes up our data structure here on Wikidata and I don't think we have consensus on this project that we want this. We only tolerated it in the past because without arbitrary access they no other way to show links to Wikipedia articles on Wikinews categories. With a template the Wikinews people can show whatever links they want on their categories without needing to move links around here. It's just a matter of copying over Commons:Template:Interwiki from Wikidata and Commons:Module:Interwiki to Wikinews and update it to suit their needs. Let's get this sorted out. Multichill (talk) 15:55, 24 February 2018 (UTC)

@Multichill: What's the position if there otherwise is no category-item here? Are Wikinews people forced to create one to match their category, or (like Commons) are they fine to link to the article-item in such a circumstance?

Also, what is the harm in systematically linking from article-item here to a category there? What is the benefit in preventing such links? Wikinews articles are all designated instance of (P31) Wikinews article (Q17633526), so a regular item here is not going to be linked to both a category there and to a news article. If there is a story, the news article will have an item of its own. There is no chance of a collision. Why is there therefore any advantage in their not linking a subject to a regular item here?

We have to use Commons:Template:Interwiki from Wikidata and Commons:Module:Interwiki on Commons because there are sometimes gallery pages there. But there are (I think?) no equivalents of gallery pages on Wikinews. So why add this clunky indirection, when a regular sitelink would do the job just as well?

There is also a difference with Commons, in that if there is a Commons category (P373) statement on an item, then most connected Wikipedias will directly show a sitelink from their article to the Commons category. But, as far as I am aware, no equivalent mechanism is in place for Wikinews, so if there is no sitelink from the article-item, then there will be no sitelink to Wikinews at all shown on the Wikipedia item.

That to me makes it entirely understandable that Wikinews editors would seek to link from article-items to their subject categories. I don't see any particular good reason to stop them. Jheald (talk) 16:41, 24 February 2018 (UTC)

Create a category just like in for Wikipedia. What I'm saying is not something new is not something new, I'm just getting rid of an exception that has grown. Exceptions are an indication the data modeling is wrong. Wikinews makes our data inconsistent. Part of the categories are like Category:Royal Air Force (Q7404780) and links keep getting moved around. If Wikipedia's would want to link to Wikinews they can still do it. Multichill (talk) 18:38, 24 February 2018 (UTC)

Or we could just say: if article-items are systematically a better sitelink for these pages, then go for it. For all of them. Site-wide. What is the downside?

And you didn't answer my first question: What is the position if there otherwise is no category-item here? Are Wikinews people forced to create one, or (like Commons) are they fine to link to the article-item in such a circumstance? What does that serve, other than create a redundant item that links to nothing and has no meaningful statements on it? Jheald (talk) 19:46, 24 February 2018 (UTC)

I'd thought this was settled years ago. Wikinews topic categories correspond to Wikipedia articles, just as Wikisource author pages do.

As a practical matter, is there a way to propose deletion, or merging, of spurious Wikidata items such as Q47478970? --Pi zero (talk) 14:27, 25 February 2018 (UTC)

There are three types of categories on Wikinews:

Date categorie s (e. g. Category:January 1, 2018 (Q46451877), which are typical categories with category's main topic (P301):January 1, 2018 (Q45919493)
Maintenance categories (Category:Candidates for speedy deletion (Q5964)) which are used across WMF projects
Thematic categories (Zimbabwe (Q954)), which should be merged with the same categories (Category:Zimbabwe (Q6983038)) on other projects and linked witj topic with category's main topic (P301). but are merged with articles.

Little bit schisophrenic, isn't it? JAn Dudík (talk) 08:36, 27 February 2018 (UTC)

A clarification, for the benefit of third parties who might be reading this (so misinformation doesn't sit here unremarked): A Wikinews topic cat is the primary page on that project associated with its topic, just as a Wikipedia article is the primary page on its topic. To state what should be obvious, the purpose of sister links is to help readers, when looking at a page one sister project, to find corresponding pages on other sisters, and patently that means leading readers in either direction between the Wikipedia article on Zimbabwe, the Commons category for Zimbabwe, and the Wikinews category for Zimbabwe. --Pi zero (talk) 13:13, 27 February 2018 (UTC)

The general guideline for Wikinews seems to be that its categories go with Wikipedia articles: see Wikidata:Wikinews/Development#Interproject links for people new to the question.
--- Jura 17:09, 27 February 2018 (UTC)

We have consensus that usual Wikinews category (news on the topic) is the same as the encyclopedic article on the topic in Wikipedia or a list of quotes on the topic in Wikiquote, for example. There is the word "category" only because of the Wikimedia engine. Identical entities must be linked directly with each other. The reverse spoils our data structure.
In addition, many Wikinews categories, at least in the Russian edition, deeply use information from Wikidata items for description, categorization and design. Changing these links will automatically destroy almost half the project. --sasha (krassotkin) 13:25, 3 March 2018 (UTC)

This consensus was estabilished because of missing arbitrary acces in that time. What is difference between Wikinews categories and e.g. Wikiversity categories? or Wikisource categories? JAn Dudík (talk) 19:14, 7 March 2018 (UTC)

The reference to Wikiversity and Wikisource is specious.

A Wikinews topic category is the focal page on the project for that topic, just as a Wikipedia article is. If Wikidata means to be helpful to sister projects, and to readers of those projects, there is no question that a Wikinews topic category is associated with the Wikipedia article; any other choice of mapping between the two projects would be actively deceptive. --Pi zero (talk) 01:45, 8 March 2018 (UTC)

Create "cmd.exe command" item or add "instance of: command", "part of:cmd.exe" to every cmd command?

I have already created "cmd.exe command" and "command.com command" items and I started to add to them all their commands. However, some commands are available in some Windows versions and not others, so I should create "Windows 7 cmd.exe command", "Windows 8 cmd.exe command", "Windows 7 PowerShell command", "Windows 8 PowerShell command"... The same result can be reached adding the properties "instance of: command" and "part of: cmd.exe | command.com | powershell" to every command items. Which of the two approaches is better?--Malore (talk) 16:24, 3 March 2018 (UTC)

WikiProject Informatics has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead.

@Malore: Out of curiosity, cmd.exe and command.com does evolve ?? I doubt there is any problem for them.

The same result can be reached adding the properties "instance of: command" and "part of: cmd.exe OK if you have « Windows 8 cmd.exe » item but if you just have « cmd.exe » it’s not as expressive.

Apart from this it does not seem a good idea to tight powershell commands versions to Windows one but to powershell version directly (see https://en.wikipedia.org/wiki/PowerShell#Versions ) or to « Windows Management Framework » as there is several windows variants (server and so on.) and each of these may or may not include a powershell version.

For a more direct answer to your question, I wrote the template {{All instances}} which allows, from items like COMMAND.COM command (Q50320434) to list all commands (or instances of the class), whether or not they are explicitely instance of the class. For example : all instances of « command.com commands ». See the documentation of the template for more informations on how it works, but in summary both of your proposed approaches would work with this template. author TomT0m / talk page 13:26, 5 March 2018 (UTC)

@Malore: Are you trying to model the abstract specification of commands interpreted by "cmd.exe" (parameters, effect, etc)? Or are you trying to model different versions of "cmd.exe" software? If modelling commands of "cmd.exe", then create a new item each time a new release of "cmd.exe" changes the parameters, effect or other behaviour of a command. If modelling software, create a new item for each version/build of "cmd.exe". Dhx1 (talk) 11:11, 5 March 2018 (UTC)

@TomT0m: My main fear was to create too many items (like "WIndows 8 cmd.exe command") that turned out to be useless because the same result could be easily achieved by a more versatile query (like your template). Thank you very much for the template.--Malore (talk) 00:41, 8 March 2018 (UTC)

@Malore: My template needs the items anyway to generate the queries, wether or not they are used in « instance of » statements. The solution to use those items it comes with less statements however (I’m developping the converse function in Module:class : search by « main snak » instead of searching by class. The idea is that we find items in the case they have such a statement with a main snak <prop:val> or they are instances of a class that declares that its instances have a <prop:val> statement. author TomT0m / talk page 07:41, 8 March 2018 (UTC)

Bot flag request

Hello! I have left a request for bot flag at Wikidata:Requests for permissions/Bot, but it left unnoticed. Please, pay attention to it. --Tohaomg (talk) 14:01, 8 March 2018 (UTC)

Attaching a property to a Mix'n'Match catalogue

Now that TMDB movie ID (P4947) has been created, how do we update catalogue 1066 in Mix'n'Match to use it? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:22, 14 March 2018 (UTC)

You have to ask it on Magnus talk page. --Edgars2007 (talk) 12:51, 14 March 2018 (UTC)

Now done. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:00, 14 March 2018 (UTC)

This section was archived on a request by: Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:00, 14 March 2018 (UTC)

Lift the ban of fields synthesized from other fields

Currently we ban the creation of a data field containing a url if it can be synthesized from information from another field. It would be much better if Wikidata had a field called "Worldcat url" and "CIA World Fact Book url". For instance "Worldcat url" should be synthesized automatically from the LCCN_ID into a clickable url stored directly in Wikidata. We only get a link to Worldcat if that person has an entry in Wikipedia where it is synthesized on the fly from Wikidata. Some people use Wikidata directly as a source of information. These proposals were previously dismissed.

We should have a field called Worldcat_url using LCCN to create a url that directs us the an authors Worldcat entry Worldcat uses LCCN_ID to create a url that contains a bibliography. The link should be synthesized for each person with a Worldcat entry and it should appear in their Wikidata entry. See: the entry at Worldcat for John Howard Lindauer using Library of Congress Control Number (Q620946) to synthesize the url http://www.worldcat.org/identities/lccn-n50051493/ for an entry on John Howard Lindauer II (Q6240065). Wikipedia displays this link in the Authority Control box at the bottom of the article, but we have over 1,000 entries for people with LCCN numbers that do not appear in Wikipedia. We should be able to access their WorldCat entry directly from Wikidata. The creation of the url can be automated 100%. If the url comes back 404, we just have the bot delete it.
We should have a field called CIA_World_Fact_Book_url using the two letter country code embedded in a url. CIA World Fact Book uses the two letter country code to create a url with exhaustive facts on that country. This link should be available from each country entry directly in Wikidata. See: the entry at the CIA World Fact Book for Albania using The World Factbook (Q11191) to synthesize the url https://www.cia.gov/library/publications/resources/the-world-factbook/geos/al.html for an entry on Albania (Q222). Currently Wikipedia does not display this link.

-- RAN (talk) 17:13, 6 March 2018 (UTC)

┌────────────────────────────────────────────────────────────────────────────────────────────────────┘ We cater for links like http://www.worldcat.org/identities/lccn-n50051493/, using third-party formatter URL (P3303), as I have just addded here. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:39, 6 March 2018 (UTC)

Maybe what we need is a way to create special "derivative" properties that just rely on third-party formatter URL (P3303) attached to an existing property? third-party formatter URL (P3303) on its own seems to be insufficient - at least the ability to attach a label etc. Maybe something can be done with qualifiers on third-party formatter URL (P3303)? Wikipedia templates that use these things need some mechanism to select the right formatter url... ArthurPSmith (talk) 18:44, 6 March 2018 (UTC)

I had a proposal for that at meta:2017_Community_Wishlist_Survey/Wikidata#Create_a_new_class_of_statements_which_are_automatically_generated_based_on_a_query but it did not have much support. --Jarekt (talk) 20:46, 6 March 2018 (UTC)

@Jarekt: I would have supported if I had known it existed, unfortunately I did not pay a lot of attentions on the wishlist this time. Did you make enough noise in here about this ;) ? author TomT0m / talk page 13:52, 7 March 2018 (UTC)

Wikidata items are not built for reading directly. Forking data would not be helpful. Use Lua or a userscript to change display where necessary. --Yair rand (talk) 19:20, 6 March 2018 (UTC)
- Yes - but, this seems to be a common complaint recently. How can we make that work better/easier? Formatter URL's are handled specially by wikidata to provide links for ID's; can we make the 3rd party ones more functional somehow? Just doing it in Lua makes our P3303 entries worthless, you just code the URL directly. ArthurPSmith (talk) 20:15, 6 March 2018 (UTC)
  - Not sure I agree on "worthlessness" of P3303. Significantly easier to build a URL in something like SPARQL or Listeria or Reasonator or an external app if there's a formatter template, as per e.g. Property_talk:P1630#Using_from_within_WDQS.
  Also perhaps worth noting that formatter URI for RDF resource (P1921) is now used to create a fully-fledged linked-data url for external IDs, accessible from SPARQL via eg p:P1014/psn:P1014.
  
  And I don't think I agree with User:Yair rand either, that direct readability of Wikidata items is irrelevant. I suspect the take-up of relative position within image (P2677) would be a lot stronger if there was a formatter linking the value directly to an immediately visible image detail. Jheald (talk) 23:53, 6 March 2018 (UTC)
You are assuming that a typical user would know how to construct a query to get information that they do not know even exists. When I first asked about the Worldbook ID half of the responders did not know where the value was located. Are we running out of server space? I do not see any down side to this at all. It seems the no votes are against it for ideological reasons, not practical reasons. And, yes, people do use Wikidata directly because many entries do not appear in Wikipedia. Just as I use VIAF directly to identify people in the Library of Congress image collection. The practical usefulness of having the link in the entry on that person should override objections based on the fact that the information can be synthesized if the end user knows that the data exists, and where it is stored, and can construct a query using one of our tools, assuming that they know the query tool exists. --RAN (talk) 00:04, 7 March 2018 (UTC)

A middle ground solution would maybe to extend the ext identifier datatype to add several formatter urls ? @Lea Lacroix (WMDE): ? author TomT0m / talk page 13:55, 7 March 2018 (UTC)

How do you imagine that to work? How would the software know when to use which formatter URL? Lea Lacroix (WMDE) (talk) 16:18, 7 March 2018 (UTC)

There could be « main » formatter, that would be used exactly as the current one, and a list of « secondary » one, used to generate as many « secondary » uris. In rdf the values would be expanded only on the full value of the statement and not as a truthy one. I have no idea on the display on Wikidata pages however. I guess for usability it’s best to display all the uris without the user having to click. Don’t know if that would be OK for the UI :) author TomT0m / talk page 17:39, 7 March 2018 (UTC)

We can already add multiple formatter URLs; and we have third-party formatter URL (P3303). Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:15, 8 March 2018 (UTC)

I want the url to his worldcat entry to appear in his Wikidata entry at John Howard Lindauer II (Q6240065). I still do not understand why we do not want that. RAN (talk) 20:58, 8 March 2018 (UTC)
- Part of the problem with third-party formatter URL (P3303) is that currently we don't record very systematically how that third-party link should be labelled. Sometimes I see that we give operator (P137) as a qualifier on the property item; and another possibility would be use the stem of the 3rd party URL. Having taken a decision how to label such a link, I see no reason why a gadget could not be written to display it, with that label, immediately below the external ID statement for an item with such a link, immediately above any qualifiers. If it turned out to be a useful and popular feature (as I think it might well), then I can see it might very likely make its way into the User Interface. I agree that it does seem a shame, that we store the information needed to create a 3rd party link, but don't actually display it on the item. Jheald (talk) 21:51, 8 March 2018 (UTC)
  - It seems to me that Reasonator might be a better venue for such links. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 23:05, 8 March 2018 (UTC)
- Why do you want that? What is the use case? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 23:05, 8 March 2018 (UTC)

Grant proposal for Black Lunch Table

Hello, all! I am co-founder of The Black Lunch Table project. The Black Lunch Table (BLT) is an ongoing collaboration between artists Jina Valentine (Fishantena (talk • contribs • logs)) and myself which intends to fill holes in the documentation of contemporary art history. In its 10 year existence, the BLT has taken a variety of forms relating to this most recent iteration, in the form of the Wikipedia edit-a-thon. BLT’s aim is the production of discursive sites (at literal and metaphorical lunch tables), wherein cultural producers of color engage in critical dialogue on topics directly affecting our communities. They endeavor to create spaces, online and off, mirroring the activity and creativity present in sites where Blackness and Art are performed.

We have been using Wikidata to cull our task lists and hope to improve and evolve that usage according to consensus and Wiki standards in the next few months.

I am not super familiar with Wikidata protocol but hope this is the appropriate place to ask for your feedback and endorsement on our grant application: meta:Grants:Project/BlackLunchTable/BLT 2018

Thanks so much!

--Heathart (talk) 17:52, 9 March 2018 (UTC)

i need a little help to get started with project please

sorry can I PLEASE GET A LITTLE HELP TO BE STARTED WITH PROJECT PLEASE – The preceding unsigned comment was added by M1ckm (talk • contribs) at 12:05, 16 March 2018‎ (UTC).

First of all - you should sign your posts and avoid to use CAPITALS. Second - if you asking for any help, better idea is to good explain what is the problem. --Jasc PL (talk) 12:19, 16 March 2018 (UTC)

@Jasc PL: You can use {{Unsigned2}} to add missing sigs to talk pages, and {{Welcome}} to welcome and advise new users, on their talk pages. Both should be "subst"; and I've applied both in this case. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:31, 16 March 2018 (UTC)

@Pigsonthewing: Thanks Andy, I forgot about it. --Jasc PL (talk) 15:01, 16 March 2018 (UTC)

This section was archived on a request by: Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:31, 16 March 2018 (UTC)

Things covered by Commons categories, but not by Wikidata

When trying to improve coverage of sleds, I noticed that Commons had many types of sleds Wikidata was lacking. As many can have items at Commons, I created the relevant ones at Wikidata. Obviously, we don't have an equivalent of (non existing) "Category:Green sled pulled by girl", but cutter sleigh (Q50181142) should be here.

The other day, another user improved coverage ships with much content from Commons.

As Commons is likely to rely more on Wikidata items to describe things, I think we should attempt to reach a similar coverage than Commons categories. The question is in which fields we need to improve. What are your suggestions?
--- Jura 11:57, 7 March 2018 (UTC)

Aircraft are an obvious case to look at. But in general, we can do a lot of good by improving the sitelinks to Commons, and identifying cases (there are a lot of them) where commons has images of things that don't have P18 values here. Thanks. Mike Peel (talk) 13:14, 7 March 2018 (UTC)

Commons has a lot of categories which are intersections of multiple concepts (c:Category:Portrait paintings of women of Spain in national costumes is intersection of portrait (Q134307), woman (Q467), Spain (Q29) and traditional costume (Q3172759)). Those should probably not have items unless we already have items based on other projects (see list of French artists (Q3246016)). But single topic commons categories could be a great source of of candidates for new items. I agree with need to improve image (P18) coverege, but other very important massive task is to clean up some constraints violations for commons related properties:

Wikidata:Database reports/Constraint violations/P18: out of control but now broken
Wikidata:Database reports/Constraint violations/P373: out of control
Wikidata:Database reports/Constraint violations/P935: out of control but now broken
Wikidata:Database reports/Constraint violations/P1472: under control
Wikidata:Database reports/Constraint violations/P1612: mostly under control

Especially Wikidata:Database reports/Constraint violations/P373 are troubling since that property is used so much. User:Ivan A. Krestinin, what should be done to get P18 and P935 Constraint violations reports working again? --Jarekt (talk) 14:50, 7 March 2018 (UTC)

I'm not sure if cleaning up P373 would help us identify categories that aren't covered yet. Maybe a cleaner version would make filtering easier, but to find an actual gap?
Going through possible images for sleds I did find a few items that we needed, but these didn't necessarily have categories. Sometimes there are simple things Wikidata lacks, e.g. tennis shoe (Q48978644). In the meantime, that one even got an identifier.
--- Jura 19:18, 7 March 2018 (UTC)

Cleanup of P373 and P18 is not going to help with sled items. However we need to get the number of issues under control since poor data quality affects people experience. Links to non-existing categories or images can break tools or just be annoying. --Jarekt (talk) 19:35, 7 March 2018 (UTC)

I'm not sure if it's worth fixing P373. Eventually, we might just drop it. As for P18, maybe some development is needed to ensure that this keeps working. Some checks could also be done by various WikiProjects, e.g. Wikidata:WikiProject Q5/reports/identical P18.
--- Jura 19:53, 7 March 2018 (UTC)

Jura If you look at Property_talk:P373 at the section that lists all the templates that use P373, you will see few hundred templates. Since sidelines to commons are unpredictable as to what namespace whey will link you too, P373 becomes like a sitelink, which is unfortunately stored as string. We can not just "just drop it". As a main way of connecting other wikipedia projects to Commons, it would be great if we could fix the backlog of constraint violations. As many of them indicate real issues with the data. --Jarekt (talk) 17:54, 8 March 2018 (UTC)

I do think Jura is absolutely right that there's a huge amount to gain from systematic comparison with the Commons category structure. Because Commons deals with pictures of physical objects, the category structure there is often more detailed, more systematic, and more complete than Wikidata categories -- but (perhaps because of the previously uneasy question of sitelinks) it hasn't had nearly the same attention paid to extraction.

In many ways the opportunities are similar to some recent exploration I've been doing with thesauruses, another hierarchical resource that (in most areas) we haven't measured ourselves up against nearly as comprehensively as we could have done. Using the topics of Wikidata:WikiProject Fashion as a test area (which has done quite a lot of benchmarking), I think it is quite useful to be able to generate hierarchical listings like Wikidata:WikiProject_Fashion/Taxonomy/aat and Wikidata:WikiProject_Fashion/Taxonomy/efv to reveal how much of the hierarchy has been matched; and also, by populating broader concept (P4900) to create a local representation of external hierarchy, quite useful to see where there are parent relations in the external hierarchy that as yet are not matched by parent relationships in our own hierarchy; and, vice-versa, parent relationships in our own hierarchy that do not correspond to any parent relationship in the external hierarchy. Sometimes this just reveals different modelling decisions, or apparent missing relationships in the external hierarchy; but quite often it can reveal incorrect matchings, or questionable relationships, or missing relationships in our own hierarchy. It would seem an obvious available opportunity, to benchmark in a similar way against the Commons hierarchy. And of course, the more we can link Commons categories to objects here, then the more help that also gives us towards understanding the contents of those categories with a view to Structured Data, as well as the possibilities now to add a multilingual Wikidata-driven infobox or other templates to the Commons category.

But the question is, what are good techniques or approaches for finding Commons categories missing Wikipedia items? (And, ideally, for matching them?)

In some work I did last year, working on settlements and civil parishes in the UK, I found SQL queries like https://quarry.wmflabs.org/query/17609 and https://quarry.wmflabs.org/query/17610 quite useful, that look down the category tree several levels, and then look back up one level to find out the categories that those categories several levels down are in. This helps to identify the categories in the tree that are still settlements, compared to those that are churches or some other thing of interest -- something that is quite useful if there is a particular hierarchy one is focussing on.

I was comparing (offline) the categories returned for parishes and settlements with the values of P373s for parishes and settlements on Wikidata, to see if there were ones in the Commons tree that didn't have incoming P373s, and whether there were Wikidata items that might match them, or whether new Wikidata items ought to be created.

But one can also go the other way, by including in results of the SQL query a column for whether the categories have a Wikidata sitelink. In some ways this can be more reliable, because of the potential data issues with P373s. But to make it work, it does mean that it's really helpful if as many categories as possible have sitelinks, if the appropriate target can be identified. (eg by harvesting P373s to categories with no sitelinks). I do think it would be a useful step if we could try to build up this number, perhaps as a bot job. The last time I looked, six months ago, there were about 1,400,000 Commons categories that could be identified with article-items; but only about 740,000 of those could be identified with a sitelink, either directly (540,000) or via a category item and then a category's main topic (P301). I think it would be quite useful to increase those numbers -- and could probably be done as a bot job, or a QuickStatement process over a few days.

(to be continued -- but do jump in, if you have good approaches for identifying Commons categories missing a Wikidata item). Jheald (talk) 21:33, 8 March 2018 (UTC)

Discussion (and RfC) advertised on Commons, at c:Commons:Village_pump#Commons_categories_and_Wikidata_notability Jheald (talk) 23:48, 8 March 2018 (UTC)

Maybe another sample to mention could be lighthouses: when expanding our lists at the WikiProject, we created some 500-1000 items for lighthouses only found in Commons categories. Some of these were described in other articles at (English) Wikipedia (e.g. in articles about islands), but none of these had items at Wikidata. In the meantime, several contributors have substantially expanded these items.
--- Jura 13:30, 10 March 2018 (UTC)

How vandalism in wikidata affect local wikis

Yesterday at 01:42 someone renamed Russia (Q159) into "mainkra". It was reverted with an hour, but unfortunately vandal version spread somehow into ru-wiki infoboxes (probably because of mysterious cache algorithms). Right now google shows that thousands of articles still affected (see [8]). I understand that this might not be a top priority issue for wikidata community, but is there anything we can do to decrease probability of similar incidents in the future? For instance, what is the reason why we allow anonymous contribution for highly used items like Russia (Q159)? --Ghuron (talk) 12:47, 9 March 2018 (UTC)

There is a previous anonymous vandalism [9]. — Finn Årup Nielsen (fnielsen) (talk) 14:52, 9 March 2018 (UTC)

Unfortunately, this is not only a problem of infoboxes – in mobile version of Wikipedia, description comes from WD and it is not unusual that IPs are changing descriptions in controversial topics (like Cursed soldiers (Q1320177) to bandits and murderers and then to heroes) and such descriptions are visible for many other users (funnily, I don't know how to edit description from mobile version of Wiki site, but IPs manage to do that in some way). There should be some Flagged Revisions or similar thing for IP edits. Wostr (talk) 15:03, 9 March 2018 (UTC)

Semi-protection looks to be the easiest desition. Ghuron, you can ask for it there. - Kareyac (talk) 16:00, 9 March 2018 (UTC)

Ah, already done. - Kareyac (talk) 16:04, 9 March 2018 (UTC)

I believe we need much more widespread protection like indef semi-protect for all elements with Id below 10000. Or for all elements used on 100 pages in local wikis (if usage stats are accurate) --Ghuron (talk) 16:54, 9 March 2018 (UTC)

Since vandalism on Wikidata has the potential to be multiplied when re-used, being generous with semi-protection sounds reasonable. Basing it one usage stats rather than Q IDs below a certain figure would be more rigorous. Richard Nevell (talk) 15:24, 10 March 2018 (UTC)

Some times ago I'v put a "ticket" at Administrators' noticeboard and ask also some questions about patrolling - looks like no one was interested in doing this by users... --Jasc PL (talk) 17:03, 9 March 2018 (UTC)

We actually can not semi-protect (or, at least, not indefinitely semi-protect) the most common items since many contributors in the projects are not autoconfirmed on Wikidata and will not be able to add new articles if needed, and if they move articles the results will not be visible on Wikidata. The only way forward currently is to patrol vandalism on Wikidata. Btw what you describe looks like a caching issue. Of course if an item is vandalized on Vikidata the vandalism immediately gets visible in infoboxe using the item, but once the vandalism gets reverted it is should not be visible anymore except for caching delays.--Ymblanter (talk) 22:33, 10 March 2018 (UTC)

For some reasons several Wikipedias have FlaggedRevisions enabled for every article. This has some advantages as well as disadvantages, but right now these two systems (Wikidata vs Wikipedia with FR) are not consistent; what's more, vandalisms that comes from Wikidata to Wikipedias are a problem that Wikipedias' users cannot easily deal with — these vandalisms are visible the seconds after addition and are bypassing the protection of Wikipedia's FlaggedRevisions; also, it's harder to notice vandalisms from regular Wikipedia editor point of view (enabling WD edits in watchlist cause that the watchlist is littered with unrelevant edits in many languages – so this is rarely used; and there is no real tool for WP users to check the data that is imported from WD) and harder to revert (WD is not easy for regular WP users and most of them do not want to edit in another project). And I can agree with you that any protection is not an option here — there should be (and by 'should be' I mean 'years ago') a system solution that would prevent such situation; either on WD-end or on WP-end. FlaggedRevisions for WD would be a very bad solution, even if enabled for some part of items, but maybe FR could be somehow modified to hold WD edits from view until reviewed on WP-end. The current situation is leading to a point in which further integration of WD and some WPs will be halted. Wostr (talk) 01:57, 11 March 2018 (UTC)

I agree this is smth worth discussing but I have no idea how the solution could look like.--Ymblanter (talk) 07:59, 11 March 2018 (UTC)

@Ymblanter: I don't think there is a clear way to "fix" that "caching issue". For items used in 10⁴-10⁵ articles, it always will be significant delays in processing of changes. But from what you describe about semi-protection, it looks like we need to be able to protect statements/labels/descriptions independently from sitelinks. What do you think? Should we try to report this to phabricator? --Ghuron (talk) 05:56, 11 March 2018 (UTC)

Yes, I believe if we were able to protect statements separately from the sitelinks (and ideally also from labels, since labels get vandalized way more often) it would solve the problem. I have no idea how feasible this is though.--Ymblanter (talk) 07:55, 11 March 2018 (UTC)

[10] --Ghuron (talk) 11:06, 11 March 2018 (UTC)

There were many RfCs about implementing stricter patrolling requirements but none was approved. My suggestion is to let edits go to Wikipedias as soon as they are patrolled. Matěj Suchánek (talk) 09:28, 11 March 2018 (UTC)

Changing data type from “String” to “Monolingual text” for the property P969

I propose the change of the data type for the property P969 (P969) type from “String” to “Monolingual text”. There are several countries with more than one official language, at least locally. For instance, besides German in Germany there are other official languages in several states like Sorbian, Danish or Frisian. The street signs are usually bilingually labelled. In countries with non-Latin writing the postal authorities often allow to write addresses in Latin languages, mostly English or French. Therefore, we should be able to distinguish multilingual addresses and to specify the language used.

The problem was discussed previously (here and here) but nothing was changed. But we need this information to distinguish between several languages and to set the language and writing-direction attributes to html span tags.

The current state is that the language information is missing in most cases. In countries like France it is easy to think it should be written in French. But in other cases like Chine it is more difficult: See for instance Q28075839 (Volks Mehood Hotel): it should be assumed that the address is written in (Mandarin) Chinese but it is written in English. For readers it is easy to get the correct language but not for a Lua script.

The only way for now is to add a qualifier like for Q47429618 (Steigenberger Hotel El Tahrir Cairo). But on the one hand there is no mean to force anybody to add a language qualifier P407. On the other hand handling missing/existing qualifiers is more complex than using a monolingual-text property. The longer we wait the more comprehensive the correction at Wikidata will become.

The change could be done by a bot. In many cases the language can be fetched from the country property P17. In questionable cases “undefined language” and a maintenance category should be added. --RolandUnger (talk) 06:43, 11 March 2018 (UTC)

Comment I'd support deleting this property in favor of located on street (P669). This is not structured data Wikidata deals with. Matěj Suchánek (talk) 09:23, 11 March 2018 (UTC)

located on street (P669) is not a real substitute because an address consists not only of a street but also of a house number, a town including the area code and the country. That's why the deletion proposal was refused. --RolandUnger (talk) 12:37, 11 March 2018 (UTC)

Request for help

Hello I am trying to submit a batch for QuickStatements 2. I was wondering how to use it. For example, for these 3 commands (that work in #1)

Q14418432 Dit villaggio in Indonesia
Q14418436 Dit villaggio in Indonesia
Q14418438 Dit villaggio in Indonesia

What is the correct format to use for Quickstatments 2? Thanks. Artix Kreiger (talk) 00:35, 11 March 2018 (UTC)

The same one ("Version 1 format"). Don't forget quotes around the descriptions. Matěj Suchánek (talk) 12:53, 11 March 2018 (UTC)

It actually depends on which import option you use in the dropdown. The v1 importer indeed uses the same syntax. However for de CSV importer it would be:

qid,Dit

Q14418432,"villaggio in Indonesia"

Q14418436,"villaggio in Indonesia"

Q14418438,"villaggio in Indonesia"

Mbch331 (talk) 13:18, 11 March 2018 (UTC)

Help:Evolving knowledge

Hi people, to answer @Noé: question above I started a draft of help page about evolving knowledge. Did not find an . Any comment on this ?

WikiProject Ontology has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead. WikiProject Properties has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead.

I did not add an example of deprecation however yet, as the example I choose in introduction is far from trivial. This query shows that there is not really a lot of deprecated statement, the first one I investigated is an erroneous deprecation of the fact that China and USSR had a common border :/ This shows that the help page is probably needed.

select ?item ?prop {
  ?item ?prop [ wikibase:rank wikibase:DeprecatedRank ]
}

Try it!

author TomT0m / talk page 14:05, 11 March 2018 (UTC)

That's very kind of you. I really appreciate every help pages and this one is very useful to understand own this complexity is managed. Thanks! Noé (talk) 14:37, 11 March 2018 (UTC)

Mix n' Match, what does "The Wikidata property for this catalog; optional" mean?

Hi all

I'm working on importing a list of academic journals into Wikidata and just about to start the import in Mix n' Match.

Currently on Mix n' Match game mode if you chose 'New Item' it actually creates a new item then, which does not have any properties and so someone deletes it so you either have to get the items undeleted or recreate them, either way its a pain.

I'm hoping there's a way to make sure the new item is created with one property, on Mix n' Match import sheet there is a field that says "The Wikidata property for this catalog; optional", can someone explain what this means?

Thanks

--John Cummings (talk) 14:05, 11 March 2018 (UTC)

Hi John. The property is for the identifier used by the source catalogue. For China Biographical Database (Q13407958), the property is CBDB ID (P497); for eBird (Q5322614) it's eBird taxon ID (P3444). So if there is a Wikidata property for identifiers ffrom the database you are importing, that might do the job. Cheers, MartinPoulter (talk) 14:28, 11 March 2018 (UTC)

Thanks @MartinPoulter:, I'm importing the list of journals from DOAJ, so I should add Directory of Open Access Journals (Q1227538)? Will this add a statement to the new items? --John Cummings (talk) 15:10, 11 March 2018 (UTC)

No User:John Cummings, you need to make a new Wikidata:Property_proposal/Authority_control, and in the template say "subject item: Directory of Open Access Journals (Q1227538)". Also mention that you got it in a MnM catalog so the approval is urgent. Ping me to review the proposal, and use the template "Ping project|Authority control" to tell people to vote for it. Given your development so far, I think it will be approved quickly.

BTW, kindly move all your developments from the Data Import page to a separate sub-page. It got very big already, and it's hard to work when it's mixed with unrelated stuff on the same page. Keep on! --Vladimir Alexiev (talk) 15:40, 11 March 2018 (UTC)

Ah, OK, thanks @Vladimir Alexiev: I started one at Wikidata:Property_proposal/Directory_of_Open_Access_Journals_ID. We are going to be creating a version 2 of the Data Import Hub in the next weeks which will have seperate pages for each import, just waiting on a couple of technical things to get sorted out. --John Cummings (talk) 16:07, 11 March 2018 (UTC)

L10n help and multilingual advice needed for bot

I'm working on a task for my bot which would find items about villages and then a) remove disambiguation from labels' names, b) set all Latin-script languages' labels to the same thing (if there was no prior disagreement among the labels), and c) describe the village along the pattern of "village in <parent entity>, <grandparent entity>, <great-grandparent entity>" (see RFBOT for full documentation). With tasks (b) and (c), I run into an l10n (localization) issue. I'll start with the simpler one, task (c). Right now, my bot only sets a description in English. I'd like to support more languages, though. So I invite people to tell me how to say "village in ..." in their language, as well as what format the language uses for nested parent entities, if it isn't "a, b, c".

For (b), I was initially using all 196 exclusively-Latin-script languages supported by Wikidata, but Ymblanter observes that some languages, e.g. Crimean Tatar ([crh]), won't use identical labels to most other Latin-script languages. So that's the first thing I'm asking for help on from the Wikidata community: Can people help me pick out languages in which the native name for a village would reliably be a valid label, or where the native name could easily be turned into a valid label with some RegEx magic? (90.191.81.65 raises a valid concern, namely that some languages will use exonyms for certain villages. However, since this only sets labels when no previous one existed, I don't see that as a problem. The worst-case scenario is that we'll have an imperfect-but-not-incorrect label instead of no label at all.)

These are the 196 exclusively Latin-script languages I originally identified:

Here are the ones I've identified so far as being almost certainly safe to use (i.e. major Germanic and Romance languages):

español, English, Simple English, British English, Canadian English, português, Deutsch, français, italiano, Nederlands, svenska, dansk, norsk bokmål, norsk nynorsk

Here are the ones I've identified as being most likely safe to use:

Interlingue, interlingua, Patois, Ligurian, Norfuk / Pitkern, Ænglisc, Deitsch, føroyskt, Plattdüütsch, Frysk, Nordfriisk, asturianu, Boarisch, estremeñu, Ripoarisch, vèneto, Esperanto, furlan, lumbaart, Napulitano, Papiamentu, sicilianu, sardu, Limburgs, corsu, Alemannisch, Picard, Nedersaksies, emiliàn e rumagnòl, Österreichisches Deutsch, Mainfränkisch, Zeêuws, Gegë, Lingua Franca Nova, tarandíne, jysk, Plautdietsch, íslenska, Latina, català, kréyòl gwiyanè, română, Lëtzebuergesch, aragonés, galego, rumantsch, occitan, Afrikaans, Scots, arpetan, Piemontèis, Nouormand, Mirandés

I reckon the next-most likely candidates would be any other Indo-European languages, followed by languages from other families.

Thanks for any assistance that anyone is able to provide. — PinkAmpers&^{(Je vous invite à me parler)} 21:27, 4 March 2018 (UTC)

Fortunately, I don't see Polish here, but I would suggest caution in many languages mentioned above, as many of them are inflected languages (so using a bot to add village in... would be not possible without the knowledge of declension for each word). What's more: no data is better than imperfect/inorrect label/description — by adding such data you can make a mistake to spread even outside Wikimedia projects. So the worst-case scenario is really the adding of imperfect data, not the lack of it. Wostr (talk) 00:07, 5 March 2018 (UTC)
@Amire80:, Amir, may be you have any ideas or know who might have any?--Ymblanter (talk) 13:07, 7 March 2018 (UTC)
Some thoughts:
- I suggest asking on Translators-L or maybe even Wikimedia-L.
- In general, calling it "L10n" is a quite misleading, because it's a rather different task. It's more like multilingual consultation, and each language may need a special approach.
- Other languages that are likely to be unsafe: az, lv, lt. But please verify. There may be more.
- Is it actually good to replicate a lot of labels? This might be perceived as a confirmation that they are actually written identically in these language. The fallback doesn't work perfectly at the moment (see this bug), but it's better to fix the fallback than to replicate a lot of labels. --Amir E. Aharoni (talk) 13:29, 7 March 2018 (UTC)
@Amire80: Fair point about "l10n". I meant it more in reference to the first request, so I've updated the section title to reflect that. Anyways, I'll email Translators-L when I get the chance. And to be clear, at this point I'm more planning on ruling in languages that are safe rather than ruling out ones that are unsafe. Also, could you please clarify your "confirmation" point? Are you talking about the endonym/exonym question, or a transcription question? Because, if it's the latter, I posit that languages can be divided into two categories: those where the label would always be the same as the native name (except in the rare cases that there's an exonym), and those where they would not always be. My whole goal here is to figure out which languages are in which category. — PinkAmpers&^{(Je vous invite à me parler)} 22:06, 7 March 2018 (UTC)
My "confirmation" point is pretty simple: If your bot adds a label in, say, Turkish, to Pedro II (Q1934329), somebody somewhere may think that "Pedro II, Piauí" is the correct way to write this name in Turkish. Maybe it is, and maybe it isn't; are you sure?

If you're less than 100% sure, then what is the benefit of filling this label?

And even if you are 100% sure, the question still stands: what is the benefit of filling this identical label?

I'm not a very big Wikidata expert, so it's conceivable that I'm missing something. --Amir E. Aharoni (talk) 22:23, 7 March 2018 (UTC)
@Amire80: Well, the first thing my bot would do, FWIW, is change the label to Pedro II. But anyways, I think a high degree of confidence is sufficient for these things; for many villages you'll simply never have 100% certainty that the proper English label is what it is, because there might not be any reliable English-language sources that discuss the village. The bot's just doing what I and many other editors do, which is assuming that if someone hasn't already supplied an alternative name in English, then the endonym is almost certainly valid, and that if it isn't, it's better than nothing. As to the benefit of filling in a label when it's identical: for optimal machine readability, which is a major purpose of Wikidata. If I were using a service that pulls from Wikidata, and my preferred language were set as Danish, it would simply tell me that the name of Pedro II is unknown. Yes, the developer could code a fallback, but the burden shouldn't be on them to do that. IMHO, in a perfect world, every item would have a label in every language. We'll never get there, of course, but I'd like to get us just a bit closer. — PinkAmpers&^{(Je vous invite à me parler)} 02:42, 11 March 2018 (UTC)
If a query pulls info from Wikidata, and your preferred language is set as Danish, and the Danish label is not set, it's quite correct to give "unknown" as the result. It's better then assuming that it's "Pedro II" without saying that it was just copied from another language. What's best is to say that the Danish label is unknown, but in case it's useful, there are labels in Portuguese or English. That's the right way to do fallback, and it's totally fine to put the burden for it on developers. It's far better than guessing it without mentioning the source. --Amir E. Aharoni (talk) 08:29, 11 March 2018 (UTC)
I agree with Amir: fallback using is better than giving potentially wrong data. Note that in French, London is translated Londres
, other common places have translations. English call Deutschland “Germany”. An interesting article about “exononimy” (in French, but with English abstract). --Pols12 (talk) 01:04, 12 March 2018 (UTC)
If this is indeed the consensus here (and ideally I'd like to hear from a few more people), then I can have the bot set the native name as an alias, rather than as a label, in the Latin-script languages. Since it's just an alias, my hope would be to do it in all 196 languages. IMHO, an endonym will always be a valid alias in a language of the same script. (After all, en:Deutschland redirects to en:Germany, and fr:London is a disambiguation page where the first entry points to fr:Londres.) — PinkAmpers&^{(Je vous invite à me parler)} 03:55, 12 March 2018 (UTC)
Maybe it would be good to have a plan on maintenance of these descriptions after an initial run, e.g. what to do if the layers change or are found to be incorrect. If it uses essentially cebwiki, maybe a first step should be to cross-check that data.
--- Jura 13:36, 7 March 2018 (UTC)
@Jura1: Cross-check it with what? — PinkAmpers&^{(Je vous invite à me parler)} 22:06, 7 March 2018 (UTC)
Something reliable for a given country. There were extensive discussion here about the problems with the cebwiki stuff for locations. A possible option would be to skip those with only cebwiki/svwiki etc. links.
The maintenance of descriptions after an initial run is a separate problem.
--- Jura 09:11, 11 March 2018 (UTC)
I could ignore cebwiki and svwiki for the purposes of copying labels. Do others think that's a good idea? I'm hesitant to do it just on one person's say-so. But I'm certainly open to the idea. As to maintenance, IMHO that seems more like a good subject for a database report, whether or not this task is approved. — PinkAmpers&^{(Je vous invite à me parler)} 04:02, 12 March 2018 (UTC)

peo as an ISO language code for Old Persian (Q35225)

I've just tried to set 𐎤𐎢𐎽𐎢𐏁 as name in native language (P1559) for Cyrus the Great (Q8423), but "peo" is not being accepted as a correct language code. --eugrus (talk) 18:08, 11 March 2018 (UTC)

@Eugrus:

Tracked in Phabricator
Task T189427

See the phabricator task. Expand on it how you will. Mahir256 (talk) 20:46, 11 March 2018 (UTC)

Thanks then! I've awarded my token to it. --eugrus (talk) 20:54, 11 March 2018 (UTC)

First version of Lexicographical Data will be released in April

Hello all,

After several years discussing about it, and one year of development and discussion with the communities, the development team will deploy the first version of lexicographical data on Wikidata in April 2018.

A new namespace and several new datatypes will be created in order to model words and phrases in many languages. Editors will be able to describe words in Wikidata, and in the future, to query this information, and to reuse it inside and outside the Wikimedia movement.

If you’re curious to discover how this new data structures will look like, you can have a look at the data model. It is suggesting a technical structure, but the editors will remain free to model and organize data as they prefer, with the usual open discussions and community processes that we apply on Wikidata. The documentation will be improved step by step, with the different releases and help of the community.

Please note that the version that will be deployed in April is a first version, that will be improved in the future, thanks to your tests, comments and suggestions. Some features may be missing, some bugs may occur. We can already tell you that the following features will be included in the first version:

Add, edit and delete Lexemes, Forms, statements, qualifiers, references
Link from an Item or a Lexeme to an Item or a Lexeme
Basic search feature

And the following features will not be included in the first version, but are planned for the future:

RDF support (which means: the ability to query it with query.wikidata.org)
Senses will not be included in the first version, to give you all some time to get properties, processes, etc in place for Lexemes and Forms
Entity suggestion and better search features
Merge Lexemes

You can have a look at a more detailed features list. After the first deployment, we will start a discussion with all of you about what are the most important features for you, so we know which ones you would like us to work on next.

Thanks to the people who already showed support and curiosity about lexicographical data on Wikidata. We hope that when it will be deployed, you will test it, experiment with the languages you know, and give us some feedback to improve the tools in the future.

While waiting for the release, here’s what you can do:

Improve the list of tools with ideas of tools that could be built on the top of lexicographical data
Add your ideas of cool queries you’d like to do with words and phrases in the future
Have a look at the project page and especially the talk page, where people are already asking questions, and discussing about how to model data and other topics
If you’re involved in a Wiktionary community, discuss with them and answer any questions they might have about Wikidata. You can also register as ambassador for your community.

Last but not least, we are kindly asking you to not plan any mass import from any source for the moment. There are several reasons behind that: first of all, like mentioned above, the release will be a first version and we need to observe how our system reacts to the manual edits before starting considering automatic ones. The system may not be ready for big massive imports at the beginning. Second reason is legal. Lexicographical data in Wikidata will be released under CC0, and the responsibility of each editor is to make sure that the data they will add is compatible with CC0. For more information, you can have a look at the advice of WMF Legal team. Finally, we strongly encourage you to discuss with the communities before considering any import from the Wiktionaries. Wiktionary editors have been putting a lot of efforts during years to build definitions, and we should be respectful of this work, and discuss with them to find common solutions to work on lexicographical data and enjoy the use of it together.

If you have any question or idea, feel free to write on Wikidata talk:Lexicographical data or contact me.

Thanks for your support and I will keep you posted about further details. Lea Lacroix (WMDE) (talk) 16:34, 7 March 2018 (UTC)

The "Structured Data for Wiktionary" project that legally can't accept any data from Wiktionary. Oh how very well done. Jheald (talk) 16:39, 7 March 2018 (UTC)

What makes you imagine that to be the case? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:55, 7 March 2018 (UTC)

(ec) @Jheald: please refrain from such snarky and wrong comments. Especially when Lea message contains a link to the legal opinion from a Harvard lawyer saying it is possible to import data from Wiktionary (or any dictionary for that matter, there is just certain conditions to respect but nothing new under the sun here) and as Wikidata has already imported a lot of data in Q items from wiktionaries in the past 5 years (without any legal issue AFAIK). Cdlt, VIGNERON (talk) 17:06, 7 March 2018 (UTC)

+1 on @VIGNERON:, whom I thank for writing exactly the same thing I was about to write. --Sannita - not just another it.wiki sysop 17:17, 7 March 2018 (UTC)

It would really be helpful if the Wiktionary community would be provided with an option that interests them. Which server runs what software shouldn't have such an impact.
--- Jura 17:12, 7 March 2018 (UTC)

Good news! Great to see you guys have finally announce it will be in CC0 despite the very few critics expressed by people that, really, just don't like Wikidata. I mean, it was already decided month ago so why postpone it for more discussion on this aspect? It will be much better to start experimenting and see the horde of lexicoenthusiasts jump out of the bushes to create thousand of new ~~entries~~ pages. Can't wait. Noé (talk) 20:41, 7 March 2018 (UTC)

I understand the temptation to be snide, I don't think sarcasm is really what's needed here. --Yair rand (talk) 21:57, 7 March 2018 (UTC)

Great, I learned the word snide today! Thanks! I don't think it's a constructive behavior neither, but I am very tired of this depressing conversation and I suffer my arguments are not audible, and not taken into account. Humor is a safety loophole. Noé (talk) 10:17, 8 March 2018 (UTC)

@Lea Lacroix (WMDE): It appears that the decisions surrounding this are being made by the dev team unilaterally, as there is no consensus in favor of launching here as opposed to on a Wiktionary site. If there were consensus from all the relevant communities, that would be one thing. If a decision came from consensus from just the Wiktionaries, that would also be fine, although not ideal if Wikidata was not at least informed. If there were only consensus from Wikidata, that would be very far from legitimate, but as it stands there isn't even support coming from the community that is theoretically supposed to be launching the unilateral takeover: In the previous thread, seven (eight?) Wikidata users expressed opposition to Wikidata running this here, and I didn't see one word of disagreement to that coming from anyone outside WMDE (excluding Andy's possible implied support?). You can't just say, "This is a done deal, how about this license?" and take a license decision as support for the premise.

Please allow some discussion to take place, and subject the decisions to communities' consensus. There has never been a real discussion as to whether lexeme data should be on Wikidata or on a new site. There has never been a discussion with the involved participants outside of Wikidata about the license, or the format, or the community structure. There is room for cooperation. --Yair rand (talk) 21:57, 7 March 2018 (UTC)

If you need one person who think that it's great to have lexemes on Wikidata, you could count me. I see three possible places where to store this data: 1. on each Wiktionary but it means that the database will be duplicated n times so it's probably not a good idea because it means a huge duplication of efforts and won't help small Wiktionaries. 2. on a new wiki like data.wiktionary.org but it means have yet an other wiki with other rules, make the ability to link to the Wikidata items complicated, reuse Wikidata property impossible, at least for community reasons (the data.wiktionary community would probably not want to have this very important part of the structure design be in the hand of an other community), needs to build a new name for possible partners (Wikidata is already well known e.g. in the research community)... 3. on Wikidata that is already a structured data repository used by other wikis successfully like a lot of Wikipedia and Wikisources. Wikidata has already a well known name and en efficient communnity and processes. Having the point of view of a Wikisource contributor, if there were the project of building a Wikimedia bibliographic database I would say "do it on Wikidata, it is already doing this and, with this choices, it would be evident that the content could be as much shared by Wikipedias or Wiktionnaries than Wikisources. Tpt (talk) 10:55, 8 March 2018 (UTC)

Super exciting news. Congratulations to the team on getting this far, and this is just the beginning! Don’t let the naysayers get you down: they said it wouldn’t-shouldn’t-couldn’t be done about Wikipedia back in the day too [oh, and no-doubt “further discussion needed” too]. Wittylama (talk) 22:38, 7 March 2018 (UTC)

@Wittylama: Wikipedia wasn't about palling up to its own sister projects and then stabbing them in the back, though, was it? Sending in some cases over 10 years work and up to half a million edits to the trash bin. *Not* the kind of behaviour a WMF project should be applauded for. Jheald (talk) 22:49, 7 March 2018 (UTC)

I find it a real shame that you - a person who most certainly knows and understands the value wikidata can (and does) bring to the Wikiverse - should choose to read this as a zero-sum game where wikidata and wiktionary are competitors. There will be “disruption” as with any change, but the aggressive turns-of-phrase like “stabbing them in the back” is unfair and unkind to people you know personally, and who respect you and your opinions, in the wikidata community. I beg you to reconsider your frustrated-opposition to this project, or at the very least to stop repeating the argument that the Wiktionary content and community are being ignored or overridden. Wittylama (talk) 23:27, 7 March 2018 (UTC)

(ec) @Wittylama: I am sorry Liam, but I think that is exactly what has happened here. I think what the project direction has done here is shameful, and as WD editors we should be *ashamed* of how our leadership is treating a sister community. It seems to me Wiktionary has been led right up the garden path with promises of "Structured Data for Wiktionary", seen the detailed structure of their site cloned in minute detail, then had the door slammed in their face with an incompatible licence. "Thanks for 10 years of hard work, now f*ck off". I'm sorry, but I just can't express intensely enough how strongly I feel that dealing with a sister community in this way is unacceptable, a real stain on our site, of which we should all be utterly utterly embarrassed. Would it have been such a hardship to require reusers to include "Powered by Wiktionary" in their credits file, and to share-alike any creative additions they made to the database? But instead we apparently prefer to either rip-off or end-of-line that community's work. Jheald (talk) 00:18, 8 March 2018 (UTC)

I don’t really think that the relationship between the foundation and community is a « leader / leaded » one. Mostly, the foundation is involved into technical aspects, community about content. « then had the door slammed in their face with an incompatible licence » It’s unclear that the structure associated with language is even copyrightable … Personnally I refrained to comment on this topic but the legal and practical implications of database copyright tends to make me unconfortable, to the point I tend to think choosing a license in this case is rather symbolic. I don’t think we will do like some organisations who put wrong data in our dataset to be able to spot people to import our datas, for example. In the case of structured lexical datas, yes, « table » is an english noun. Language is a communication tool, so I don’t really think it make sense to « protect » this fact … language is made to be shared, not to be claimed by any licence. author TomT0m / talk page 12:07, 8 March 2018 (UTC)

@Wittylama: Of course the communities are being overridden. Wiktionary and Wikidata don't need to be competitors, and they don't want to be competitors, but the decision by the dev team that they must be forced into being competitors by having the structured lexical data system set up without Wiktionary is creating the situation without good reason. The value that structured data can bring is important, and we should use it for lexical data, with the communities, not against them. --Yair rand (talk) 23:46, 7 March 2018 (UTC)

It’s a shame you feel that way. This project could benefit from your obvious energy and interest in the topic to help ensure it is as successful as possible, rather than using it to repeat your critique of it. This concern has been raised, read, debated, and responded-to many times before. So now, even though you feel the project is sub-optimal, I encourage you to be a force for positive change, to help try to make it succeed - since you agree that structured lexical data is important. Wittylama (talk) 00:15, 8 March 2018 (UTC)

@Wittylama: It has not been debated. I follow these discussions quite closely.

I don't think it's "sub-optimal", I think it's actively harmful to Wikimedia. After fifteen years of Wiktionary being entirely ignored by Wikimedia institutions, we finally hear of a plan for structured lexicographic data, lots of work goes into it, and then there's the plan for deployment... on Wikidata. The one thing that was literally built for a dictionary won't be on a Wiktionary site. The building of structured data will take place exclusively on Wikidata, set up as a competitor to Wiktionary. Wiktionary communities have no say as to how it's built or structured. It will be administered exclusively by Wikidata admins, under Wikidata policies. Wiktionary will not be permitted to use an open-source extension developed by a Wikimedia organization that would benefit it immensely.

The worst-case scenario would be if this becomes a general precedent of non-cooperation between Wikimedia communities. If the success of one project's mission is anti-correlated with the success of another, people will certainly not be inclined to support a sister project. I hope this does not happen. --Yair rand (talk) 02:11, 8 March 2018 (UTC)

WMF already paid a user to export content from Wiktionary. Obviously, we (at least I) hoped that it would be possible to import this into a Wikibase, but apparently this wont be possible. The time, money and effort will be lost.
--- Jura 00:37, 8 March 2018 (UTC)
As Sannita said it seems that some lawyers agree that we actually could import content from Wiktionaries . I am still waiting for a lawyer or a reuser stating that "Wikidata is dangerous because it have extracted and imported facts from not public domain/CC0 sources like Wikipedia or a lot of others databases.". Please correct me if I am wrong. Tpt (talk) 10:55, 8 March 2018 (UTC)
Well Tpt it certainly worries me, because I think our community probably has cumulatively extracted a lot of data from non-CC0 sources, that is identifiably from those sources or even referenced to them, and probably did have an element of original creative selection, arrangement, or judgement. I think that could well be seen as license-washing, and could well be an accident waiting to happen. There are some quite assertive data publishers out there. I think we would not do well to underestimate our risk in this area. Jheald (talk) 11:56, 8 March 2018 (UTC)

Trying to get a summary of the concerns that the vocal critics here have expressed. Is this an accurate summary of your position?

Jura you say it's the wrong project.
Yair rand, you say it's the right project, but on the wrong wiki.
Jheald, you say that it's the right project, but with the wrong license.
Noé, you say it's moving too fast.

Wittylama (talk) 09:09, 8 March 2018 (UTC)

No. I think it's the wrong project (assuming by "project" you mean, eg, Wiktionary or Wikidata), the wrong wiki, and the license should be chosen by the right project. (Based off of their statements on the earlier thread, your summaries of Noé's and Jheald's views also seem to be incorrect, but I'll wait for them to clarify.) --Yair rand (talk) 10:03, 8 March 2018 (UTC)

(ec) As per Yair rand above. The underlying problem is the failure to get any sense of buy-in or ownership of this by Wiktionary, or even to apparently consider that important. The licensing is the sharpest manifestation of this, because it literally slams the door on their work, and says it will have no part in the new project, building an unbreachable wall between the two. But the failure to give the Wiktionary community any sense of governance over the new project is all of a piece -- not even the most cosmetic measures to present the new namespace as an extension of Wiktionary as much as an extension of Wikidata. Lea's announcement that the new licence will be CC0, before the RfC on that exact question has even closed, is just another example of the recent tone-deafness of the project direction in this area. As Yair says above: After fifteen years of Wiktionary being entirely ignored by Wikimedia institutions.. [t]he one thing that was literally built for a dictionary won't be on a Wiktionary site. The building of structured data will take place exclusively on Wikidata, set up as a competitor to Wiktionary. Wiktionary communities have no say as to how it's built or structured. It will be administered exclusively by Wikidata admins, under Wikidata policies. Wiktionary will not be permitted to use an open-source extension developed by a Wikimedia organization that would benefit it immensely. This is simply not how we treat our own. It is not acceptable for a WMF project to actively marginalise an existing Wikimedia community in this way, sabotage their licensing, and undermine their work. Jheald (talk) 11:28, 8 March 2018 (UTC)

Well, it is quite fast, mainly because m:Wikilegal/Lexicographical Data is a preliminary note with plenty questions remaining open but I am much more concerned because Lexicographical data in Wikidata is not a community-lead project. Decisions are took by a group of 5 to 10 persons with their own agenda. I was happy to see some honesty when the name changed, because there is nothing for Wiktionary in this project. It may be built on Wiktionary data but for third parties. It was clear in Denny's prose since the beginning. I mainly disliked the plebiscite organized about the license, with scarce information on the whole picture, only arguments pro and no room for discussion before to start the vote. It was a false debate, and the decision was set before the beginning of the vote. Note that I do not think the people behind the project are evil. I am convinced they want to do something good, but it's not enough for me. I think important projects in our communities have to be grounded in community discussions in which a consensus emerged after each position have been expressed and discussed. Here, there is no consensus, only a non-democratic leadership on an opaque project. But well, it appears they do not want Wiktionarians but to have a new community to collaborate on lexicographical data here. I am curious to see who will do that (how many of the voters for example) and how emergent problems will be discussed. Noé (talk) 10:45, 8 March 2018 (UTC)

@Noé: One of the biggest problem when starting a project like this is that overall, before you actually got something « in production », it’s kind of really hard to get community input. Few people are actually involved into the discussions of the data model and so on. So to advance you need to move on with the few input you got and take decisions … The « preliminary note » is what the team has after several years of on and off discussions. I think getting what you would qualify « non preliminary note » is something that would actually not happen. At some point the devteam has to propose something, and that something has to be a product because that’s the only way to get a lot of inputs. Tests wiki do not involve a lot of persons. author TomT0m / talk page 11:26, 8 March 2018 (UTC)

@TomT0m: The preliminary note I mentioned is a legal analysis, not a technical one. It was written by a legal counsel of the Wikimedia Foundation. It is not a work by the Wikidata devteam, and it was not claimed as fund/asked by them. It is about the licensing of lexicographical information, and there is several kind of contents that are not taken into account yet. I asked polite questions on the talk page and I think a second version of the document could be made out of them, being more specific on delicate but important matters. Noé (talk) 12:39, 8 March 2018 (UTC)

@Noé: The legal issues on Databases, worldwide … pretty complex topic unlikely to be settled until tested in front of court on country with jurisprudence … especially in a « Big Data » world. Just heard about the problems of the Gutenberg project on this article. I’m afraid these issues are really complex, and that this project, right now, has to leave with legal risks just by compiling facts :/ the same risk exists for (unstructured) Wiktionary I guess. For example Wikidata descriptions should not be theoretically extracted from Wikipedia, and Wikidata lives with this since the beginning. I’m not really aware of any issue, major or minor with this. author TomT0m / talk page 18:12, 8 March 2018 (UTC)

license, or other issues?

I've only been on wikidata for about 2 1/2 years now, but all along I recall the lexicographical extension being discussed here, and then under active development for the last year or more. March 2018 is the first time I've seen any opposition to this effort. Where have you people been during the last 2 1/2 years? Is the CC-0 license the real issue, or is something else going on here? I find this pile-on against the new development very discouraging - let's at least see how it works in practice as Lea suggested. Maybe there's a better way to do it, or maybe there's no point in doing this at all, but given the effort invested so far to implement it here, I strongly feel it should be given a chance to prove itself. As to the license, anything other than CC-0 greatly limits the usefulness of a structured database - it's very hard to do attribution for example as required by CC-BY if you are building a box based on a hundred pieces of data from as many contributors. But if we really need CC-BY or something else for some parts of the data, let's deal with that when needed. Wikidata is partly CC-0 (the main namespace) and partly CC-BY-SA (this and other text name spaces) as it stands, so the license issue certainly shouldn't be a show-stopper. ArthurPSmith (talk) 13:53, 8 March 2018 (UTC)

We are around for a while! Maybe not in Project Chat but in Wikidata:Wiktionary (when the page was named like this). You can have a look at Wikidata talk:Lexicographical data/archive to see our past discussions, and see that similar points of view were already expressed about multiple issues. Also, we had a meeting at least in two opportunities, at Wikimania 2016 in Italy and at Wikiconvention francophone 2016 in Paris

Noé (talk) 14:32, 8 March 2018 (UTC)

Thanks for the link. You (Noé) certainly commented a lot there, but it doesn't seem to have as negative a general tone as your recent remarks. What's changed? Yair rand also commented quite a bit but mostly positive. Jura had a few comments including a suggestion of a separate installation, but it wasn't followed up on. And the bulk of the discussion was long ago (late 2016 mostly) - if there was significant opposition to the whole concept why weren't you pushing the development team to redirect their efforts somewhere else this last year? ArthurPSmith (talk) 18:15, 8 March 2018 (UTC)

At first, I wasn't very enthusiastic for Wikidata because I think a multilingual project like this one give an important bonus to people that can express their opinion in English. But, after having discussed with Lydia and colleagues at Wikimania, I accepted to collaborate and I spend dozen of hours to initiate conversations in French Wiktionary and in the page already mentioned. At some point in the discussion, I rose a question about the formation of contributors. For me, it is important to document the issues we encounters by learning collaborative lexicography. Some Wiktionarians already spent years to document lexicographic problems and I was troubled to see it was planned to start from scratch and not take profit of that in Wikidata. During the discussion, I realized the devteam was not looking so much on how it could improve Wiktionaries but rather how lexicographical data can be reused by third parties. I had the feeling it was more oriented to computational operations rather than human consultation and contribution. Then, I was not very pleased by the first model. I had the feeling it was still oversimplified. Finally, it was the way the plebiscite was organized that made me as negative. The situation was not clearly stated, only with positive arguments and without room to express another option. Last disappointment, Léa have posted the announcement for April saying it will be in CC0, jumping to a conclusion I felt was already decided before the beginning of the vote. I wasn't fully opposed of any development, I just feel the project now is very different as how it was design at first and may not be good for Wiktionaries communities. Noé (talk) 22:06, 8 March 2018 (UTC)

@Noé: That’s an interesting point of view for sure. I’ll try to summary the different opinions at this point, how I see it:

It’s of interest to have lexicographical data for human beings. That’s what wiktionary does in a Semi-structured_data way. You’re saying that the wiktionary community had accumulated and documented over time a lot of knowledge about how structure that knowledge, and how to present it to humans.
The cons of Wiktionary is that it’s not easy to share information between language versions and each language version maintains its datas on each languages knowledges.
It’s of interest to have machine readable datas of lexical datas. There is numerous applications like translation assistants, of which our communities are big consumers. A central data centric repository for those structured datas is proposed to help achieve achieve this goal.
One of Wikidata goal is to propose a repository of data usable for « humans and machine alike ». It’s achieved by giving interfaces for human to enter datas in a way precise enough for machines to enter and consume datas in a documented way through technical documents and programming API. It’s a « middle ground » solution as the objects defined are generics (items, properties, statements …) and not very specific and rigid (human with a date of birth and death and that’s it). It’s designed so that community can easily create new objects (items, properties, statements …) necessary to express new things. This comes at the cost of some structure (humans may have a construction date, which is typically not what a data consumer would expect). On the other hand some stuffs easy to express in natural language are harder or more tedious to express in Wikidata model. This is a tradeoff.
There is a proposal to extend Wikidata to include structured datas about lexical entities, that follows the approach that was initially proposed for Wikidata : generic concepts to describe lexical entities, extendable by creating as many instances of these concepts community needs to model them.

The unknown at that points : it’s unclear on how Wiktionaries will interact with these central repos. Some answers may be similar to Wikidata and other wikis : Extension:ArticlePlaceholder could be used to automatically generates term pages in a language that currently do not have one (yet) using the structure defined by a local community from the central repo. This allows sharing lexical datas between linguistic projects while letting them keeping their specificities. There is effort to create a client editing pushed by individuals in clent wiki communities, but this has not really been finalised yet. It may be unreasonable to expect that the plan on wiktionary / wikidata interaction on editions be settled before this is done (and wiktionary could be included into those efforts). Is it worth waiting a « definitive » answer on this question before starting the pure structured part of lexical datas ? Experience proved Wikidata developed and achieve stuffs way before this, so in my opinion the answer is « no ».

You express doubts on :

the extensibility of the model to model the full range of lexicographical datas. Experience will tell if the proposed model is community extendable enough to express the subtleties, but to me it seems reasonable that the same approach that Wikidata had on datas to lexicographical datas : generic concepts, community extendable, maybe at the cost of some expressibility.
The relationships between communities and projects : this is true but hardly avoidable. For example if a structured datas repo dedicated to lexicography was enabled on its own this would not avoid a putative tension between the « text centric » lexicographers and the « data centric » one, as this exists in Wikipedias (EDIT (to push the reasoning to an end) : plus with Wikidata community if we experience a triple community pattern, not really an ideal solution imho). Plus its unrealistic in both cases to replace a semi structured wikt to a fully structured one in a heartbeat, so the two aspects will coexists at least for decades.
The licensing issues : It’s unclear that structured datas can express easily stuffs that are for sure subject to copyright (like) laws in the dictionary, that is elaborated definitions written by lexicographers, or paragraphs about etymology and so on. This could be expressed in a structured way but with an entirely different way, through item and properties with the expressivity tradeoff explained previously. Concept like « Gloss » are not intended to replace them.

Did I forget something, anything you do not agree with in this description of the state of the art ? author TomT0m / talk page 09:40, 9 March 2018 (UTC)

Thanks for your well written answer. I'll follow your path, and try to answer to each dot.

Well semi-structured data is not a proper name, because data are very structured in Wiktionaries. For French Wiktionary, a researcher made a XML version of the project and only had to correct about 60 errors (GLAWI). It's textual and queries are not optimal, but it can already be used by translators or language learners.
"share information between language versions" well, that's not easy because culturally, we do not see others' languages the same way we consider our language. German speakers will not describe French the same way French is describe in French (for example wikt:de:voilà say it's an interjection like in German, but wikt:fr:voilà say it's a verb). Some grammatical categories are commons in a culture and can be used in a dictionary to describe foreign languages but rare in other and a different name may be used to let readers understand it (for example optative is a grammatical category often render in dictionary as subjunctive, it's very different from a linguistic point of view but more interesting for readers, you may also consider the definition of basic linguistics concepts such as adjective or verb may be different for each source language). Finally, there is traditions in the way information are conventionalized (for example, cases for Latin are not displayed in the same order in France and in Germany for no reason but tradition). Those problems may be solved but we had so much more work to do in describing 500k+ words in English (for English Wiktionary) and 350k+ words in French (for French Wiktionary) and adding synonyms, thesauri, examples (more than 360k for French Wiktionary!), pictures, translations in plenty languages and more. Managing this kind of very tricky problems may have interested a dozen of expert but it was not the aim of a collaborative project to work only with experts. In order to have a community of interested people, we deliberately skip those insolvable issues. It's not a default, it was a wise choice to make the projects grown up as they are now.
"It’s of interest to have machine readable datas of lexical datas", well, yes. For linguistic investigation, I can imagine multiple uses also.
The way you describe Wikidata goals is interesting.
I am not sure I get you point here.

"it’s unclear on how Wiktionaries will interact with these central repos." and that's a matter of concern for me. Because, it can led to a fork situation, with similar but no identical data in Wikidata and in Wiktionaries. Plus, as Denny mentioned several times, he imagines a new community emerging in Wikidata rather than people migrating from Wiktionaries. I am sure a bunch of people will contribute on both projects (like Vigneron, Pamputt or JBerkel) but I think it will mainly be to check for mistakes not to enhance existing data. If someone want to add entries in Berrichon from an old PD dictionary, will it include it in French Wiktionary or in Wikidata? In a place where attestation of uses can be added, pictures included and link to words in the definition, or in a place where one have to remember sequence of numbers that mean "Noun" or "Intransitive" in order to add the exact statement? Well, we'll see.

About doubts:

Is the model stronger enough, well we'll see. I am already concerned by "Pronunciation" being singular, because that sounds very normative. There is no 1:1 correspondence for a word to a pronunciation. That's a myth and a normative vision. But if the project here focus on giving the several possible pronunciation for each words, there is no need for semantics and it can be cool.
Tension between Wikidatians in general, Wikidatians interested in lexicography, Wiktionarians, but also the Wikidata devteam and other people that may develop tools for one community and not the others. Without a middleground and a safespace, it will be very hard to communicate. It may be in Meta, but for now on, it's mainly in Wikidata, where Wiktionarians doesn't feel welcomes. And it's almost all in English.
Licensing issues: I agree with your phrasing. To give an example for etymology: Saying that a French word can from Latin can be very wrong. Well, they may share a more or less common history but with plenty step in between. It may have come from regional uses, with changes in meaning due to contact with other languages or the object it designate change in time. For a nice example of a complex story, you can have a look at bataclan in French Wiktionary. I think this kind of story will not be represented in Wikidata. It is not a unique case, this is very common to have long stories for etymology. Where there is not, it's because it wasn't studied enough.
Finally, an import doubt you skipped is how this new project will document itself. New policies and help pages have to be written and in several languages. I fear newcomers will spent hours to reinvent pages already made in Wiktionary and that English speakers points of view will be prominent in those. I haven't see enough tools, spaces or methodology to favor communication between people that don't speak English enough to participate.

Well, I spent more than an hour writing that but I may have done some mistakes because it's not my native tongue, so I apologize for that. Thanks again to TomT0m for this nice will of communicate here Noé (talk) 11:26, 9 March 2018 (UTC)

@Noé: (en vrac, no time) I guess the choice of modelling of language has to be something different that is not a one to one correspondence to what grammarians does in one country. In Wikidata we rarely use a model « as is » and what we get is either something new, a mixture of existing solution, or the union of different viewpoints. For example on the « voilà » example Stuctured Lexicography could present both while frwikt alone would be comparatively poorer. Community decides in the end the way to model the datas, we are NPOV so no viewpoint is to exclude a priori.

« I am already concerned by "Pronunciation" being singular » well, if it stays that way, which is always sonething we can discuss with the devteam, chances are that we add a property « alternate pronunciation ». To gives you an example of how community can extend the model.

on grammar categories, this is something I don’t really no much so I don’t say much, but I imagine there is relationships between them like some being particular case of another. It’s something we can deal with in Wikidata with properties like subclass of (P279) and queries and/or programmatically arbitrary access. This seems like problems we discuss everyday in Wikidata so we’ll welcome a discussion on these :)

Finally, an import doubt you skipped is how this new project will document itself. New policies and help pages have to be written and in several languages. I fear newcomers will spent hours to reinvent pages already made in Wiktionary and that English speakers points of view will be prominent in those. I haven't see enough tools, spaces or methodology to favor communication between people that don't speak English enough to participate. Mmm the only way you could fully avoid that risk is to align a data model with the specific usages of a linguistic version of a wiktionary, this means one data model for linguistic version, and no documentation of the french data model in spanish, for example. This is indeed fundamentally a different approach of the current central approach, but it’s a lot of effort for a dubious gain if, for example, there is already structured datasets for french. The roots of central repos is to share in hope we can make something better than the sum of the part. I’ll like to emphasize again that in practice Wikidata is not the kingdom of the « English viewpoint », I think « we should not refuse datas if one Wikipedia use them » is one of our principle (is it written somewhere) as a central repo.

If someone want to add entries in Berrichon from an old PD dictionary, will it include it in French Wiktionary or in Wikidata You did not took into account the idea that he might include them in enwikt, which is interesting because a viewpoint is to see that there is 200+ Wiktionary forks :) I guess it’s of interest to understand how to work with each other, which mean strengthen the links between communities and avoid thinking in terms of « safe place » which means there is a war in a first place … why should there be a war ? Importing them in Wikidata is not incompatible to import them in frwikt. It’s of interest of other wiktionaries to import them in Wikidata to make them available to other Wiktionaries anyway. Informations that cannot be represented (easily) in a structured way may need to be expressed in the wikipages anyway. author TomT0m / talk page 12:29, 9 March 2018 (UTC)

Thanks for your comments.

For voilà, the model limits as one lexical category by word, so it may be two entries in the database. In French Wiktionary, there is a note with five sources and a neutrality of point of view in the writing, explaining the situation. A similar example in English could be worth with a nice explanation.

chances are that we add a property « alternate pronunciation » and that's a bad solution. Every pronunciations are alternatives, and that's still promote a norm.

For grammatical categories, it is not a problem of subclasses but of linguistic analysis encountering a necessity of vulgarization. We adapt the metalanguage to the readers knowledge. Because it is made for humans, not for linguists. I, as a linguist, will be perfectly fine with obscure labels and very precise descriptions. But for the readers, we adapt the description and we do it based on our own language and our own learning processes (how school had teach grammar). I have no idea on how to have both level in Wikidata.

Wikidata is not the kingdom of the « English viewpoint » well the entire discussion of having Lexeme namespace in CC0 and every Wikidata:Requests for comment are English only.

For Berrichon, I was talking about French Wiktionary because the sources for this language are only in English (to my knowledge) so you need to know this language if you want to discuss with other people about the data. There is no final facts for languages as the language evolve, so it will be necessary to make changes in the entries, with new sources about the language or new attestations of uses. There is often conflicts, because a dictionary is not a repository of words. I think it's impossible to fix a language in a database. That's maybe the main reason I am skeptical about this whole project. Noé (talk) 14:06, 9 March 2018 (UTC)

Re: Pronunciation. In Wikidata, it is common to have a singular property take multiple values. We have a qualifier "valid in place" and I am sure that <pronounciation> could have its own set of qualifiers like "valid in place", "applies to dialect", "applies to usage" (= formal, informal). Thus multiple pronunciations could be entered, all equally correct, and qualified if appropriate. Decisions like what qualifiers are useful and when they are required or optional would be made by the community through property proposals and gathering of support. - PKM (talk) 20:15, 9 March 2018 (UTC)

Thank you for your explanation. If a property can be singular or plural, why the label keep singular? Is it not possible to display it in plural when there is more than one value? Noé (talk) 09:22, 10 March 2018 (UTC)

@Noé: term versus word : @Lea Lacroix (WMDE): In Wikidata we have a « 1 wiki page project = 1 item » principle. Is it the case that in Wiktionary there will be a « 1 word 1 page » principle or will there be a « 1 page several same writing terms » or more complicated arbitrary access based one ? I suspect this because the interwiki principle is handled differently. Is this question even settled at this point or is it still on the unknown « relationships between lexical datas and wiktionary » answers ? Depending on the answer to this question your point may be a non issue.

I have no idea on how to have both level in Wikidata. Several possibility, depending on the problem : the simplified version is deductible from the more complex one, and it could be programmatically coded in template or lua code on how to deduce it. Only the complex version has to be stored on wiktionary. If it’s a goal difficult to atteign, use community defined properties and qualifier dedicated to represent the simplified version, for example. I guess it’s typically the kind of question the devteam would like to be community answered. It’s exactly for this kind of potentially non trivial problems that the core model is simple but community extendable, hard to guess the right answer without experience. Up to community to use the extension possibility smartly. This gives community freedom, define the model, both with how the datas are to be used, and relieves the devteam from defining the perfect state of the art structured lexical data model, which is a topic for linguists they are not.

every Wikidata:Requests for comment are English only. Well, I remembered making the effort of translating in french some early request for comments, experience proves this did not really attract a lot of inputs. Anyway in the past some people posted in their languages in international parts, but this practice tend to disappear over years. If the alternative is isolationism, however, we’re stuck in a deadlock.

I think it's impossible to fix a language in a database. That's maybe the main reason I am skeptical about this whole project. I think that no one want to « fix » a language, so I find this point kind of hard to understand. Wikidata has mecanisms to deal with evolving knowledge, which are good enough, and can deal with conflicting informations by not choosing between them. We created qualifiers like statement disputed by (P1310)

to deal with controversies, and there is no requirement that any of the claims are entirely consistent with other claims. It follows the same NPOV rules that wiktionaries follows, accepting conflicting datas as long as they are legitimately sourced. This should reassure you about the goals of the project, doesn’t it ? author TomT0m / talk page 09:50, 10 March 2018 (UTC)

Hi TomT0m, thank you for your patience (et de continuer cette conversation en anglais alors que nous sommes tout deux francophones).

If the choice is set, I am eager to read arguments for each possibilities.

I am not sure I understood your opinion here. The different way of describing a language are different analysis, they are not compatible. For a word, being a noun or and adjective is not a natural property, it's just a way it is analyzed in a specific system of analysis. Another, similarly valid, in the same language or in another, may conclude it is a different category. But well, let's see how the model will deal with that.

It is not a choice between a unique language or isolationism. There is plenty strategies and it is quite bad for me to not challenge this issue. I wrote an essay about communicating when there is different languages around, if you want to think about this problem.

Wikidata has mecanisms to deal with evolving knowledge this is something that was not well presented to Wiktionary communities. Is there some documentation around to read about this topic? I am reassure about the way Wikidata works, not about the goals of the lexicographical project. I am not sure about how it will interact with Wiktionaries and how useful it may be. I am not sure neither on who will have control on it. It's great to understand but not having any control on it still inspire me no confidence. -- Noé (talk) 21:57, 10 March 2018 (UTC)

But well, let's see how the model will deal with that. Not sure on what I can add here, I’m under the impression I said a lot but failed to make myself understood. The model will deal with that by allowing to express the different viewpoints, once again. Maybe several lexical entities will be necessary to express that, maybe not. The rest is up to community.

I am not sure neither on who will have control on it Not sure anybody actually controls anything in our projects, so I’m not sure I understand the question :) « Community » seems the best answer, whatever that means. And most of the time it means « the volunteer who take the time to do something ». Occasionally he finds something to help, rarely somebody who has another idea on how to do things. Which leads me to another point …

about communicating when there is different languages around, if you want to think about this problem. The problem is no to think about the problem, it’s to come up with a practical solution that works :) It’s cool to build a language, it’s way more difficult to make people use it in an actual project. we’re actually doing what we can and what seem to work :/ author TomT0m / talk page 14:37, 11 March 2018 (UTC)

Termination of wiktionary.org ?

Funny summary, WittyLama. I wonder if you actually read the discussions.
The problem here is that the current proposal attempts to replace Wiktionary.org with an incompatible solution. A compatible solution could be on this Wikibase or another dedicated server. The technical difference between these two should be minor.
This is different from what Wikidata did before: I don't think there were any complaints that interwikis were moved out of Wikipedia. Why would there be: the same is provided differently in a better way. The same goes for the external identifier stuff. Both were peripheral to Wikipedia and the integration of the Wikidata approach in the existing Wikipedia is fairly well developed.
For Wiktionary, the way the technical development is being instrumented, practically terminates Wiktionary.org without any agreement of the relevant community or an explicit decision from WMF board. This despite that it could function here or elsewhere, in a compatible way or in an incompatible way.
If there were discussion were one or the other point was decided before, I'd be happy to read. I asked about it further up on this page. Apparently others ask themselves the same question, while some guy thinks I had already written about it. Funny story.
--- Jura 15:16, 8 March 2018 (UTC)
Could you explain where this idea come from? How can a content project be terminated by data? I think it's obvious for everyone that content can't be generated by data only (or only at a very low quality), and as far as I know, no one proposed replacement. PS: the WMF board can't close projects, the LangCom is in charge of enforcing the closing policy (policy which is not meant for active and common projects). Cdlt, VIGNERON (talk) 16:38, 8 March 2018 (UTC)

┌────────────────────────────────────────────────────────────────────────────────────────────────────┘ "The problem here is that the current proposal attempts to replace Wiktionary.org" No evidence at all is offered for this frankly bizarre assertion. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 19:25, 8 March 2018 (UTC)

@VIGNERON: At #Gap_between_Wiktionary.org_and_Wiktionary_namespace_at_Wikidata?, the conclusion seems to be that the features cover most if not all those needed for Wiktionary. Even if Wiktionary.org might linger on, practically it will be replaced by a structured Wikibase.
--- Jura 08:34, 9 March 2018 (UTC)

It has been repeated on the discussion page and the FAQ, the project has no aim to replace or fork Wiktionaries. A lot of content that is currently on Wiktionaries will not be in Wikidata.

"There is plenty of content and community organisation work that is being done on the Wiktionaries that cannot and should not be transferred to Wikidata. Wikidata is just a tool to support the Wiktionary and other Wikimedia projects—it is not capable of replacing them and it should not be used as such." (FAQ, collaborative work between Lydia, Denny and me, September 2016)

"Wiktionary is much more than just the structured data. And the Wiktionaries would still continue to fulfill these additional functions, and could indeed focus on them and thus become more effective." (Denny, November 2016)

"Wikidata doesn't have the intention to replace the Wiktionaries with something new, but to provide a backend database that can support the Wiktionaries, if they choose so." (Denny, November 2016)

"We are not working against each other, and we don't want to. We don't want to steal the content and the communities from a project to another. We want to work together with the expertise we have on both sides." (Léa, November 2016)

"The Wikidata development team will not force any project to use Wikidata’s data, and their editors to edit on Wikidata. It’s up to the individual projects and editors to decide which parts and what data from Wikidata will be useful for them." (FAQ)

Lea Lacroix (WMDE) (talk) 11:21, 9 March 2018 (UTC)

These are declarations of intent, but looking at #Gap_between_Wiktionary.org_and_Wiktionary_namespace_at_Wikidata?, practically, there doesn't seem to be much left. Maybe the outcome of the development was better than originally planned. If the objective was to leave behind [Wiktionary] "community organisation work that is being done on the Wiktionaries that cannot and should not be transferred to Wikidata.", this seems odd, because this exactly what's needed most.
--- Jura 11:17, 10 March 2018 (UTC)
@Jura1: because this exactly what's needed most This is a weird claim to me. Needed for which purpose ? Why would one of the less structured part of a wiktionary, natural languages sentences, being most needed in a project that aims to structure things ? I guess you are taking the viewpoint of a person who needs to find a definition of a word in a language he can read as a « user centric » viewpoint. If only this, of course pure structured datas are not very relevant … structured datas are mainly relevant in other usages, for example Natural Language Processing theory and applications like translation, obtaining informations about a word even if the only wiktionary who has informations about that word is written in a language you don’t speak, or liking terms with wikidata items associated to their meaning. These do not need to import natural languages parts of wiktionaries. author  TomT0m / talk page 12:40, 10 March 2018 (UTC)
@TomT0m: Jura's dealt below with the main part of your comment -- that the community organisation that's been built up at Wiktionary is very much something that would benefit the Lexi project by being plugged into. But I'd like to address another thing you said (edited slightly): Why would less structured parts of a wiktionary, eg natural languages sentences, be needed in a project that aims to structure things ?. And people occasionally also sometimes question the inclusion of "unstructured data" in other parts of Wikidata. Such a question always makes me want to ask the following counter-question in response: what is or would be the benefit of excluding such content, making it inaccessible from WDQS, so that it could not be returned by a query? Obviously sometimes the content is a huge blob of a thing, like a Commons file or a Wikipedia article, and we can provide the access easily enough with a link. But for a small thing, like a sentence fragment, it seems to me far better not to exclude it, but for it to be able to be present along with everything else in the structured system. Jheald (talk) 14:56, 10 March 2018 (UTC)
@Jheald: that the community organisation that's been built up at Wiktionary is very much something that would benefit the Lexi project by being plugged into. The problem with this it’s that, it seem (or as I see it, correct me if I’m wrong) that it’s a split organization between the different linguistic versions and that, as we aim to build a central repo, anyway something new has to be invented anyway … so new spaces have to be created anyway. So not a definitive answer. Does not mean the invention of a new space removes anything to the previous organisation. I don’t think so, even if some of the stuffs probably will have to be updated …

On free/wikitext : It seems we live pretty well with the localized text datatype. Going further like allowing localised wikitext is technically something entirely different, as it would mean using the wikitext parser to render wikilinks for example, dealing with the (lack of) spec of wikitext, using site agnostic wikilinks, disallow templates ? (seems unrealistic to allow templates as every site has its own template set) … quite complicated to do well, or doomed to be hacky and messy. Worth the trouble ? author  TomT0m / talk page 15:19, 10 March 2018 (UTC)

"community organisation work that is being done on the Wiktionaries": Wikidata has project chats in various languages. Why leave that part at wiktionary.org and providing everything else from Wikidata.org? Really odd.
--- Jura 13:24, 10 March 2018 (UTC)
Oh, sorry, got confused. I’m not really sure where you are heading at, however. I’ll not try to guess. author  TomT0m / talk page 13:42, 10 March 2018 (UTC)
Looks like this got side-tracked by some confusing green text. Anyways, if there are any practical gaps we haven't seen yet, please comment at #Gap_between_Wiktionary.org_and_Wiktionary_namespace_at_Wikidata?.
--- Jura 09:33, 11 March 2018 (UTC)

A way forward?

Comment Looking at this discussion through the lens of my 30+ years in technology project management, analysis, and consulting, I think what's missing here is a Community Liaison person to work with the Wiktionary and Wikidata communities.

This person's role would be to:

Listen to Wiktionary communities and determine what they expect from a structured database.
Document core user stories and desired features for structured data from a Wiktionary perspective.
"Translate" key concepts between the Wikitionary and WD communities (does this word mean the same thing to both groups?).
Help Wiktionary people understand Wikidata core concepts and norms (i.e. "pronunciation" [singular] does not mean only one value can be recorded).
Assess how well the initial implementation of lexicographical data in WD matches the needs/wants of the Wiktionary community.
Identify suggested changes and enhancements for community and technical discussion, review, and prioritization.

The tricky part would be filling the role. I believe the ideal person would be:

Fluent or at least competent in both English and French (preferably with some familiarity with other languages)
Familiar with linguistic concepts and terminology
Experienced as a community liaison or requirements analyst
Familiar with Wikidata
Familiar with Wiktionary

I think that's a tall order to fill. Similar roles in the Structured Data on Commons project are funded through an outside grant. Do we think a role like this is a good way to move forward, and how might we implement it? - PKM (talk) 20:45, 9 March 2018 (UTC)

I think Léa was hired to fulfill a quite similar position. She may know better Wikidata that Wiktionary but she already spent a lot of time and energy to facilitate the exchanges. But the project aims to help Wikidata much more than Wiktionary, so her role may have changed recently. She may present herself better than I.

Another problem may have been that Wiktionaries communities are disseminated over the globe and full of very shy people. To my knowledge, the only monthly meeting dedicated for Wiktionarians is in France, and Léa plans to come to visit us the 5th of April. The English Wiktionary community never did a dedicated meetup and a larger part of them never met any other contributor. It is worst for the Spanish version, better for Czech, Italian and Hebrew. I do not know for Russian, Polish, Armenian or German communities. We started a Tremendous Wiktionary User Group but it is still very hard to contact people in other communities. So, in a perfect world, a Community Liaison in between Wiktionaries could be great

Noé (talk) 21:27, 9 March 2018 (UTC)

Indeed, this would be an excellent idea. What would be the step to establish such a Community Liaison intervention? --Psychoslave (talk) 09:27, 11 March 2018 (UTC)

Thanks PKM for summarizing my job position ;) Well, that's not exactly reflecting reality, for two main reasons:

I'm not dedicated only to the Lexicographical data project and liaisons for Wiktionary, I'm doing community communication for the entire Wikidata project, that also includes IRL events like workshops or conferences. That's why I was not very active around Wiktionary in 2017: I was mostly busy organizing the first conference dedicated to the Wikidata community. Now I'm catching up and I hope that we can work together even better.
I didn't have any particular knowledge of Wiktionary or linguistics when I started. I've been already learning a lot, both from my colleagues and Wiktionary people, and I'm still learning about concepts and processes, and it's really fascinating.

If you want to discuss about how to improve communication between Wikidata and your Wiktionary community, feel free to contact me. Lea Lacroix (WMDE) (talk) 08:40, 12 March 2018 (UTC)

Historical periods of Japan

I'd like to use the historical periods of Japan (such as Edo period (Q184963), Azuchi-Momoyama period (Q319531), Muromachi period (Q334845), Kamakura period (Q236205), and so on) to describe some GLAM data. As far as I know, these are periods of the history of one country, distinguished by political and cultural changes. It would be useful to be able to get all of them in a query to create a timeline. A few issues:

At the moment they seem to be a mix of instance of (P31) era (Q6428674) and instance of (P31) historical period (Q11514315). Should we standardise on era (Q6428674) for these items? From the English descriptions, historical period (Q11514315) seems purely to be about time, whereas era (Q6428674) is about time, plus location, plus culture. If we say that a sculpture is from the Edo period (Q184963), we mean it is from a particular region and culture, not just a period of time.
How do I show the connection between these periods and the islands/country of Japan? I could make each of them instance of (P31) a new item for "era of Japanese history" and this item could be subclass of (P279) era (Q6428674), facet of (P1269) history of Japan (Q130436), or each item could connect somehow to history of Japan (Q130436).
Should an era (Q6428674) have a start time (P580) and end time (P582), or an inception (P571) and dissolved, abolished or demolished date (P576)? At present there seems to be a mix.
Should they be follows (P155) and followed by (P156) or replaces (P1365) and replaced by (P1366)?

WikiProject Cultural heritage has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead.
Whatever solution is used for Japan will probably also be useful for periods of other cultures such as Kushan period (Q50374693). MartinPoulter (talk) 21:07, 9 March 2018 (UTC)

Possibly related property proposal (would this help you?): Wikidata:Property proposal/PeriodO definition ID. ArthurPSmith (talk) 21:58, 9 March 2018 (UTC)

Since asking the question, I've taken a data-based approach to see what properties are already in use, and on that basis I've gone with start time (P580) and end time (P582), follows (P155) and followed by (P156) and each item instance of (P31) era (Q6428674); part of (P361) history of Japan (Q130436) MartinPoulter (talk) 12:37, 12 March 2018 (UTC)

Office holders and secondary Office holders

@jheald created a query for Ottoman Sultans and their Grand Viziers. This is a pattern that happens often; presidents and their vice presidents, governors and their lieutenant governors etc.. Such information can only be queried for when this information is available. Similarly, we can in addition to this, show for a territory like Kentucky who the senators and representatives were at a given time.

What I am looking for is a maps for countries like the US, UK or India that divides the country up in its electoral parts and show how this changes over time. When a part is clicked, the office holders at that time are shown (dependent on available information). My question: how far of are we to realising the presentation. The data is something we are working on.. Thanks, GerardM (talk) 10:16, 11 March 2018 (UTC)

@GerardM: Thanks for the namecheck! Glad you found the query useful. With respect to electoral maps, the most impressive thing driven by Wikidata that I've seen so far would be this map of the 2017 German federal election results, created by User:Lucas Werkmeister, first unveiled as part of Wikidata's fifth birthday celebrations.

Probably the biggest gap to fill in order to create something similar for other elections at other times in other countries would be to find/create/upload to Commons geoshape files for the relevant electoral districts at the relevant point in time. The big blocker here is that (by fiat from the WMF team) the Commons 'data' namespace is restricted to be CC0 (see mw:Help:Map_Data), whereas almost all sources of such boundary information (eg OSM, most National Mapping Agencies, etc) require at least attribution.

As a result, there are very few geoshapes on Commons (tinyurl.com/ya72d2un)(note: the URLs aren't quite right in this query, the '+' signs need to be turned into spaces for links that work); and (perhaps as worrying), for what geoshapes are there, the current arrangement gives an enormous incentive for licence fraud -- ie uploading a shape derived from a non-CC0 source, failing to identify how it has been created (something that appears not generally to be documented on data pages), and pretending that it is CC0.

I agree, that being able to directly generate something like eg the national and local election maps posted by eg Twitter User @ElectionMapsUK would indeed be very nice. Jheald (talk) 11:26, 12 March 2018 (UTC)

The information for US governors of states and territories is largely complete.. The US maps are PD. Thanks, GerardM (talk) 12:11, 12 March 2018 (UTC)

importing floating point numbers with quickstatements

How to import floating point numbers with quickstatements? I enter value 211.41 to quickstatements, but 211,409999999999996589394868351519107818603515625 gets added??? --WikedKentaur (talk) 18:54, 11 March 2018 (UTC)

quickstatements-should-not-add-decimal, perhaps; known issue, no prognosis. --Tagishsimon (talk) 21:12, 11 March 2018 (UTC)

Is there some kind of community voting process. So that some tools (like quickstatements) could get funding or coding support from WMF? --WikedKentaur (talk)

I don't think WMF support can prevent arithmetic underflow (Q669129). The fix for this is ready but Magnus hasn't approved it yet. Matěj Suchánek (talk) 08:39, 12 March 2018 (UTC)

Migration vocabulary translation campaign

As part of the European Year of Cultural Heritage, Europeana is running a campaign focused on personal stories of migration. Part of the campaign is to create a useful and multilingual vocabulary of 61 migration-related terms that GLAMs can use in their metadata. The translation of those terms is being coordinated on Wikidata.

If anyone would like to help translate the labels, descriptions, and the label of the relevant 'subclass of' item, visit Wikidata:Europeana migration vocabulary.

This project page tracks the translation status on Wikidata for all official EU languages (plus several other languages which have requested to be included too). Special assistance is requested for:

Help to add the missing 'subclass-of' property to several of the more 'difficult to model' items, and
Translations are especially missing for Slovene, Slovak, Lithuanian, Hungarian, Croatian, and Greek.

Sincerely, Wittylama (talk) 21:24, 11 March 2018 (UTC)

Hi Wittylama How have been selected the languages? For example I do not see Arabic language, a major language. It appears that there are only European languages, am I right? Pamputt (talk) 06:01, 12 March 2018 (UTC)

Dear Pamputt, the project tracks all 24 official European Union languages - Europeana and the Europeana Migration project is an E.U. project after all - plus any languages that request me to add them too. At the moment that is Armenian, Ukranian, two Norwegian languages, Esperanto, Basque, Catalan, Albanian, Macedonian, Upper Sorbian, Manx, Sami... If you would like to be involved in this campaign with translating another language, I can add it if you would request it. Wittylama (talk) 09:44, 12 March 2018 (UTC)

Facto Post – Issue 10

The latest issue of the Facto Post newsletter is at w:User:Charles Matthews/Facto Post/Issue 10 – 12 March 2018. If you would like it delivered on enWP, please sign up at w:Wikipedia:Facto Post mailing list. Charles Matthews (talk) 12:32, 12 March 2018 (UTC)

Merge

I think that Template:Infobox Japanese clan (Q14398784) and Q21293209 represent the same template. How can I merge them?--Auric (talk) 12:26, 18 March 2018 (UTC)

Done. See WD:Merge for guidance. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:42, 18 March 2018 (UTC)

This section was archived on a request by: Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:42, 18 March 2018 (UTC)

WikiCite?

What's the status of efforts to use Wikidata with bibliographic citations?

From Module:Cite, I get the impression that it still needs attention from developers before it can be used routinely. Please see my question on this at "Module talk:Cite#WikiCite with URLs?". Thanks, DavidMCEddy (talk) 14:00, 18 March 2018 (UTC)

I have replied there. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:18, 18 March 2018 (UTC)

This section was archived on a request by: Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:18, 18 March 2018 (UTC)

Script to parse Wiktionary entries

I wrote a script that can parse Wiktionary entries and output structured data, in preparation for the introduction of lexicographical data. Maybe someone already did that, and I just reinvented the wheel, but I couldn't find it anywhere. Since senses won't be included in phase one of the lex data rollout, and since there's some copyright issues there when it comes to automated importing, I haven't included them in this script. That means also excluding split etymologies and anything nested within them, as well as translations. What the script does pull is: etymology (sometimes), pronunciation, alternative forms, derived terms, related terms, hyponyms, and anagrams. The script only works in English right now, but I don't think it would be too hard to generalize to other languages.

As an example, here's what happens when you parse the entry on the word wiki:

{'pronunciation': {'accents': {'unknown/all': {'en-pron': 'wĭʹkē'}}, 'audio': {'US': 'en-us-wiki.ogg'}, 'rhymes': ['ɪki', 'iːki'], 'homophones': ['wicky', 'weaky']}, 'derived-terms': {'from-noun': ['MediaWiki', 'Wikidata', 'WikiLeaks', 'Wikimedia', 'Wikinews', 'Wikipedia', 'Wikiportal', 'Wikiquote', 'Wikisource', 'Wikispecies', 'Wikiversity', 'Wikivoyage', 'Wiktionary', 'Wikipedist', 'Wiktionarian', 'interwiki', 'wikify', 'wikiholic', 'wikilink', 'wikiness']}, 'hyponyms': {'of-noun': ['public wiki', 'private wiki', 'protected wiki']}, 'anagrams': ['Kiwi', 'kiwi']}

I'm almost entirely self-taught when it comes to Python, so I'm sure there's tons of things in the code that I could be doing better. And I'm also sure there are features that could be added. So edits to my code are very much welcome. — PinkAmpers&^{(Je vous invite à me parler)} 03:29, 12 March 2018 (UTC)

Hi PinkAmpersand! Keep in mind that no massive import data has to be done in the first version that will be released in April. Pamputt (talk) 05:57, 12 March 2018 (UTC)

PinkAmpersand given the amount of data generated by Lua modules it's impractical to use XML dumps as a base. I think the best way is to work directly on HTML dumps (which should be available soon). The more interesting question is what will happen after importing data from Wiktionary. If the imported data isn't used on Wiktionary we'll effectively have a fork situation. – Jberkel (talk) 11:05, 12 March 2018 (UTC)

@Jberkel: « we'll effectively have a fork situation » That depends if it’s not used in « any » wiktionary. As there is many one, and if english lexicographical data is used by another linguistic version, the situation will be way more complex than a « fork » situation :) and that’s only if you consider frwikt informations about english are not already effectively a fork of enwiki’s one. author TomT0m / talk page 11:16, 12 March 2018 (UTC)

TomT0m Well yes, this is already the status quo with parallel Wiktionaries, but they have evolved separately and weren't cloned through automatic processes (in most cases at least). Maybe there will be some "natural" convergence to one or the other version. If wikidata attracts more collaborators for lexical data (as often assumed in other discussions) the rate of change will be much faster and the Wiktionaries might adopt it very quickly (assuming quality contributions). – Jberkel (talk) 12:00, 12 March 2018 (UTC)

What? only if you consider frwikt informations about english are not already effectively a fork of enwiki’s one of course they are not forks! French Wiktionary describe English language data in French, with references to the French semantic system and with French grammatical categories. English Wiktionary describe English language in English, with references to the English semantic system and with different grammatical categories. Both also describe other languages in very different ways. They are not fork because they do not consider the vocabulary and grammar of each languages the same way. So the importation of English data from en.wikt and fr.wikt will not map, and it's not only a difference of coverage. Sure that ignoring this may lead to a misunderstanding of the Wiktionary project. Noé (talk) 13:11, 12 March 2018 (UTC)

@Noé: Pardon me, I used « fork » in a loose sense. But wiktionaries shares more than nothing imho … they have a non empty overlap in the represented datas and they share most of their goals, share the same platform. (Although a degenerate case of fork could be projects that shares very few or nothing that splitted at a very early stage and ended up with different models.) If that’s a problem for you just replace « fork » by « edition » as it can be read on the fr wiktionary main page, this does not change much to my discourse :) And to be sure, no, I did not expect wiktionary edition to be one to one map of each other. But we already discussed this a lot and I have nothing to add. author TomT0m / talk page 15:13, 12 March 2018 (UTC)

@Jberkel: I don't quite follow. What about HTML dumps would be easier than reading template parameters? — PinkAmpers&^{(Je vous invite à me parler)} 12:33, 12 March 2018 (UTC)

PinkAmpersand Not all data is directly contained in the template parameters / XML. Some might come from a Lua module, or even Wikidata (going full 🔄). And templates are difficult to parse, especially when nested. You'll basically end up rewriting MediaWiki :) Parsoid produces annotated HTML which is in my opinion easier to work with. – Jberkel (talk) 12:41, 12 March 2018 (UTC)

I suppose I'd have to see what the Parsoid output looks like to form a full opinion. I just took the most obvious approach here. If that's not the best approach, that's fine. It was a fun exercise in coding regardless. — PinkAmpers&^{(Je vous invite à me parler)} 14:16, 12 March 2018 (UTC)

I know. I just thought this might be useful for down the road. And in the mean time, it could be used in a semi-automated fashion, with a human reviewing each individual edit. — PinkAmpers&^{(Je vous invite à me parler)} 12:33, 12 March 2018 (UTC)

FYI, there already exists a lot of Wiktionary parser. I am sure they are not all working but some of them should work. Pamputt (talk) 19:27, 12 March 2018 (UTC)

Imia/Kardak (Q2119012)

I think there is a problem in Imia (Q2119012). IP changed the labels in many languages to Imia and described it in English as Unpopulated Greek islet in southeastern Aegean. AFAIK, there is a political dispute whether these two islands are Greek or Turkish. In most wikis this situation is reflected by the article title: Imia/Kardak (Imia – Greek name; Kardak – Turkish name). But I'm not sure if this should be the label and we should revert the latest editions or maybe one of these names (which?) should be the label and the other should be an alias? Wostr (talk) 20:27, 12 March 2018 (UTC)

Notification from edit summary

Greetings,

The ability to notify other users in edit summaries will be available later this week, on 15 March 2018. Other users can be notified if a link to their user page is provided in an edit summary. Some user-made gadgets and scripts that automatically put user names in edit summaries may need to be changed to put a colon in the link, such as [[:User:Example]]. You can change how you receive these mention notifications in your preferences. This feature was highly requested in the 2017 Community Wishlist survey, and feedback is welcome.

Thanks, happy editing to you. -Keegan (WMF) (talk) 21:09, 12 March 2018 (UTC)

Conditional Sentences

Hi there,

how do i can represent a conditional statement in the wikidata? eg. if the weather is rainy then the sky is cloudy – The preceding unsigned comment was added by Sajjad.shirazy (talk • contribs) at 11. 3. 2018, 06:31‎ (UTC).

You use qualifiers (though I am not sure what is the best qualifier for this particular sentence).--Ymblanter (talk) 07:53, 11 March 2018 (UTC)

A proposal subclass of (P279) ; subclass of (P279) ;

⟨ cloudy sky ⟩ has part Search ⟨ sky ⟩

. This reads « any rainy weather sky is a cloudy sky » ; any cloudy sky is a sky and a cloudy sky has clouds. author TomT0m / talk page 10:00, 11 March 2018 (UTC)

@TomT0m but i think it's a relation between two sentences and we should not reduce them to relations between concepts. also for other logical operators like `&`, `|`, ... we will have a really unreadable phrase. isn't that better to have a section for relations between sentences? – The preceding unsigned comment was added by Sajjad.shirazy (talk • contribs) at 11. 3. 2018, 07:19‎ (UTC).

@Sajjad.shirazy: You may be interested into Web Ontology Language (Q826165)  

and description logic (Q387196)  

as those languages precisely allows to express logical sentences as « class expressions » to define our concepts. Because Wikidata is not really about « relation between two sentences » … You may also be interested into union of (P2737)

_and disjoint union of (P2738)

to express or or xor operators analogs. author TomT0m / talk page 09:47, 13 March 2018 (UTC)

Mess with streaming media URL (P963)

WikiProject Informatics has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead. @Danrok, Shisma:

There's a huge mess with streaming media URL (P963). It was originally proposed for streming-links of films oder audio but the German label was set wrong: „Downloadlink“ (link of download) so German speaking users used it to adding links to the downloads of computer software and other digital files. It's nowadays used much more for this than for steaming-urls. Additionally there are several uses in even other ways. Should we split the property into two oder change it to a more general type? I'd prefer spiting it. -- MichaelSchoenitzer (talk) 16:11, 11 March 2018 (UTC)

@MichaelSchoenitzer: I'd prefer spiting it I wonder why. There do not seem to be a risk of collision. I see no harm in generalizing it, we should avoid useless complication of multiplying properties withou a good reason. Rename « electronic link to the final product » (final product to avoid confusion with source code repo for example) author TomT0m / talk page 16:19, 11 March 2018 (UTC)

@TomT0m: My main thought was that a streaming-URL and a download-URL are quite distinct in the way you can use them. (Downloading the live-steam of BBC is not what you want) And we can't distinguish via qualifiers since a) we don't have good fitting and more importantly b) we often use this Property as a qualifier already. So users would need to guess from context if it is a downloadable or streamable url. Also I don't think that there is really no risk of collision. For example events or lectures sometimes have a live stream while the event is running and later a Download link of an edited version of the video. -- MichaelSchoenitzer (talk) 16:31, 11 March 2018 (UTC)

@MichaelSchoenitzer: OK, I get it … my misunderstanding comes from the fact that in french the « streaming » idea seems to have been been lost in the translation process, it’s just translated « media URL » so I’m afraid it could have been used for non streaming media files as well … author TomT0m / talk page 16:40, 11 March 2018 (UTC)

This is one I proposed originally, and it should be the URL where the subject item's live video feed can be found, e.g. a TV station that is available online, could also be things like radio stations that have a live video feed for the studio, parliament session video feeds, etc. Danrok (talk) 17:10, 11 March 2018 (UTC)

If there are no other proposals I would create a new Property "download link" and rename P963 to be clearer about a streaming-links. -- MichaelSchoenitzer (talk) 21:14, 11 March 2018 (UTC)

I've created the new property download URL (P4945) and Succu renamed the old one. -- MichaelSchoenitzer (talk) 13:58, 12 March 2018 (UTC)

Next we'll need to run a bot updating all these usages. -- MichaelSchoenitzer (talk) 14:18, 12 March 2018 (UTC)

@MichaelSchoenitzer: that's not how we do things here. We don't create properties by ourselves (without formal consensus), even for "no-brainers". We go trough proposal section. So you should delete the property and go to WD:Property proposals. --Edgars2007 (talk) 14:47, 12 March 2018 (UTC)

In my opinion... I was very surprised why we do not simply have property download URL ?? --Jasc PL (talk) 16:07, 13 March 2018 (UTC)

“autopatrolled” entries to be removed from the logging table

Hello all,

This change might impact people who run tools based on the patrolling status of edits.

Currently, MediaWiki is storing the information about if an edit has been patrolled or autopatrolled in the logging table. This table is getting very very big on Wikidata (200 GB), causing significant infrastructure issues.

Therefore, we plan to make the following changes:

Stop adding new entries for autopatrolling to the logging table
Remove the old entries for autopatrolling from this table
Since the distinction between autopatrol and manual patrol was introduced in April of 2016, We need to remove every patrol action (manual or not) before that date.
Include information about autopatrolled in the recentchanges table. The fields rc_patrolled is current 0 for unpatrolled edits, and 1 for patrolled edits. In the future, it will be 0 for unpatrolled, 1 for manually patrolled, and 2 for autopatrolled edits.

This means that the information about if an edit is autopatrolled, will be accessible only in the Recent Changes table, for 30 days. For now, manual patrolling actions will continue to be recorded in the logging table as before, and will remain visible on Special:Log. More details can be found in the technical RFC document, see phab:T184485.

We plan to deploy these changes on April 4th. The script removing patrol actions in the database may take several weeks to run.

If you’re maintaining a tool using logging.log_action = "autopatrolled", please consider changing your code to use recentchanges.rc_patrolled = 2. If this is going to cause large issues for an important tool, please let us know.

If you have any technical question, feel free to write to user:Ladsgroup. Cheers, Lea Lacroix (WMDE) (talk) 13:11, 12 March 2018 (UTC)

Note: starting from tomorrow (14.03), the column term_entity_id in the table WB_terms won't be filled anymore. Lea Lacroix (WMDE) (talk) 15:48, 13 March 2018 (UTC)

Wikidata weekly summary #303

Here's your quick overview of what has been happening around Wikidata over the last week.

Discussions
- New request for comments: Notability and Commons

Events
- Upcoming: WikiIndaba in Tunis, 16-18 March. There will be several Wikidata-related sessions
- Upcoming: Wikidata workshop in Paris, March 16th

Press, articles, blog posts
- Structured Data on Commons is the most important development in Wikimedia's usability, by John Lubbock
- Data on the history of Scottish witch trials added to Wikidata, by John Lubbock
- German Wikidata Workshop on "Wikidata: Potential uses and application examples for digital cultural heritage" during the conference DHd 2018
- Automatically Generating Wikipedia Info-boxes from Wikidata, by Tomás Sáez and Aidan Hogan
- Linking ImageNet WordNet Synsets with Wikidata, by Finn Årup Nielsen
- Towards a Question Answering System over the Semantic Web, by Dennis Diefenbach et al.
- Practical Linked Data Access via SPARQL: The Case of Wikidata, by Markus Krötzsch et al.

Other Noteworthy Stuff
- WDQS updater switched to Kafka
- First version of Lexicographical Data will be released in April
- Wikidata:Database reports/Constraint violations updated
- "autopatrolled" entries to be removed from the logging table
- You can have a look at the Europeana migration campaign and help with translations in your languages
- Mix'n'Match new features: Creation Candidates and Top missing entries
- WDCM Journal: gender equity in Wikidata usage

Did you know?

Development
- Significantly (on average to 1/4th) reduced the number of changes from Wikidata showing up on the watchlists and recent changes on Wikipedias and the other sister projects. This way changes that do not affect an article should no longer show up. We're still holding off roll-out to Commons, Cebuano, Waray-Waray and Armenian Wikipedia because of scalability concerns.
- Working on optimizing one of the largest database tables (wb_terms) (phab:T188279)
- Fixing a bug on how Wikidata changes are shown on Wikipedia (phab:T189320)
- Continued addressing security review issues for Wikibase-Lexeme extension (phab:T186726)

You can see all open tickets related to Wikidata here. If you want to help, you can also have a look at the tasks needing a volunteer.

Monthly Tasks
- Add labels, in your own language(s), for the new properties listed above.
- Comment on property proposals: all open proposals
- Suggested and open tasks!
- Contribute to a Showcase item.
- Help translate or proofread the interface and documentation pages, in your own language!
- Help merge identical items across Wikimedia projects.
- Help write the next summary!

Final note from Léa: thanks to people who participated to the feedback page! Today's Weekly Summary is already improved thanks to your suggestions. Feel free to add more comments, and feel free to edit the newsletter yourself: all small contributions are welcome :)

Read the full report · Unsubscribe · Lea Lacroix (WMDE) 15:46, 12 March 2018 (UTC)

Hi User:Lea Lacroix (WMDE) - what was the update for Constraint Violation reports? The ones I've been looking at (for example Wikidata:Database reports/Constraint violations/P2427 have not been updated since the end of February - I checked about a dozen and they were all at least as out of date as that one. Can we get them to run more regularly? ArthurPSmith (talk) 17:19, 12 March 2018 (UTC)

The bot operator is having issues, he's currently trying to move his code to Wikimedia Toolforge. Note that the development team isn't responsible for these reports. Sjoerd de Bruin (talk) 17:56, 12 March 2018 (UTC)

Checking https://www.wikidata.org/w/index.php?title=Wikidata:Database_reports/Constraint_violations/P2427&action=history the comment seems strange.
--- Jura 21:00, 12 March 2018 (UTC)

Not sure if you're referring to my comment, but even if it was updated on 5 March, the top of the page states "Data time stamp: 27 February 2018, 11:02 (UTC)". ArthurPSmith (talk) 17:41, 13 March 2018 (UTC)

Hybosorus illigeri / H. roei

Regular readers will know that the current three sections at the top of this page all relate to User:Succu's editing of taxonomy-related items or to taxonomy-related property proposals. There is now a fourth issue.

Noting changes to species:Hybosorus illigeri and species:Hybosorus roei, which included the addition of the text:

New Case 3768 has been submitted to ICZN on 5 March 2018 in order to save the name illigeri Reiche, 1853. Under art. 82 of the Code when a case is under consideration by the Commission, prevailing usage of names is to be maintained until the ruling of the Commission is published. Therefore the name Hybosorus illigeri must be used, being in prevailing usage.

(that case, being "under consideration", is of course - significantly - currently unresolved); and that the matter had already been the subject of a previous ICZN case (Case 3400. Hybosorus illigeri Reiche, 1853 (Insecta, Coleoptera): proposed conservation by giving it precedence over Hybosorus roei Westwood, 1845) with a finding to the opposite (OPINION 2230 (Case 3400) Hybosorus illigeri Reiche, 1853 (Insecta, Coleoptera): precedence not given over Hybosorus roei Westwood, 1845); I created Hybosorus roei (Q50355361) and marked it with said to be the same as (P460)-Hybosorus illigeri (Q1945578) and vice versa.

Note also that P460's English description is:

this item is said to be the same as that item, but the statement is disputed

Succu refuses to accept this, despite my opening a talk page discussion and pointing out his own edits ([11], [12], [13]) which show that these items are conflated. His only response was to falsely accuse me of doing "only reverts"

He has offered no justification for his edits, and has now resorted to falsely accusing me "trolling". Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:38, 7 March 2018 (UTC)

To start with a first question: Could you please link to the publication of Case 3768 in the The Bulletin of Zoological Nomenclature, Mr. Mabbett? I'm not aware of such an article. Thanks --Succu (talk) 18:52, 7 March 2018 (UTC)

OK. A second question: Why do you trust the replacement of the current ICZN ruling done by an IP, Mr. Mabbett? --Succu (talk) 21:35, 7 March 2018 (UTC)

Number 3: Why didn't you asked (=Do not remove sources) User:Hybosorus to provide a link to "his Case", Mr. Mabbett? --Succu (talk) 21:59, 9 March 2018 (UTC)

So there was an import of data from another wiki and the statement was removed from Wikidata as no reference was added to Wikidata and the reference from the sourcewiki couldn't be ascertained?
--- Jura 09:17, 11 March 2018 (UTC)
- As I said above, Succu has has offered no justification for his edits. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 22:12, 12 March 2018 (UTC)

Did you overlooked my questions directed to you, Mr. Mabbett, or do you simply ignore them? --Succu (talk) 21:59, 13 March 2018 (UTC)

How to indicate what song a participant performed during a music competition such as Eurovision Song Contest 1991 (Q208523)?

Looking at Eurovision Song Contest 1991 (Q208523) there are a number of artists stated in participant (P710). Since it's a song competition and the song is the actually winner, how can we indicate what song each participant was performing? I would presume we should add some kind of sub-statement to each participant showing the song, but how?//Mippzon (talk) 12:48, 11 March 2018 (UTC)

Is there no property for award contender or entrant? Because it seems to me the simplest solution would be to ditch the "participant" section and instead have properties along the lines of Eurovision Song Contest 1991 (Q208523)award contenderFångad av en stormvind (Q1004668)performer (P175)Carola Häggkvist (Q212524). Or to have both as redundant lists. — PinkAmpers&^{(Je vous invite à me parler)} 04:12, 12 March 2018 (UTC)

That's an interesting thought! Because participant (P710) only handles people (or groups of people) taking part of an event. But what about other competitions where things like movies, songs, art etc. are competing? Does anyone else know how to model that? //Mippzon (talk) 20:03, 12 March 2018 (UTC)

I would think an award contender property could handle all of those things. And if desired it could coexist with participant. So Häggkvist could be listed both as a participant and as the performer of "Fångad av en stormvind". For something like the Oscars, participant could be limited to people who actually appeared onstage (as a presenter, as a winner accepting the award, or as a musician), while award contenderperformerX could be used for anyone who's credited for a nominated song, whether or not they did so onstage, or even whether or not they were actually nominated for that song. For instance, St. Vincent (Q238795) would be a participant in 90th Academy Awards (Q24636843), since she appeared alongside Sufjan Stevens (Q319502) performing Mystery of Love (Q47486876), but wouldn't be listed as a performer under the award contender property, since she wasn't credited in the single (and of course, unlike in Eurovision, it's the single that's nominated, not the live performance). If desired, one could set up something like this:

participant (P710)

St. Vincent (Q238795)

edit

applies to part (P518)

performance of "Mystery of Love" at the 90th Academy Awards

object has role (P3831)

performing artist (Q16010345)

0 references

add reference

add value

Meanwhile, if a singer didn't perform (nor present), they'd be listed as the performer of an award contender but not as a participant. — PinkAmpers&^{(Je vous invite à me parler)} 14:44, 13 March 2018 (UTC)

Seems someone tried to do something like what you mean, at least for Eurovision. Compare Eurovision Song Contest 1956 (Q171784) and Eurovision Song Contest 1957 (Q154846). Which one is the best? //Mippzon (talk) 21:01, 13 March 2018 (UTC)

Armor

I've been doing some subclass cleanup on armor (Q20793164) - there's still a lot to do, but now armored fighting vehicle (Q130368) is no longer a <subclass of> clothing. If anyone has expertise in this area, please jump in! - PKM (talk) 21:45, 13 March 2018 (UTC)

Template:User

I am proposing changes to Template:User at Template talk:User#daask-proposal that may affect a significant number of talk pages. Please join the discussion there. Daask (talk) 01:04, 14 March 2018 (UTC)

numeric value (P1181) of pi (Q167)

I find this quite amusing:

3±0 (stated in (P248) First Book of Kings (Q131066), section, verse, paragraph, or clause (P958) 7:23)
3.1605±0.0001 (stated in (P248) Rhind Mathematical Papyrus (Q213540))

Even if this was stated in the given references, should we not refrain from incorrect statements or mark them accordingly somehow?

Other than that: 1 Kings does not really mention the constant $\pi$ , but says: “Then he made the molten sea; it was round, ten cubits from brim to brim, and five cubits high. A line of thirty cubits would encircle it completely.” (NRSV)

3 is the value one can derive if one takes 30 as the exact circumference and 10 as the exact diameter of a circle (though, ultimately, ex falso quodlibet). When it comes to empirical measurements, it is common, however, to give approximate values without saying that they are approximate (e.g., the total height of Burj Khalifa (Q12495) is given as 829.8 meters). Taking 30 and 10 to be rounded to the nearest unit: (30 ± 0.5)/(10 ± 0.5) = 3.010025062656641604 ± 0.200501253132832080. The cubit itself was an imprecise unit (I guess). It is the forearm length from the tip of the middle finger to the bottom of the elbow, but this differs from individual to individual, so from a literal reading where we take 30 and 10 as exact values, one might still argue that the cubits are not identical to each other. Furthermore, not every round shape is a circle. And to say that a cord of some length (along the neutral axis (Q1091301)) encompassed the sea is not to say that a cord had an identical inner length had it encompassed the sea.

In the Rhind Mathematical Papyrus, there is the problem of what the area of a circular field with a diameter of 9 khet is. Solution: Subtract 1/9 from the diameter, leaving 8 khet. 8 times 8 is 64, so the area is 64 setat. I am not sure whether this method is meant to be exact (the RMP is a mathematical manuscript, after all). $\pi =256/81$ is what you would get from $64=9^{2}\pi /4$ , yet $\pi$ is not directly mentioned in the solution, nor do I know of any RMP passage where it is, and, again, ex falso quodlibet. -- IvanP (talk) 18:39, 12 March 2018 (UTC)

The provided example is a prefect example of what Wikidata actually is. It does not necessarily store the truth but also past views. Note that the most accurate value is marked as preferred, while the others are either normal or deprecated. See Help:Ranking. Matěj Suchánek (talk) 19:00, 12 March 2018 (UTC)

Ah, I now noticed the deprecated rank for 3±0 and 3.1605±0.0001. My other points remain, though (when it comes to empirical measurements, it is common to give approximate values without saying that they are approximate etc.). -- IvanP (talk) 19:11, 12 March 2018 (UTC)

@Matěj Suchánek, IvanP: You've hit the nail on the head here, not about the accuracy of religious texts, but about the inability for Wikidata to model a numerical value without specifying an accuracy. If we insist on conforming to Wikidata's standard about storing accuracies, we can take IvanP's line and list the value as 3 ± 0.2, which I believe is the best solution until we can model "accuracy not specified". @Swpb: Are you aware of a qualifier that allows us to state that the accuracy is unspecified? Deryck Chan (talk) 10:59, 13 March 2018 (UTC)

“not about the accuracy of religious texts” – Note that I am not a Biblical inerrantist or identify with any religion, I just find it dubious to say that someone claims

\pi

to be exactly three when they just stated some measures. Even if the author did not know a better approximation of

\pi

and used the value 3 to arrive at the values, this does not mean that they meant their values to be mathematically exact. By the way, the Septuagint gives a value of 33 instead of 30, but this may be an alteration. -- IvanP (talk) 12:51, 13 March 2018 (UTC)

If the source doesn't specify an uncertainty, as in the biblical reference, the uncertainty should be left as zero, not an arbitrary value. I guess you could say uncertainty corresponds to (P2571) "unknown value". But more importantly, I'd want to see on these deprecated values:

A statement supported by (P3680) qualifier
A reason for deprecated rank (P2241): in this case, one or more of incorrect value (Q41755623), ambiguity (Q1140419), item/value with less precision and/or accuracy (Q42727519)
A qualification that the given value is implied, rather than stated explicitly, in the source (maybe determination method (P459) = calculation (Q622821)).

Swpb (talk) 13:53, 13 March 2018 (UTC)

@IvanP, swpb: Thanks. I'm concerned with making sure we don't imply that the bible said something it doesn't. Leaving the value as 3±0 will be wrong for that reason even if ±0 is the "conventional" thing to do if we don't want to specify an accuracy. Along the lines of Swpb's suggestions, I have changed the reference from stated in (P248) to inferred from (P3452)First Book of Kings (Q131066) and added determination method (P459)calculation (Q622821) and reason for deprecated rank (P2241)item/value with less precision and/or accuracy (Q42727519). Deryck Chan (talk) 17:32, 13 March 2018 (UTC)

Thanks, didn't know about inferred from (P3452). For the biblical value, why can't it be left as "3" without precision specified at all? IVanP's "nearest whole unit" logic is reasonable if we're forced to create an uncertainty value, but it doesn't really come from the source, nor do we have a way of indicating where it did come from. Swpb (talk) 18:14, 13 March 2018 (UTC)

@swpb: Sometime ago there was a bot (or script?) that goes around adding +/-0 to all integer quantities that didn't specify an error bound. This was quite frustrating because they put +/-0 onto a lot of census data that were rounded to 100 or 1000. I think the way the Wikidata UI processes error bounds might have changed and stripping the +/-0.2 may work this time. Deryck Chan (talk) 10:27, 14 March 2018 (UTC)

Wikiversity

Hi,

how the Wikidata may help to Wikiversity goals? I wonder wheater we can make some quizes using data? Or wheather we can store data colected by students in a structured way, to manage them easily afterwords.--Juandev (talk) 13:07, 13 March 2018 (UTC)

Hello @Juandev:, your idea of quiz made me think about three games that have been developed by volunteers on the top of Wikidata's data:

Everything is connected, where you have to place people/concepts next to each other if they are connected in some way
Guessr, a game based on Commons pictures and Wikidata geolocation, where you have to guess on a map where the pictures have been taken
Stadt, Land, Wikidata (in German) a quiz game where you have to find names of cities, countries, etc. starting with a specific letters, and your answers are checked against Wikidata

For the rest, you can have a look at Wikidata:How to use data on Wikimedia projects if that helps. Lea Lacroix (WMDE) (talk) 09:47, 14 March 2018 (UTC)

Event tomorrow: w:User:Charles Matthews/Plant editathon

For any Wikidatans near Cambridge UK: this event 1800-2000 hrs on 15 March, which I'm organising, will have a large dimension of adding data for is pollinated by (P1703) and is pollinator of (P1704). (Those properties have hardly been used, by the way.) If you are not so near – you can of course participate remotely. Charles Matthews (talk) 09:43, 14 March 2018 (UTC)

Creation of item for a US regulation document

Notified participants of WikiProject Chemistry

Hi, I want to create an item for the following US regulation for reference purpose, 2012 OSHA Hazard Communication Standard; 29 CFR Part 1910.1200 (link), but I have no idea about the current data model to use in case of US regulatory document. Can someone help me to define what is the good way to fill that item or at least to provide me an item as good example ? Thank you Snipre (talk) 15:57, 14 March 2018 (UTC)

Cognitive sciences

They are ATM subclasses of a lot of other domains : see cognitive science (Q147638). Are we correct on this ? – The preceding unsigned comment was added by TomT0m (talk • contribs) at 14:19, March 14, 2018‎ (UTC).

The meaning of subclass always seems a little shaky to me for such things, but in this case I could agree that these are reasonable relations. ArthurPSmith (talk) 19:03, 14 March 2018 (UTC)

@ArthurPSmith: I guess a relevant question in cases when we deal with may be classes is « what are the instances » ? Here it’s pretty unclear. I’d say a science is a corpus of experience and theories, and the practice of the scientists of this fields. I’d pretty much know what is an instance of « cognitive science experiment » or « cognitive science theory », but not what is an instance of « cognitive science ». I’d be more confident if we consider a science a « field of knowledge », hence a part of the human knowledge corpus (maths would be a part of knowledge for example), human knowledge corpus being something that pile up over time.

My best guess to link « cognitive sciences experiment » to « cognitive sciences » would then be

⟨ cognitive sciences ⟩ has part(s) of the class (P2670) ⟨ cognitive sciences experiment ⟩

author TomT0m / talk page 19:30, 14 March 2018 (UTC)

Some help requested for cleanup of female items

Hi all, we had a lot of articles made on Wikipedias during editathons last week around International Women's Day 8 March. Now we have some cleanup to do. Here is a list of women by occupation and if you scroll to the bottom you will see lots of occupations where there is only 1. For occupations that are clearly not occupations, click on the item and then click "What links here", find the female item and try to select an appropriate occupation. Sometimes it is just a question of moving the occupation item to a different property, like "field of work" or "position held". Even if you do just one, ever little bit helps. Thanks in advance. here is the report: Wikidata:WikiProject Women/Number of women per occupation. Jane023 (talk) 08:02, 15 March 2018 (UTC)

hello Jane023

rather than trying to find the woman in a sometimes very long list, I suggest using this SPARQL Query tinyurl.com/yah7b3of - just copy the code and paste it in the adress bar - , replacing the occupation value (here vocal jazz (Q1530455)) by the QID of the 'non-occupation' item you look for. It will be much quicker and will even detect male items with the same problem ;) --Hsarrazin (talk) 08:17, 15 March 2018 (UTC)

Yes thanks! Of course we need to cleanup the male items too. I tried making a report for just the Olympics but kept getting timeouts. Generally I only select those items that I assume will have almost zero incoming links (and there are lots of those too) Jane023 (talk) 08:25, 15 March 2018 (UTC)

I added a column with a person.
--- Jura 08:40, 15 March 2018 (UTC)

That makes it even easier! Thanks. Jane023 (talk) 08:49, 15 March 2018 (UTC)

Please test pings in edit summary

1. Read this:

"You can notify users in edit summaries. They will get a ping just as if they had been mentioned on a wiki page. phab:T32750"-- meta:Tech/News/2018/10

2. Sign up at https://wikidata.beta.wmflabs.org/ using a different user name and password (not the one you use here). You may create multiple accounts if you like, just put a note on their user pages.

3. Edit a page and put a username link in edit summary. Confirm that you are receiving the notification correctly.

4. Test at different pages and in different ways.

5. Report bugs to Phabricator.

6. Share this comment with other people on other wikis, in different languages.

--Gryllida (talk) 23:51, 8 March 2018 (UTC)

Special:Diff/646435431 --Liuxinyu970226 (talk) 10:25, 9 March 2018 (UTC)

I would say like this. Pamputt (talk) 18:19, 9 March 2018 (UTC)

It seems that the test by Mahir256 did not ping me. I do not know whether it is the same for Liuxinyu970226. However it works well on beta.wmflabs.org. Pamputt (talk) 19:57, 9 March 2018 (UTC)

@Gryllida: Why do we get notifications by people who revert to the last version I edited?
--- Jura 04:14, 16 March 2018 (UTC)
Because nobody had realized that. Matěj Suchánek (talk) 12:22, 16 March 2018 (UTC)
There is already a Phab about that. As a fast fix some admin could change the MediaWiki:Revertpage to the recommended version without linking the former editor. ( Reverted edits by [[Special:Contributions/$2|$2]] ([[User talk:$2|talk]]) to last revision by $1 )Grüße vom Sänger ♫ ^(talk) 12:41, 16 March 2018 (UTC)

Your ideas of projects for the Wikimedia hackathon

Hello all,

This year, the Wikimedia hackathon will take place on 18-20 May in Barcelona. Several Wikidata volunteers and members of the development team will attend. Like last year, there will also be a documentation sprint where people can work on improving documentation.

If you are attending, which Wikidata-related projects do you plan to work on?
Do you have ideas of features or fixes that should be worked on during the hackathon?
Are there documentation pages that need to be improved? (you can also participate remotely)

Feel free to link to Phabricator tickets. Note that the hackathon is only a 3-days event and some projets need a longer time to be achieved ;)

Cheers, Lea Lacroix (WMDE) (talk) 07:37, 16 March 2018 (UTC)

Hi all, I'm not coming to Barcelona but I am part of preparing the hackathon at Wikimania (Cape Town, July). I have some ideas to do with Wikidata assisting in generating article lists and/or stubs for new articles in the Wikipedias of marginalised languages. Is is okay to ask whether the Barcelona hackathon might see the first steps towards this?

Michaelgraaf (talk) 13:40, 16 March 2018 (UTC)

Two days left to make submissions for Wikimania 2018

Hello all,

The deadline for submitting talks, workshops and posters for Wikimania 2018 is Sunday, March 18th, at 23:59 UTC.

You can have a look at the submissions that have already been proposed around Wikidata, and you still have time to discuss with proposers or submit something yourself.

Cheers, Lea Lacroix (WMDE) (talk) 18:35, 16 March 2018 (UTC)

Conflation and amalgamation of two persons

It seems to come up once in a while. Maybe a summary of a suggested approach can help: Help:Conflation of two persons. What do you think of it? What's missing?
--- Jura 11:12, 15 March 2018 (UTC)

Looks good. Some aspects which might be missing:

Conflation can be a result of a bad merger, and in some cases (whatever that means, I have no idea right now) it might be better do undo the merge instead of the described procedure.
The problem is not restricted to persons, it can happen to any type of items (e.g. human settlements). Should we generalize that help page to address this?

—MisterSynergy (talk) 07:50, 16 March 2018 (UTC)

The approach recommended may be worthwhile, if the conflation has existed for some time. It should not be applied if a second identity has only recently been conflated with a long-standing and valid item. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:45, 16 March 2018 (UTC)

agreed, just the other day I had to fix a bad merge of Ottawa University (Q7109265) and University of Ottawa (Q627969), caused by a new sitelink added a few days before to the wrong one (and then somebody merging to fix interwiki links). I think keeping a "conflation" or "disambiguation" item around should only be done if one or more of the sitelinks is clearly about both/all items. ArthurPSmith (talk) 14:08, 16 March 2018 (UTC)

Thanks for your feedback.
1. I will try to add something about undoing recent bad merges.
2. I think it would be possible to generalize this, but the advantage of limiting the description to people is that it's fairly easy to follow and apply, even though there can be discussions about it.
Items about people also tend to be generally fairly stable and include plenty of (stable) identifiers, this isn't necessarily true of other fields. Human settlements can include much potentially problematic data with coordinates of one place and elevation data of some place nearby. An improvement still needs to be found and implemented and might be hindered by following this.
--- Jura 13:51, 17 March 2018 (UTC)

Company data project - R&D and notes on company EVENTS structuring / meta data

Notified participants of WikiProject Companies

Finally found a spare moment and wanted to add a few comments to this item which rolled off: Wikidata:Project_chat/Archive/2018/01#Wikidata_data_model_"deficiency" ...ONE event with 1 preceding entity and 2 succeeding entities. Really, I wanted to add more general comments about company events, and the meta data around events "sooner than later". IMHO having a robust and working notability policy for company inclusion/exclusion should come first. Events are a very tough area, seems like a potential "black hole". My notes were longer so I put them here: User:Rjlabs/WikiData Company Data Project#Company Events Rjlabs (talk) 21:18, 16 March 2018 (UTC)

I think we should err on the side of creating more items, rather than trying to stuff too much into a single item. We have properties like replaces (P1365) and replaced by (P1366) that can track the sequence of entities in the cases of mergers and splits. If an "event" has reputable sources for its existence (eg. there's plenty of news stories etc. about the HP split) then create an item for the event also and link to the various company items. For less important events, I assume adding significant event (P793) with generic values (initial public offering (Q185142) for example) and appropriate qualifiers would be sufficient? Then the question is, what events count as "significant"? ArthurPSmith (talk) 23:48, 16 March 2018 (UTC)

Arthur, great to hear your reasoned input. Agree that "plenty of news stories / from reputable sources" is a reasonable test for inclusion. More general comment at User:Rjlabs/WikiData_Company_Data_Project#Birth_&_death,_acquisitions,_merger_and_spin_off. As for what corporate events are eligible to become entities/items/objects in themselves how about a "white list" and "black list" for corporate events? Examples:

White list

Event (creates or destroy Wikipedia or WikiData entity/establishment objects, or changes data held in a entity/establishment info box in use at Wikipedia for companies) AND, there is at least one reliable secondary source covering the event
8-K reported event that is filed (or international equivalent) for any public company that has stock or bonds actively traded on an established exchange.
Similar events for private companies that have at least 500 full time equivalent participants (employees, officers, directors, managers...) provided it was carried by at least one reasonably reliable secondary news source.

Blacklist

employee promotion news events unless the person promoted has an article in Wikipedia
promotional news release for new products in the ordinary course of business unless three or more tier one news outlets cover the new product
shareholder proposals receiving or likely to receive less than 10% of the share vote
etc.

-- – The preceding unsigned comment was added by Rjlabs (talk • contribs) at 03:57, 18 March 2018‎ (UTC).

Rate speed for QuickStatements

Hello,

Is there a way for one to select the rate of speed edits for QuickStatements? Hoo man asked me to slow down with the batches, but I was not able to selectively control the speed and I was running 1 batch at 1 time (no parallel jobs). Even with just 1 batch (0 parallel), Wikiscan was telling me i was running at a rate of 100/min. With that speed, this causes server strains and I don't want to repeat that. Artix Kreiger (talk) 14:58, 13 March 2018 (UTC)

The weird thing is that you've edited with edit rate of 300/min without problems. Note that both editing rates aren't very nice to the server. Sjoerd de Bruin (talk) 15:06, 13 March 2018 (UTC)

Sjoerddebruin, I know now. Thats why I am asking if I can have the option to limit the rate to 60/min or even down to 30/min. That is opposed to letting the software run as fast as it wants. Artix Kreiger (talk) 15:19, 13 March 2018 (UTC)

Many tools have the same issue, of not being able to control edit rate. I see the same on Commons with Cat-a-lot and other tools, but I thought QS must have some build in throttling, because it's jobs take a long time to complete. 300/min is the same as 5/sec and that does not sound high enough to worry about. If there is evidence that that is too high than we could file a ticket to slow it down a bit. --Jarekt (talk) 20:41, 13 March 2018 (UTC)

There is a recent proposal by the development team (phabricator:T184948) to enforce a maximum speed of 100 edits/min. --Pasleim (talk) 00:39, 16 March 2018 (UTC)

Indeed. Thanks, Pasleim. If you have any input on this please add comments on the ticket. The details are still to be decided. --Lydia Pintscher (WMDE) (talk) 11:39, 16 March 2018 (UTC)

Easiest solution is for Magnus to put a sleep between edits in Widar. I recall having this discussion before and Magnus adding a throttle, but that might have been removed silently.

Assuming this is the right source, it would be just adding a sleep of a couple of seconds to the functions that create items and a bit shorter sleep to the main switch. Multichill (talk) 15:50, 18 March 2018 (UTC)

wikidata-terminator link from the intro is 404

Message: "The URI you have requested, /wikidata-terminator, doesn't seem to actually exist." – The preceding unsigned comment was added by 141.20.6.104 (talk • contribs) at 14:14, 18 March 2018‎ (UTC).

@Magnus Manske: --Pasleim (talk) 16:39, 18 March 2018 (UTC)

Splitting Filmweb.pl ID (P3995)

Any opinions about splitting Filmweb.pl ID (P3995)? See also Property_talk:P3995#Splitting up this property.

Notified participants of WikiProject Movies Queryzo (talk) 08:52, 14 March 2018 (UTC)

I just proposed both new properties for films and persons. Queryzo (talk) 16:34, 19 March 2018 (UTC)

Wikidata weekly summary #304

Here's your quick overview of what has been happening around Wikidata over the last week.

Discussions
- Open requests for adminship: Putnik, Okkn

Events
- Upcoming: Wikidata workshop in Eindhoven, NL, March 24th

Press, articles, blog posts
- Mind the (Language) Gap: Generation of Multilingual Wikipedia Summaries from Wikidata for ArticlePlaceholders by Lucie-Aimée Kaffee et al.
- Semantic labeling for quantitative data using Wikidata, by Phuc Nguyen and Hideaki Takeda
- OpenStreetMap Interview: Andy Mabbett, Wikidata and OSM - The OpenCage Geocoder blog
- How we’re using machine learning to visually enrich Wikidata, by Miriam Redi on WMF's blog

Other Noteworthy Stuff
- The property suggestions were updated last week, the last update was in December 2017. The most noticable effect is the higher ranking of "family name" (P734) on items about people. Input about the suggester is still welcome.
- There is an early conversation about structured licensing and copyright on Wikimedia Commons.
- George, le deuxième texte (fr), a website querying Wikidata to find French female authors, in order to bring more diversity in the literature school programs
- New, configurable download page for Mix’n’match catalogs (example)
- The Su Lab is looking for a postdoctoral researcher to work on Wikidata in the Gene Wiki team
- Help:Conflation of 2 persons

Did you know?

Development
- Looking into current Lua usage to see where we can improve the Lua functions we provide (phab:T189506)
- When there is a constraint violation in a reference, the reference is now automatically expanded to make it more visible (phab:T177970)
- Looked into issues around notifying the Wikipedias about changes happening on Wikidata (sometimes delayed due to too quick bot editing) (phab:T189772)
- Fixed some translation issues in the embeded part of the Query Service (phab:T188990)
- Fixed an issue with usernames being broken for Wikidata changes in watchlist and RC on Wikipedia (phab:T189320)
- Optimizing a heavily used database table (wb_terms) (phab:T188279)
- Polishing a lot of things for lexicographical data first deployment
- Make it possible to remove a Form (phab:T173332)

You can see all open tickets related to Wikidata here. If you want to help, you can also have a look at the tasks needing a volunteer.

Monthly Tasks
- Add labels, in your own language(s), for the new properties listed above.
- Comment on property proposals: all open proposals
- Suggested and open tasks!
- Contribute to a Showcase item.
- Help translate or proofread the interface and documentation pages, in your own language!
- Help merge identical items across Wikimedia projects.
- Help write the next summary!

Read the full report · Unsubscribe · Lea Lacroix (WMDE) 16:06, 19 March 2018 (UTC)

Common misspelling property?

I'm looking at Velhagen & Klasing (Q2512342). This publisher used a Blackletter font that made it difficult to read their own name, eg. here. Someone not familiar with Blackletter could easily read that as "Velhagen & Klafing", and OpenLibrary has a book recorded under the misreading Belhagen & Klafing. Due to German grammar, they are sometimes found under Velhagen & Klasings. I find all of these spellings well-attested on the internet.

If Wikidata had a means to indicate common misspellings, it could be used to help clear up all these issues. Is there such a means? I can't find a "common misspelling" property. Daask (talk) 16:24, 18 March 2018 (UTC)

You could perhaps use described at URL (P973) with a qualifier of subject named as (P1810) ... there's a working example at Ephraim Belemu (Q45384419) --Tagishsimon (talk) 22:55, 18 March 2018 (UTC)

Use title (P1476), official name (P1448), name in native language (P1559), or similar (depending on the domain), marked as deprecated, with a reason for deprecated rank (P2241)="misspelling", and a citation; like this, though you need a source that says shows (or reports) the bad spelling, otherwise you're just reporting "original research". Be sure to use the correct spelling with a "normal" rank, too. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 23:21, 18 March 2018 (UTC)

@Daask: is it not sufficient to add the alternate spellings to "Also known as" on the item? ArthurPSmith (talk) 15:59, 19 March 2018 (UTC)

Thanks everyone! I must say that I am surprised to get three different answers on Wikidata where I would expect a preoccupation with standardization. I would use Tagishsimon's suggestion in cases where a work is talking about Ma'nu VIII but calling him Ma'nu VII by mistake or using a different regnal numbering system, or where a notable work misspells the subject. I hesitate to add sources that don't add much to our understanding of the subject with described at URL (P973), since that is my understanding of its purpose. I prefer Pigsonthewing's recommendation since it clearly marks it as a misspelling. Thanks again! Daask (talk) 19:18, 19 March 2018 (UTC)

Edit items imported from other databases

Is it possible to edit items imported from other databases or they must remain exactly as they are written in the original database? I ask it because I noted that there are both "food ingredient" and "Food Ingredients" items. The second one is imported from MeSH. Can I merge the two items? --Malore (talk) 00:03, 20 March 2018 (UTC)

@Malore: If you are pretty certain they are about the same thing, yes definitely merge them. The more links we have between different identifiers, the better. ArthurPSmith (talk) 14:26, 20 March 2018 (UTC)

Is it OK to add exact match (P2888) to ImageNet synsets?

An AI library gives me ImageNet "synset" identifiers, for instance it gives me http://imagenet.stanford.edu/synset?wnid=n03868242 ("oxcart" warning: slow) and want to translate that to bullock cart (Q1190777). In other words, I need to translate the synset I am given to a Wikidata QID. How to do that?

Reading this withdrawn proposal, my understanding is that the mapping does not really exist yet and should be done via exact match (P2888)?

I created an example edit here: https://www.wikidata.org/w/index.php?title=Q1190777&type=revision&diff=653336713&oldid=652657044

Is it OK to do more? Or is there a better way to do it? Or even better, is the mapping already available somewhere? Thanks! Syced (talk) 10:41, 20 March 2018 (UTC)

Yes, that looks fine to me, that seems to be the conclusion of that proposal is to do it that way. ArthurPSmith (talk) 14:29, 20 March 2018 (UTC)

What is Wikidata template?

What is a template in Wikidata and how is it used and where? Capankajsmilyo (talk) 10:48, 20 March 2018 (UTC)

It's the same as in Wikipedia, and can be used on any talk, community, help, chat page (like this one for example). As to the actual Wikidata database, I don't think there's any use for templates. --Anvilaquarius (talk) 11:37, 20 March 2018 (UTC)

Information about the Wikimedia Foundation global survey starting soon

Hello!

My name is Edward Galvez and I work for the Wikimedia Foundation. For those of you who do not know, the Wikimedia Foundation supports you by working on the software and technology to keep the sites fast, secure, and accessible, plus providing programs and initiatives to support free knowledge globally.

In about one week, the Foundation is starting a global survey to learn about the experiences and feedback of Wikimedians. I am writing here, because I wanted to share with you a bit more about the project. The survey is called "Wikimedia Communities and Contributors survey" and is conducted annually. We will send the survey to editors across all the Wikimedia projects, as well as Wikimedia affiliates and volunteer developers. This survey is going to be our way of making sure that we can hear feedback from a significant number of users from across the projects. This research supports editors and Wikipedia’s mission. This is our second annual CE Insights survey, and we look forward to improving it every year.

You can sign up to be notified about the results of the survey, or to learn how you can help with planning the survey next year. If you have any questions or concerns about this project, please feel free to send them to my talk page on meta or email me directly at surveys@wikimedia.org in any language. To see the results of last year’s survey, and to see how your feedback helps the Wikimedia Foundation support communities, you can learn more about this project, or read about frequently asked questions. You can also share your feedback on meta. Thank you! --EGalvez (WMF) (talk)

14:05, 20 March 2018 (UTC)

@EGalvez (WMF): I tried to sign up in the page that you provided to notify the results, but it gives an error when I enter "wikidata.org". Micru (talk) 14:24, 20 March 2018 (UTC)

@EGalvez (WMF), Micru: The value used in the template should be changed to "www.wikidata.org". Mahir256 (talk) 14:29, 20 March 2018 (UTC)

@Mahir256: Thanks! Micru (talk) 15:07, 20 March 2018 (UTC)

Good list

Hi, a list that I wrote was promoted like good, but when I change the status in wikidata, I didn't see any option for that. What can I do? Mr. Fulano (talk) 20:38, 20 March 2018 (UTC)

It seems we only have "featured list". You have to ask at Wikidata:Contact_the_development_team.
--- Jura 21:06, 20 March 2018 (UTC)

This section was archived on a request by: Matěj Suchánek (talk) 16:51, 26 March 2018 (UTC)

Changing datatype of the existing property

If there is a need to change datatype of a property (used in <100 items) from string to item, can it be done without creating new property and deleting the old one? Where should I ask for such change? Wostr (talk) 22:35, 22 March 2018 (UTC)

This doesn't sound like something that could be converted. So yes, deletion is needed. It also helps to say the name of the property, instead of some vague message like this. Sjoerd de Bruin (talk) 22:40, 22 March 2018 (UTC)

It wasn't meant to sound vague; it's about Property:P728 and Property:P940. Okay, I'll create relevant property proposals then, thanks. Wostr (talk) 23:26, 22 March 2018 (UTC)

Confirmation from devs --Edgars2007 (talk) 12:58, 23 March 2018 (UTC)

Thanks, Wostr (talk) 14:02, 23 March 2018 (UTC)

This section was archived on a request by: Matěj Suchánek (talk) 16:51, 26 March 2018 (UTC)

Health Info

My name is Paul Bliss and I work for Highmark Health and we were wondering if there is a need to have data on all kinds of health symptoms, diagnosis and treatments. If this information already exists, then I apologize for wasting your time.

If you don't have it and would like to see what we have to offer, please let me know what the process is to move forward.

Thank you for your time.

Paul Bliss PBhighmark (talk) 14:24, 19 March 2018 (UTC)

@PBhighmark: Yes, to start, please see Wikidata:WikiProject Medicine. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:14, 20 March 2018 (UTC)

More on Invaluable.com artist IDs

Some updates on Invaluable.com person ID (P4927) at Property talk:P4927#Info from Invaluable, relevant for anyone who writes about artists, not least @Gerwoman, Jane023:. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:59, 20 March 2018 (UTC)

SPARQL questionnaire

Paul Warren, of the Knowledge Media Institute, at the UK's Open University, has created a questionnaire to help understand how people are using SPARQL, and which may interest some of you. The questionnaire, which takes around 20 minutes, is available at: https://openuniversity.onlinesurveys.ac.uk/sparql-survey The closing date for the questionnaire is 14 April 2018. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 22:08, 20 March 2018 (UTC)

BTW, the answer to "how many triples?" is lots, comfortably above the highest option the survey offers. (Hat tip to User:Lucas_Werkmeister_(WMDE) on twitter). Jheald (talk) 18:18, 21 March 2018 (UTC)

Property Catalog

I stumbled upon an apparently wrong use of catalog code (P528) in Tannenwald concentration camp (Q47515290), which uses catalog (P972) as a reference, and not as a qualifier. But even the catalog value European Holocaust Research Infrastructure (Q21755493) is wrong, as its not the catalog but the organization maintaining it. There seem quite some items affected, can someone help to fix these? The more general question - when do we use the catalog property, and when do we propose a specific property for the identifier? I already (successfully) proposed two properties for identifiers of protected areas, but there in Germany alone there seem to be many of them as every federal state has its own catalogs. Those two I proposed can link to an external website with formatter URL (P1630), which seem only possible with real properties. Ahoerstemeier (talk) 16:37, 21 March 2018 (UTC)

Wikidata:Statistics

I was just looking at Wikidata:Statistics, and two things struck me.

It tells us the number of items, but not the number of statements (or triples)
The pie chart does not have a separate 'slice' for academic papers.

Thoughts? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 21:58, 20 March 2018 (UTC)

2) yes, using "event" (publication) for academic papers is too general... --Infovarius (talk) 14:34, 23 March 2018 (UTC)

Alternate names property

Property alternative name (P4970) has recently been approved, as a qualifier for external ID statements, to record alternative names given in thesaurus or database (in addition to qualifiers subject named as (P1810) for preferred form of name, and native label (P1705) for preferred form of name in a given language).

Before I start populating the property, I was wondering: the property was approved with data type:string. But would it make sense to allow a language to be specified, ie to make it data type:monolingual string? Or is it worth keeping as it is, eg for use for personal names that are pretty much language independent, and thesauruses that have a preferred language, and to use native label (P1705) for translations that are language-specific beyond that?

As far as I know, there aren't many thesauruses that routinely distinguish between preferred and additional terms across multiple languages, so perhaps the property is okay as it is? Jheald (talk) 18:09, 21 March 2018 (UTC)

The Europeana Fashion Thesaurus v1 (Q28890038) distinguishes between preferred and alternate labels in its 11 languages. In many cases, the alternate names have been entered in WD as aliases. - PKM (talk) 20:59, 21 March 2018 (UTC)

The databases of the Getty Vocabulary Program (Q5554720) differentiate as well. --Marsupium (talk) 18:16, 22 March 2018 (UTC)

Looking for some way to link to external definitions of a term within a document

Hi all

Is there a general property that allows you to link to external definitions of a term within a larger document? I know there are external IDs for defferent external catalogues e.g dictionaries, but is there something general for defintions? Something with a similar function to this?

Thanks

--John Cummings (talk) 14:30, 22 March 2018 (UTC)

Described at URL?
--- Jura 15:46, 22 March 2018 (UTC)
- Thanks @Jura1:, I think I want to be more specific than just a URL (unless the link links to a specific section). Is there something to say Described in, with a qualifier for a page number? and then a link to a URL? I'm just aware a lot of stuff is in .pdf. Basically I want to link to a specific phrase rather than a 300 page .pdf.... I feel like it kind of would work for Wikiquote or Wiktionary but it doesn't quite fit with either.... --John Cummings (talk) 17:27, 22 March 2018 (UTC)
  - I'm not sure how one could link a section, but pdf-urls can include "&page=299". Obviously, you can still use qualifiers for sections, but these wont link.
    --- Jura 17:32, 22 March 2018 (UTC)
    - I sometimes address this need with a really long <stated in> or <reference URL> reference, using qualifiers section, verse, paragraph, or clause (P958) (which has alias 'entry') and quotation (P1683). See bunker gear (Q339179) <described by source> for an example. - PKM (talk) 18:42, 22 March 2018 (UTC)
      - If the pdf document is a regularly used source, I would create an item to describe it, like a "document", "dictionary", etc., with all properties for author (P50), publisher (P123), publication date (P577), language of work or name (P407), full work available at URL (P953) ou URL (P2699), etc.) and then use described by source (P1343)-"item" with qualifier page(s) (P304), and then URL (P2699) with the complete url, including page precision :) --Hsarrazin (talk) 08:53, 23 March 2018 (UTC)

@Hsarrazin:, @PKM:, @Jura1:, thanks for your suggestions, I think this will work. --John Cummings (talk) 15:12, 23 March 2018 (UTC)

Adding the Lexeme namespace to the licensing footer text

Proposal

Hi everyone,

As you might know we’ve been working on adding support for lexicographical data over the past year. We are now getting close to a first version and I am tidying up the last pieces before we can get started collecting lexicographical data here on Wikidata and remix, query and reuse that data to learn more about the languages of this world. You can check out the demo system with the current state.

One of the remaining tasks is around licensing. Since the beginning of Wikidata all our structured data is released under CC-0. This has helped significantly with spreading our data widely and quickly and thereby helping us give more people more access to more knowledge. Our current licensing footer text however explicitly mentions the main and property namespaces as the places holding data under CC-0. Since lexicographical data is in a new namespace we need to adjust this text.

I am convinced it is in the best interest of Wikidata to extend CC-0 to all structured data namespaces. The reasons (in addition to my reasons for CC-0 in general):

We have fared very well with CC-0 so far and many partners use it as one of the main reasons they are attracted to Wikidata - both for re-use and contribution of data.
Having a mix of licenses is a potential legal minefield that can be exploited by some actors, threatening not only re-users but also our own contributors. It is a huge hassle for re-users, in particular small re-users like individual contributors, hobby developers, and small organizations, and will lead to less usage by these, and thus to less spreading of our knowledge.
It is the sound thing for data - much better explained by Luis in his blog posts (1, 2, 3, 4).
It will mean that we can not import some data from Wiktionary and other sources that is incompatible with CC-0 but that is already the case now. We have always leaned towards making re-use easier at the expense of easy importing. (See input from the legal team for more details on what kind of lexicographical data can be protected.)

So I would like to adjust the license text to say “All structured data (e.g. main, Property and Lexeme namespace) is available under the Creative Commons CC0 License; text in the other namespaces is available under the Creative Commons Attribution-ShareAlike License; additional terms may apply. By using this site, you agree to the Terms of Use and Privacy Policy.”.

Cheers --Lydia Pintscher (WMDE) (talk) 09:34, 22 February 2018 (UTC)

Support. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:23, 22 February 2018 (UTC)
Of course, the adjustment has my full and Strong support, since I never doubted one moment this is the correct approach. Sannita - not just another it.wiki sysop 10:35, 22 February 2018 (UTC)
Strong support Linking data with a mix of licenses is just a gordian knot. --Andrawaag (talk) 11:50, 22 February 2018 (UTC)
Question What's the plan for textual definitions?
https://wikidata-lexeme.wmflabs.org/index.php/Lexeme:L15#Senses doesn't show any but the substance of https://de.wiktionary.org/wiki/Leiter seems to be these.
Similarly https://wikidata-lexeme.wmflabs.org/index.php/Lexeme:L13 compared to https://en.wiktionary.org/wiki/hard#English
Would they have to remain at Wiktionary, be reduced to statements as on the prototype, added to another namespace or need to be re-written?
--- Jura 11:53, 22 February 2018 (UTC)
- You can see some of the definitions on the demo system (where it says "Führungsperson" for example). They would be CC-0 too. Just like with Wikipedia now we will build the infrastructure and collect data here that the Wiktionary editors are free to use as they deem useful for their work. Hope that helps. --Lydia Pintscher (WMDE) (talk) 07:19, 23 February 2018 (UTC)
  - Somehow I doubt there is much room for Wiktionary.org to make use of Wikidata's Wiktionary namespace to annotate its content further. Are there any plans for a structured way to enable this? Maybe a separate Wikibase(Wikidata) instance as for Commons?
    --- Jura 14:26, 24 February 2018 (UTC)
The choice of a permissive license is unfortunate but not entirely surprising, given the big corporate players that funded the creation of Wikidata. It's a departure from the early ideals of Wikimedia projects, of creating content that will be always free down the line. Now there's no point debating this, since it would make no sense to make the namespaces have incompatible licenses. The real discussion with the community should've been carrier out much, much earlier. NMaia (talk) 12:13, 22 February 2018 (UTC)
This is fundamentally inappropriate, and most of all repetitive: we've been through this time and time again, of course there's no way to convince that no "big corporate players" were involved in the discussion and that it was a community decision, if you're convinced otherwise. Source: I was there when we discussed it. --Sannita - not just another it.wiki sysop 23:53, 22 February 2018 (UTC)
Interesting, can you provide details when and when this "community decision" was made "offline"?
--- Jura 07:00, 23 February 2018 (UTC)
It wasn't made "offline" - and this is a final notice: please, do NOT put in my mouth words I've never spoken - it was made in the mailing list of Wikidata, while the project was still in beta. The first discussion was made in April 2012, then another in August 2012, and these are the first two discussions I can find just by casually browsing the ML archives. Check them out yourself if you don't believe me, I've got work to do, and frankly I'm tired of repeating the same things all over again. --Sannita - not just another it.wiki sysop 09:05, 23 February 2018 (UTC)
I thought this was somehow related to Wikitionary, but it's about Wikidata in general. I took your "I was there" literally.
--- Jura 20:53, 23 February 2018 (UTC)
Strong support CC-0 has been a key of Wikidata's success. Mixing it with less-free licenses will create significant hurdles for on- and off-project users. --Magnus Manske (talk) 13:25, 22 February 2018 (UTC)
Support ArthurPSmith (talk) 15:59, 22 February 2018 (UTC)
Neutral I agree that using CC-0 for all data make senses. On the other hand, I believe that using such licence will not allowed to import a lot of interesting stuff from the Wiktionaries. Pamputt (talk) 18:13, 22 February 2018 (UTC)
- Yeah but in the long run I believe that is the better trade-off to make. We've made the same trade-off for the data in items. --Lydia Pintscher (WMDE) (talk) 07:23, 23 February 2018 (UTC)
Support obviously. VIGNERON (talk) 19:59, 22 February 2018 (UTC)
I also have the same question as Jura, since senses seem like they'd be derived from existing Wiktionary definitions. Mahir256 (talk) 21:30, 22 February 2018 (UTC)
Support This is indeed a major development. John Samuel 23:19, 22 February 2018 (UTC)
Support It would be crazy to start mixing licenses now, good to clear this up right away. I9606 (talk) 03:14, 23 February 2018 (UTC)
Support It might make it impossible to import data from Wiktionary, but in the long term it is better for reuse. Me too I am very attached to keeping data open, but Wikimedia has reached a stage where the embrace, extend and extinguish (Q1335089) strategy would not against us anymore, so better make the data as open as possible, which means CC0. Syced (talk) 06:21, 23 February 2018 (UTC)
Oppose It don't think of any lexicographer or linguist who may accept to publish under CC0 a work they spent five to twenty years on. CC0 does not respect the time spent in collection of words and meanings, structuring the language for a dictionary and edition. CC0 is in favor of compagny that will just use the data without considering to diffuse the knowledge, it will not reinforce the free reuse but only the stealing of data. Finally, I think this decision concern wiktionarians and deserve a better explanation of the problem, one that include the pro and the con. A this point, I still consider you are doing a fork of Wiktionary in Wikidata with your own agenda. -- Noé (talk) 07:54, 23 February 2018 (UTC)
- Who is being asked to "publish under CC0 a work they spent five to twenty years on"? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:31, 23 February 2018 (UTC)
- How does CC0 not respect the time spent? I think we are confusing the legal license with the provenance. If the data provider, also provides the proper references and qualifiers, it is respecting the work spent, don't you think? Yes, people can use wikidata content without pointing to this provenance, but what is it's value if they can't support those claims with original sources? Also, a lot of work is made possible with public funds, so not sharing those result with a public license is quite unfair?. I agree with Andy, nobody is pushing people to share they knowledge under CC0, if you don't like it you don't need to. But for those who would like to share knowledge publicly, CC0 provides the means. Different scientific resources did make the change to share the knowledge with the general public eg: example --Andrawaag (talk) 12:04, 23 February 2018 (UTC)
  Andy: Well, you're right, lexicographical data in Wiktionary could be written only by individuals and never by big imports from published sources. Good luck to start again from scratch.
  
  CC0 do not respect the time spent because it do not force reusers to mention the source of information. If references are provided, it is equal to diffuse it with CC BY or with CC0. Public funds = sharing with public licence, I agree, avec CC BY-SA is also a public licence, lucky us. I pointed out that I am quite sure a CC BY-SA licence may create a better environment to include integral of recent works directly given by their authors. You may not agree, but no study was provide for or against this, and I think a proper analysis and prospectives have to be made before such a vote. Noé (talk) 12:32, 23 February 2018 (UTC)
  Instead of rhetoric, please answer my question; "Who is being asked to 'publish under CC0 a work they spent five to twenty years on'"? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:50, 23 February 2018 (UTC)
  It was not rhetoric, maybe vague broken English by a non-native speaker, but not a manipulation through figures of style. I am not that mean. So, you cut half of my sentence. I was assuming a lexicographical project would like to have lexicographers to participate (people that already made dictionaries or linguists that did some lexicographical works and have already though about dictionaries issues), and I wrote a CC0 will not convince this kind of profiles to share data. But I was probably wrong by assuming such a goal for this project. More I read on this project and more I realize is not grounded on lexicographers needs and knowledge nor wiktionarians needs and knowledge but on wikidatians needs and vague idea of linguistic and lexicography practices and difficulties. Noé (talk) 14:19, 23 February 2018 (UTC)
  @Noé: So in your opinion, we should re-license Wikidata as WTFPL (Q152481) which is a little better than CC0 for the Public Domain software usages, but that opinion is not recommend by Free Software Foundation (Q48413) (cf. https://www.gnu.org/licenses/license-list.en.html#WTFPL). --Liuxinyu970226 (talk) 15:18, 23 February 2018 (UTC)
  I was not postulating anything for Wikidata in general, my messages were about the namespace for lexicographical data. As I understand it, WTFPL is made for software, not for data, so I don't get your point here. Noé (talk) 15:35, 23 February 2018 (UTC)
  
  Your English is clear. You said that you: "don't think of any lexicographer or linguist who may accept to publish under CC0 a work they spent five to twenty years on"; and I was asking you for evidence that anyone is being asked to do that. It now seems that you concede that no-one is. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:45, 23 February 2018 (UTC)
  @Pigsonthewing: Yes, I concede I had a wrong vision of this project goal. I though Wikidata is a project that host databases, and Lexeme will be a storage place for several lexicographical databases. So, I though it will be a place to include data collected by lexicographers such as published dictionaries, published wordlist or existing online databases (for example RefLex). It appears I was misguided and Lexeme will only host one lexicographical database. So, published documents will not be added into it. Lydia first dot was about partners and I imagined it was implicit mention to owners of lexicographical database. If it was the plan, CC0 would be a wrong option in my opinion, knowing a lot of people that published already this kind of database. As I wrote, my opinion is not a good indicator. A prior consultation directed to such owners of database could help the choice for the licencing to be based on stronger arguments. Finally, this consultation appears to be unnecessary. Fine. Then, I'll keep my vote to oppose by Yair rand arguments. -- Noé (talk) 13:43, 26 February 2018 (UTC)
  You'll note that Yair rand wrote "I think that CC-0 is the right license for this". Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:30, 27 February 2018 (UTC)
  He also wrote "The discussion is happening here, as opposed to with any of the Wiktionary communities, which is inappropriate." and I keep my vote oppose to the procedure itself not to the choice offered here. Noé (talk) 18:27, 1 March 2018 (UTC)
  Yes, and I have explained below why he was wrong to write that. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 19:19, 1 March 2018 (UTC)
Question If Wikidata data are in CC0 and a Wiktionary wish to include some, is this copyfraud? Noé (talk) 08:04, 23 February 2018 (UTC)
- No that is fine. --Lydia Pintscher (WMDE) (talk) 08:09, 23 February 2018 (UTC)
  - No, it's not fine. What Wikidata community is already doing is massive copyfraud. I found a jurist specialized in free licenses and so far she confirmed that this doesn't seem legal at all. --Psychoslave (talk) 09:24, 23 February 2018 (UTC)
    - Psychoslave you misunderstood the question (or maybe Noé didn't ask what he meant to ask) the question here is the reuse of Wikidata data outside Wikidata, so in this case, the re-user is responsible ; there is no way the Wikidata community can do copyfraud in this scenario. I'm guessing you are thinking of the import of data from an external source inside Wikidata (here copyfraud by the Wikidata community is technically possible) but this is a different subject and one that has been raised multiple time already and even answered with some professional legal advice. Cdlt, VIGNERON (talk) 18:56, 23 February 2018 (UTC)
      - You are totally right that I was talking about data from an external source inside Wikidata, and this as also consequences for re-users. If Wikidata is creating a data bank by illegal means, then any re-users of its data set is also concerned. I agree that using an other license would not completely solve this problem, but using CC-by-sa would at least solve it for importation from Wiktionary. Would you be kind enough to provide links about the legal advises you have in mind? --Psychoslave (talk) 10:09, 28 February 2018 (UTC)
- Lydia, can you provide evidences for your assumption? Noé (talk) 12:32, 23 February 2018 (UTC)
  - CC0 license states You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. So using CC0 data on Wiktionary is fine. --Jarekt (talk) 17:00, 23 February 2018 (UTC)
Strong oppose With no surprise for those who followed my research on the topic, I strongly oppose to this. I will give more feedback bellow as soon as I find time for this. --Psychoslave (talk) 09:24, 23 February 2018 (UTC)
Neutral A database of lexical information lacking definitions will be quite bland. So what will happen? Data will be imported from CC0-compatible sources or reworded, probably starting with definitions from WordNet and out-of-copyright dictionaries. Sounds like a reboot of Wiktionary in a way, this time with a more permissive license. – Jberkel (talk) 09:42, 23 February 2018 (UTC)
Support - CC0 FTW! Wittylama (talk)
Support --Jarekt (talk) 17:00, 23 February 2018 (UTC)
Comment Why is this discussion happening here ? Surely the lexemes aren't going to be managed by the Wikidata community -- they are for the Wiktionary community to administer, and will be subject to that community's policies on content and every other aspect -- just as the upcoming CommonsData wikibase will be administered by Commons, not by us. That's one of the points for them being federated wikibases. This is not a decision for us to make. It is up to that community to choose how they wish to licence their work. My view therefore is we have no standing here; this is not our choice to make. This discussion is therefore not appropriate and should be closed, and/or re-started in a more appropriate forum. Jheald (talk) 18:07, 23 February 2018 (UTC)
- The data is going to be here on Wikidata in a new namespace. It is the license of the content on Wikidata. --Lydia Pintscher (WMDE) (talk) 18:12, 23 February 2018 (UTC)
  - But do we think we, the Wikidata community, are going to be the ones administering it, making day-to-day rules and guidelines for its content and organisation? Or the Wiktionary community? Far better, it seems to me, if the Wiktionary community felt that they were the owners of these items. Jheald (talk) 18:45, 23 February 2018 (UTC)
    - @Jheald: Just like Wikidata was not created with the only purpose to support Wikipedia, lexicographical data on Wikidata doesn't have the only purpose of supporting Wiktionary. Of course, Wiktionary communities will be free to experiment with the data, improve it, include it in their projects, mixing it with other content they have. Other parties can find interest in data about words: students, researchers, applications developers... we want the data to be structured, accessible and reusable for everyone without distinction. Anyone helping improving the data will take part in the ownership. Lea Lacroix (WMDE) (talk) 18:41, 25 February 2018 (UTC)
      - @Lea Lacroix (WMDE): Project divisions matter, and giving current Wikidata admins and policies authority over lexical structured data has the potential to be really divisive and problematic and cause a lot of friction between the various groups. These are very different types of communities we're dealing with, and Wiktionaries are more likely to actually know what they're doing with regards to lexical data. I've been a contributor and admin on both Wikidata and the English Wiktionary, and I'm quite confident that these communities will not get along at all if an adversarial context is built by attempting to subsume Wiktionary into Wikidata entirely from the outside (as this will likely be viewed). Wiktionarians have been building up free lexicographic content within Wikimedia for more than 15 years, know very well how to do it, and would likely make up most or all of what could be a thriving community for structured lexical data if we don't strangle it from the start by trying to do it here, a project really not built for that kind of thing. (Lexical content is different.)
        I think that CC-0 is the right license for this, but I'm still going to Oppose this change, because this community shouldn't have the authority to make this decision in the first place. --Yair rand (talk) 20:11, 25 February 2018 (UTC)
  @Lydia Pintscher (WMDE): Is there any reason for the data to be here in a new namespace? What advantages come from having it here, as opposed to a different project? This is a question that comes up in many areas, and I think there are some good questions to ask when trying to figure this out.
  - Does the new content fit well within the current project's scope and structure?
  - Will any existing specialized Wikidata gadgets be useful for this new type of content?
  - In Wikidata's primary forums, is there likely to be any overlap between topics discussed relevant to existing Wikidata content and topics relating to the new lexical content?
  - Are the project's current policies well-built for the addition?
  - Will there be any significant overlap between those watching the recent changes feed for Wikidata items and for Lexemes?
  - Are there any content-level benefits to having them on the same project?
  In my opinion, the answer to all of these questions is "no". Others may disagree, but there needs to at least be some discussion about this before rushing into assuming that we're just creating a new namespace here and leaving the decisions to the existing Wikidata project. --Yair rand (talk) 21:03, 25 February 2018 (UTC)
  - Hey Yair, thank you for your questions. There are many reasons for not creating a new project. The biggest one being that the lexicographical data and the data we have in items now should be closely connected. In addition Wikidata is _the_ central knowledge base for Wikimedia and this data is part of that - there is no central place like that for Wiktionary. This data is supposed to not only be used by Wiktionary but also a lot of other re-users, just like our current data isn't just used by Wikipedia. Additionally we have a community here who has spend the past 5 years taking care of structured data in Wikimedia and has a lot of experience in that. Starting that from scratch in a separate project wouldn't be helpful. And on top of that none of the other Wikimedia projects have their own knowledge base because we want to share the data across all of them. (Commons will be a bit different but is also not comparable to the case of lexicographical data.)
    
    You asked if any of the gadgets will be useful for the new content. Yes at the very least the merge and constraint gadgets as well as the primary sources tool are going to be useful there. (I have not checked the rest.)
    
    About overlapping discussion content: Yes I believe so. For example in the property definitions as well as everything we've learned about data quality processes over the past 5 years as well as usage of the data inside and outside Wikimedia.
    
    Are our current policies well-built for the new content? Maybe or maybe not. But that was no different when we added other new Wikimedia projects like Wikinews. We adapt where needed.
    
    Will there be any significant overlap between those watching the recent changes feed for Wikidata items and for Lexemes? I would hope so because the statements on Lexemes will link to a lot of the items.
    
    Are there any content-level benefits to having them on the same project? Yes a lot because the interconnections of the items and Lexemes through statements are a huge part of the value we are going to deliver with this. And people will for example want to run queries that include data from items and lexemes together.
    
    I hope this clarifies things a bit. --Lydia Pintscher (WMDE) (talk) 18:06, 26 February 2018 (UTC)
    @Lydia Pintscher (WMDE): The links between items and lexemes do not seem to be a valid justification for merging, assuming federated wikibases are going to function as expected. Wikidata, like most Wikimedia projects, currently details concepts and things. Wiktionary explains the means by which people communicate them. The substantial difference between explaining lexemes and ordinary things is clearly agreed upon, otherwise we wouldn't be talking about a separate namespace and format in the first place. Wikidata's policies and practices are built around things like labels, descriptions, aliases, sitelinks, badges, the ontologies we build around entities. We need to determine how to structure the links between the various concepts in the world, whether or not an item is notable. Our items are about people, places, concepts and ideas and objects, eras and events, art and books and organizations and families and ideologies. Lexical data has none of that.
    
    The Wikidata community is very good at what it does. Lexical content, structured or not, is not what it does. --Yair rand (talk) 03:48, 28 February 2018 (UTC)
  @Yair rand: You summed it up nicely. One of the technical/political shortcomings I see with Wikidata is indeed the centralisation. There is often talk about “the Wikidata community”, but in reality there'll probably be a multitude of (sub-)communities, but as you point out the power structures lie in the current userbase/adminship. It's difficult to create a sense of shared ownership in this context, especially when there is an impression that certain decisions are "forced" onto other communities. I'm still optimistic about the long-term success of the project, and wonder what could be done to remove some of the friction you mentioned. It's a rocky start, to say the least. – Jberkel (talk) 10:11, 27 February 2018 (UTC)
Oppose Hardening my position to "Oppose", per Yair rand above. This project started out as "Structured Data for Wiktionary". Somewhere along the line the aim changed, and it became "Wikidata for Lexicographic Data". I think that change was a mistake. I don't think there can healthily be two rival lexicographic projects at Wikimedia. It's just a recipe for poison, constant friction and bad blood. I think the two projects would end up tearing each other apart, with collateral damage all round, and both probably rapidly bleeding out support. So my definite view is that if this is going to be done at all, it needs to be done with the active support of Wiktionary. If this project does not have the support of Wiktionary, then it should be shut down. I hear the points that Tpt in particular makes well below. But, if the price for proper Wiktionary integration and Wiktionary community support is, for all free-text values to be CC-By-SA, as suggested by Deryck Chan below, at least until the Wiktionary community decides otherwise, then that is a price I think we have to be prepared to pay. Or we should pull the plug and shut the whole project down. Jheald (talk) 13:03, 27 February 2018 (UTC)
Support --Pymouss (talk) 20:27, 23 February 2018 (UTC)
Comment It would be good to know about why the alternative approach (with the same model) was rejected. Please see my question/comment at: Wikidata_talk:Lexicographical_data#Separate_installation_for_Wiktionary_?.
--- Jura 20:53, 23 February 2018 (UTC)
- Please see my reply to Yair rand above. I hope that covers it. --Lydia Pintscher (WMDE) (talk) 18:08, 26 February 2018 (UTC)
Oppose. This is exactly what we feared all along over at Wiktionary: that Wikidata would start handling lexicographical data without even bothering to consult the people who already create and manage lexicographical data on Wikimedia every day. Given the licensing situation and the glaring lack of communication, two parallel projects are going to work on the same problems, but separately. It should concern everyone here that out of the only usernames I recognise as active in any of the Wiktionaries, none of them have voted Support. Metaknowledge (talk) 22:07, 23 February 2018 (UTC)
- The question here is about Wikidata's licence terms, and only that. Other than that a more liberal licence gives Wiktionary greater freedom to reuse material from Wikidata, that decision has no bearing on Wiktionary. There appears to have been plenty of prior consultation on the wider issues; not least by Léa (in English), (and in French) over the last year and a half. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 22:50, 23 February 2018 (UTC)
Weak oppose. I would prefer that structured data be CC-0 and free-text data be CC-By-SA in general. The Lexeme namespace is likely to contain a lot of free text (definitions and example sentences) which will fit better with CC-By-SA than CC-0, though I agree that the linkages between different Lexemes and Items should stay in CC-0 to avoid database rights disputes. Deryck Chan (talk) 11:51, 24 February 2018 (UTC)
- @Deryck Chan: Did you have a source « Lexeme namespace is likely to contain a lot of free text », in the contrary, from what I've seen, there is almost no free text (and meanwhile, there is already a lot of free text on the Q items). Cdlt, VIGNERON (talk) 09:09, 26 February 2018 (UTC)
  - @VIGNERON: m:Wikilegal/Lexicographical Data, which you have contributed to, implies that the Lexeme namespace will likely carry a lot of copyrightable "Definitions with room for creativity", "Pragmatic information", and "Encyclopedic information and example sentences". The existing example Lexemes in the WMFLabs site don't have any free text, but it is hard to imagine that definitions, pragmatic information, and example sentences won't make their way to the Lexeme namespace quickly. I agree that there is already a lot of free text on the Q items, but you may remember the days when Wikidata items only had labels and sitelinks. Lexemes will likely contain more free text than Q items after
    - @Deryck Chan: I don't see this implication in this text ; on the contrary since "Wikilegal/Lexicographical Data" says that creative texts are not facts, I understand it as the implication that there won't be creative definition in Wikidata. Plus, I don't really see the need for definitions. Cdlt, VIGNERON (talk) 13:42, 26 February 2018 (UTC)
      - @VIGNERON: If you can't access the definitions from SPARQL, then what's the point? What useful queries are you going to be able to run? Jheald (talk) 14:07, 26 February 2018 (UTC)
        @Jheald: I don't understand. For me, definition for a word is like the description for a concept, and it's quite rare to do a query on description in SPARL, there is plenty of other properties you can query. Cdlt, VIGNERON (talk) 14:31, 26 February 2018 (UTC)
        @Jheald, VIGNERON: Argh I'm confused now. @Lydia Pintscher (WMDE): Are we planning to store free-text definitions, pragmatic information, and example sentences in the Lexeme namespace? Deryck Chan (talk) 14:36, 26 February 2018 (UTC)
        The senses will have free text glosses, which we expect to be as short as descriptions for items. The community will be free to create properties to hold free text, just as they currently do. Example sentences are a likely candidate. --Lydia Pintscher (WMDE) (talk) 18:20, 26 February 2018 (UTC)
        Also, even beyond the free text, some of the purely lexeme-valued fields in Lexeme L13, such as etymology or rhyme, probably show enough judgment or expressive choice that a bulk import of those would also be against the licensing of Wiktionary. Jheald (talk) 13:09, 27 February 2018 (UTC)

Oppose Because it's not interesting for people. The thing you create isn't reality of languages, of linguistic studies and of community needs... Lyokoï (talk) 18:38, 24 February 2018 (UTC)
- What are you opposing? Like others above, you seem to be answering a different question to the one asked. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:41, 24 February 2018 (UTC)
Support using CC-0 for lexicographical Wikidata data (and only importing CC-0-eligible data). Jc86035 (talk) 10:16, 25 February 2018 (UTC)
Support --Pasleim (talk) 19:52, 25 February 2018 (UTC)
Strong oppose for the same reasons as Noé's. I could'nt say it better. Delarouvraie (talk) 08:31, 26 February 2018 (UTC)
Question Lydia first dot is about "many partners" liking CC0. Who are or will be the partners for lexicographical data? How their opinion was consulted to know if they have the same position as previous partners on other kind of structured data? Noé (talk) 13:49, 26 February 2018 (UTC)
- I obviously don't know everyone who will be contributing and using the lexicographical data we will have. (I don't even know everyone using our current data.) I talked to a number of researchers and people building applications (for translation and language learning for example) from big and small projects and companies. They all said CC-0 is the right choice. For the contribution side I have talked to considerably fewer. There it seems to be ok outside Wikimedia as well. As I said above we have always leaned towards making re-use easier if we have to make the choice. --Lydia Pintscher (WMDE) (talk) 18:36, 26 February 2018 (UTC)
Support as per my arguments five years ago. --Denny (talk) 16:00, 26 February 2018 (UTC)
Note: Some of the reasons why an attribution-required condition is harmful for open data are enumerated in point 4 of 'The 5 Things Open Data Publishers are Doing to Keep their Data Closed'. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:50, 26 February 2018 (UTC)
Strong support I believe that having CC0 for lexemes data is very important for the success of the project. A few reasons why:
- Having a license like CC BY-SA would introduce unclear legal implications (e.g. is copyright applicable in this case? if not people could claim using CC BY-SA is "copyfraud") and inequalities between re-users related, between other things, to their lawyer support and the existence or not of database rights in their jurisdictions. Reference: Luis' blog posts (1, 2, 3, 4).
  - To my mind, Wikidata is right now in extremely unclear legal state. Just bluster that the database is under CC-0 doesn't make it so. Without proper data traceability facility, it's impossible to prove this claim true. On the other hand, massive import of data from other data banks, like Wikipedia, are well known, which is illegal in Europe and United States without the data repository owners permission, as far as I understand. Note that this point is not really about CC-0, as switching to any other license would not make this problem magically disappear. --Psychoslave (talk) 09:56, 28 February 2018 (UTC)
- Let's assume that we have set of facts would require attribution. It would require people reusing data to do proper attribution. And it is hard. For example, what is currently done for OpenStreetMap is just having a "copyright OSM contributors". But it is definitely not nice (not much that having a "source: Wikimedia Commons" near a Commons image) and I am not sure that having a "copyright Wikidata contributor" is going to please the "lexicographer or linguist who may accept to publish under CC0 a work they spent five to twenty years on" that Noé described. More, if the consider that that proper attribution should be kept for CC BY-SA compatible data imported into Lexeme namespace, it means that the Query Service and other tools that allows to retrieve Wikidata content should be able to return the proper attribution when they return query results. I believe for that we would want the query service to return the minimal set of "sources" to credit for the result set. Indeed the same query result fact could be derived multiple times from different facts in Wikidata, each possibly having different sources. It looks like an optimization problem and so it seems likely that having an efficient system to provide such sources is going to be very hard (it's just my feeling, I have not studied this problem in details. If you are interested in this topic search "sparql provenance semirings" in your favorite search engine).
  - To my mind, it's not about having your name written everywhere, that is just pissing territory vanity in the noosphere. This is about data traceability. And also this exposal of the problem seems to mix two different topic: 1. keep record of provenance and respect license of sources, 2. ability to generate report about data provenance even for mashed up data set. I agree that the later is a hard problem, but the first is technically extremely easy, the only difficulty here pertain to importation policy management. --Psychoslave (talk) 09:56, 28 February 2018 (UTC)
- As Lydia said having a more restrictive license is going to hit mostly small reusers: big companies have lawyers to take care of all the legal implications and if they choose not to reuse Wikidata content they could without hurting their revenues. For example Google have its Knowledge Graph and Microsoft its Satori database, both created before Wikidata and I do not see any reason to to think they (or similar companies) could not do the same with lexicographic data if they need them. Tpt (talk) 20:53, 26 February 2018 (UTC)
  - Sure big companies have the money for lawyers, but how does CC-0 help in any way here? Knowledge Graph use, inter alie, Wikidata input, and Google even was even a major fund provider for launching Wikidata, plus they hired Denis which was the project leader (not sure what its official role today), but before that they also bought the Freebase which was under CC-by-sa. There are no doubt that big companies are happy with the current Wikidata policy. I wasn't pointed with any evidence of small companies or individual that would be angry to see a demand of more equity regarding freedom of derivative works. --Psychoslave (talk) 09:56, 28 February 2018 (UTC)
Support Toni 001 (talk) 11:19, 27 February 2018 (UTC)
Support From the point of views of Wikidata CC-0 is a good choice. But doing this without prior discussion with users from Wikitionaries is a bit inappropriate.--Jklamo (talk) 11:52, 27 February 2018 (UTC)
- As I noted above: There appears to have been plenty of prior consultation on the wider issues; not least by Léa (in English), (and in French) over the last year and a half. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:28, 27 February 2018 (UTC)
  - @Pigsonthewing: I had a look at those, in English and in French. They seemed to be mostly about arbitrary access being turned on for existing Wikidata items. I'm not sure I saw anything about the planned governance or licensing of lexemes on Wikidata. Are there any particular threads that Lea participated in that you have in mind as representing relevant consultation? Jheald (talk) 15:07, 27 February 2018 (UTC)
    - Consultation relevant to the planned licensing? You're participating in it here. You will find a partial - yet lengthy - list of relevant consultations at Wikidata:Wiktionary/History Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:40, 27 February 2018 (UTC)
      - @Pigsonthewing: Exactly the point. The discussion is happening here, as opposed to with any of the Wiktionary communities, which is inappropriate. --Yair rand (talk) 03:48, 28 February 2018 (UTC)
        This discussion is about the licensing of content on Wikidata; this is therefore the correct venue for it. Nonetheless, it has been announced on Wiktionary, and several people from the Wiktionary community have commented here. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:33, 28 February 2018 (UTC)
        This discussion is about licensing lexicographical content, a kind of data wiktionarians known and wikidatians don't. So cons here from wiktionarians may seems misguided by a lack of knowledge of how Wikidata work, and some pros here from wikidatians may seems misguided by a lack of knowledge of lexicographical data. So, I think this conversation have to be restarted with much more explanation for both audience. Noé (talk) 18:27, 1 March 2018 (UTC)
        I'd be fascinated to learn more of your research into what Wikidatans do, and do not, know. Is it published online anywhere? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 19:19, 1 March 2018 (UTC)
        I haven't published anything on Wikidatians works. I did summarized some thoughts about lexicography and Wiktionarians skills but I'll be pleased to read any paper on Wikidatians expertise on lexicographical data. Noé (talk) 09:12, 2 March 2018 (UTC)
        You are the one asserting that "wikidatians [sic] don't know lexicographical content" and "lack knowledge of lexicographical data". I was curious to see whether you would be able to substantiate those assertions, but instead it seems they're merely "thoughts", expressed in an attempt to bolster a very weak argumentum ad verecundiam. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:32, 2 March 2018 (UTC)
        Why sic? In Wikidata editors (Q28859214), I can see Wikidatians as an alias in English. Well, it is hard for me to prove the absence of something. As strakhov point out, Wikidatians here seems to have very few participation in Wiktionaries. I haven't read any analysis or essay by Wikidatians about lexicographical data, except by Denny and Lydia. I'll be happy to change my mind, to read anything that can show that people here are concerned by lexicographical data. Noé (talk) 18:47, 2 March 2018 (UTC)
        I'm not asking - nor expecting - you to change your mind. I'm suggesting that you should not present your opinions as facts, especially when they disparage fellow editors. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:41, 4 March 2018 (UTC)
Strong support --Egon Willighagen (talk) 16:43, 27 February 2018 (UTC)
Strong support given the ridiculously complicated mixture of copyright and database rights for data in different jurisdictions having the simplest to reuse license is very helpful. John Cummings (talk) 17:53, 27 February 2018 (UTC)
- @John Cummings: Some lexicographic data here in CC0 and some lexicographic data in Wiktionaries with CC BY-SA. This is a mixture of license, don't you think? Noé (talk) 09:01, 2 March 2018 (UTC)
  - @Noé: The issue is that an individual fact cannot be copyrighted and so cannot have a license, copyright only applies to databases of facts in some jurisdictions and has database rights in others. Also database rights or copyright of a database is only broken when 'a significant portion' of the data is reused, the response I've had from lawyers I've asked about this is that there is no case law to define 'a significant proportion' either in terms of percentage or overall size or anything else. --John Cummings (talk) 22:50, 6 March 2018 (UTC)
Strong support In the above discussion I see a lot of off-topic objection and no convincing reasons for restricting the structured data under discussion. MartinPoulter (talk) 12:35, 28 February 2018 (UTC)
Strong support As a researcher, I find CC-0 much more welcoming than other licenses. We are trying to invest as much resources as possible into contributing to Wikidata with our research, and complicated or overly restrictive licensing terms are an obstacle for us, since we are not legal experts. We don't want to consult lawyers each time we want to try out some new idea and put the results on the Web! (Neither do we want to disrespect the legal terms chosen by a community, like many researchers do!) Also as a data producer and contributor, I am strongly in favour of CC-0. My contributions are donations, which I hope to be useful to all of us (not just for those with a legal department at their command). In research, attribution is usually not done because of legal force, but because of academic standards that each research community has to hold up. I don't believe legal terms are an effective way to enforce respect and honesty in research. --Markus Krötzsch (talk) 17:01, 28 February 2018 (UTC)
Strong support For the reasons stated above. I think our work (and many others') in the biomedical domain shows that there are significant licensing challenges in any data integration effort. Wikidata has greatly simplified access to CC0-licensed resources, and also spurred several biomedical groups to change their license to CC0 based on (at least in part) what it enables within Wikidata. Cheers, Andrew Su (talk) 17:08, 28 February 2018 (UTC)
Support What else? --Succu (talk) 21:16, 28 February 2018 (UTC)
Oppose I was a bit hesitant on this and not really convinced by the arguments advanced.
Wikidata's Wiktionary namespace is likely to supplant much of Wiktionary.org. Contributing to Wikitionary.org hadn't really been favored by the use of MediaWiki-software and I don't think much had been done to develop its infrastructure over the years. Now that a Wikibase-structured installation is to be created, I don't think there will be that much done to integrate its content in Wiktionary.org in an efficient way. At least, I haven't seen any prototype for that. There may not be much use of doing that either, as most if not all information can be included in a Wikibase site in a structured way. This is fundamentally different to Wikidata's main namespace that replaced interwikis in Wikipedia and maybe some infobox content. Wikipedia as such continues to operate as an encyclopedia.
Similar to Wikipedia, Commons and Wiktionary.org use Wikimedia's default licensing model and this hadn't hindered its growth. Commons will eventually have its separate installation of Wikidata(Wikibase) and continue with its licensing model. So the use of structured data doesn't seem to constraint people to use cc0 nor constraint them to store all data within Wikidata itself.
Already now, federated queries and queries to Wikipedia content are possible leading users to retrieve content with different licenses in one query.
From a query perspective, it wouldn't really matter if Wiktionary content is on a separate Wikidata(Wikibase) installation as that of Commons and content could still be made available to Wikidata users. We could hold Wiktionary content on a separate installation and the namespace question wouldn't come up. Additionally, it's not clear why a distinction within Wikidata couldn't be made especially as for now no textual content is available.
From a Wikidata perspective, the suggested approach initially made sense, but then I noticed it prevents us from expanding some of the dictionary related elements we already have with content from Wikitionary and Wikipedia.
Further, the solution doesn't seem ideal for the Wiktionary community: most if not all of its content would be held outside the sites themselves.
In conclusion, I think it would be good to develop an alternative installation for Wiktionary content as it's being done for Commons. It's regrettable that this wasn't evaluated and presented as an alternative from the beginning.
--- Jura 12:11, 1 March 2018 (UTC)
- Jura, it is not true that it was not evaluated. --Lydia Pintscher (WMDE) (talk) 12:33, 1 March 2018 (UTC)
  - I asked about it at Wikidata talk:Lexicographical data#Separate installation for Wiktionary ?, but the answer I got didn't seem to indicate that. Is it available somewhere?
    --- Jura 12:46, 1 March 2018 (UTC)
Support XIII (talk) 09:21, 2 March 2018 (UTC)
Support. Reh man 10:22, 2 March 2018 (UTC)
Support Language resources are vital for natural language processing technologies. In my experience as a researcher, it is very difficult to easily find and freely access them, since they are often protected by license barriers. CC-0 is essential here! --Hjfocs (talk) 16:52, 2 March 2018 (UTC)
Support. I understand the scares of some Wiktionary editors who fear that Wikidata will put Wiktionary data that are (for them) in CC-BY-SA, but I think that CC0 is the best licence for the reuse as said above by Hjfocs. Tubezlob (🙋) 17:54, 2 March 2018 (UTC)
Support Ainali (talk) 22:33, 2 March 2018 (UTC)
Question Why this vote happening here and not in a Requests for comment? Noé (talk) 15:10, 10 March 2018 (UTC)
Support The best license IMO for open structured data collection is CC0 and alike. More restrictive licenses create source/license tracking nightmare (having to attach provenance to every piece of data or retain a lawyer to wrestle with a question how much of your federated query result is copyrightable and what should be licensed and alike - which most people would prefer to stay away from) and impede free data reuse and collaboration between data sources. See more on this here: https://wiki.osmfoundation.org/wiki/Licence_and_Legal_FAQ/Why_CC_BY-SA_is_Unsuitable Laboramus (talk) 18:06, 15 March 2018 (UTC)

Support Li Song (talk) 02:42, 24 March 2018 (UTC)

Discussion elsewhere

Please be aware of wikt:Wiktionary:Beer parlour#Wikidata and CC0 licence for lexicographical data. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 22:28, 23 February 2018 (UTC)

Also in the French Wiktionary: wikt:fr:Wiktionnaire:Wikidémie/février 2018#Licence pour l'espace de nom Lexeme sur Wikidata. --Thibaut120094 (talk) 17:30, 25 February 2018 (UTC)

Mentioned in Actualités, the French Wiktionary monthly magazine (similar as Signpost). Also exist in English. Noé (talk) 10:30, 1 March 2018 (UTC)

Mentioned in Regards sur l'actualité de la Wikimedia et d'ailleurs, French Wikipedia monthly magazine. Noé (talk) 18:17, 1 March 2018 (UTC)

Wikidata talk:Lexicographical data#Separate installation for Wiktionary ?
Wikidata talk:Lexicographical data#Separate approach/namespace for textual definitions ? – The preceding unsigned comment was added by Jura1 (talk • contribs) at 09:12, 26 February 2018‎ (UTC).
Wikidata:Project_chat#Gap_between_Wiktionary.org_and_Wiktionary_namespace_at_Wikidata?

A reply to the proposal

One of the remaining tasks is around licensing. Since the beginning of Wikidata all our structured data is released under CC-0. This has helped significantly with spreading our data widely and quickly and thereby helping us give more people more access to more knowledge.: There is no way to check whether a different license, like the ODbL used by OSM for example, would have done a less good job at this. There was no A/B test with different license. On the other hand impossibility to enforce fair same condition of use on derivative works make it far less likely to be sustainable, and also go against a good data traceability. Both this feature go against our strategic direction of knowledge equity and ability to use different forms of free, trusted knowledge.
I am convinced it is in the best interest of Wikidata to extend CC-0 to all structured data namespaces. The reasons (in addition to my reasons for CC-0 in general): This have already been been replied in November.
We have fared very well with CC-0 so far and many partners use it as one of the main reasons they are attracted to Wikidata - both for re-use and contribution of data.: It would be interesting to have pool about which actors did accepted to contribute to Wikidata for this specific reason, and even better comparison of how many actors refused to participate due to this specific reason. Without that kind of metrics, no success can be honestly attributed to this license choice.
Having a mix of licenses is a potential legal minefield that can be exploited by some actors, threatening not only re-users but also our own contributors.: This is clearly FUD, and it's sad to see such a practice used here. All the more, pretending that Wikidata is under CC0 is not enough to make sure it is. Without appropriate license tracking of data sources, the legal uncertainty of Wikidata growth with the base itself. As there is not such a scrupulous control, and that on the contrary Wikidata community refuse to admit its massive imports from other sources, like Wikipedia, is illegal.; Indeed A person infringes a database right if they extract or re-utilise all or a substantial part of the contents of a protected database without the consent of the owner. It should be noted, however, that extracting or re-utilising a substantial part of the contents Database rights: the basics; In this circumstances, until the situation is cleared, using Wikidata as input is actually a legal Russian roulette: there's no harm playing with it until it will blow into your face.
It is a huge hassle for re-users, in particular small re-users like individual contributors, hobby developers, and small organizations, and will lead to less usage by these, and thus to less spreading of our knowledge.: On the contrary, that's typically the kind of public that would be positively impacted by a copyleft license on the overall. It's an other far more annoying problem for really big transnational business, obviously. There are the only kind of structure that benefit of this lake of equity in reuse.
It is the sound thing for data - much better explained by Luis in his blog posts (1, 2, 3, 4).: That is one person which is explicitely stating "Wikidata did the right choice", it's obviously a strongly biased source. Not that everything is to drop there, but there is clearly no balanced view of pros and cons in this blog posts.
It will mean that we can not import some data from Wiktionary and other sources that is incompatible with CC-0 but that is already the case now.: No it means you won't be able to legally import any substantial part of Wikitionary within Wikidata. But judging by how large extract of Wikipedia were injected into Wikidata, it's seems that legality is not a very important matter for Wikidata. Sure, it might be not illegal in some unknown country, but at least in Europe and United-States, it seems Wikidata blithely crossed the line of legality. The Wikidata team promised it would not import data from Wikipedia, but they broke this promise. Based on this experience, they can be trusted about statement that they won't let happen massive import and license laundering of Wikitionary works.
We have always leaned towards making re-use easier at the expense of easy importing.: And more importantly, at the expense of legal confidence.

--Psychoslave (talk) 09:20, 28 February 2018 (UTC)

┌────────────────────────────────────────────────────────────────────────────────────────────────────┘ Here's Lydia's response to your tl;dr screed from the mailing list:

I understand you care a lot about this topic and are posting about it in many places but I have a personal rule that a lot of the people in Wikidata know. I am willing to discuss and explain basically anything on a calm and rational basis. (And I did this on-wiki I believe.) The rule is simple: The more loud, aggressive and pushy someone gets about a topic the less likely I am to engage. This rule has a simple reason: I don't want Wikidata to get into a spiral of shouting. If we do this people get into the mode where only if they shout they get heard so they shout all the time. This is toxic for a community. So I fear I can't contribute to this thread beyond this message.

-- Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:29, 28 February 2018 (UTC)

What is your point here? Do you mean any feedback should be stopped to be given as soon as it goes further than "I agree with your proposal"? I did agree at the time to the reply of Lidya as I perceived how my answer back then could be qualified as "loud, aggressive and pushy". I did apologize for this, both on the list and off-list to people that might have felt personally attacked. Although to my mind her resolution to not participate at all in the conversation was not an appropriate conduct ever, her critic on the form was fair, but her avoidance of any reply to the substance was far less valid. I didn't want to add fuel to fire, so I just shut up on this.

Now in this current thread, so far I'm convinced my intervention can not be qualified under the same reproach. So please point me to where you do believe I've been impolite, or reply to critics with valid factual arguments, or come with new well documented arguments in favour of the proposal if you have some. But please don't throw my past errors at my face like anathema and perpetual prohibition of giving critical cordial feedback. Maybe you are perfect, but mere mortal like me only improve through recognition and overtaking of their errors, unduly perpetually reproaching them about one of them is not a way to help them progress. Not being able to provide critical feedback is also toxic for a community. --Psychoslave (talk) 09:02, 1 March 2018 (UTC)

Past errors? You were talking about "the Wikidata team self-hagiographic rethoric" literally in the hour before your response here. I would say that if you don't even recognize any issues with what you write without having to be explicitly pointed to them, I don't see enough benefit in spending time to engage with you. --Denny (talk) 17:42, 1 March 2018 (UTC)

Maybe it would be better to put back this quote in its context where its part of a direct reply to @Noé: who complained several time in different channels of the lake of cons exposure in this section proposal. So this was more an allusive joke with the over pedantic wording intended to reinforce it. This wasn't "loudly pushed in an aggressive manner" in this thread, it was pulled out of a unlinked context where it can at worst be qualified as sarcastic. Now instead of trying to discredit each other on ad hominem bases, what about talking the real problems that create this tensions and try to solve them?

Namely, how will this choice of CC-0 be the best option for Wikitionaries when all its community have built until now is covered by an incompatible license preventing any massive import that wouldn't cast heavy legal doubt in many countries where possibly most contributors are currently living? Sure Wikidata doesn't target only Wikitionaries, and that's good to be open at other uses. But our Wikimedia community should be given the tool to leverage on what was already achieved without this legal doubtfulness. On this regard, using CC-BY-SA for the Lexeme namespace would make far more sense. And it's not only Wikitionary, look at the license of database like Google ngrams or Les Vocabulaires du Ministère de la Culture et de la Communication: CC-BY-SA 3.0 unported. This are just two very interesting sources that won't be includable if CC-0 is retained as exclusive license for the Lexeme, and there are many other out there. --Psychoslave (talk) 21:39, 1 March 2018 (UTC)

I am sorry but can we discuss this issue without bringing in arguments like "Wikidata choice toward CC0 was heavily influenced by Denny Vrandečić, who – to make it short – is now working in the Google Knowledge Graph team"? I personally feel using arguments like this precludes any good faith discussion. Either we are here discussing decisions of licensing, CC0 or otherwise, on their merits, presuming the disagreement stems from genuine desire to find better solution and honest disagreement about the means to bring that. And then stuff like that should not ever be mentioned - and certainly not referred to repeatedly. Or we're trading conspiracy theories and accusations, and then I don't see much use for it, frankly. If somebody is convinced the counterpart is acting in bad faith, no argument is going to sound convincing and no agreement will be deemed satisfactory. It turns into a negative-sum game, and the only winning strategy in this game is not playing it. Laboramus (talk) 18:19, 15 March 2018 (UTC)

Wiktionary edits

Comment Hey folks.

support. ~5910 edits in wiktionaries (~5000 by VIGNERON alone);

VIGNERON (~5055 edits), I9606 (0 edits), Andy Mabbett (97 edits), Sannita (12 edits), Andrawaag (0 edits), Magnus Manske (0 edits), ArthurPSmith (0 edits), Jsamwrites (0 edits), Syced (278 edits), Wittylama (4 edits), Jarekt (24 edits), Pymouss (233 edits), Jc86035 (38 edits), Pasleim (0 edits), Denny (edits), Tpt (0 edits), Toni 001 (0 edits), Jklamo (150 edits), Egon Willighagen (0 edits), John Cummings (1 edits), MartinPoulter (3 edits), Markus Krötzsch (0 edits), Andrew Su (0 edits), Succu (0 edits), XIIIfromTOKYO (0 edits), Rehman (0 edits).

oppose ~158 027 edits in wiktionaries:

Jura1 (0 edits), Noé (13181 edits), Psychoslave (2885 edits), Yair rand (32394 edits), Jheald (1 edits), Metaknowledge (83933 edits), Deryck Chan (8 edits), Lyokoï (23023 edits), Delarouvraie (2602 edits)

neutral/so-so ~127000 edits in wiktionaries:

Pamputt (~102000 edits), Jberkel (23187 edits) Nmaia (1740 edits).

In whose interest is this particular arrangement? Not wiktionary's one, it seems. Shouldn't wiktionary communities have a say? After all, who's gonna look after lexicographical data in Wikidata? The "0-wiktionary-edits-guys from above? strakhov (talk) 16:09, 2 March 2018 (UTC)

Are you arguing that if a single contributor - Metaknowledge - had a change of mind, it would be all fine?

Also, I know I haven't contributed to the Wiktionaries a whole ton (although quite a bit more than what you say, I feel rather omitted :) ), one reason why I did not contribute more was because, to be honest, the idea of having a dictionary of every language, separately maintained by every language, a rather unachievable one. I was convinced for more than a decade that an approach where we centralize the data and maintain it only once is far more productive. In fact, I am arguing that by having lexicographic data in Wikidata we will not only see current Wiktionary contributors contribute to Wikidata, but we will have an influx of new contributors that are currently not contributing to Wiktionary at all. We saw the same in Wikidata with respect to Wikipedia. I would even go so far and make a bet about how long it will take to have more active contributors working on the lexicographic data in Wikidata than we currently see contributors in any of the Wiktionary projects, if someone is willing to take that bet. This is, in my mind, a great opportunity for increasing the number of contributors, and for increasing the chance for the Wiktionary projects to achieve their mission. --Denny (talk) 16:58, 2 March 2018 (UTC)

Nop exactly. I just meant the only significant contributor in Wiktionary supporting this proposal was apparently VIGNERON (sorry if I missed something in my quick recount). I do not see how antagonizing an entire project is good for us, even if it attracts some additional guys from the outside. I do not oppose 'structuring' Wiktionary, as I find pretty inneficient the work done there too (anyway, my wiktionary-experience is pretty pretty scarce), but I'd try to involve those communities instead of outvoting them here by brute force. To ask them and to give them what they need. If a significant (is it?) part of Wiktionary communities feel their work is plagiarised or miss-licensed by this Wikidata-CC0-lexeme-approach, maybe it's time to rethink the proposal (whether it's "legal" or not. That's on the legal team, I'm not a lawyer). strakhov (talk) 17:32, 2 March 2018 (UTC)

Now do the same counts for Wikidata edits - this is, after all, a discussion of what licence to use on Wikidata - something which more than one of your high-scoring Wikitionary examples have not addressed at all. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:28, 2 March 2018 (UTC)

Well. You imply that this thing being stored in Wikidata is something already set in stone. Maybe it's not, or at least it shouldn't, as Jura pointed out. strakhov (talk) 18:59, 2 March 2018 (UTC)

Imply? I'm able to assert it with confidence. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:08, 3 March 2018 (UTC)

Yeah, whatever. Let's see if I get your point:

Should we create the lexeme space? Yeah, it's awesome.
But should not wiktionarians have a say? Nop, because this is all about wikidata and licensing, and we the wikidatians Template:Sic know so much about that stuff.
But couldn't this thing be installed in Wiktionary instead of here? Nah, I assert with full confidance it has to be here.

Well, I don't know, this all seems pretty shallow, doesn't it? After all Wikimedia Commons is going to host their metadata in their own project, I do not know why wiktionarians should not host their lexemedata there. Sincerely, hosting it in Wikidata seems an excuse for licensing it with CC-0 because mixing licenses is so bad. So, it is here. Then it should be CC0. And here. And CC0... And so on. strakhov (talk) 20:06, 3 March 2018 (UTC)

"Let's see if I get your point" You don't. And don't try to put words into my mouth. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:22, 3 March 2018 (UTC)

Then try some reasoning instead of your it has to be because it has to be in Wikidata because I assert it. Word the ideas coming out from your mouth as you prefer, I don't care. strakhov (talk) 20:28, 3 March 2018 (UTC)

[ec] Didn't I just tell you not to try to put words into my mouth? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:31, 3 March 2018 (UTC)

Just a comment, not a vote, by Lmaltier (250.000 contributions on fr.wiktionary, and about 1.342.000 page creations including creations by bot). I'm not interested by this vote, because I think that it would be best for Wikidata not to import anything from Wiktionaries, and best for Wiktionaries not to import automatically anything from Wikidata. I would oppose to any automated change on fr.wiktionary when something is changed on Wikidata (manual exploitation might be possible noneless, e.g. if there can be some alert when linguistic Wikidata info is corrected, because this might be something to be checked on Wiktionaries, but this interest is minor). Wikidata might be helpful to some external projects, but I think that it will be negative for Wiktionaries, as it might attract some good contributors to Wiktionary's detriment (what happened with OmegaWiki). It might even be fatal to some Wiktionaries, and remember that Wiktionaries are what's useful to readers (no reader will ever read Wikidata). From the strict Wiktionary point of view, I would be happy if the Lexeme namespace is discontinued altogether, and I think that its negative consequences would probably outweigh its usefulness. Lmaltier (talk) 20:13, 2 March 2018 (UTC)

It is a bit odd that the proposal hasn't much support by people who actually edit Wiktionary.
Personally, I was hoping the new features would allow me to expand Wiktionary-related content at Wikidata I already edited (some was substantially expanded or started by myself) and I notice now that this wouldn't be possible if we moved ahead with the proposal. Looks like we should have made a better use of other Wikidata features earlier.
--- Jura 04:39, 3 March 2018 (UTC)

[…] to be honest, the idea of having a dictionary of every language, separately maintained by every language, a rather unachievable one. I was convinced for more than a decade that an approach where we centralize the data and maintain it only once is far more productive. In fact, I am arguing that by having lexicographic data in Wikidata we will not only see current Wiktionary contributors contribute to Wikidata, but we will have an influx of new contributors that are currently not contributing to Wiktionary at all.: The usefulness of a way to factorize some information so that it can be more easily shared between each Wikitionary is clear, and a good solution to this problem would be indeed very welcome, at least by me. Now this doesn't imply that this solution should be centralized, nor that productiveness is the sole criteria which out-stand all the others. Producing a lot of crappy data would still hold the objective of being productive. All the more, even taking the centralizing approach, it also have to propose a data model which will actually fulfill the evolving needs of the wikitionarian community. And sadly the proposed model don't match such an expectation. It's like someone would propose to an hypothetical Wikiphys community an Aristotelian or even Newtonian data model and hope that all there will ever be stated about physic will hold in this, while ignoring that many useful material this community already created won't be legally usable within it due to the selection of an incompatible license. Not only almost none of the feedback that the Wiktionary community gave back wasn't taken into consideration, but it seems that the idea of hiring some skilled people on both linguistic and computer science wasn't considered either when actually there are some dedicated degree mixing both topics out there. Admittedly, this technical considerations are not the central point here, but since it advanced as a pro argument above, it surely fair to provide some perspective of whit this present proposal is proposing to introduce. --Psychoslave (talk) 06:18, 4 March 2018 (UTC)

Call for civility

I would appreciate if we could keep the conversation civil, on both sides. Fortunately most of us do so, but there is neither a need to dissect every argument presented on each side, nor is it polite to call out anyone's opinion as wrong. We should let everyone express their opinion, whether dissenting or assenting with Lydia, and continue to be friendly to each other. It is obvious that, no matter what happens, not everyone will agree with the course of action, and that's OK. There is no need to further burn bridges. As usual I am reminded of the Wikimania talk last year - was it last year? - it is among people with the same goal that the fiercest fights are fought. We shouldn't. We should all work together towards our common goal, towards our mission: a world in which everyone can share in the sum of all human knowledge - and we should do so in the understanding that even if we disagree in some points, we still have the same goal and are on the same side. So, please, let's be friendly, and in case you still want to write some hot-headed answer, sleep over it at least once.

We're in this together. --Denny (talk) 04:29, 4 March 2018 (UTC)

That being said, Denny, we must take care to state and restate clearly the specific pretenses on which we are approaching this proposed change and state as clearly as possible the bases for the arguments we wish to make about the change, even if we end up just repeating ourselves or stating the obvious within the span of just minutes. It is unfortunate that claims about the effects of this change are being made here with insufficient evidence on either side and that some in the conversation are not thoroughly explicitly declaring the sources for their assertions (instead of pointing to Léa's talk page or some "research on the topic" not summarized anywhere, point to specific sections of the page or give some quotes from there to better defend your point—who knows, maybe they do agree with you and they just don't know it).

(Full disclaimer: I concur precisely with the rationale behind Jheald's, and by extension Yair rand's, oppose vote. I do not think we should antagonize those people who frequently work with lexicographies on Wikimedia projects by letting a choice about lexicographies on this Wikimedia project be nearly wholly determined by those who lack consistent work with lexicographies on Wikimedia projects.) Mahir256 (talk) 08:05, 4 March 2018 (UTC)

Call for civility is fine and surely we should all embrace any reasonable call for this. Stopping to analyze and reply to each arguments is a far more questionable demand. Not that such an approach can not come with its own kind of error in its conclusions, but to make it an uncivil approach per se is probably exaggerated, isn't it?

If the Wikimania talk mentioned was recorded and is available somewhere, a link allowing to watch it would be interesting.

Not even making the slightest mention of well known concern of some of our community members when presenting a proposal on this topic and not making a call for comment in Wiktionary main talk pages is probably not the best way to lead to a cordial discussion including all interested stakeholders. Also calling for personal bet on who is right on any topic is probably not the most efficient way to avoid generating antagonistic conversations. Do we agree on this? --Psychoslave (talk) 08:36, 5 March 2018 (UTC)

@Denny: I agree completely that it's important to have this discussion civilly.

I feel that part of the reason this discussion has become difficult is that people are talking past each other and discussing different points, partly because Lydia's original proposal was about several different things, with some ambiguity left about the status of the various parts. The proposal was to change the Wikidata license text to include a Lexeme namespace in the list of namespaces licensed under CC0, thus presenting two questions: Should there be a Lexeme namespace on Wikidata, and if so, should it be licensed under CC0? The former is being taken by some to be a given, and it's not even completely clear whether the dev team is willing to let this be subject to community consensus, and if so, which community's consensus matters here. Wikidata? Wiktionary? The Wikimedia community at large? What if the Wiktionary communities get consensus for establishing a separate wikibase installation? Would there then be two projects working on structured lexemes, or would that be reason enough not to add the namespace to Wikidata, or would the developers refuse the request? The ambiguity and the resulting repeated miscommunications are contributing to the rising temperature of the discussion. What we need now is official clarification of what the parameters are here, we need to figure out where the data will go and who decides, and then whichever community or communities are the relevant one(s) should have a discussion about what the license should be. --Yair rand (talk) 21:28, 5 March 2018 (UTC)

The decision about having a Lexeme namespace on Wikidata has been taken already. This has been discussed on Wikidata for several years, and has been discussed with Wiktionary communities since 2016. We've been asking for feedback about the data model during the same period. The development team has been working on it since 2016, and the first version is about to be released.

Here is a short list of the arguments that made us decide on storing lexicographical data on Wikidata, instead of having a separate Wikibase instance on Wiktionary:

Just like Wikidata concepts are not only for Wikipedia, lexicographical data is not only for Wiktionary but for everyone: other Wikimedia projects, third parties.. Wikidata is already recognized, both inside and outside the Wikimedia movement, as a central deposit of reusable data.
Being able to interlink lexicographical data and data about concepts, reusing the same properties, items, etc, is way easier if all the data is stored at the same place. On another site, editors would have to recreate everything that is already existing on Wikidata. On the same level, when everything will be on Wikidata, we will be able to build queries mixing concepts and words.
The Wikidata community worked on many tools to make their life easier, to add, edit, reuse data, and these tools may be adapted for lexicographical data.
Wikidata community has strong knowledge about how to handle structured data, and will be curious to learn about lexicographical data, just like they learned about ontologies, and all kind of topics, when editing Wikidata. Together, we can combine our fields of expertise.
Enthusiasm about structured data and multilingual collaboration is way higher on Wikidata than on Wiktionaries. Wikidata community is experienced in multilingual collaboration and can support Wiktionary editors.
We expect to attract more new editors if everything is centralized on Wikidata, than on a new platform that would be stored on Wiktionary.

I hope that helps understand why this decision has been taken. Lea Lacroix (WMDE) (talk) 16:27, 7 March 2018 (UTC)

Can we see links about "The decision about having a Lexeme namespace on Wikidata has been taken already." and "This has been discussed on Wikidata for several years"? So the actual alternative we should discuss is do we want to be able to include data from Wiktionary or not?
--- Jura 16:52, 7 March 2018 (UTC)

Jura, you regularly contribute to 'Wikidata weekly summary'. Are we really expected to believe that you do so without actually reading it? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:58, 7 March 2018 (UTC)

Maybe you can provide the links.
--- Jura 17:02, 7 March 2018 (UTC)

And maybe, since you want them and know where to find them, you could. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:48, 7 March 2018 (UTC)

It seems you can't do that either. Let's wait for Lea Lacroix (WMDE) then.
--- Jura 19:02, 7 March 2018 (UTC)

@Lea Lacroix (WMDE): To respond a few of your points:

I am reasonably confident that there would not be any overlapping properties between items and lexemes. Links to items in external databases are possible in any case.
Combining expertise would be possible on a new site, where Wikidatians and Wiktionarians would be on equal footing forming a new project. If you don't expect Wikidata users to go to a new site with a blank slate, why would you expect Wiktionary users to venture into a site dominated by a different culture and an existing foreign structure of policy and administration? We would get the most out of both communities with a neutral place to work focused on lexical data.
The Wikidata community does not do direct multilingual collaboration, despite our best efforts. Several mechanisms were imported from Commons, but I don't think we've ever had any important discussion here that had as much as 10% non-English-speaking participation. Wiktionary, on the other hand, is filled with language enthusiasts and some actual professional translators.

These issues deserve to be discussed, but there has been no discussion on this topic, and no consensus in favor of the decision to launch here. There are opportunities here that we may be throwing away. --Yair rand (talk) 21:51, 7 March 2018 (UTC)

Noé (talk) 14:57, 8 March 2018 (UTC)

@Yair rand: All important discussions (that involved all the contributors) are in English because it's the international language and the most speak by the Wikidata editors. Sometimes, on the French project chat, we choose to continue the discussion here on the English chat. It's totally normal and necessary. But it doesn't stop the French Bistro from being very active. With our efforts, 81% of the translatable messages are translated in French. It is possible to contribute to Wikidata only in French, without knowing a single word of English, while discussing the project only in French too. Tubezlob (🙋) 18:17, 8 March 2018 (UTC)

@Lea Lacroix (WMDE): would you respond to the questions above: 'Can we see links about "The decision about having a Lexeme namespace on Wikidata has been taken already." ?'
--- Jura 09:16, 15 March 2018 (UTC)

If I understand correctly, you are asking if there has been a community discussion and decision on-wiki. In that case, there is not. There have been several discussions between the development team and the communities, for example at Wikimania. We discussed the project with many people with diverse skills, over the past four years. Not all discussions happen online. Discussions happening IRL are also valid. Lea Lacroix (WMDE) (talk) 09:42, 15 March 2018 (UTC)

@Lea Lacroix (WMDE): I disagree (as I suspect most do) that IRL discussions are valid substitutions for on-wiki consensus, but I appreciate the clarification. Thank you. --Yair rand (talk) 17:55, 15 March 2018 (UTC)

I also disagree that that IRL discussions are valid substitutions for on-wiki consensus. Of course any canal is good to communicate with the different possibilities they bring, this opinion is really scoped to forge a consensus. --Psychoslave (talk) 16:14, 16 March 2018 (UTC)

I wish we had a system where we could refer to the different senses of a word directly - when you say "forge" I assume you mean either the second sense of the second etymology or the first or second sense of the third etymology on Wiktionary, and not the third sense of the second etymology? --Denny (talk) 17:18, 16 March 2018 (UTC)

(@Denny: You can do this, for senses with the appropriate template.) --Yair rand (talk) 19:37, 18 March 2018 (UTC)

Thanks, Yair rand, I didn't know about that template. --Denny (talk) 02:33, 19 March 2018 (UTC)

Wikidata focus - facts about real world or facts in Wikipedia

"en label should be the same as the enwiki article name" - [14] 77.180.168.210 00:41, 24 March 2018 (UTC)

Although, oddly, "There is no requirement that an item's label be the same as the page name on its corresponding Wikimedia site." says Help:Label --Tagishsimon (talk) 01:00, 24 March 2018 (UTC)

Strange (mis)use of "of (P642)" qualifier within significant event (P793) property

Look at Mona Lisa (Q12418) there is a statement there about significant event (P793)art theft (Q1756454)of (P642)Vincenzo Peruggia (Q361362). I do not think there was theft of Vincenzo Peruggia, it was theft by Vincenzo Peruggia. Different example: according to www.museunacional.cat Portrait of the Girl Rosó Galia (Q23932099) was "Donated by Joan Miró Folguera" to the museum in 1931 that is modeled on Wikidata by Portrait of the Girl Rosó Galia (Q23932099)significant event (P793)gift (Q707482)of (P642)private collection (Q768717) in 1931. According to this SPARQL query there are 3750 uses of significant event (P793)of (P642) pairs and I can not find any that make sense (at least in English). How is significant event (P793) supposed to be modeled? Usually change of ownership events like acquisition (Q22340494), bequest (Q211557), gift (Q707482) or sales (Q194189) are modeled with old owner and new owner names, see for example c:Template:ProvenanceEvent, but how do we model that on Wikidata? Do we need to create new qualifiers for that? --Jarekt (talk) 15:25, 21 March 2018 (UTC)

Indeed a problem. I don't know how to deal with it. --Marsupium (talk) 16:11, 24 March 2018 (UTC)

Replace redirect

Is there an easy way to replace Keravnos Strovolou FC (Q25975624) with Keravnos Strovolou FC (Q7063280)? Because of the merging to the second item, no label is showing when using {{Wikidata list}} on Wikipedia. Xaris333 (talk) 15:24, 24 March 2018 (UTC)

Some bot will replace the incoming links soon. Sjoerd de Bruin (talk) 15:37, 24 March 2018 (UTC)

Still waiting for the bot. How soon is... soon? Xaris333 (talk) 18:07, 26 March 2018 (UTC)

Some of the changes on the wikidata has lag on the database

Tracked in Phabricator
Task T190667

About 1 month ago at the 28th Feb this edit is done.

Now (26th March) this query doesn't show the new interwiki link. it means Replag has a long time delay also Replag is 0. something should be wrongYamaha5 (talk) 11:04, 26 March 2018 (UTC)

Wikidata weekly summary #305

Here's your quick overview of what has been happening around Wikidata over the last week.

Discussions
- Closed request for adminship: Putnik, Okkn. Welcome on board!
- Closed request for comments: Former ATE

Events
- Upcoming: 1st Workshop on Quality of Open Data, Berlin, July 18–20 (submission deadline May 27)
- Upcoming: EuropeanaTech and Wikidata Workshop Day for GLAMs, Rotterdam (NL), Monday 14 May. A day of GLAM-related workshops around Wikidata and Structured Commons, for beginners and advanced users.

Press, articles, blog posts
- SPLASHes in Wikidata, by Egon Willighagen

Other Noteworthy Stuff
- report about items with identical birth and death dates updated
- Relator, a tool to improve family relations in Wikidata
- Descendants check: consistency across multiple generations
- Due to Easter Monday, the next issue of the Weekly Summary will be sent on Tuesday, April 3rd. Until that day, feel free to add information in there

Did you know?

Development
- New search code for Wikidata merged. You may notice the improvement in the search results output for Wikidata item. However, new code for search is not enabled, only new results format. The search code will be enabled next week.
- Improving formatting of language and lexical category in diff for Lexemes (phab:T189679)
- Allow to remove a Form (phab:T189675)
- Translate the grammatical feature properly on Lexemes (phab:T189143)
- Investigate and fix a bug on Lexemes when undoing an edit (phab:T187215)
- Progress on refactoring the table wb_terms (phab:T189777, phab:T188993, phab:T188279)
- Fixing an error on the caching of the constraint checks (phab:T189842)
- Improving the performance of a table in the database (phab:T180834)
- Improving the way we're building dumps (phab:T177550)
- Investigate on improving Lua functions (phab:T143970)

You can see all open tickets related to Wikidata here. If you want to help, you can also have a look at the tasks needing a volunteer.

Monthly Tasks
- Add labels, in your own language(s), for the new properties listed above.
- Comment on property proposals: all open proposals
- Suggested and open tasks!
- Contribute to a Showcase item.
- Help translate or proofread the interface and documentation pages, in your own language!
- Help merge identical items across Wikimedia projects.
- Help write the next summary!

Read the full report · Unsubscribe · Lea Lacroix (WMDE) 13:38, 26 March 2018 (UTC)

Zhao surname

Why Zhao (Q804886) and Zhao (Q37245391) are not merged and have said to be the same as (P460)? What is the difference? Wostr (talk) 14:18, 26 March 2018 (UTC)

looks like somebody thought one of them was a disambiguation page, but it's definitely not now. I think they can be merged (after the said to be the same as (P460) is removed). ArthurPSmith (talk) 14:36, 26 March 2018 (UTC)

One is in Chinese while the other is Latin script. Sjoerd de Bruin (talk) 17:05, 26 March 2018 (UTC)

A bit weird distinction and I really don't get the idea. Sometimes one person has its surname in both scripts in his/her documents, sometimes uses different script (Latin) while having surname in other script. The same surnames can be written in many many ways depending on the language. What's more: китайская (趙/赵), корейская (조), вьетнамская (Triệu) фамилия – Russian description (Chinese, Korean and Vietnamese surname so not only Chinese like in the statements)... If Zhao (Q37245391) covers many languages, Zhao (Q804886) shouldn't be something like subclass of (P279) of the former? Wostr (talk) 17:49, 26 March 2018 (UTC)

Wikidata:WikiProject_Rowing/reports/P1559_for_rowers_(non-latin_script) gives a good idea how items for first names work for people whose native name isn't in Latin script.
--- Jura 18:01, 26 March 2018 (UTC)

/conflict/ Also, if the difference here is the script only, it a bit hilarious: if I were a Russian student who came to Poland to study at some university, I would have to present transcription of my birth certificate. So my name in cyryllic script, let's say, Пётр Фёдорович Ковалёв (here the surname is Ковалёв) would be transcripted to Piotr Fiedorowicz Kowalow (surname: Kowalow). But Ковалёв and Kowalow are the same surname written in different scripts, so I really don't understand why we need two items for one term. Wostr (talk) 18:09, 26 March 2018 (UTC)

Is there a notability standard for works and authors here?

Pigsonthewing/Andy Mabbett has a template on Wikipedia that cites references automatically based on WikiData entries. I thought that was a cool idea and wanted to make Wikidata entries on various sources and authors that are cited in my Wikipedia articles about paleontology. The scope of inclusion at this project seems really broad but I wanted to check and see if the random paleontology authors and sources cited in my articles are even compatible with this project. Abyssal (talk) 15:46, 26 March 2018 (UTC)

If they are considered reliable sources on Wikipedia, they will be fine here; see also m:WikiCite and WD:Zotero; and you may enjoy using Scholia. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:12, 26 March 2018 (UTC)

Well speak of the devil, hahaha! Thanks, Andy. I'll get to work some time today. Abyssal (talk) 16:16, 26 March 2018 (UTC)

Wikidata does have a general notability standard, which can be found at WD:N. But Andy's summary above is generally correct :-) -- Ajraddatz (talk) 17:47, 26 March 2018 (UTC)

AFAIK the Wikicite team created millions of items for sources, so yes, you can create items for sources. However, I think there were problems with the utilization of items about sources in Wikipedia articles, namely it was too resource intensive. I don't know if the situation has changed since then. Micru (talk) 18:18, 26 March 2018 (UTC)

Member of political party start time is mandatory

Member_of_political_party has start_time as mandatory. How do you determine when someone becomes a member of a political party? That would require access to voter registration records. Is it when they turn 18 years old in the USA? Is it when they win an election for a public office? Or do we remove the mandatory constraint? --RAN (talk) 18:31, 23 March 2018 (UTC)

I'm an election official (justice of the peace) in Vermont (USA). In my state, there is no mechanism to join a political party. When a primary election occurs, the election official asks the voter which party's ballot he/she wants. The next election, the voter could ask for a different party's ballot without having to do anything to change from one party to another. Jc3s5h (talk) 21:01, 23 March 2018 (UTC)

Start time as a mandatory constraint for member of political party (P102) is for all practical purposes nonsense. Moving away from the US situation of voter registration (which, fwiw, may or may not equal membeship) it's clear that there will be few sources of information as to when anyone who joined a political party did so.

The constraint was added by a bot [15] and so it would be handy to get some more background from the owner, User:Pasleim, and a pointer to any prior discussion; I've had a brief hunt for such, but have found nothing. --Tagishsimon (talk) 00:56, 24 March 2018 (UTC)

This constraint was not really added by my bot, but just converted from P1646 (mandatory qualifier)=start time (P580) to property constraint (P2302)=required qualifier constraint (Q21510856). The original constraint with P1646 (mandatory qualifier) was already added in 2014 by User:Милан Јелисавчић. --Pasleim (talk) 06:55, 24 March 2018 (UTC)

I can see how it can be used, for say, Donald Trump who ran as a candidate for multiple parties, we would have the dates for that from his Federal Election Commission filings. For any modern US politician switching parties, we would have access to the paperwork. I think we should get rid of the mandatory requirement. --RAN (talk) 01:14, 24 March 2018 (UTC)
- For most cases like this, we would ideally have the party information via qualifiers on candidacy in election (P3602) or position held (P39) statements, rather than a top-level member of political party (P102) anyway. --Oravrattas (talk) 06:09, 27 March 2018 (UTC)

- - The mandatory constraint for member of political party (P102) is impossible to fullfill. Party membership is in many cases registred within the party organisation alone, and may not be public.

Qualifiers for "material used"

I've proposed additional allowed qualifiers for made from material (P186) on its Talk page. Please share your thoughts! - PKM (talk) 18:48, 26 March 2018 (UTC)

How to access the female forms of occupations without expensive calls?

On the Wikidata Infobox talk page on Commons, @Powerek38: asked if the infobox could display actress rather than actor at commons:Category:Hanka Bielicka (and similarly for other female forms of occupations). Which is reasonable enough. However, they then explained that the convention here is not to use actress (Q21169216) (which would have worked out-of-the-box), but to use actor (Q33999) -> female form of label (P2521). Which is, TBH, weird/sexist. It didn't seem possible to do this using parser functions and existing Lua code, so I asked @RexxS: if commons:Module:WikidataIB could be modified to add that functionality. However, doing so apparently requires many expensive arbitrary access calls that would be a huge performance hit (see the discussion linked at the start for the full details).

So, is there a way to access that label without an expensive call? Or could we just start using actress (Q21169216) and similar (with a bot-run to fix existing uses)? Background on how we ended up at this odd situation would be appreciated. Thanks. Mike Peel (talk) 21:15, 26 March 2018 (UTC)

"Which is reasonable enough". Or not, of course. There doesn't seem to be a consensus. [16] --Tagishsimon (talk) 21:32, 26 March 2018 (UTC)

Well, I think this is obviously an ongoing discussion in many language communities around the world, I can certainly say that about Poland and Polish speakers (as that's my home country and my mother tongue). And yes, there are some female forms in modern Polish which would be used by people of more feminist views, but probably not by people who are conservative (linguistically or otherwise). However, in many cases the female forms have become a totally obvious, undisputed part of universally accepted language. In these cases usage of male forms for women looks really strange and artificial. Shortly speaking, I think this issue should get a technical solution. How and when this solution will be used, which forms will be provided as female form of label (P2521) and based on what criteria, that's a different matter all together. Powerek38 (talk) 21:47, 26 March 2018 (UTC)

Yeah, but the original problem is that you have some item like Hanka Bielicka (Q428422). When you request their occupation you get the male form by default and if you want the female form you need to lookup actor (Q33999). This means that data lookups are slower for women than men, which is quite sexist, especially if these extra data lookups are denied because it's thought to be too expensive. Ghouston (talk) 00:41, 27 March 2018 (UTC)

A possible solution would be to use only gender-neutral labels. If a language lacks a gender-neutral term, it could be set to "actor/actress" or whatever, and the caller could optionally select the one desired without needing to make another data query. Ghouston (talk) 00:55, 27 March 2018 (UTC)

Note that there are languages other than English. For most of Slavic languages for occupations using male form for females is grammatically incorrect and there is no gender neutral term.--Jklamo (talk) 08:49, 27 March 2018 (UTC)

We need your feedback to improve Lua functions

Hello,

If you’re regularly using Lua modules, creating or improving some of them, we need your feedback!

The Wikidata development team would like to provide more Lua functions, in order to improve the experience of people who write Lua scripts to reuse Wikidata's data on the Wikimedia projects. Our goals are to help harmonizing the existing modules across the Wikimedia projects, to make coding in Lua easier for the communities, and to improve the performance of the modules.

We would like to know more about your habits, your needs, and what could help you. We have a few questions for you on this page. Note that if you don’t feel comfortable with writing in English, you can answer in your preferred language.

Feel free to ping or share this message with any editors that could be interested!

Thanks a lot for your help, Lea Lacroix (WMDE) (talk) 08:15, 27 March 2018 (UTC)

Merge

Q51151859 into Q4308005 77.179.31.197 05:48, 31 March 2018 (UTC)

Done —Thibaut120094 (talk) 08:20, 31 March 2018 (UTC)

This section was archived on a request by: Matěj Suchánek (talk) 12:56, 2 April 2018 (UTC)

How to show ambulanceman from WW1

In William Benjamin Owen (Q18911202) this gentleman served in WW1 for the British Red Cross and Order of St John of Jerusalem (BRC/StJJ) according to Army Medal Office WWI Medal Index Cards. How should this be show for the "military branch"? Adding BRC creates a constraint conflict, and changing the BRC's item seems to be an incorrect way to manage the constraint. Suggestions? Noting that I have had a go as qualifier to Conflict. — billinghurst sDrewth 12:41, 24 March 2018 (UTC)

I wouldn't consider either the British Red Cross or the Order of St John of Jerusalem to be branches of the military. - PKM (talk) 18:56, 24 March 2018 (UTC)

@PKM: normally neither would I, however, this person has been awarded military medals for military service, and clearly was seen to have participated in the conflict of WW1. So saying "no no no, we don't believe ..." doesn't allow us to capture these facts. — billinghurst sDrewth 12:14, 28 March 2018 (UTC)

New user script for faster navigation

Hey with increasing length of items I spend more time on annoying scrolling. Especially if there are a lot of values for one property on an item (Examples: Linux kernel (Q14579) or Beta-2-microglobulin (Q287896)). Therefor I started writing a script to improve this: User:MichaelSchoenitzer/Updown For now it does two things:

If there are a lot of values for one property it will add arrows that allow you to jump to the first/last value.
With the keys j and k you can jump to the next/previous statement

Maybe it's helpful for you too. If you have additional ideas for features improving navigation, feel free to propose them. -- MichaelSchoenitzer (talk) 23:55, 26 March 2018 (UTC) I've added a new feature:

Press the key t to toggle a sticky menu that allows you to jump to a statement.

-- MichaelSchoenitzer (talk) 10:54, 27 March 2018 (UTC)

Nice, thanks!--
--- Jura 19:00, 27 March 2018 (UTC)

Make Help:Label translatable

Hello, I was working on bringing the translation in Italian up to date for the links in WD:1, but I noticed that Help:Label doesn't have the Special:Translate links. Is there a reason for this? Thanks! --Sabas88 (talk) 07:48, 27 March 2018 (UTC)

Yes, because each language follows different rules regrading labels, so it doesn't make sense to have English guidelines translated. --β₁₆ - (talk) 13:50, 27 March 2018 (UTC)

I don't agree with that: there should be documentation for labels in every languages translated in every languages, it makes sense. Tubezlob (🙋) 19:20, 27 March 2018 (UTC)

"there should be documentation for labels in every languages" it's ok, but English, Arabic, Tamil, Japanese, Hindi and 250+ other languages can't have the same rules, because every language has the own grammar and lessical features. (see also description and alias). --β₁₆ - (talk) 08:34, 28 March 2018 (UTC)

Couldn't be hosting only the general guidelines and leave the local rules to some local help section? --Sabas88 (talk) 15:42, 28 March 2018 (UTC)

Pedido para agregar información en la mini base de datos de Q6152160 (Jane Duncan)

From Request a Query...

En Jane Duncan (Q6152160), se establece correctamente que 'Jane Duncan' es seudónimo de la persona que aquí se describe.

Pero si se observa la página digital en Wikipedia de esta persona, podrá constatarse que la misma también usa como seudónimo a Janet Sandison.

La cuestión es que en lo personal, no tengo mucha práctica en Wikidata, y si bien en Wikidata hago cosas sencillas, directamente no sé cómo agregar un segundo seudónimo. Aparte, está la cuestión de agregar la fuente de esa información en Wikidata, cosa que tampoco sé hacer.

Si algún wikimedista sabe de esto, favor implementarlo. Muchas gracias.

--AnselmiJuan (talk) 13:50, 28 March 2018 (UTC)

After google translate ... we have an item Jane Duncan (Q6152160) which seems to be a record for a person, Elizabeth Jane Cameron, filed using one of her psuedonyms, Jane Duncan, as a label. Cameron also used another psuedonym, Janet Sandison. The questions raised, are, are we happy with the current record, and how do we record the second pseudonym. --Tagishsimon (talk) 15:10, 28 March 2018 (UTC)

Estimado usuario Tagishsimon. Muchísimas gracias por este cambio de localización de mi pedido, y por haber traducido al inglés mi requerimiento de ayuda. Muy amable de tu parte. --AnselmiJuan (talk) 17:44, 28 March 2018 (UTC)

Can a translation admin please mark the changes on this page

Hi

Could someone who is a translation admin pretty please change the text on Template:Data_import_header/text? I'm making some changes to the Wikidata Import Hub so it works better and need to change the header. I've requested to become a translation admin but not been granted yet and need to get this bit done so I can move onto other parts.

Thanks

--John Cummings (talk) 16:49, 28 March 2018 (UTC)

Done Matěj Suchánek (talk) 17:05, 28 March 2018 (UTC)

@Matěj Suchánek: 👍👍 --John Cummings (talk) 17:14, 28 March 2018 (UTC)

interface for data entering is confusing for time value with century precision

This must have been asked and discussed several times before, so I would appreciate to get a pointer to any conclusion.

On Help:Dates#Precision it is very clearly stated that a time value of '+1800-00-00T00:00:00Z' with a precision 7 should be interpreted as the 1800s or a time between years 1800 and 1899. But when you edit Wikidata you have to enter "18. century" to get that value. As "the 18:th century" (very similar to "18. century") in English (as well as some other languages) means 1700–1799 it is very likely that you enter "19. century" which gives a time value of '+1900-00-00T00:00:00Z' with precision 7. Today I see that user &beer&love has entered a date of birth (P569) with "20. century" (i.e. time value '+2000-00-00T00:00:00Z') for a lot of people born between 1900 and 1999. See e.g. this edit of Secundino González (Q5639303).

This leads to my question: Shouldn't the interface be changed so that it says 1800s (like on the help page) instead of 18. century in order to reduce the risk of people entering this value for dates between 1700 and 1799? Maybe somehting else should be choosen in order to tell if 1800s should be read as 1800–1809 or 1800–1899. --Larske (talk) 21:10, 18 March 2018 (UTC)

@Larske: Last time I recall a small discussion on this even worse problem was in January at Wikidata:Project chat/Archive/2018/01#Centuries, started by PKM. Unfortunately it got closed. Indeed I see this as a huge source of wrong data input as well. Nobody really seems to care though. I'm kinda desperate. --Marsupium (talk) 22:21, 21 March 2018 (UTC)

@Marsupium: Thanks for the pointer. I have added a slightly modified version of my post to the Phabriactor task T95553 now.

Tracked in Phabricator
Task T95553

--Larske (talk) 05:23, 22 March 2018 (UTC)

The « precision 7 should be interpreted as the 1800s or a time between years 1800 and 1899 » is quite strange and confusing (and the « 1800s » part is probably wrong, precision 7 is ~~decennial not centennial~~centennial not decennial). It was first added on this page by Jarekt and modified by Jc3s5h (and corrected again by Jarekt and Jc3s5h...). Cdlt, VIGNERON (talk) 07:47, 22 March 2018 (UTC)

The earliest version of the JSON datamodel by Christopher Johnson (WMDE) states that precision 7 is 100 years, and the current version agrees. Would VIGNERON please state the basis that "precision 7 is decennial not centennial"? – The preceding unsigned comment was added by Jc3s5h (talk • contribs).

@Jc3s5h: Ooups, my mistake, it's obviously the other way round (this is really confusing /o\ especially for a French speaker as "century"@en = "siècle"@fr and "centennial"@en = "centurie"@fr).

The help page indicate 1800s for precision 7 (which is a decade, 1800-1809 and not 1800-1899 as currently stated) since your diff in last June : Special:Diff/508966983. Do you agree this was a mistake? (or I am missing something?).

Cdlt, VIGNERON (talk) 11:22, 22 March 2018 (UTC)

In English, 1800s would usually mean 1800 to and including 1899. This does leave the question of what to call the decade 1800 to and including 1809. There is no good solution for this; the decade 1900 to and including 1909 is sometimes jokingly called the nineteen-naughties (since "naught" is a synonym for zero). Jc3s5h (talk) 13:56, 22 March 2018 (UTC)

@Jc3s5h, Jarekt: Before deciding what to call something, it seems reasonable to agree upon which time interval it represents. According to comment from @Lucas Werkmeister (WMDE): in the Phabricator task T9553, the value "1800 with precision 7" represents the years from 1701 to 1800, i.e. neither 1700 –1799 nor 1800–1899 which is now stated on the help page.

Do you agree that the help page should be changed to say 1701–1800? Then we can discuss the most appropriate name for that type of interval in different languages. --Larske (talk) 21:46, 28 March 2018 (UTC)

The help page agrees with the specifications: mediawikiwiki:Wikibase/DataModel/JSON and mediawikiwiki:Wikibase/Indexing/RDF Dump Format. I disagree with @Lucas Werkmeister (WMDE):. Jc3s5h (talk) 22:39, 28 March 2018 (UTC)

Not sure, if related (kind of TL;DR issue), but to put some links together: phab:T73459, disscusion (read the part in English, in Latvian there isn't anyting related). --Edgars2007 (talk) 09:59, 22 March 2018 (UTC)

How to stop bots from adding wrong interwiki links?

I removed the Wikipedia link from Q2402286 to de:Telldenkmal because the subject is not the same: Q2402286, corresponding to English Wikipedia's article en:Tell Monument, is for a specific memorial to William Tell in Altdorf, Switzerland. German Wikipedia's article de:Telldenkmal, on the other hand, is describing memorials to William Tell in general - including, but not limited to the Altdorf monument. The Wikidata item doesn't correspond to that; various things such as the coordinates of the Altdorf monument, the GND ID etc. aren't applicable to the subject "memorials (!) to William Tell". I even left an explanation on the talk page. But of course, bots don't read talk pages, so User:EmausBot re-added the wrong link. How can I remove the link and prevent further wrong bot edits? Gestumblindi (talk) 23:17, 21 March 2018 (UTC)

By adding the sitelink to some other item. Sjoerd de Bruin (talk) 23:27, 21 March 2018 (UTC)

Well, as there is no Wikidata item for the subject "Memorial to William Tell", I guess I have to create one, then... Gestumblindi (talk) 23:29, 21 March 2018 (UTC)

@Sjoerddebruin: I created Q50824634 - ok? Gestumblindi (talk) 23:47, 21 March 2018 (UTC)

@Gestumblindi: - Yes. Good work. (I've amended the EN label & description to emphasise the list quality of the DE article - memorials to William Tell (Q50824634).) --Tagishsimon (talk) 11:09, 22 March 2018 (UTC)

hmmm Tagishsimon, are you sure Q2402286 ? --Hsarrazin (talk) 12:44, 23 March 2018 (UTC)

For similar case of misbehaving bot see https://www.wikidata.org/w/index.php?title=Q49518982&action=history . Bot adds link to non-existing page, human removes it, bot re-adds the wrong data. How do we break that cycle? --Jarekt (talk) 13:17, 22 March 2018 (UTC)

You would either have to attach the link to another item, or get the bot changed somehow... ArthurPSmith (talk) 13:40, 22 March 2018 (UTC)

For the Commonscategory stuff, I started deprecating the statements. It's just kafkaesque ..
--- Jura 15:48, 22 March 2018 (UTC)

@Magnus Manske: ^. Matěj Suchánek (talk) 15:54, 22 March 2018 (UTC)

There are several bots doing this.
--- Jura 17:23, 22 March 2018 (UTC)

Sometimes I try to figure out where the bot gets bad info and try to correct it there, but in this case there is no information on where it is being imported from. --Jarekt (talk) 12:19, 23 March 2018 (UTC)

@Jarekt: In this case it's related to Topic:U76ifhzw3ordaswa and probably these lists. --Edgars2007 (talk) 12:52, 23 March 2018 (UTC)

P373 seems like a good sample to illustrate why overly broad properties don't work well in Wikidata.
--- Jura 09:26, 25 March 2018 (UTC)

I do not find P373 property to be "overly broad". It is just like a sitelink to a specific namespace. Unfortunately, sitelinks verify that page exist and change automatically with page renames or redirects and links to pages on commons do not. --Jarekt (talk) 18:18, 28 March 2018 (UTC)
- I find it overly broad because the same is applied to multiple items and none of the checks we have can ensure that this done correctly.
  --- Jura 06:39, 29 March 2018 (UTC)

How to show that John Andrews (Q126052) served onboard USS Benicia (Q2273142)

Can anyone help me how to show that a sailor served onboard a named vessel? Breg Pmt (talk) 17:43, 25 March 2018 (UTC)

You could use crew member(s) (P1029) on the ship, but I can't see any specific inverse property. Maybe item operated (P121) is closest. work location (P937) only applies to geographical locations (or does it?). Ghouston (talk) 23:00, 25 March 2018 (UTC)

position held (P39) crew member (Q5184855) / of (P642) USS Benicia (Q2273142) might be another possibility (by analogy with some of the cosmonauts), but it looks a bit too 'round the houses' to be very robust, unless it becomes a recognised idiom for this.

It would also have the disadvantage of not playing well with QuickStatements, if there are lots of ships that one wants to record that somebody has served on. On the other hand, this kind of format doesn't seem to be holding back User:Andrew Gray's work on UK MPs. Jheald (talk) 23:28, 25 March 2018 (UTC)

Maybe the inverse property should be created. crew member(s) (P1029) is already being used on people, like Johann Mohr (Q70370) and Engelbert Endrass (Q64114). Ghouston (talk) 06:14, 26 March 2018 (UTC)

A new "crew member of" property would definitely be useful. Could be used for all kinds of vessels - from ships, submarines and planes to (fictional) spaceships. Also for bigger naval vessels that have been in service for decades, it might be a bit much to add all (notable) persons who served aboard it to the ship's item via crew member(s) (P1029) statements.

Note: at the moment, crew member(s) (P1029) still has a property constraint that limits it to astronauts. I already asked on the property's talk page and WikiProject Space about the removal of the constraint, but got no replies. --Kam Solusar (talk) 12:16, 26 March 2018 (UTC)

@Jheald, Kam Solusar, Ghouston: I agree that broadening crew member(s) (P1029) is not the best way to go - it makes a lot more sense to have the served-on relation on the person rather than the ship, as this is generally consistent with how we handle other similar things like "worked for" or "was a member of". Long-lived ships with very large crews could potentially have a lot of people here and it would get awkward to list them all on the ship item. For spaceflight, this isn't an issue as (so far) we've never had more than a handful of people at any one time.

I think a new "crew member of" property would be a good idea - position held (P39) really isn't a great way to handle things that are more of a "job" or "posting" than an official position. It might be useful for eg commanders, but we already have commander of (DEPRECATED) (P598) for that. Andrew Gray (talk) 21:35, 26 March 2018 (UTC)

I created a proposal at Wikidata:Property proposal/member of the crew of Ghouston (talk) 22:44, 28 March 2018 (UTC)

Islandic municipality Ölfus

Can someone take a look at Ölfus (Q297010) and Q16420549? Are this two items about the same thing? permanent duplicated item (P2959)? Wostr (talk) 13:58, 26 March 2018 (UTC)

As far as I understand it, Ölfus (Q297010) describes an administrative division while Q16420549 describes a physical-geographical region. --Pasleim (talk) 15:12, 26 March 2018 (UTC)

As an Icelander, I can confirm that Pasleim is right on this one.--Snaevar (talk) 11:37, 29 March 2018 (UTC)

Ok, thanks, I'll add different from (P1889) then. Wostr (talk) 13:03, 29 March 2018 (UTC)

Maximum and minimum values of a property

How can I indicate the maximum and minimum values of some property (e.g. SD card -> memory capacity: minimum:128MB-maximum:2TB)?

I saw minimum value (P2313) and maximum value (P2312) but from what I understand they should be used only as qualifiers of property constraint (P2302). There is also maximum size or capacity (P3559), but there isn't the equivalent for minimum.

And finally, why there isn't a "range" datatype? --Malore (talk) 16:23, 26 March 2018 (UTC)

Some properties seem to be made out to be for maximum without actually mentioning it. The qualifier "criterion used" with the values maximum or minimum can work too. I noticed you made proposals for a series of subproperties of temperature at Wikidata:Property_proposal/Generic#maximum_operating_temperature (etc.).
--- Jura 19:10, 27 March 2018 (UTC)

@Jura1: So my property proposals "maximum operating temperature" and "minimum operating temperature" are equivalent to "operating temperature" qualified with "criterion used"? --Malore (talk) 07:32, 29 March 2018 (UTC)

Yes, for a similar use, see Q50078381#P2067.
--- Jura 07:47, 29 March 2018 (UTC)

Anyone up for a new WikiProject?

Hi after poking around some items of Holocaust victims it struck me that we still have a lot of work to do modelling death. I opened a thread on fb to discuss this and the conclusion was to open a thread here. I think we need a Wikidata:WikiProject Death to gather up all the various properties to help with aspects like Help:Modeling causes but also to help with issues like "burial date = used for death date" or "hospice place = used for death place; last known address was X". Anyone interested? I know it is a bit morbid, but a list of properties and qualifiers would be very helpful for anyone who adds people items regularly. Jane023 (talk) 07:27, 29 March 2018 (UTC)

I'm not sure if it's a good thing to combine these two subjects in one WikiProject. I think the first one does have a dedicated project and properties for the others should have fairly detailed descriptions.

For FB people, maybe Oxford Analytics could be an interesting new topic for a project.
--- Jura 07:39, 29 March 2018 (UTC)

What do you mean by "first one"? The help pages for causality? Jane023 (talk) 07:51, 29 March 2018 (UTC)

https://twitter.com/wikireaper Multichill (talk) 09:18, 29 March 2018 (UTC)

Very funny! I am actually more interested in Public Domain deaths actually (so creators who died before 1947) and a lot of these people died during WWI or WWII from various war-related causes. Just trying to find some quick answers, but as per usual, I seem to solve these one-by-one and then promptly forget how I solved them. Just make me a page and I will add some examples. Looking for a morbidly interested party (like that twitter account creator?). Jane023 (talk) 09:37, 29 March 2018 (UTC)

User:Andrew Gray has been interested in definitively killing off some UK MPs [17], but only as a sideline to his normal line of business I think. Jheald (talk) 10:45, 29 March 2018 (UTC)

Well I guess we have similar problems with the Dutch parliament - I haven't even looked at those. I am however also interested in defnitively killing off a few saints, in addition to resurrecting some of their post-death miracles. Don't ask me how to model those deeds though! Jane023 (talk) 10:57, 29 March 2018 (UTC)

I'd definitely be interested in doing some more work on modelling death - we're okay on the "mechanics", but a bit vague on the circumstances, which historically are often more meaningful (eg we may not know exactly how someone died, but we know it was during a certain riot). For a nicely detailed example example, Stan Rogers (Q3025760) gives the date and place of his death, the fact that it was an "accident", and that the direct cause of death was smoke inhalation. But there's no easy way to explain that it was an air crash, or indeed that it was specifically Air Canada Flight 797 (Q2003519), which feels a bit more useful to the reader. Andrew Gray (talk) 12:28, 30 March 2018 (UTC)

@Andrew Gray: "no easy way to explain that it was an air crash, or indeed that it was specifically Q2003519" In the Facebook discussion referred to by Jane, I suggested modelling such cases like this. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:26, 30 March 2018 (UTC)

@Pigsonthewing: "Facet" isn't appropriate, surely? It's intended for subtopics (History of X > X). I've considered using "significant event" qualifiers in a similar way but it feels clunky - especially since there are three or four different properties it could be a qualifier on. Andrew Gray (talk) 15:33, 30 March 2018 (UTC)

It's described as "topic of which this item is an aspect, item that offers a broader perspective on the same topic". How does that not apply? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:32, 30 March 2018 (UTC)

The description is quite vague, but the examples on facet of (P1269) seem pretty clear to me - it's an event that's a part of a larger event, or a concept that's part of a larger concept. A person's involvement with an event isn't a facet of the event in the way that's normally interpreted. Leaving that aside, as noted, there's no obvious way to decide which of the death-related properties to put qualifiers on (manner, cause, place, date, or even military classification), and I think using a property like this would just confuse the end-user. Andrew Gray (talk) 17:35, 30 March 2018 (UTC)

This sounds like a good idea👍 I recently struggled with the Spanish poet Federico Garcia Lorca, whom the fascists executed by firing squad during the Spanish civil war. Was that a homicide or capital punishment? To this day the answer will depend upon who you ask, regrettably. Moebeus (talk) 16:52, 30 March 2018 (UTC)

How to describe that a journal appears in an online repository

Hi all

How would I describe that a journal appears in the Directory of Open Access Journals e.g this ? I tried to do this with creating an ID property for the journal but this was rejected because DOAJ uses ISSN numbers as its numbering system.

Thanks

--John Cummings (talk) 15:29, 29 March 2018 (UTC)

what ? is there a discussion about it ? I would also support a DOAJ ID property... the fact that it uses ISSN does not mean that all publications with ISSN are in it ^^ and DOAJ undoubtedly is a very good source for access to journals and reviews. --Hsarrazin (talk) 15:45, 29 March 2018 (UTC)

ok, found it, Wikidata:Property_proposal/Directory_of_Open_Access_Journals_ID, but I do not understand AT ALL, how to use third-party formatter URL (P3303) to indicate that some publication is in DOAJ :( --Hsarrazin (talk) 15:52, 29 March 2018 (UTC)

As far as I can tell we can't use that property for creating any direct link to DOAJ. It's just something that goes on the actual property page ISSN (P236) to indicate how you would create the url from the id. So you could use it in queries to build the URL for DOAJ but it won't create any automatic linking like the main formatter URL will. NavinoEvans (talk) 16:05, 29 March 2018 (UTC)

Any thoughts on using published in (P1433) for creating the link to DOAJ? it seems quite logical to me, but does not really fit with the current description for the 'published in' property(in English anyway). NavinoEvans (talk) 16:08, 29 March 2018 (UTC)

You can also use catalog (P972) (query). Black Lunch Table (Q28781198) uses P972 that way. But it does not create a link here in Wikidata, though. strakhov (talk) 18:03, 29 March 2018 (UTC)

Use the DOAJ URL as a reference. Or rely on DOAJ to give 404 response code where appropriate. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:36, 29 March 2018 (UTC)

How to use SPARQL query to get an overview about how many DOAJ URLs constructed that way are link rot (Q1193907) (= HTTP 404 (Q208219))? --Succu (talk) 19:36, 29 March 2018 (UTC)

John Cummings, simply reopen your proposal Wikidata:Property proposal/Directory of Open Access Journals ID. It makes a lot of sense. --Succu (talk) 20:53, 29 March 2018 (UTC)

Thanks all, I think I would like to keep it simple to allow people to query easily. @Succu: the external ID makes a lot of sense to me as well, I don't really understand why its not OK (lots of ID numbering systems go from 0001 - 9999 and they are allowed to be separate ID properties), if anyone feels the urge to reopen it, good luck to you. catalog (P972) seems like the best if that isn't allowed.

@Pigsonthewing:, sorry, what do you mean? Having 404s feels like something we want to avoid? I feel like I'm quite out of my depth...

I think this highlights the need for some kind of documentation on agreed community norms on how to describe different kinds of data. This would also help people to do more successful queries, if there are 5 different ways to describe the data and we aren't telling people which one we are using they are unlikely to get an accurate result.

Thanks

--John Cummings (talk) 21:39, 29 March 2018 (UTC)

@John Cummings:

please re-open the Property proposal, explaining that the fact that the ID is ISSN does not solve the problem of linking, nor the problem of knowing which publications are in DOAJ or not. I think the previous refusal was made a little too quick, not understanding the real needs for it. As a librarian, you have my full support for this :) --Hsarrazin (talk) 07:21, 30 March 2018 (UTC)

Thanks @Hsarrazin: and @Succu: for picking this back up, lets see what happens. --John Cummings (talk) 09:20, 30 March 2018 (UTC)

A 404 response is perfectly valid and useful in such cases. I've restored the withdrawn proposal to its withdrawn state. Old proposals should not be reopened like that. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 09:45, 30 March 2018 (UTC)

it is not "old" (2 weeks), and it was "withdrawn" by you, after 3 votes, 1 support and 2 oppose, because you let it believe that it was not necessary... but you do not propose any valid and efficient way to store access to DOAJ, and even less to know wich publications are in DOAJ.

please re-open it, and stop imposing your pov on this matter : others have also something to say... --Hsarrazin (talk) 09:52, 30 March 2018 (UTC)

No, it was withdrawn by John Cummins; I simply marked it up as such. As to your false "you do not propose..." allegation, please read what I have written above. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:09, 30 March 2018 (UTC)

I read carefully all you said about it here and on property proposal ; and I wrote "any valid and efficient way" (practical if you prefer).

do you call that a "practical" way ? this is for geeks and devs, not for ordinary users... having no direct link, and trying to access through a query, just to see if you get a 404 is not what I call "valid and efficient"... it is maybe a solution for a dev' or somebody who works with bots and scripts - and please do not tell me to use Reasonator... when I work on an item, I do not go on Reasonator to check for link then back to work... this is not practical !

how, with that system, would you give me the possibility to get all items that have a DOAJ link ? in a practical way, not through a program that would have to run through all items with ISSN to check whether there is a link or 404 ? --Hsarrazin (talk) 12:10, 30 March 2018 (UTC)

@Pigsonthewing:, what is the proper way to reopen a property proposal? Am I the only one who can do it? Can you explain the 404 thing? --John Cummings (talk) 18:24, 30 March 2018 (UTC)

Problems with National Library of Ireland authority ID

P1946 (P1946) seems to be very broken; please see Property talk:P1946#Links broken.. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:07, 30 March 2018 (UTC)

Scientific Journal vs Academic Journal

It seems scientific journal (Q5633421) or academic journal (Q737498) is being picked completely arbitrarily for the instance of (P31) of a journal.

Number of scientific journal (Q5633421) = ~43K (query)
Number of academic journal (Q737498) = ~10K (query)

It seems the only determinant is the person who is creating them. For example, @Harej: creates them as scientific journal (Q5633421) while @John_Cummings: creates them as academic journal (Q737498)!. (They both import them from general scholarly journal databases)

There is an old discussion @Infovarius, John_Vandenberg, BurritoBazooka: about the distinction between the two here Talk:Q5633421 but no decision has made on what to do about it. If you do not think the two should be merged, could you provide guidelines on which journals should be scientific journal (Q5633421) and which ones should be academic journal (Q737498)? Mahdimoqri (talk) 19:48, 28 March 2018 (UTC)

@Mahdimoqri:, thanks for the ping. I chose academic journal (Q737498) for importing the journals from Directory of Open Access Journals because I was unsure whether they could all be considered scientific journal (Q5633421). As far as I'm aware there are no guidelines on helping people chose the correct term (this is a wide problem in Wikidata). I trust people here to make the correct decision and change the items I created and updated ( I'm not very competent with QuickStatements yet). --John Cummings (talk) 19:53, 28 March 2018 (UTC)

It seems to me scientific journal (Q5633421) should be a subclass of academic journal (Q737498). Academia is usually divided into faculties, of which science is one. Law or art journals aren't science journals, and it can be debated whether engineering, mathematics and computer science are part of science or not. Ghouston (talk) 21:55, 28 March 2018 (UTC)

They can't be merged, anyway, since enwiki and others have articles for both. Ghouston (talk) 22:02, 28 March 2018 (UTC)

I think we should make scientific journal (Q5633421) a subclass of academic journal (Q737498). scientific journal (Q5633421) should be used for journals in the sciences (including social sciences). We might also want a new subclass for "humantities journal" = "academic journal focused on the arts and humanities". I'd happily help with reclassifying the journals we've got. It would be fairly straightfiorward if they all had "main subject" properties :-). PKM (talk) 17:43, 29 March 2018 (UTC)

@PKM: putting scientific journal (Q5633421) under academic journal (Q737498) sounds great! Do you know what's the process to propose/request this change? I'd be happy to help with the reclassification as well. Mahdimoqri (talk) 13:44, 30 March 2018 (UTC)

@Mahdimoqri: that change has already been made - [18] - and so

⟨ scientific journal (Q5633421)  

 ⟩ subclass of (P279) ⟨ academic journal (Q737498)  

 ⟩

is in place, and you're good to go for the reclassification task. --Tagishsimon (talk) 14:28, 30 March 2018 (UTC)

@Tagishsimon: wonderful! @PKM: any thoughts on what need to be done for reclassification?

┌────────────────────────────────────────────────────────────────────────────────────────────────────┘ I have created humanities journal (Q51135530) and linked it to the existing list of humanities journals (Q6623807). Pleasee help add labels and descriptions.

@Mahdimoqri: you might start with the items linked on en:List of humanities journals and make sure they are all in the right class. Once those are done, we can create a list of instances of scientific journal (Q5633421) (from the link on its talk page) and start moving things that don't fit there. If anyone has a list already that could be used to automate any part of this process, let us know! - PKM (talk)

Note that we also have history journal (Q627517) and law review (Q746654) as subclasses of academic journal. - PKM (talk) 19:05, 30 March 2018 (UTC)

How to add an item

Hi. On enwiki, I have just made my 175k'th edit (thanks). Now I want to add an item to Wikidata, and I did not find any any link that opened that option. (Even worse: I only got dozens of "research you idea by checking ...", "try elsewhere ..." stuff). What is wrong? - DePiep (talk) 20:37, 29 March 2018 (UTC)

@DePiep: You need 176k edits to join the club? Link to "Create a new item" is on the left-side menu, fourth from the top. here's a link. Come back with follow-up questions ;) --Tagishsimon (talk) 20:42, 29 March 2018 (UTC)

That's the spirit! Wikidata people never admit, and always jab & look down at langwiki editors. Would you care to consider that I made an actual question? - DePiep (talk) 20:56, 29 March 2018 (UTC)

Yes. Would you like to concede that I gave the answer. Most of us, btw, are active both on wikidata and wikipedia, so... --Tagishsimon (talk) 21:07, 29 March 2018 (UTC)

Tagishsimon yes, my follow up: how to add a property? - DePiep (talk) 21:35, 29 March 2018 (UTC)

Your "help" was hidden in sarcasm. As I noted: not the first time I met this here at Wikidata. -DePiep (talk) 21:38, 29 March 2018 (UTC)

You should have a link. What do you see on the left panel of the Wikipedia article in the "languges" portion?--Ymblanter (talk) 21:08, 29 March 2018 (UTC)

The link is there, found it, used it, OK. My point is that this is not the way a new "item" (article) is created in langwikis, so I was missing it understandably. Take a look at Wikidata:Main_Page: I followed "Get involved", "Contribute to Wikidata", and then got nowhere wrt creating an item. - DePiep (talk) 21:35, 29 March 2018 (UTC)

Yes, the getting started guide to Wikidata is somewhat lacking, Wikidata:Tours should provide some info, although there are several tours missing... Something I have on my list to try and get started again. --John Cummings (talk) 21:42, 29 March 2018 (UTC)

Cannot add property Q47574 to Q51102989. -DePiep (talk) 21:54, 29 March 2018 (UTC)

@DePiep:, what are you trying to do exactly? I think this may be helpful Wikidata:List_of_properties. If you have 176K edits on Wikipedia and you can't work it out I think this shows how badly we need better instructions.... --John Cummings (talk) 22:00, 29 March 2018 (UTC)

Yes, John Cummings, that is what I am trying to do/say: Wikidata editing does not look like enwiki editing. IMO, up to WD to improve something. If you read a frustration: good reader. - DePiep (talk) 22:14, 29 March 2018 (UTC)

I'm guessing

⟨ dimensionless unit (Q51102989)  

 ⟩ subclass of (P279) ⟨ unit of measurement (Q47574)  

 ⟩

, in which case on the dimensionless unit (Q51102989) record, select "Add statement", choose P279 as the property, and Q47574 as the value. --Tagishsimon (talk) 22:06, 29 March 2018 (UTC)

Thx, but really: P279 was not suggested. -DePiep (talk) 22:14, 29 March 2018 (UTC)

I'm unsure what you mean. Not suggested when & where? fwiw, it seems to me that if there is a relationship between the two items you mention, then it is that one is a subclass of the other, which is encoded here with subclass of (P279). Clearly, you may have some other idea about the relationship, in which case, you'll probably tell us? --Tagishsimon (talk) 22:18, 29 March 2018 (UTC)

As I said: "P279 was not suggested". - DePiep (talk) 22:41, 29 March 2018 (UTC)

Helpful. Still. --Tagishsimon (talk) 23:00, 29 March 2018 (UTC)

Yep, I found it. Loads of effort, had to find en:VPT = WD:thispage; had to explain the problem thrice, got multiple jabs. But still you claim Wikidata is "working" nicely & well? - DePiep (talk) 22:17, 30 March 2018 (UTC)

I've made no such claim, but find your histrionics overdone & dull. --Tagishsimon (talk) 22:53, 30 March 2018 (UTC)

I've reverted DePiep's edit to unit of measurement (Q47574), which changed its English label from "unit of measurement" to "unit of a physical quantity" and its description from "real scalar quantity, defined and adopted by convention, with which any other quantity of the same kind can be compared to express the ratio of the two quantities as a number (International vocabulary of metrology)" to "(International vocabulary of metrology)". I've also nominated dimensionless unit (Q51102989) for deletion, as "Non-notable, uncited, and unlinked.". Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 09:49, 30 March 2018 (UTC)

Nonsense, Pigsonthewing. A "unit" is not a "physical quantity". -DePiep (talk) 22:05, 30 March 2018 (UTC)

Then demonstrate consensus for your change, which I have, absent such consensus, again reverted. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 22:23, 30 March 2018 (UTC)

Read the SI. (also, Pigsonthewing, you are behaving nasty). - DePiep (talk) 22:28, 30 March 2018 (UTC)

Lay off the abuse. Demonstrate consensus. Cease edit warring. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 22:31, 30 March 2018 (UTC)

You don't know about physical quantities, so hold back. Also: glad I met you here, Pigsonthewing! So you don't spend your time on xxwiki. -DePiep (talk) 22:37, 30 March 2018 (UTC)

Very well: Wikidata:Administrators' noticeboard#DePiep & unit of measurement. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 22:52, 30 March 2018 (UTC)

I wrote: "Nonsense, Pigsonthewing. A "unit" is not a "physical quantity"." Can't you handle a fact? -DePiep (talk) 22:58, 30 March 2018 (UTC)

What processes do you use to import data into Wikidata?

Hi all

I'm working on better documentation for data imports, it would be really helpful if you could outline the steps you take to upload data. I'm not super interested in which tools you use, more the processes you take the data through to get it into Wikidata.

Multiple people giving similar answers is helpful as it will help me to understand commons processes.

Thanks very much

--John Cummings (talk) 18:21, 30 March 2018 (UTC)

I'd distinguish two cases:

Adding data to existing items: For that, I normally use a federated SPARQL query, which links my own data to Wikidata. If some property is already covered in Wikidata, I skip the item (because any checking if one or the other value is correct - or both - can be done only intellectually).
Adding new items: Normally, I try to match the items via Mix-n-Match in the first instance to avoid duplicates. Again, with a federated SPARQL query I can preclude any exsisting items by external ids which are present in my own data, and get the data I want to insert from my own endpoint.

In both cases, I save the data and transform it with some simple perl script into Quickstatements2 input format. The script adds a provenance references, and sometimes data which is only implicit (e.g. P31 Q5, when my data only inculdes humans). Then I load it in Quickstatments batch mode, using a bot account. Jneubert (talk) 20:05, 30 March 2018 (UTC)

I've described these processes in more detail and with links to according queries and scripts in a paper for the NKOS workshop 2017. Jneubert (talk) 20:12, 30 March 2018 (UTC)

Bad constraint on archive date (P2960)

"archive date (P2960)" includes the constraint "used as qualifier constraint (Q21510863)", which specifies that the property should only be used as qualifier and not for references or values. However, this contradicts P2960's other constraint, "used as reference constraint (Q21528959)", which says it should be used for references and not for qualifiers or values. It is impossible for both of these constraints to be simultaneously satisfied, so at least one of them needs to be removed. This property is intended to be used in references, so "used as qualifier constraint (Q21510863)" is the one that needs to go. Cheers, IagoQnsi (talk) 21:28, 30 March 2018 (UTC)

@IagoQnsi: I moved this here, as it's not an admin issue. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 22:57, 30 March 2018 (UTC)

Is Apollo 11 (Q43653) instance of (P31) mathematical concept (Q24034552)?

Well, since it is instance of (P31) human spaceflight (Q752783) which is subclass of (P279) mathematical concept (Q24034552) it is:

human spaceflight (Q752783) is subclass of (P279) spaceflight (Q5916)
which is subclass of (P279) flight (Q206021)
which is subclass of (P279) displacement (Q190291)
which is subclass of (P279) physical quantity (Q107715)
which is subclass of (P279) quantity (Q309314)
which is subclass of (P279) mathematical object (Q246672)
which is subclass of (P279) mathematical concept (Q24034552)

The example is taken from Talk:Q2672914, however regardless of that particular case, I am worried about the general pattern of carelessly adding statements like subclass of (P279) when there are points of view when they are not 100% correct. It makes Wikidata completely useless for finding all things that are instance of something in cases where it might also be described as being instance of a subclass of it.

Does anyone have an idea how to prevent this? My only idea would be to have some intelligent gadget, which tells you that Apollo 11 becomes an instance of a mathematical concept when you want to add instance of spaceflight.--Debenben (talk) 16:12, 29 March 2018 (UTC)

This happens all over Wikidata. I think we have a general problem of defining physical objects and physical activities as subclasses of abstract concepts. In this specific case displacement (Q190291): vector that is the shortest distance from the initial to the final position of a point P should not be a subclass of "quantity". It's a different "displacement" displacement (Q5636358): ship's weight that's a subclass of "quantity" (and it needs a better description as well). Further, I'd argue that "flight" is not a subclass of "displacement^vector" in any case. - PKM (talk) 17:33, 29 March 2018 (UTC)

@User:Neo-Jay, why do you think spaceflight (Q5916) is a subclass of flight (Q206021)? --Succu (talk) 20:33, 29 March 2018 (UTC)

@Succu: The English description of Q5916 (English label "spaceflight") is "essentially an extreme form of ballistic flight..." (Italic type added), which indicates that "spaceflight" is a subclass of "flight". And the English description of Q206021 (English label "flight") is "process by which an object moves, through an atmosphere or beyond it" (Italic type added), which indicates that flights can include moves in outer space (spaceflights). The English Wikipedia article Flight states that flight "is the process by which an object moves through an atmosphere (or beyond it, as in the case of spaceflight)..." (Italic type added). This explicitly defines spaceflight as a type of flight. So I edited spaceflight (Q5916) to be a subclass of flight (Q206021). Please correct me if I am wrong. I admit that my edit was only based upon English meanings of Q5916 and Q206021. Probably these two items have other meanings in other languages.--Neo-Jay (talk) 01:37, 30 March 2018 (UTC)

Concepts would ideally be independent of language. If we accept ballistic flight as a type of flight, like the flight of a cannonball, then space flight seems like a plausible subclass. But then are orbits of artificial satellites and astronomical objects also a subclass of flight? Ghouston (talk) 04:31, 30 March 2018 (UTC)

"Orbit" (Q4130) is a subclass of "trajectory" (Q193139), seemingly different from "flight" (Q206021). --Neo-Jay (talk) 06:54, 30 March 2018 (UTC)

physical quantity (Q107715) subclass of (P279) quantity (Q309314) is questionable. A quantity (Q309314) is a mathematical object, but physical quantity (Q107715) seems to be something more, not something less. Ghouston (talk) 23:30, 29 March 2018 (UTC)

Changed it to physical quantity (Q107715) subclass of (P279) physical property (Q4373292) and physical quantity (Q107715) has characteristic (P1552) quantity (Q309314). Ghouston (talk) 23:38, 29 March 2018 (UTC)

I don't want to be ungrateful, but the problem is not solved by just fixing this particular instance, I can come up with thousands of examples: Moscow Metro (Q5499) is also instance of (P31) mathematical concept (Q24034552), via rapid transit (Q15099348) urban rail transit system (Q3491904) railway network (Q2678338) transport network (Q924286) spatial network (Q7574076) graph (Q141488) mathematical object (Q246672) mathematical concept (Q24034552). Isn't there a way to ensure that items cannot be subclasses of themselves, items that are subclasses of physical object (Q223557) cannot be subclasses of abstract entity (Q7184903) etc. Maybe some

Notified participants of WikiProject property constraints?--Debenben (talk) 16:13, 30 March 2018 (UTC) PS: I just checked: It looks like Apollo 11 (Q43653) is still instance of (P31) mathematical concept (Q24034552), maybe via spaceflight (Q5916) transport (Q7590) motion (Q79782) change (Q1150070) process (Q3249551) sequence (Q20937557) or something.--Debenben (talk) 16:43, 30 March 2018 (UTC) fixed--Debenben (talk) 17:41, 30 March 2018 (UTC)

WikiProject Ontology has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead.

There was a discussion here a few weeks back about problems like this with our higher-level ontologiy. I am always reluctant to make what appear to be common sense edits to the upper ontology because the higher you go, the more abstract the relationships seem to be. I'd like to see the Ontology project coordinate a top-down assessment and recommendation of the changes to be made for things like physical objects. I'd prefer to see a proposal-and-consensus approach to modeling our data. Does anyone with a solid grounding in the theory want to take on this effort? - PKM (talk) 18:52, 30 March 2018 (UTC)

I'm willing to put in some time on this as I think that it is a major problem in Wikidata, and I think that I have the appropriate background. One problem is that there are quite a number of ways to go, and each of them appears to have quite vocal proponents. This makes it very hard to achieve official progress, so things are done on an ad hoc basis. Peter F. Patel-Schneider (talk) 19:12, 30 March 2018 (UTC)

This is an issue that comes up again and again. Wikidata is built more with the local than the global view in mind. This has benefits and drawbacks. The constraint reports are one tool to help find such issues. However we need to do more to make it easier to find and solve such ontology problems. If someone has ideas for what can be done in addition please let me know. --Lydia Pintscher (WMDE) (talk) 13:54, 31 March 2018 (UTC)