Shortcuts: WD:PC, WD:CHAT, WD:?

Wikidata:Project chat

From Wikidata
Jump to navigation Jump to search

Wikidata project chat
Place used to discuss any and all aspects of Wikidata: the project itself, policy and proposals, individual data items, technical issues, etc.
Please take a look at the frequently asked questions to see if your question has already been answered.
Please use {{Q}} or {{P}}, the first time you mention an item, or property, respectively.
Requests for deletions can be made here. Merging instructions can be found here.
IRC channel: #wikidataconnect
Wikidata Telegram group
On this page, old discussions are archived after 7 days. An overview of all archives can be found at this page's archive index. The current archive is located at 2024/07.

How do we keep bot owners from importing the same bad data over and over

How do we keep bot owners from importing the same bad data over and over?

For example, there are multiple instances on Virgil (Q1398) the bot BotMultichillT (talkcontribslogs):

imports a large batch of bad data
which is then removed
then imported again a few months later
removed again
imported again
removed again
imported again
removed again
imported again
removed again
imported again
removed again

How are we ever going to make progress as editors cleaning up bad data if the bot owners keep putting the bad data back in? --EncycloPetey (talk) 14:29, 26 March 2020 (UTC)[reply]

See Help:Deprecation. You keep the entry but mark it as a false claim. The entry can't be reinserted as it already exists but no one pays attention to it because it is recorded as false. From Hill To Shore (talk) 15:20, 26 March 2020 (UTC)[reply]
It's not possible to mark aliases that way. --EncycloPetey (talk) 15:32, 26 March 2020 (UTC)[reply]
In the specific example you link above where the bot is inserting several different language labels into the English fields, you need to flag it up to the bot operator. Either the source that the bot is using needs to be put on a black list or the logic of where the bot is inserting the data needs to be improved. From Hill To Shore (talk) 15:28, 26 March 2020 (UTC)[reply]
Several people have brought this and similar edits to the bot owner's attention multiple times. There are at least two active threads about ULAN data (from three editors) on his talk page right now. --EncycloPetey (talk) 15:32, 26 March 2020 (UTC)[reply]
The first step is to ping the bot owner to see if they will engage here. @Multichill:. If the bot owner doesn't respond to multiple requests on their talk page or pings to related discussions then you can flag it to administrators to intervene. What often happens is that we find the bot owner is busy and had either not picked up the earlier messages or misunderstood the implications. Most issues can be resolved once the bot operator engages in the discussion. From Hill To Shore (talk) 16:21, 26 March 2020 (UTC)[reply]
@From Hill To Shore: The bot owner has replied below. He refuses to change his bot and is accusing me of vandalism. He has reverted me [1] and claims all the data is valis as English aliases, contrary to my explanations below, simply because they are in another database. --EncycloPetey (talk) 16:51, 26 March 2020 (UTC)[reply]
  • "Bad" doesn't really say much. Sometimes it happens that bots have code errors or do mis-mappings and the result should be different or added elsewhere. This means that the bot's code has a design issue and should be blocked.
Here this doesn't seem true. I think we discussed these alias here before and concluded that it's a good idea to add them. Why do you keep deleting them?
It's a common feature of people born before spellings of names were standardized that there isn't just one that was used to refer to them. Bear in mind that Wikidata is not Wikipedia nor an encyclopedia.
If you think some are problematic, you could add them as statements with the given reference, deprecated rank, and a reason for deprecation. Such cleanup would be most helpful, merely suppressing referenced data is not. --- Jura 15:34, 26 March 2020 (UTC)[reply]
"Publiusz Wergiliusz Maro" is not English; it's Polish.
"Publio Virgilio Maron" is not English; it's French.
"Vergil." is not an alias; it's Vergil with a period added.
You think that "... Virgil" is a valid alias? Why? Why would we need an alias with preceding ellipsis?
In short, I don't think you properly looked at the list of added aliases. I don't see how adding 68 statements about deprecation is a good solution to the problem of alias cruft. --EncycloPetey (talk) 15:41, 26 March 2020 (UTC)[reply]

I have no intention of changing the behavior of the bot because the Union Lists of Artist Names (ULAN) considers these valid aliases. You should not be removing these aliases. See https://www.getty.edu/vow/ULANFullDisplay?find=&role=&nation=&subjectid=500337098 to see theses aliases are all individually sourced. Removing these aliases borders vandalism. Multichill (talk) 16:41, 26 March 2020 (UTC)[reply]

Removing bad data is not vandalism; it improves the quality of Wikidata content. Why are you refusing to alter the behavior of your bot? --EncycloPetey (talk) 16:51, 26 March 2020 (UTC)[reply]
In your opinion it's bad data. This opinion is not backed by any sources. In my opinion these are valid and useful aliases and my opinion is backed by various sources. Shouting generally doesn't improve your point. Multichill (talk) 17:03, 26 March 2020 (UTC)[reply]
In your opinion, does "Vernacular" mean "English"? Your bot edits indicate that you think so. The data at ULAN you are adding is marked as Vernacular, but your bot is dumping the entire lot of it into the English aliases field, regardless of what language the data is in. The ULAN data does not indicate its language; it is therefore inappropriate to repeatedly claim that it is English. I have not "shouted" anywhere; I have bolded the key lines of my discussion for the ease of readers who do not wish to slog through all of the discussion. Emphasizing is not shouting. --EncycloPetey (talk) 17:45, 26 March 2020 (UTC)[reply]
You are asserting that the ULAN aliases are perfect. But this is clearly not the case. Once imported, if those aliases are improved, the improvements should persist. It would appear that bot-imported aliases, plus improvements by other Wikidata editors, are superior to the original ULAN aliases, and so the original ULAN aliases should not take precedence. —Scs (talk) 22:49, 26 March 2020 (UTC)[reply]
That's not my field, but in general: the fact that some organisation considers something valid does not mean that something is valid. There are many examples in chemistry databases where there are names that are obviously incorrect for an alias in WD (e.g. broader/narrower/related concepts that should have or already have different item in WD). Wostr (talk) 16:50, 26 March 2020 (UTC)[reply]
@Multichill, call me a vandal if you dare, will you? — Mike Novikoff 14:20, 31 March 2020 (UTC)[reply]

It is indeed a problem, I see a lot of incorrect aliases of chemical compounds that are a result of automatic imports (incl. aliases in different languages matched to English, aliases that are names of broader/narrower/related concepts, aliases with capital letters despite the fact that the same alias is already present without capitals). Sometimes such erroneous aliases are propagated to other languages (eg. English aliases are copied to Welsh, British English etc.) However, I don't think there is a universal solution for this — many of these errors are caused by imperfections in imported databases — but I think that every mass import should be discussed before it starts (at least a month earlier) in relevant WikiProject or in other place. This could reduce the number of errors. Wostr (talk) 16:47, 26 March 2020 (UTC)[reply]

    • I think there was some gene bot the messed up a lot of aliases. I think they were mostly fixed. Also, imports of redirects from Wikipedia as aliases as done for some languages is known to be problematic. None of this is relevant to the referenced additions by the bot above. --- Jura 17:07, 26 March 2020 (UTC)[reply]
  • What is the primary purpose of aliases? To me they are simply a way to improve search results. If a variant of a name is likely to be used for search purposes, then it is useful to have that as an alternate label. The more aliases the better, generally. Is there some other use for aliases of which I am unaware, that is damaged by having too many? ArthurPSmith (talk) 17:29, 26 March 2020 (UTC)[reply]
    But should Polish, French, and German aliases appear in the English alias field? Should aliases that simply add a period after the name be added? --EncycloPetey (talk) 17:52, 26 March 2020 (UTC)[reply]
    I'd say that enabling searching is one of two equally-important purposes of aliases. But "enabling searching" does not necessarily require explicitly including every possible misspelling and abbreviation and punctuation variation. —Scs (talk) 12:09, 27 March 2020 (UTC)[reply]
  • I'm guessing the problem here is that the Union List of Artist Names does not tag aliases by language, but of course Wikidata does. So yes, the ULAN aliases are all valid; what's invalid is importing them into Wikidata and tagging them as "en". So we need to figure out a way of augmenting the ULAN aliases with language mappings for proper importing, or else find a way to import them into Wikidata either without a language tag, or with some kind of "unknown" or "unspecified" language tag. Or -- here's another idea -- instead of deleting the "bad" aliases, re-tag them with better languages, and then teach the bot not to re-import an alias if it's present under any language tag. —Scs (talk) 17:49, 26 March 2020 (UTC)[reply]
    That's part of the problem. The ULAN database also has aliases that are the same other aliases, but with punctuation added, such as the name followed by a period, or the name preceded by ellipses. --EncycloPetey (talk) 17:50, 26 March 2020 (UTC)[reply]
I was about to say, the bot should be filtering those out, too, because in this case they're clearly unnecessary. The tricky part is that there are other cases where punctuation can be more significant. So it's not immediately obvious what a bot's rules should be for when to strip insignificant punctuation, or which punctuation is insignificant.
In any case, it seems we do need a better consensus on bot activity. This is the second time in as many weeks we've had complaints here about bots importing poor-quality data. It seems to me that in such situations bot operators should not just be falling back on the defense of "the bot is fine, and the data is fine, stop complaining". —Scs (talk) 18:00, 26 March 2020 (UTC)[reply]

This issue was raised back in 2018 (part1, part2), where I noted that many of the aliases being added by BotMultichillT from ULAN simply do not conform to our alias policy. The issue of adding alternative names marked as "vernacular" or "undetermined language" as English aliases was called out, as was the fact that many ULAN aliases are simply related concepts. The botop was asked to make the bot respect the work of editors removing these incorrect aliases, but apparently nothing was done. Bovlb (talk) 18:19, 26 March 2020 (UTC)[reply]

Can we block the bot until the problem is fixed? It's added the bad data in yet again since this discussion started. --EncycloPetey (talk) 20:15, 26 March 2020 (UTC)[reply]
  • So, having spent a lot of time in editing items about ancient Greek and Latin literature, I agree with many opinions expressed above, namely by @EncycloPetey, Scs, Bovlb:: it's true that ULAN contains a lot of aliases whose language is not stated; consequently, all these names exist, but it's wrong to add them indiscriminately as English aliases, because most of them aren't used in English sources, but in sources written in other languages. I have myself removed a lot of such aliases in the past years and months (this today). Since the problem has already been discussed (as stated by Bovlb), I think the bot should stop additions - or be blocked - until a consensus is reached; my suggestion is that the bot should at least never add an English alias when it is present in at least one label or alias other than English (of course there are a few exceptions, e.g. Publius Vergilius Maro can be a valid label/alias both in English and in Latin and in other languages, but it's better to manage those cases manually). --Epìdosis 20:34, 26 March 2020 (UTC)[reply]
  • Thanks for those links, @Bovlb:. At the risk of reopening that debate (and at the further risk of seeming to criticize the unsung work of bot operators, which is certainly not my intent), I have to point something out.
It's claimed that these "extra" aliases, the ones that some here are complaining about and trying to trim down, are important for searching. That might be true if we had a really dumb search engine, but we don't: the Mediawiki search engine(s) is/are pretty good. If you were to search for, say, "P. Vergilius Maro", and if Q1398 did not have an alias with that exact spelling, your search would still find Q1398 perfectly easily. (I tested this hypothesis by searching for "Q. Vergilius Maro". Similarly if you search for "Mantoano Virgilio" even though the closest explicitly-listed alias is "Virgilio Mantoano".) So there may be a reason for preserving the full breadth of these "extra" aliases, but enabling better searching isn't it. —Scs (talk) 11:31, 27 March 2020 (UTC)[reply]
  • Easily with Special:Search maybe provided that the relevant string property is indexed, but most searching at Wikidata is done with entity selector. If it hasn't happen in 8 years, it's not possible otherwise by now, it's unlikely to happen ever.
Anyways, it's still not stated why it's a problem to have more than 2 alias for an item. What are you trying to do with them? Maybe the usecase for not having them should be stated. --- Jura 11:51, 27 March 2020 (UTC)[reply]
  • Some of the points I have seen for retaining this data dump of aliases under the English tag suggest that the field is used solely for searching on Wikidata. This is incorrect. The aliases are reused elsewhere, such as in the Commons Creator template, providing a way for users of many projects to interact with our content. By dumping multiple language alias data against the English entry, we present English users with a nonsensical list of names (many of which can't even be read). Of a more serious and damaging nature though, we are hiding the native labels from non-English readers, because the content is against the wrong filter. From Hill To Shore (talk) 13:00, 27 March 2020 (UTC)[reply]
    • Can you provide us a sample from Commons creator template you see as problematic? --- Jura 13:12, 27 March 2020 (UTC)[reply]
      • I can't give you a specific example of a creator that has been affected by this problem as it is unclear on the scale of the issue. If bots and human editors are edit warring over this, there may be no specific cases that have had their data read by Commons. However, as a hypothetical example, see Commons:Creator:Rowland Langmaid. This shows a number of aliases if you are viewing Commons in English but will show different aliases if you are set to a different language. From Hill To Shore (talk) 13:30, 27 March 2020 (UTC)[reply]
  • It's a problem to have supposed aliases that aren't really aliases because someone writing something is liable to feel free to choose one as a stylistic matter, and end up writing something inappropriate. - Jmabel (talk) 15:37, 27 March 2020 (UTC)[reply]

I looked at some cases people are not happy about the bot importing many aliases (Virgil (Q1398), Homer (Q6691), Jerome (Q44248) & Dante Alighieri (Q1067)). The common denominator is that these people don't seem to be a (subclass of) visual artist (Q3391743). I updated the query to only work on visual artists. Is that a good compromise? Multichill (talk) 18:18, 27 March 2020 (UTC)[reply]

This update will limit the problems, but this is not a generic solution. --NicoScribe (talk) 18:40, 27 March 2020 (UTC)[reply]
Moreover do you plan to remove the incorrect values that have already been imported? --NicoScribe (talk) 15:52, 31 March 2020 (UTC)[reply]

This is not a problem of a bot, but a different understanding of what we should/should not be allowed as alias between one user and one bot-operator. Blaming the bot-script is not going to solve this. Starting a discussion with the bot-owner might either solve the issue, or end as agree to disagree. Edoderoo (talk) 10:29, 30 March 2020 (UTC)[reply]

What do you mean by "one user"? There's a lot of users complaining for more than four years already. Could it for once end as enough is enough? — Mike Novikoff 15:33, 30 March 2020 (UTC)[reply]
@Edoderoo: I am not really blaming the bots of Multichill. They are doing what their operator wants. Moreover "The contributions of a bot account remain the responsibility of its operator [...] In the case of any damage caused by a bot, the bot operator is asked to stop the bot. [...] The bot operator is responsible for cleaning up any damage caused by the bot" per Wikidata:Bots policy.
What do you mean by "one user"? Look at the old discussions and the current one: this is not "between one user and one bot-operator", this is "between 25 users and one bot-operator". Starting the old discussions with the bot-operator did not solve all the issues. --NicoScribe (talk) 15:39, 31 March 2020 (UTC)[reply]
What do you mean by one user. Counter question: why is the subject bot owners in plural? When the problem is with only one bot owner, it is brought as a generic problem between bot owners and the smart people. Discussing with such pre-positioning is not offering any solution. Edoderoo (talk) 16:11, 31 March 2020 (UTC)[reply]
@Edoderoo: yes, this discussion's title should include the words one bot owner instead of bot owners. So, what is your solution when 25 users disagree with one bot-operator? --NicoScribe (talk) 16:26, 31 March 2020 (UTC)[reply]
@Edoderoo: what is your solution when 25 users disagree with one bot-operator, please? If your solution is to talk, that is exactly what we are trying to do here. --NicoScribe (talk) 14:37, 4 April 2020 (UTC)[reply]
I guess our next move should be to ask for a block of BotMultichillT at AN. It clearly goes against the consensus, yet the bot owner says "I *am* right, period". Some other admin will put *another* period there, won't he? Personally, I'd suggest to forbid to use ULAN at all for any of Multichill's bots. Four years were more than enough to show his mighty skills to filter out junk from crap, let's strike a balance now. And I sincerely hope that WD is not like ruwiki (which I had left more than year ago) where any user with admin flag will do whatever he wants and nothing will ever stop him. — Mike Novikoff 21:44, 7 April 2020 (UTC)[reply]
  • Multichill already "filter out the non-latin strings" so I do not understand why Multichill would be unable to filter out some keywords, such as "called", "dit", "detto", "genannt", "surnommé", "plus connu sous le nom de", "eigentlich", "known as". --NicoScribe (talk) 15:52, 31 March 2020 (UTC)[reply]
  • Note also the Wikidata v. Renon case. Looks like an utter madness, doesn't it? I can't readily propose an algorithm to filter out such things, and moreover I don't suppose it's my burden. What does it mean in practice? Does it mean that Renon should persist for some another five or ten years? Just imagine all the people... no, alas, just imagine how globally this "Renon" will be replicated all across the Universe... well, across the Internet by then. — Mike Novikoff 18:25, 31 March 2020 (UTC)[reply]
  • I wrote to the Getty (the Vocabulary Program chief editor and 2 IT people): There's a heated discussion about Multichill (the author of the Sum of All Paintings project) importing ULAN names and aliases indiscriminately. The example being discussed is Virgil. Two problems are pointed out:
    • 1. Many aliases are a duplicate of another, with just a trailing dot added. Sometimes the dot indicates an abbreviation, but not always. Even an example with leading ellipsis is pointed out. I some cases the dots are required to show abbreviation (Eg "Pub.V.M." and "P.V.M.") but in other cases are parasitic (eg "Wergiliusz" vs "Wergiliusz.") Since the dots are not useful for searching, can the Getty fix some of these problems?
    • 2. There is no language tag, so his dumping of all aliases in Wikidata as EN is incorrect. But I don't see an easy way for the Getty to fix this... --Vladimir Alexiev (talk) 06:21, 13 April 2020 (UTC)[reply]
If this set of aliases is generally good (that is, if it's worth bulk-importing at all), but if it contains these anomalies that can't readily be automatically filtered out or properly tagged, one solution would be to let the volunteers here clean them up (as indeed some volunteers have been trying to do). That would work if the bot can be persuaded not to re-import the same data over and over. —Scs (talk) 11:13, 13 April 2020 (UTC)[reply]
If I were running this bot, I would alter its current algorithm:
for each alias a in external database D
if a is not present on the relevant entity e in Wikidata
add a to entity e
and change it to:
for each alias a in external database D
if a is not in auxiliary list X and is not present on the relevant entity e in Wikidata
add a to entity e
add a to auxiliary list X
Scs (talk) 12:17, 14 April 2020 (UTC)[reply]
@scs: Such a change would resolve the issue of bots edit-warring with human editors but, as ghuron noted in a different discussion, "I need to create infrastructure that can cache hundreds of millions of individual statements. This task alone is significantly more complex than the rest of the bot code." Bovlb (talk) 16:50, 14 April 2020 (UTC)[reply]
@Bovlb: That's a good point; thanks for reminding me.
The difference here is that (a) I don't get the impression there are hundreds of millions of ULAN aliases being imported, and (b) aliases don't have rank, so the people who would like to see the aliases cleaned up have no choice but to try to convince the bot operator to change its algorithm.
(If this problem persists, I'm thinking that the people who would like to see the aliases cleaned up are going to have no choice but to seriously propose adding a whole new ranking/deprecation mechanism for aliases.)
The other question I would ask is whether the ULAN aliases are so vital a dataset that a 100% synchronization process between them and Wikidata has to be ongoing. If not, it seems like it would be enough to make one pass through the importation process, then call it done and move on to some other database to import. Or (since it's true that new people worthy of inclusion are being born and identified every day) there could be a simple mechanism to limit the continuous import to process only people newly added to ULAN. (That would be simple, without requiring an entire "auxiliary list X", if it happens to be the case that ULAN's IDs, like Wikidata's Q-ids, are monotonic.) —Scs (talk) 11:39, 15 April 2020 (UTC)[reply]

Updating population data for US States

Hi, I've parsed the US Census files as well as linked them to Wikidata pages for all US States. I'd like to update their populations to match the Census ones. Can someone give me a guidance how should I do this exactly?

I guess I'd need to give a reference, what should it be? Also, how to register that "point-in-time" pointing to the latest population data?

Say, I'd like to update https://www.wikidata.org/wiki/Q99 to have population 39,512,223 which is from the 2019 estimates from US Census, from this file: https://www2.census.gov/programs-surveys/popest/datasets/2010-2019/counties/totals/co-est2019-alldata.csv

What should I add exactly?  – The preceding unsigned comment was added by Hyperknot (talk • contribs) at 21:49, 6 April 2020 (UTC).[reply]

Major issue

A major database corruption hit Wikidata, also affecting connected sister projects such as Wikipedia, about 11 hours ago (at 23:00:00, 6 April, UTC). A fix is in place, but it may take a while for things to get back to normal. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 08:57, 7 April 2020 (UTC)[reply]

Update: logged out users might see wrong data for the next 24 hours. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:32, 7 April 2020 (UTC)[reply]
Less "wrong data" and more "poorly rendered client pages" with elements like the wikidata sidebar link, interwiki links and data included in the articles missing. ·addshore· talk to me! 18:30, 7 April 2020 (UTC)[reply]

Watanabe no Tsuna (Q6582069)

I don't know if this is just happening to me or if something else is going on: the enwiki, zhwiki, and jawiki for this subject are not showing interwiki links, while frwiki and fawiki are showing the interwikis fine. I don't see any templates being used in the affected wikis that are suppressing the interwiki links. Sorry if I asked this in the wrong place. Underbar dk (talk) 02:22, 7 April 2020 (UTC)[reply]

@Underbar dk: See phab:T249565. Causing lots of issues, including true duplicates. S.A. Julio (talk) 02:38, 7 April 2020 (UTC)[reply]
Missing interwiki links on client pages will indeed be a symptom ·addshore· talk to me! 10:25, 7 April 2020 (UTC)[reply]

Why isn't "Wikidata item" under Tools column in English wp?

"Wikidata item" isn't listing under Tools column in English Wikipedia for en:One America News Network. How can this be changed for Q17107494, as it is causing an error for the {{official website}} there? X1\ (talk) 06:06, 7 April 2020 (UTC)[reply]

Hello,
An issue that occured yesterday has been causing various problems with the connections between Wikidata and the other projects. We are working on fixing it as soon as possible. Lea Lacroix (WMDE) (talk) 07:21, 7 April 2020 (UTC)[reply]
There is also problem that is now possible to create duplicate item (with same sitelinks) JAn Dudík (talk) 07:49, 7 April 2020 (UTC)[reply]
Yes, this is a known problem but should stop in the next 24 hours after the table is fully restored. Let me know if the issue happens again after 24 hours. Lea Lacroix (WMDE) (talk) 10:28, 7 April 2020 (UTC)[reply]
Thank you. X1\ (talk) 09:59, 7 April 2020 (UTC)[reply]

Could not save due to an error.

I can t change anything on Q83873593 --Viruscorona2020 (talk) 06:21, 7 April 2020 (UTC)[reply]

It looks like some users have managed to edit that page since you posted this message. https://www.wikidata.org/w/index.php?title=Q83873593&action=history Are you still having problems? Do you have any error message? ·addshore· talk to me! 10:27, 7 April 2020 (UTC)[reply]

These template pages are appeared on this item. However, the interlingual links cannot display on every wiki sites. Taiwania Justo (talk) 09:59, 7 April 2020 (UTC)[reply]

This is probably phab:T249565 ·addshore· talk to me! 10:24, 7 April 2020 (UTC)[reply]
I edited the item and purged some of the pages; all now link to the Wikidata item. Some links don't display (for example to a different project in a different language) but that is just how it is set up. Peter James (talk) 10:37, 7 April 2020 (UTC)[reply]
Template:PD-user (Q6188601) has the same problem. I cannot target that the related items with this bug. Taiwania Justo (talk) 11:12, 7 April 2020 (UTC)[reply]
I checked some of the pages and the links were there; some may need purging. Peter James (talk) 11:28, 7 April 2020 (UTC)[reply]

Update about the database breakage

Hello all,

A few hours ago, an important issue with the Wikidata database caused various issues: Wikipedias were down for about 20 minutes, some error messages appeared on Wikidata and other projects. Some of the side effects were disappeared interwikilinks, broken tools and gadgets, creation of duplicates on Wikidata, as well as other issues related to the connections between Wikidata and the sister projects.

This breakage was caused by a database table that was temporarily dropped due to a failing script that got activated after the renaming of wb_terms. You can read more technical details in the Phabricator ticket and the incident report.

A patch has been submitted quickly after the breakage and we can confirm that no data prior to the breakage has been lost: as we’re sending this update, all of the data that was stored before and after the issue (April 6th at 23:00 UTC) has been successfully restored.

However, some deletions and redirects that happened during the past 15 hours may currently still have site links attached to them in this table. We will need to remove them as soon as possible.

You may also experience caching issues on sister projects for the next 24 hours if you’re logged out: most of the time, purging the page or performing a null edit on a page will solve it.

If you encounter any further issues, please let us know. Many thanks to the engineers at WMDE and WMF who contributed to address the problem quickly. Lea Lacroix (WMDE) (talk) 18:09, 7 April 2020 (UTC)[reply]

Mass creation of duplicate items

I can see people actively creating duplicate items. Probably because they come to Wikipedia and see that their favorite articles don't have any interlanguage links. I've merged/redirected some ([2], [3], [4]), but Q89666264 just won't merge with Q61686297 for some reason. --Moscow Connection (talk) 11:54, 7 April 2020 (UTC)[reply]

Looks like this is fixed now :) ·addshore· talk to me! 18:29, 7 April 2020 (UTC)[reply]

duplicates impossibile to merge...

probably a consequence of the recent bug...

Q89624137 is a dupe of Q48002038, with the same enwp link... obviously created when adding the tewp link...

contrary to other dupes of the same kind that I successfully merged today, I cannot merge these 2 ; even removing the en link from Q48002038, there still is a "Erreur durant "Fusionner avec : Q48002038" : Validation failed: SiteLink conflict" message.

Can someone investigate, and merge them, please ? --Hsarrazin (talk) 10:48, 8 April 2020 (UTC)[reply]

edit : ok, there seems to be a 3rd item, which conflicts on tewp link Q25576476, but the statements seem conflicting : actor and sport coach ? I just merged the 3 of them, and then unmerged because of this... could someone understanding telugu please check it, I'm afraid of making a big mistake ?
Trouble here: Category:Molluscs of Oceania (Q7849769) with true duplicates Category:Molluscs of Oceania (Q73191082), Category:Molluscs of Oceania (Q86161706) and Category:Molluscs of Oceania (Q86238123). I tried to merge several ways, but remained unsuccessful. Lymantria (talk) 11:50, 8 April 2020 (UTC)[reply]
I merged them; I had to remove the sitelinks from the last two items, and from Category:Molluscs of Oceania (Q86127443), which was also a duplicate. Peter James (talk) 12:48, 8 April 2020 (UTC)[reply]
Thanks, so I still overlooked one. Lymantria (talk) 06:40, 10 April 2020 (UTC)[reply]
Lakshmi Kanakala (Q48002038) is described as an actress, and is also in the category Category:Indian acting coaches (Q86321678); coach (Q41583) was probably assumed to be the correct meaning of "coach". There is now a separate item acting coach (Q28135085) but it's still a subclass of Q41583. Peter James (talk) 13:00, 8 April 2020 (UTC)[reply]

Cannot edit Q664609

Today I edited sitelinks and labels on several items, but when I try the same on Caribbean (Q664609), every time I get: "An error has occured. The save has failed." What's wrong? --bdijkstra (overleg) 11:53, 8 April 2020 (UTC)[reply]

A Commons sitelink had been added, but it was already on Category:Caribbean (Q6140308); I removed it from Q664609, so it should now be possible to edit. Peter James (talk) 12:34, 8 April 2020 (UTC)[reply]
Indeed it is editable now, thanks. How does one figure out that a duplicate sitelink is the cause and how does one find the other sitelink? --bdijkstra (overleg) 14:34, 8 April 2020 (UTC)[reply]
It's usually the cause, but I don't know if there's an easy way of finding them; I searched for the English title, as that is the most likely to be a duplicate, and didn't find a new item that looked likely, then as the Commons link was to a category, I checked the category item and saw that the link was on both. After that I would have asked the same question. Peter James (talk) 16:57, 8 April 2020 (UTC)[reply]

"The save has failed"?!

@Peter James: Before Feb 2020, when I failed to do addition of sitelinks, the warning box should tell me that the sitelink is already used by another item, and suggest me that if move and/or merge is possible or not, but now this warning box just only say "has failed", who vanished the former functions? --Liuxinyu970226 (talk) 01:36, 11 April 2020 (UTC)[reply]

I don't know about that, it still says where the sitelink is used with Special:MergeItems and when merging in QuickStatements. Peter James (talk) 08:14, 11 April 2020 (UTC)[reply]

I moved the Wikipedia page en:Hôtel de Conti to en:Hôtel de Brienne. Apparently the link at Wikidata Q3145768 failed to update. I tried to make the change manually on Wikidata, but got an error message: "Could not save due to an error. The save has failed." An attempt to edit the Label item of Q3145768 also gave this error. Perhaps this is related to a move I requested of en:Hôtel de Conti (disambiguation) to en:Hôtel de Conti, which deleted the original en:Hôtel de Conti. I did not noticce the problem at Wikidata until after that second move was made. --Robert.Allen (talk) 22:08, 8 April 2020 (UTC)[reply]

Part of the database had to be restored from a backup, which had some of the sitelinks attached to both this item and a redirected item Q16844978. Editing the redirect seems to have removed the duplicate links so the item can now be edited. Peter James (talk) 00:01, 9 April 2020 (UTC)[reply]
This partly relates to phab:T249613 I expect ·addshore· talk to me! 13:16, 9 April 2020 (UTC)[reply]

(true) duplicates created after database corruption

After the database corruption on 7th of April 2020, a lot of duplicate items have been created, including true duplicates, for example:

(e.g. [5])

Will all true duplicates be merged automatically? Thanks a lot! --M2k~dewiki (talk) 21:52, 9 April 2020 (UTC)[reply]

Not so. It would be interesting if someone created a list of true duplicates at Wikidata:True duplicates? Lymantria (talk) 06:45, 10 April 2020 (UTC)[reply]

Cannot edit Q89654066

@Addshore: Weirdly, I cannot edit Q89654066 and the edit history is empty, but the item already has two sitelinks. Were some edits lost from the page history? Vojtěch Dostál (talk) 17:19, 10 April 2020 (UTC)[reply]

@Vojtěch Dostál: Merged to Filip Forejtek (Q48614810), the enwp sitelink was duplicated. Thanks. Mike Peel (talk) 18:22, 10 April 2020 (UTC)[reply]

ICMBio ID (external - ID)

Hi guys, I'm having a little bit of difficulties to structure the proposal of this new external ID. The problem is that the URL not always follow the same path: [7] or [8], but, this is this the most important ID when we are talking about national reserves in Brazil.

So how can I propose this in a proper manner? Rodrigo Tetsuo Argenton (talk) 12:17, 7 April 2020 (UTC)[reply]

@Rodrigo Tetsuo Argenton: I don't speak Portuguese, so I'm not 100% sure here. Some testing reveals that every page has a unique numeric ID in the URL (1869 and 7922) in your examples. Furthermore, if you input the ID into this URL: https://www.icmbio.gov.br/portal/unidadesdeconservacao/biomas-brasileiros/_/_/ID-, it'll always bring up the content for that ID (even though it's not the canonical URL). The ID could also just be everything after the portal/ in the URL, mosaicosecorredoresecologicos/moscaicos-reconhecidos-oficialmente/1869-mosaico-bocaina and unidadesdeconservacao/biomas-brasileiros/mata-atlantica/unidades-de-conservacao-mata-atlantica/7922-rppn-cabure in your examples. --SixTwoEight (talk) 13:52, 7 April 2020 (UTC)[reply]
@SixTwoEight:, thank you for the attention, so you are saying that the structured need a text entry, not only an id?
@Ederporto: pode me ajudar, aqui? Talvez precise de português para criar esse ID?
Thank you. Rodrigo Tetsuo Argenton (talk) 14:09, 9 April 2020 (UTC)[reply]
@Rodrigo Tetsuo Argenton, SixTwoEight: Hello, the identifier to be proposed will be only about the conservation units or the content of ICMBIO? I think short, numeric identifiers are always better, so the first SixTwoEight's proposal is great. On the other hand, the ICMBIO also have pages for the Brazilian fauna species and Brazilian research centers on those topics, it would be nice to have them as well. And this non canonical url works with them too, so I propose three different ICMBIO identifiers:
  • ICMBIO conservation units ID: https://www.icmbio.gov.br/portal/unidadesdeconservacao/biomas-brasileiros/_/_/$1-, with $1 being a number;
  • ICMBIO Brazilian fauna ID: https://www.icmbio.gov.br/portal/faunabrasileira/_/$1- for the Brazilian species, with $1 being a number and;
  • ICMBIO research centers ID: https://www.icmbio.gov.br/portal/centrosdepesquisa/$1 with $1 being a slugified name.
Ederporto (talk) 00:38, 10 April 2020 (UTC)[reply]
Thank you @Ederporto:, I will propose in some time soon. Rodrigo Tetsuo Argenton (talk) 22:24, 14 April 2020 (UTC)[reply]

BNF to Wikidata, RIP

It appears we recently lost one of the most valuable tools for easily importing biographical database entries into Wikidata without needing a degree in computer science. The immensely useful BNF to Wikidata tool appears to be recently defunct and abandoned: see https://www.lehir.net/dicare-tools/. Can we get something similar rebuilt? I will reiterate my previous assertion that Wikidata needs more ways to quickly and easily pull from existing databases. Even if a tool simply imported name, instance of (human), and a single identifier, that would save several time-consuming steps. If concerns about database copyright are a problem, perhaps tackle public domain sources like the Library of Catalog Name Authority File. I'm aware of, and sympathetic to, concerns about mass imports of thousands of items all at once, but any efforts to more easily start a single item would make a world of difference. -Animalparty (talk) 18:33, 7 April 2020 (UTC)[reply]

I have created phab:T249697, though this is a very general task that does not include anything specific to do.--GZWDer (talk) 12:44, 8 April 2020 (UTC)[reply]
That's quite the accusation done by user:Envlh on https://www.lehir.net/dicare-tools/
At least the source code was committed to git (https://github.com/envlh/ ) so you can deploy it somewhere else. So the RIP is very premature. Multichill (talk) 16:18, 8 April 2020 (UTC)[reply]
GZWDer Thanks for starting that phab ticket. One really nice thing about BNF to Wikdidata was that it allowed importing/additions of sourced statements to already existing items, as well as expediting the creation and expansion of new items. -Animalparty (talk) 22:29, 8 April 2020 (UTC)[reply]
-Animalparty,GZWDer, Multichill, Hi, could you add in the phabricator proposal the latest addition of Envlh : a tool that converts from IdRef to Wikidata ? (basically works the same way than the BNF one) The code is also on git--René La contemporaine (talk) 08:09, 14 April 2020 (UTC)[reply]
@René La contemporaine: no, you should do that yourself. You can login to Phabricator using your Wikimedia account. Just go to https://phabricator.wikimedia.org/auth/start/?next=%2FT249697 and use the "Wikimedia" option. Multichill (talk) 16:14, 14 April 2020 (UTC)[reply]
Multichill Well... I tried my best but this is really far away from the set of skills I have. Please check if what I said makes some sense. :-(--René La contemporaine (talk) 18:32, 14 April 2020 (UTC)[reply]

those are the building and address items. their addresses and building numbers follow the official government prescription in http://www.juso.go.kr.They are against Notability policy? or does it meet the criteria 'clearly identifiable conceptual or material entity'? thank you. Choikwangmo9 (talk) 01:46, 8 April 2020 (UTC)[reply]

thanks! Choikwangmo9 (talk) 04:35, 8 April 2020 (UTC)[reply]
  • They are different entities, since one is a building and the other is a service operating inside the building. The notability is questionable, especially for the building: how many similar buildings would there be even just in South Korea? Ghouston (talk) 03:45, 8 April 2020 (UTC)[reply]
approximately 7,243,472. but they are all physical and material entities 'clearly identifiable conceptual'. Choikwangmo9 (talk) 04:35, 8 April 2020 (UTC)[reply]
Yes, but notice the 2nd part "The entity must be notable, in the sense that it can be described using serious and publicly available references." For the same reason that Wikidata can't include all humans, without exploding, even if they happen to be listed in serious references like telephone books, it also unlikely that it would be able to cope with all buildings, all companies, or other large datasets like all particle collision events recorded at CERN or all observed stars and galaxies in the Universe. Some limitation to only the most notable is necessary. Ghouston (talk) 07:18, 8 April 2020 (UTC)[reply]
I don't disagree, but I can't help noting that the current mass import (now that we're done with scholarly articles (Q13442814) and taxa (Q16521)) seems to be every star and galaxy listed in SIMBAD (Q654724)... —Scs (talk) 12:28, 8 April 2020 (UTC)[reply]
  • I don't see anything wrong with entering a single address, the problem would be importing every addresses, which would be impossible to manage at this time. We have to prioritize our time and recognize our computational limits. Not adding every address has more to do with priority, even if they can be defined by reliable sources. --RAN (talk) 03:51, 9 April 2020 (UTC)[reply]

Interwiki extra

(Apologies if this has been discussed before. I could not find it in the archives.) Template:Interwiki extra (Q21286810) is used on several Wikipedias to "Get extra interwiki links from a specified Wikidata object, to complement existing links for the including article." For example, if a person is known primarily for one incident, we often find that some projects create an article on the person, whereas others have an article on the person event, and some may have both. Similarly with duos like Bonnie & Clyde. This template allows a reader to find relevant articles in additional languages. Unfortunately, to use this template, it must be added separately to each article in each project, and does not help with projects that have no such template, which seems to go against the Wikidata way of hosting such information centrally. Is there a property we can use to represent this, which could be exploited by all client projects? Should there be one? Bovlb (talk) 18:08, 8 April 2020 (UTC)[reply]

I don't understand the middle of your paragraph there; "some projects create an article on the person, whereas others have an article on the person, and some may have both." Have you used a wrong word? Both options seem the same. From Hill To Shore (talk) 18:47, 8 April 2020 (UTC)[reply]
Fixed, thanks. Bovlb (talk) 19:21, 8 April 2020 (UTC)[reply]

To phrase this problem in a different way: The central maintenance of interwiki links was the original motivating use case for Wikidata, and this is an area where we are failing to meet that goal because of our finer ontological distinctions. For example, the Spanish Wikipedia) has an article on Kitty Genovese (Q238128), but the English Wikipedia) has an article on murder of Kitty Genovese (Q18341392) instead. An ESWP reader does not see "English" on the "In other languages" list, even though ENWP has a highly relevant article on the subject. The corresponding ENWP reader does see a link for "Spanish", but only because an ENWP editor has manually added {{interwiki extra|qid=Q238128}} to the bottom of the page. My proposal is that we devise a way to centralize this information so that all projects can enjoy the benefits. Bovlb (talk) 21:45, 14 April 2020 (UTC)[reply]

Question for working with QS

I tried to use QS on a bit larger scale and did pretty much the same as I did in my smaler tests. Yet it created only six empty Items that I have to fill now. Does anybody know QS well enough to have a look at the code and tell me if I did something wrong? I would also like to know if something like this works:

CREATE
LAST|P361|Q666587 (property part of with Qvalue of a cemetery)
Q666587|P527|LAST (property has part with the QValue of the new item) <- the interesting line for the question

Thanks for your help --D-Kuru (talk) 20:40, 8 April 2020 (UTC)[reply]

Hello @D-Kuru: did you use a tab character for separation (instead of the pipe symbol |)? ([...] Hint: You can also use a spreadsheet software such as Excel; Copying/pasting the cells should automatically insert TABs. [...] ) Using OpenOffice and Copy/Paste works for me fine to create the tab separation, some text editors might convert the tab character to spaces. --M2k~dewiki (talk) 22:11, 8 April 2020 (UTC)[reply]
I think the commands with LAST anywhere but at the beginning still don't work. (The behavior is said to be different when you run batches synchronously and asynchronously, so not really sure about both.) --Matěj Suchánek (talk) 07:28, 9 April 2020 (UTC)[reply]
@M2k~dewiki: I have a template with things that could be added to a new item here. I copy it and change the values to whatever fits the item or remove it when I can not find the information. The code was exactly entered as it is shown in the expandable section. I prewrote the code in Notepad++ using the pipe character (as the help page said that it can be used for QSv2). The problem in the end was how QS interpreted the code.
@Matěj Suchánek: Well, then I have to test this. It sucks that this can only be tested on Item creation though. I read that there is a special test page if you want to test stuff, but I don't think that there is a whole Wikidata sandbox where you can create temporary items that never show up in the database and will be deleted once the test is over.
Now to the interesting thing: I copied the code to QS and tested it again. There is told me that it would create new items and listed their statements. I did not test if it would do the same thing again, but the list of items looked completely different. So I guess that it would work.
What I think happened is something I already noticed on Commons. I opened the QS window and edited the code. Since the code is a bit larger and I tried to be as careful as possible it took like 45 minutes. I think the window had some kind of timeout. What happens after that timeout is the site is processed without any detail you entered. On Commons it was that an image I wanted to upload reset the upload page. Here it might be that the page does not process any statement data.
I have no idea if this is because of Firefox, Windows or Wikipedia. It would be interesting to ask if Wikipedia is designed to have some kind of timeout on it's pages, but who would I ask and where should I do that?
--D-Kuru (talk) 20:53, 9 April 2020 (UTC)[reply]
We do have testwikidata:. I think the kind of timeout could be expiration of the edit token used for security or session data. But I'm not sure (networking isn't my field). The best explanation can be given by some Wikimedia engineers who are best reachable via IRC (don't know which channel). --Matěj Suchánek (talk) 08:04, 10 April 2020 (UTC)[reply]
I have to take a closer look at test wikidata since it seems to be pretty much to be what I'm searching for. The rest also needs a deeper look into I guess --D-Kuru (talk) 13:48, 11 April 2020 (UTC)[reply]
@Matěj Suchánek: I took a acloser look at it. Well, "The Test Wikidata is for developers to test their code without breaking all of Wikimedia". I also would have to have admin rights to edit it. So unfortunately I can not use Test Wikidata to test wround with creatig items. --D-Kuru (talk) 10:56, 16 April 2020 (UTC)[reply]
I don't think so. I can see even anonymous users editing it and QS does offer Wikidata-test. But I cannot help you more now, sorry. --Matěj Suchánek (talk) 11:42, 16 April 2020 (UTC)[reply]

date of baptism -- can we resolve this?

Once again, the discussion of the item-requires-statement constraint (Q21503247) on date of baptism (P1636) has disappeared from this page with no action taken. Now, religion or worldview (P140) is already one of the property's allowed qualifiers. I don't think many people object to stating the religion of the baptism as a qualifier, right? Certainly I don't. That being so, can I get agreement to remove the constraint that in addition requires adding religion or worldview (P140) as a statement of its own? It was this latter constraint that caused dissensions and confusions in the past discussions. Levana Taylor (talk) 01:03, 9 April 2020 (UTC)[reply]

Thanks. It wasn't protected, I just didn't want to single-handedly take action, especially since there hasn't been a comment from the person who added it in the first place. Levana Taylor (talk) 04:29, 9 April 2020 (UTC)[reply]
I think it was only added in February [9]. I don't see any prior edit war over it. Ghouston (talk) 04:33, 9 April 2020 (UTC)[reply]

How to resume a property proposal being on hold

When the EntitySchema was announced a property was proposed to link to schemas from wikidata items. Becasue at the time of the proposal it was deemed too early to discuss this. But now we know a bit more about the EntitySchema. We are now developing a variety of schema's in the ongoing biohackathon and it would be convenient to use such a property to keep track of all the schemas that are being developed in this context. How can we resume the discussion/acceptance process? --Andrawaag (talk) 09:54, 9 April 2020 (UTC)[reply]

@Andrawaag: Feel free to describe your usecase as a comment of this Phabricator task, it may help things moving forward. Lea Lacroix (WMDE) (talk) 12:02, 9 April 2020 (UTC)[reply]
@Lea Lacroix (WMDE):. I added this use case of biohackathon in the above ticket. John Samuel (talk) 08:11, 10 April 2020 (UTC)[reply]

Reordering different values datewise

Consider the item 2020 coronavirus pandemic in West Bengal. In that, there is a property named number of cases. I would like to know whether the values can be displayed according to the qualifier 'Point of Time'. Currently, it is ordered in the way, the user enters it. Adithyak1997 (talk) 10:17, 9 April 2020 (UTC)[reply]

I'm not sure what the limitations on gadgets are, but feel like this is something someone could create one for (order the UI based on series ordinal/point in time/start date qualifiers). Worth noting that this is purely an aspect of the Wikidata UI - I only mention this because I've seen people talk about "order" before and removing/re-adding statements to get the right "order" (which is silly, because the data is only truly ordered by addition of qualifiers). --SilentSpike (talk) 12:21, 9 April 2020 (UTC)[reply]
Also worth noting, you may want to set preferred rank on the most recent values, this way queries will return those. Can be qualified with reason for preferred rank (P7452) most recent value (Q71533355). --SilentSpike (talk) 12:24, 9 April 2020 (UTC)[reply]
I've found one method that works, through the API, which is to read the JSON version of the item using wbeditentity, edit the JSON to reverse the statements (without changing IDs, etc.), then rewrite the item with wbeditentity. It's a bit odd, since the edit has an empty diff. Example, [10], which orders the authors. Ghouston (talk) 13:39, 9 April 2020 (UTC)[reply]
@Ghouston: My point was really that there's no value in doing so as far as the data itself is concerned and really this is something that should be solved by the UI. --SilentSpike (talk) 17:26, 10 April 2020 (UTC)[reply]
Probably, but maybe you'd need a way to give a hint about which qualifier to sort by. It could be various types of date or a sequence number depending on the statement. Ghouston (talk) 23:26, 10 April 2020 (UTC)[reply]
It could be done as a property : qualifier mapping in the script. Ghouston (talk) 23:51, 10 April 2020 (UTC)[reply]
I had a go at creating a script for this User:SilentSpike/sortQualifiers.js, unfortunately it doesn't consistently work (which is generous, it's failing more often than not). Think it's because the script is running before the HTML of all the statements is filled in. However, even if it were working consistently, I've realised there is an unavoidable delay because first the UI loads and then it gets re-sorted (which is pretty bad UX). So perhaps this isn't something best solved by a gadget/userscript, maybe worth open a phabricator ticket to suggest ordering statements by certain qualifier values (similar to how statements themselves have a defined order). --SilentSpike (talk) 17:26, 10 April 2020 (UTC)[reply]
If it was going to be done, I suppose it would best be done in bulk, e.g., for every item with an author statement. But I'm not sure that it's desirable because of the extra load on the server. Although, done this way, you'd only be sorting the statements once, not every time the UI displays the page. Ghouston (talk) 23:32, 10 April 2020 (UTC)[reply]
Maybe it would be difficult to automatically work out which items would need sorting. I don't know if you could write a SPARQL query that would find out-of-order statements. Ghouston (talk) 23:37, 10 April 2020 (UTC)[reply]
Order of statements is insignificant, there is no guarantee you will get statements in any order. See phab:T125493 or phab:T173432. --Matěj Suchánek (talk) 08:56, 11 April 2020 (UTC)[reply]
Thanks for the links, I've still yet to fully work out how to navigate phabrictor effectively! This confirms my understanding that it's a UI issue not a data issue. --SilentSpike (talk) 11:57, 11 April 2020 (UTC)[reply]
There might be a few properties that could its values safely sorted by "point in time" qualifier (e.g. population). Ideally this is done directly when some other edit is done. --- Jura 20:48, 13 April 2020 (UTC)[reply]

All Xwiki articles without X language labels

Hi! Trying to get rewrite and optimize query for getting all Xwiki articles that doesn't have X language label. Currently have something like this:

select ips_item_id, ips_site_page
from wb_items_per_site i
where i.ips_site_id = 'lvwiki'
  and not exists (
    SELECT wbit_item_id
    FROM wbt_item_terms
      INNER JOIN wbt_term_in_lang ON wbit_term_in_lang_id = wbtl_id
      INNER JOIN wbt_text_in_lang ON wbtl_text_in_lang_id = wbxl_id
    WHERE
      wbit_item_id = i.ips_item_id
      and wbtl_type_id = 1
      AND wbxl_language = 'lv'
  )
group by ips_item_id
limit 5000;

Maybe somebody sees some way to optimize it? --Edgars2007 (talk) 17:36, 9 April 2020 (UTC)[reply]

There is no aggregation function used, so group by is redundant. --Matěj Suchánek (talk) 08:10, 10 April 2020 (UTC)[reply]
(facepalm) --Edgars2007 (talk) 09:46, 11 April 2020 (UTC)[reply]

Also known as

Is it ok to add in a common misspelling of a name in the "Also known as" field? Such as "Charles Lindburgh (misspelling)". --RAN (talk) 18:01, 9 April 2020 (UTC)[reply]

Yes, if some sources make that spelling error then include it as an alias. It will stop duplicate entries being created as users who search for the misspelling will find the correct item. I wouldn't include it unless other sources have already made the error, as that will just add clutter to the database. In other words, don't include it if you just think that it "may" be a common misspelling but have never seen evidence of the error. From Hill To Shore (talk) 18:22, 9 April 2020 (UTC)[reply]
I recommend being conservative with 'Also known as' values. They aren't necessarily hidden or discreet (i.e. displayed only in Wikidata), but may well be regurgitated elsewhere, for instance Creator templates on Commons. -Animalparty (talk) 16:36, 10 April 2020 (UTC)[reply]
I would also note that adding Charles Lindburgh is fine (assuming people do actually misspell it like that, but we should avoid adding notes like Charles Lindburgh (misspelling) - not completely sure which you're suggesting. Andrew Gray (talk) 20:44, 11 April 2020 (UTC)[reply]

contemporary constraint

contemporary constraint (Q25796498)

"The entities 1959 Cypriot presidential election (Q4305841) and President of Cyprus (Q841760) should be contemporary to be linked through office contested (P541), but the latest end value of 1959 Cypriot presidential election (Q4305841) is 13 December 1959 and the earliest start value of President of Cyprus (Q841760) is 16 August 1960."

A case that the contemporary constraint is wrong. The election took place some months before the Independence day. How to remove the constraint from the statement? Xaris333 (talk) 00:44, 10 April 2020 (UTC)[reply]

I added an exception for it on office contested (P541). Ghouston (talk) 05:30, 10 April 2020 (UTC)[reply]
Thanks. Xaris333 (talk) 18:11, 10 April 2020 (UTC)[reply]

unique duplicate identifiers on Q33102818 ?

good morning !!!!! Q33102818 contains same url for AE member ID Chanin_Marie-Lise and reference @ https://www.ae-info.org/ae/User/Chanin_Marie-Lise. is my assumption correct. what, when and how should be added to references ? also provide example which contains all data [ i can view and learn through commit history ]. Leela52452 (talk) 04:15, 10 April 2020 (UTC) suggestion or critique is preferred here[reply]

You are correct, you do not need to add a reference link to an external identifier, the ID number is already a link and its own reference. --RAN (talk) 18:48, 10 April 2020 (UTC)[reply]

List of Wikidata external sources

Hi everyone,

I'm new to Wikidata but I find the project ... just amazing.

As far as I understood by skimming through the docs, data are manually edited but are also automatically imported from external sources.

Is there a list of the external sources used to populate Wikidata?

Sylvain Leroux (talk) 10:41, 10 April 2020 (UTC)[reply]

There were too many of those sources, but right now someone is importing many many galaxies and related solar systems, and for years already people are importing scientific papers and their authors. But I also know that some uploaded every single Dutch street in 2015, and a lot of items are simply created because they belong to a new Wikipedia-article in any of the supported languages. Edoderoo (talk) 13:14, 10 April 2020 (UTC)[reply]
  • Bot imports usually have an external identifier, but not all external identifiers are completely loaded. For instance Findagrave and Familysearch, have famous people, but lots of ordinary people that would be difficult to disambiguate, and just make it difficult to find the one you are looking for, so we only import the famous people. Bot imports need to be approved since we have computational limits. This has caused a queue to be formed for uploading large data sets. Somewhere we have a list showing the mix and match programs. The list compares the size of all the original data sets and the percent we have uploaded. Does anyone have a link to that table, it showed the percent of VIAF and the percent of Geonames and other large data sets that were imported entirely, or matched against existing entries? --RAN (talk) 15:32, 10 April 2020 (UTC)[reply]
Thanks for the reply Edoderoo. There were actually many questions in one, and my initial message wasn't clear about that. Probably because it wasn't that clear in my mind. If I try to rephrase my thinking:
* Is there a list of the sources regularly explored by bots? As a random example, with such a list I would know if I need to query/parse/cross-reference the Library of Congress Catalog, or if this would be a waste of time since Wikidata already aggregates that content.
* Is there a way to distinguish between bot-imported data and human-curated data? For example, it's surprising (to me!) that COBOL (Q131140) programming paradigm (P3966) object-oriented programming (Q79872) has no reference, whereas COBOL (Q131140) instance of (P31) object-based language (Q899523) has en:wikipedia as the source. Would that mean in the first case data were manually entered? Or they were derivated from other data? And why only en:wikipedia is quoted as the source for the second case? Does that mean no other Wikipedia website corroborates that fact?
Sylvain Leroux (talk) 15:28, 10 April 2020 (UTC)[reply]
Wikidata does not contain all entries in VIAF/LCCN/GND/FAST etc.--GZWDer (talk) 19:48, 10 April 2020 (UTC)[reply]
Somewhere we had a list showing the percent of each that was imported through the mix and match program which found a matching value for a person or place in the database. --RAN (talk) 05:06, 14 April 2020 (UTC)[reply]

Help

Hello everyone, I hope you are all well! I want to learn how to find the Freebase ID! AntonyFragakis (talk) 13:31, 10 April 2020 (UTC)[reply]

Combining values of a statement

Please compare the values given for the statement 'number of pages' in the items 1 and 2. I would like to know whether both the formats given for the statement are permissible or not. Adithyak1997 (talk) 14:32, 10 April 2020 (UTC)[reply]

Disambiguating in the description field

Is there a consistent way to disambiguate in the description field? When I look at Frederick Dent Grant (Q1344993) it says "United States Army general (1850-1912)". I read it that his years of service in the military were from 1850 to 1912. Would it be better as "(1850-1912) United States Army general". We do need to add "(yyyy-yyyy)", at times, because we have multiple people in large famous families recycling the same name over multiple generations. This leads to people linking the wrong generation in the parent-child concatenations here in Wikidata, especially when they have the same description such as "American politician" or "American businessman". I have been noticing the errors as I add the family tree template at Commons. See Commons:Category:Ulysses S. Grant --RAN (talk) 19:15, 10 April 2020 (UTC)[reply]

I see some that are listed as "American politician, born 1853" or "American politician, died 1921" as a way to disambiguate multiple people of the same name. Is that better than "American politician (1853-1921)" so we do not interpret it as "American politician in office from 1853 to 1921". This doesn't come up often, but we have some families with 4 generations of people with the same first middle and last name, and when we are in the 1700s and early 1800s most people did not have a middle name recorded, unless they were a member of a noble family. --RAN (talk) 00:18, 11 April 2020 (UTC)[reply]

Programming language derivative work

I'm not sure this is the right place to discuss that topic, but I notices in Pascal (Q81571) a warning since derivative work (P4969) does not apply to instance of (P31) programming language (Q9143). But it looks like it makes sense. I see three ways to resolve that:

  1. There is a better-suited property than derivative work (P4969) in that context.
  2. or programming language (Q9143) should be changed to be a direct or indirect instance of work (Q386724).
  3. or derivative work (P4969) should be changed to accept programming language (Q9143). In that case we should also consider changing based on (P144).

What do you think?

Sylvain Leroux (talk) 21:16, 10 April 2020 (UTC)[reply]

Object-orientation vs object-oriented programming

I noticed object-orientation (Q2011845) is wrongly used instead of object-oriented programming (Q79872) as the programming paradigm (P3966) for at least one programming language. I suggest:

  1. A rule should be added to programming paradigm (P3966) so it warns if applied to something else than instance of (P31) programming paradigm (Q188267)
  2. and object-oriented programming (Q79872) should be made facet of (P1269) object-orientation (Q2011845)

What do you think?

Sylvain Leroux (talk) 21:50, 10 April 2020 (UTC)[reply]

Is dialect a class of programming language?

I'm not sure dialect (Q2458742) is a valid instance of (P31) programming language (Q9143). Given the description, it looks like it came out of an automatic import from StackOverflow.

It seems more meaningful to use dialect of (P4913) to establish the relationship between the derived language and its base language. I did it on Eta (Q51170461). Should I proceed that way, or do I need to revert that change?

Sylvain Leroux (talk) 22:37, 10 April 2020 (UTC)[reply]

In theory, dialect of (P4913) is only for dialects of human languages.
See also Wikidata:WikiProject Informatics/Programming Language and its talk page.
« Given the description, it looks like it came out of an automatic import from StackOverflow.  » →‎ Given the history, it did not come of an automatic import from StackOverflow. Visite fortuitement prolongée (talk) 07:28, 11 April 2020 (UTC)[reply]
In theory, dialect (Q33384) is only for dialects of human languages. Visite fortuitement prolongée (talk) 09:00, 11 April 2020 (UTC)[reply]
@Visite fortuitement prolongée: I stopped editing when I realized both dialect (Q33384) and dialect of (P4913) already appears in several language descriptions. And that dialect of (P4913) requires the subject to be instance of (P31) dialect (Q33384).
Conceptually, is there a difference between the notion of computer-language dialect and of human-language dialect?
Sylvain Leroux (talk) 13:06, 11 April 2020 (UTC)[reply]

I don't know what is the policy regarding moving topics so, I just copied this discussion to the Wikidata:WikiProject Informatics/Programming Language talk page

What happened with logos?

By visiting Bangla Wikipedia (Q427715), its logo image (P154) tells me "the Commons link should exist", even the commons link is in fact existing. By attaching ?action=purge after URL and confirm the purging, I missed it for an hour, but after an hour I re-watched out the "should exist". --Liuxinyu970226 (talk) 01:41, 11 April 2020 (UTC)[reply]

'Namwon-ro 527beon-gil' (korean '-gil' means 'street') refers to one of regions in Wonju, Gangwon-do, the Republic of Korea as shown in [11]. 146 buildings are located in this area. I ask the opinion of its notability here. thank you. Choikwangmo9 (talk) 03:02, 11 April 2020 (UTC)[reply]

Area outline on the displayed map in a commons category

Im refering to commons:Category:Evangelischer Friedhof Matzleinsdorf. In the Infobox for Wikidata you can see that the cemetery has an outline on the map. Yet I'm not sure how to add/did not find any option to add that outline to other items. An example would be commons:Category:Soldatenfriedhof Wien-Evangelischer Friedhof Matzleinsdorf (Zweiter Weltkrieg, russisch) as subcategory of the reference category. The cemetery as well as the war cemetery are represented as ways in OpenStreetMap (OSM). Since the outline of the cemetery seems like a perfect fit for the way in OSM, I guess it is uses this way somehow. I know there is OpenStreetMap relation ID (P402), but this does not work for nodes or ways.
Does anybody have an idea how this can be added/is already added by some categories? (BTW: I have a draft ready for a property proposal for either OSM reference ID or OSM node ID and OSM way ID) --D-Kuru (talk) 13:57, 11 April 2020 (UTC)[reply]

@D-Kuru: Apparently the areas are stored as Wikidata references on OpenStreetMap. See c:Template_talk:Wikidata_Infobox/Archive_1#where_are_grey_map_areas_stored?. Ghouston (talk) 00:35, 12 April 2020 (UTC)[reply]
Thanks for the info! --D-Kuru (talk) 15:05, 12 April 2020 (UTC)[reply]

What roles can Wikidata play in the current education crisis?

Hi all

Based on conversations I had the WMF education team on the importance of education in times of crisis I started an RFC on Wikidata to discuss what roles it can play in the current education crisis caused by Coronavirus (over 90% of learners worldwide not in school). One way is by helping learners and their parents/guardians find educational resources they need to continue their education at home. However currently no one is consistently and systematically linking curricula (what students need to learn) to the educational resources which are available.

Thanks very much

--John Cummings (talk) 14:30, 11 April 2020 (UTC)[reply]

Quickstatements login issue

Hey folks, I can't get logged on to Quickstatements. When I eventually get through, OAuth gives me this error: "Error retrieving token: mwoauthdatastore-request-token-not-found". Any ideas? I'm working on an event this weekend and while we can prepare the data for upload, it's somewhat frustrating not to be able to actually upload it at the end of the event! Lirazelf (talk) 14:50, 11 April 2020 (UTC)[reply]

I also face the same issue (and errors on QuickStatements import). --Misc (talk) 17:57, 11 April 2020 (UTC)[reply]
Seems there is already a phabricator task on it: https://phabricator.wikimedia.org/T249035 --Misc (talk) 19:49, 11 April 2020 (UTC)[reply]
Ah, thanks for that. I've added a comment. Lirazelf (talk) 08:52, 12 April 2020 (UTC)[reply]

This is a notice directed in accordance with the relevant policy for requesting CheckUser. Thank you. --Sotiale (talk) 01:00, 12 April 2020 (UTC)[reply]

In accordance with the policy, here is a notification of my candidacy for CheckUser.--Jasper Deng (talk) 01:04, 12 April 2020 (UTC)[reply]

Added here for completeness--Ymblanter (talk) 19:04, 12 April 2020 (UTC)[reply]

General discussion

I took the liberty of moving these into one section to permit discussion of the general principle of having local checkusers on this project that is independent of the specific applicants. See meta:Steward requests/Checkuser for the process we currently follow, having never had local checkusers. Bovlb (talk) 04:24, 12 April 2020 (UTC)[reply]

  • On the plus side, this project is growing, and it's probably about time that we took this load off the global stewards. On the minus side, a significant fraction of our LTA involves cross-wiki problems, and there may therefore be some need for a CheckUser to be able to check across multiple projects. Bovlb (talk) 04:48, 12 April 2020 (UTC)[reply]
    • @Bovlb: Our current CheckUser policy permits stewards to keep checking for those cases, and we have a sizable number of local sockpuppetry cases as well. A particularly flagrant one I recall is Special:Contribs/JJBullet. Our goal here is to be better able to handle the local cases. CheckUsers also get subscribed to a cross-wiki checkuser-l mailing list for all stewards and CheckUsers globally and are given access to a CheckUser wiki also for those users, which facilitates cooperation with the stewards. I personally have worked closely with stewards on cases requiring CheckUser and would never hesitate to reach out for cross-wiki cases.
    • On another note, I don't think it's really necessary to have a separate discussion for it. The comments sections of our requests should be used for this IMO.--Jasper Deng (talk) 04:54, 12 April 2020 (UTC)[reply]
  1. Do we need a "Request for CheckUser" page? Or we should use Administrators' noticeboard?
  2. Also a mail list may be needed.
  3. Should we relax the sockpuppet policy once we have CheckUsers? (previous discussion)

--GZWDer (talk) 10:37, 12 April 2020 (UTC)[reply]

In my opinion more CheckUsers are needed. The optimal number of CheckUsers in my opinion is 4. more admins should volunteer.--GZWDer (talk) 10:40, 12 April 2020 (UTC)[reply]
I think using Administrators' noticeboard may create confusion for the users from other projects and also for making it easier to archive the requests a separate page might be a better idea. This will also make it easier for CUs to notice new requests in their watchlists.-BRP ever 11:14, 12 April 2020 (UTC)[reply]
We could definitely set up a request for CheckUser process. On my part, my only really big preference is that I would like the prefill to ask for evidence, since investigations won't be carried out without it.--Jasper Deng (talk) 17:33, 12 April 2020 (UTC)[reply]
I concur with BRPever and Jasper Deng on this. --Kostas20142 (talk) 07:10, 13 April 2020 (UTC)[reply]

QID Emoji proposal

In January 2019, a proposal about creating a new type of ​Emoji Tag Sequence​ that uses ​Wikidata​ QIDs was made. Here is the revised proposal in November 2019. Unicode Consortium opened an Public Review which will run until April 20. You may provide feedback of the proposal here.--GZWDer (talk) 02:21, 12 April 2020 (UTC)[reply]

Good lord. If I'm reading this correctly, the intent is basically to let anyone add new emoji to Unicode, without the Unicode consortium having to be involved at all, in effect by letting Wikidata be the registry instead.
Suppose (to use an example from the proposal) I want there to be a Unicode emoji for the flag of NATO. Right now there's not a defined Unicode code point for that. But under this proposal, it would be possible for me literally to embed tags for the characters Q 4 5 9 7 8 8 in a text stream in a special way, such that the sequence (begin tag + 7 tag characters + end tag) would represent one emoji character, and if a recipient looked up Q459788 and discovered that it had an emoji icon for it, hey presto, you'd get a little NATO flag icon in your text stream.
Needless to say this would put Wikidata in an interesting position, as emojiphiles flocked here to try to add or adjust the behavior of their favorite new emoji! It would also open up fantastic new vistas for vandalism and prankery...
Now with this said, the proposal does not say that every Wikidata entity is automatically a new emoji character. No, it's limited to some combination of
  • entities that have a distinct visual representation
  • entities that people think are reasonable to use as emoji
It's not the case that any qid you stuck in a text stream would, necessarily, automatically get rendered on the recipient's screen by, say, fetching the entity's image (P18) or icon (P2910) value. Anybody who implemented something like that would clearly be insane. (But you just know it would happen!) —Scs (talk) 12:13, 12 April 2020 (UTC) edited 12:57, 12 April 2020 (UTC)[reply]
The implementation will likely hardcode all supported emoji, and how an emoji is displayed depends on the font or software (usually a static symbol is created for each supported emoji, independent with the current content of the item), so is not affected by vandalism unless the vandalism make developers unaware. In UTF-8 each emoji and tag character is four bytes, so a sequence of flag of NATO (Q459788) is 36 bytes (with "tag base" and "end tag").--GZWDer (talk) 12:39, 12 April 2020 (UTC)[reply]
Right, will likely hardcode.
To be clear, as I understand it, today the process for adding new emoji or other characters to Unicode goes something like this:
  • Someone proposes new characters.
  • The Unicode consortium evaluates the proposal and, if approved, assigns new code points.
  • Assuming they want to maintain Unicode compatibility, implementors (of web browsers, smartphone messaging apps, etc.) add rendering support for the new code points in their next release.
The new process would streamline things, taking the Unicode consortium out of the loop:
  • Someone proposes new emoji characters, pointing out that they already have Wikidata QIDs.
  • If they agree, implementors add rendering support for the new QIDs in their next release.
It would be, as I said, insane for an implementor to arrange for unrecognized QIDs to be displayed by automatically, dynamically fetching an image from Wikidata or elsewhere. But given how much people looooovve new emoji, my prediction is that soon enough, even the "streamlined" approval process would be found to be cumbersome, and someone would plough ahead with some kind of automated or semiautomated approach.
So without saying whether I'm for or against this, my prediction is that it would have a huge impact on us here. (When the Unicode consortium was first established, I bet they never imagined they'd find themselves at the forefront of pop-culture iconography debates, either.) —Scs (talk) 13:26, 12 April 2020 (UTC)[reply]

Picture of the Year (Q15635201) ("Picture of the year" project page) and Picture of the Year (Q28155311) "competition" should link to commons:Commons:Picture of the Year, it was proposed to merge them at Wikidata:Requests for deletions#Q15635201. Initially I thought they should be merged but it looks like they were intended to be separate because of the instance of (P31) and official website (P856), and use in award received (P166) on other items. Q28155311 currently links to the category. Peter James (talk) 10:36, 12 April 2020 (UTC)[reply]

Help needed to merge Fallopia japonica -> Reynoutria japonica

Hello, I think that Reynoutria japonica (Q18421053) and Fallopia japonica (Q899672) are corresponding to the same specie of flower, just different names. Actually, there is already a warning that the link to the "Encyclopedia of Life" is a duplicate. I'm a newbie. I've tried to merge them with the "Merge gadget" by following the instructions. However, there is a conflict on the Welsh keys (which I think is just because the welsh wikipedia also has a duplicate). Trying to adjust the link fails (the publishing button gives an error).

I'm giving up... can someone help me?

Pieleric (talk) 13:54, 12 April 2020 (UTC)[reply]

The cannot be merged because the two items are linked with each other by taxon synonym (P1420). And that is intentionally - every taxon name has a separate item here, even if it is widely accepted as a synonym. Ahoerstemeier (talk) 14:07, 12 April 2020 (UTC)[reply]

Editing files on Commons with QS

On the page of QS I saw that there is a box that says "Commons [batch mode only!]". Does that mean I can use QS to add statements to images on Commons? Example: File:Building at Döblinger Hauptstraße 83 in Döbling, Vienna, Austria PNr°0516.jpg depicts Q64692116. Is there a way to quickly add an information like this to the image without a manual edit?
BTW: I'm not talking about modifying the image description page. I'm just talking about the linking an Item from Wikidata. --D-Kuru (talk) 15:10, 12 April 2020 (UTC)[reply]

I already took a look at Help:QuickStatements, but didn't find anything in there that would help me along. --D-Kuru (talk) 15:12, 12 April 2020 (UTC)[reply]
Yes, you can. It's possible to specify either filename or Mid, captions are equivalent to Wikidata labels, otherwise the format is same. --Matěj Suchánek (talk) 09:07, 13 April 2020 (UTC)[reply]

@Matěj Suchánek: I tested a little bit, but I still don't seem to get how this works.
I want to add Radeon HD 5450 (Q90327087) to File:Sapphire Radeon HD 5450-front PNr°0382.jpg. First I tried the opposite of how you add a statement to a wikidataitem:

  • File:Sapphire Radeon HD 5450-front PNr°0382.jpg|P180|Q90327087
  • "File:Sapphire Radeon HD 5450-front PNr°0382.jpg"|P180|Q90327087
  • Sapphire Radeon HD 5450-front PNr°0382.jpg|P180|Q90327087
  • "Sapphire Radeon HD 5450-front PNr°0382.jpg"|P180|Q90327087

None was working. The page just tells me "No valid commands found". I thought, maybe it's still the other way round:

  • Q90327087|P180|File:Sapphire Radeon HD 5450-front PNr°0382.jpg
  • Q90327087|P180|"File:Sapphire Radeon HD 5450-front PNr°0382.jpg"
  • Q90327087|P180|Sapphire Radeon HD 5450-front PNr°0382.jpg
  • Q90327087|P180|"Sapphire Radeon HD 5450-front PNr°0382.jpg"

The first and the third one won't even run (it is just stuck on run). The second and the fourth one just say "Invalid snak data."
First I tried the batch mode for Commons. Then I tried to do it on wikidata since I it didn't work, so might as well test this. I'm clearly doing something wrong here, but I have no idea what.
--D-Kuru (talk) 14:02, 13 April 2020 (UTC)[reply]

I looked at Help talk:QuickStatements#Commons again. Both and CSV syntax work for me but conversion from filename to Mids still (or again) doesn't work. (Something made me think it does.) Also [batch mode only!] is a gotcha, it will only work if you run the batch asynchronously (in the background). --Matěj Suchánek (talk) 14:57, 13 April 2020 (UTC)[reply]
@Matěj Suchánek: Thanks. It finally worked. I'm not sure if it is faster than entering the info by hand tough...
For archive and later lookup purpose the steps:
  1. Find Item on Wikidata
  2. Check File ID with Minefield (looks like M123456) Multiple files can be added every new line. NOTE: The files get resorted by their media ID!
  3. Open a new QS batch
  4. Select "Commons [batch mode only!]" from the drop down menu
  5. Add the files and items like M123456|P180|Q654321 (=> file with ID 123456 depicts item 654321)
  6. Import commands
  7. Check commands and click "Run in background" (!! NOT "run" !!)
  8. Enter a name for the batch (optional)
  9. Check for errors and own mistakes (eg. linking the wrong item to a file)
--D-Kuru (talk) 17:01, 13 April 2020 (UTC)[reply]

Do we know the number of these items and how long would it take for to get to 100%.. Is it an idea to automate such a link? Thanks, GerardM (talk) 15:27, 12 April 2020 (UTC)[reply]

Former citizenship(s) of people - present or not and if then how? (ideological issue)

I would like to discuss with you, if and how to present former citizenships of people due to moving to another county or county seizing to exist.

Intro I have noticed that user:sporti edited all items about Slovenian people and added SFRJ (Jugoslavia) as their citizenship. Since SFRJ doesn't exist anymore, therefore people aren't citizens of SFRJ anymore.

Cause Wikipedia uses data of Wikidata, and therefore information about citizenships are shown in articles in infobox. In my opinion, if such information about former citizenships have to be displayed, then I would suggest that new category ("Former citizenships") should be created where such citizenships would be inserted.

Why change is needed? In case of Slovenian people and SFRJ, having both Slovenian citizenship and SFRJ citizenship in same category can provoke many people as it presents ideological topic that divide the nation. It's well known fact that people born in Slovenia before its independence were citizens of SFRJ, but they seize to be that after independence and became Slovenian citizens. Having SFRJ in same category of citizenships could be interpreted as support of former regime and therefore present controversial view which should be avoided on open platform such as Wikipedia.

Therefore I address this question to you, the community. How and if such data should be presented, so it doesn't cause ideological conflicts.

Best regards, Gregor – The preceding unsigned comment was added by 89.143.164.113 (talk • contribs) at 15:42, 12 April 2020‎ (UTC).[reply]

This can be resolved by setting country of citizenship (P27) with qualifiers for start time (P580) and end time (P582). One citizenship will end and another will begin. In cases where there are periods of dual citizenship, the overlapping time periods will reveal that as well. From Hill To Shore (talk) 15:48, 12 April 2020 (UTC)[reply]
A Wikidata statement is time-neutral by default. The statement that a person has a specific citizenship just means that there was a time in their life when they had the citizenship. It's useful to use the start time (P580) and end time (P582) qualifiers. In addition you can use the preferred-rank to mark the current citizenship and then Wikipedia infoboxes can read the current citizenship that way.
You may want to mark current (or latest) data with preferred rank.--GZWDer (talk) 19:13, 12 April 2020 (UTC)[reply]

presidential election

1959 Cypriot presidential election (Q4305841) -> candidate (P726) -> Makarios III (Q153509) -> member of political party (P102) --> independent politician (Q327591)

How can I show that Makarios III (Q153509) was supported by AKEL – Left – New Forces (Q22808995)

Another example,

2003 Cypriot presidential election (Q3557575) -> candidate (P726) -> Tassos Papadopoulos (Q200776) -> member of political party (P102) --> Democratic Party (Q816863)

How can I show that Tassos Papadopoulos (Q200776) was also supported by AKEL – Left – New Forces (Q22808995) and Movement for Social Democracy (Q259800)?

Xaris333 (talk) 17:00, 12 April 2020 (UTC)[reply]

I cannot answer the question, but member of political party (P102) --> independent politician (Q327591) seems wrong to me. Q327591 is not a political party, and by using that value you will make seem like all candidates with no party belong to the same party. I think it instead should be "no value". --Dipsacus fullonum (talk) 17:14, 12 April 2020 (UTC)[reply]
@Dipsacus fullonum: independent politician (Q327591) is an item that says "a politician not afiliated with any political party" therefore it is a correct option for member of political party (P102). No value is a less useful option in this case.
@Xaris333: I've adjusted your templates above so that they work correctly. You need to set them as {{Q|Q4305841}} and {{P|P726}}. The only qualifier that I can see will work here is political coalition (P5832). Was it a formal agreement between the candidate and the other party or did they just choose to support him without his agreement? From Hill To Shore (talk) 17:32, 12 April 2020 (UTC)[reply]
Formal agreement. Not between parties, but between the candidate and the party. Xaris333 (talk) 17:37, 12 April 2020 (UTC)[reply]
In that case maybe nominated by (P4353) would be the better option. You could use the qualifier a few times and say he was the nominated candidate of different parties but a member of only one party. From Hill To Shore (talk) 17:45, 12 April 2020 (UTC)[reply]
@Xaris333: "a politician not afiliated with any political party" is not a party, so it should never be the value of P102 which have political parties as values. Please note that I am not saying that there shouldn't be a claim for P102, but that the special value called "no value" should be used as value. It is explained at Help:Statements#Unknown or no values. --Dipsacus fullonum (talk) 17:48, 12 April 2020 (UTC)[reply]
@Dipsacus fullonum: independent politician (Q327591) is intended for use with member of political party (P102). Where else would you make use of it other than in a call from member of political party (P102)? Rather than telling people that there was a political party made up of independents, the structured data from independent politician (Q327591) tells people that all the people linked to that item were at one point in time independent politicians outside of any party. "No value" tells us nothing and prevents any queries for "idenpendent politicians of country X." If you think this is wrong then we should bring in a third opinion, or at the very least engage in a discussion. Ignoring the point someone raised a few lines above and giving out disputed advice is not very productive. If the issue isn't resolved one way or the other you will just be having one set of editors adding an item and another set of editors removing it in an infinite loop. From Hill To Shore (talk) 21:46, 12 April 2020 (UTC)[reply]
@From Hill To Shore: Q327591 is not intended for use with P102. The item was created in 2012 to create interwiki links between many Wikipedia articles about independent politicians. P102 was created later (in 2013) and should only have political parties as values, which Q327591 is not. It is neither true that "no value" tells us nothing. The claim "person P102 no value" tells us that the person is not member of any political party, and therefore is an independent politician. And nothing prevents you from query for persons with "no value" for P102. But use of Q327591 is against the data model for Wikidata and makes queries difficult (like e.g. finding members of same the political party) because Q327591 isn't a political party as expected for values for P102. --Dipsacus fullonum (talk)
@Dipsacus fullonum: That may be the history of it but the current situation is that we have 7,773 links from main space to independent politician (Q327591). There may be some different usages mixed in there, but all of the ones I sampled had it linked to member of political party (P102). For the record, as far as I can recall, I have never added an entry for member of political party (P102), so those 7,773 links are by other users or bot operators. My personal opinion is that I think you are wrong on this. Whatever the historical order of creating the items and properties, linking the two establishes the concept of "independent politician" in association with the individual. No value does not link the concept. However, the key here is consensus, neither you nor I should be dictating Wikidata policy. There is evidence here of a set of editors working against your advice. If you think they are wrong, build a consensus to reverse it. From Hill To Shore (talk) 22:51, 12 April 2020 (UTC)[reply]
There are 4,976 truthy claims for persons with P102 Q327591 and 1,634 truthy claims for persons with P102 "no value" according to these two queries:
SELECT ?item ?itemLabel
WHERE
{
  ?item wdt:P31 wd:Q5 .
  ?item wdt:P102 wd:Q327591 .
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" . }
}
Try it!
SELECT ?item ?itemLabel
WHERE
{
  ?item wdt:P31 wd:Q5 .
  ?item a wdno:P102 .
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" . }
}
Try it!
--Dipsacus fullonum (talk) 23:10, 12 April 2020 (UTC)[reply]
I think that the claims with P102 Q327591 are incorrect, against policy (which are not made by me) and should be changed because Q327591 is not a political party, and because Wikidata has standard means to indicate that a property have no value which are intended for exactly such cases. --Dipsacus fullonum (talk) 23:10, 12 April 2020 (UTC)[reply]
Just to say that in member of political party (P102) there is Wikidata property example (P1855) with independent politician (Q327591) since June 2016 [12]. Maybe @Jura1: can help us. Xaris333 (talk) 23:13, 12 April 2020 (UTC)[reply]
You will note that Jura's example which was used in Special:Diff/351977470 was changed to use "no value" in 2019 in Special:Diff/958163719. --Dipsacus fullonum (talk)
So? That's not means that the second one is the correct one... As a user, I usually check examples at property's page. It is there since 2016. Maybe that's the reason why some users used it like that. Xaris333 (talk) 00:18, 13 April 2020 (UTC)[reply]

presidential election of 2 rounds

The President of Cyprus is elected using the two-round system; if no candidate gets a majority in the first round of voting, a run-off is held between the top two candidates.

I tried to apply this in Wikidata with 1988 Cypriot presidential election (Q4351905). So, the structure is like this:

I am not sure about the statements I used. Some questions:

Xaris333 (talk) 18:48, 12 April 2020 (UTC)[reply]

This one seems to have a better setup: first round of French presidential election, 2017 (Q29836873). It's not itself an instance of an election. Ghouston (talk) 00:58, 13 April 2020 (UTC)[reply]
Yeah, that one is nice. If you don't want to include that much detail, you could do like 2016 Estonian presidential election (Q22676594), but that way one might miss the interesting part: the person elected in the 4th round wasn't a candidate in the first three. --- Jura 01:05, 13 April 2020 (UTC)[reply]

Thanks! Most problems solved.

  • For the last problem I added a separator to point in time (P585) [13]. It's that ok?
  • Another problem is candidate (P726). Should we add the candidates to election item or only to rounds items? Because, if you add them to election item, you must add votes received (P1111) but you can't because two of the candidates participated in two rounds. Xaris333 (talk) 01:25, 13 April 2020 (UTC)[reply]
    • In English, it just reads "potential issue" (This candidate statement is missing a qualifier votes received.), not "you must add". I do think adding all candidates on the election items is an advantage, even if they didn't actually reach the stage where they could get votes. --- Jura 01:30, 13 April 2020 (UTC)[reply]
Sorry for my English... So, its better to have the candidates only in rounds items? In 2017 French presidential election (Q7020999) there also all the candidates with potential issue. Xaris333 (talk) 01:40, 13 April 2020 (UTC)[reply]
No. Just noticed that someone had borked the main item .. --- Jura 18:59, 13 April 2020 (UTC)[reply]

Infectious disease harmonization

We have all the COVID deaths using the same "cause of death" but looking back at other diseases I want to harmonize the cause of death. Is there a tool I can use to put all the AIDS deaths under death from AIDS-related complications (Q4651894)? We have several synonyms, such as "AIDS" (142) and "HIV" (there were 4, I moved them by hand) and "AIDS related disease" (34), but death from AIDS-related complications (Q4651894) has the largest number and appears to have been created just for this purpose. What tool will allow me to do it easily, rather than one-by-one? --RAN (talk) 20:26, 12 April 2020 (UTC)[reply]

Help needed to create reference definitions in Wikidata Items of medical terms

Hi, As a laboratory medicine professional from the Netherlands I’m rather new in WIKIDATA and need some support from the community to sort things out I don’t readily understand due to a lack of experience. Even I’ve done the tutorials and stuff. To improve the unambiguousness of the interpretation of several wikidata Items in the field of laboratory medicine, I want is to add formal definitions from authorities like WHO, ISO and IUPAC as a statement to these Items. The reference definitions are crucial for the FAIR-LOD principle at a knowledge level.

As an example, global authorities like ISO and WHO have formal definitions in ISO15189:2012 and WHO toolkit 2009 in their section ‘terms and definitions’ of e.g. item Accreditation ( Q705899 ). I want to present the definitions of terms (wd: item) as a statement like property “defined as” with datatype text for the value. In this field the original text of the standard. As a qualifier I would suggest “Defined in” with the standard document such as ISO 15189:2012 and a qualifier “defined by” the authorized body of that document, such as ISO.org.

The only properties I ran into were “described by source” (P1343), “stated as”, “stated in” and stated in reference as” ( https://tools.wmflabs.org/hay/propbrowse/ ) but these do not cover this issue. More over, the “stated as” is not allowed as a property and only as a qualifier. This restriction I don’t understand by the way.

Therefore , I want to propose this new property in ( https://www.wikidata.org/w/index.php?title=Wikidata:Property_proposal/Pending&action=edit&section=4 ). And here I run into some newby issues, that is, how to fill in this form and how to arrange subsequent steps. It is to some extend clear to me how to do this but to avoid disillusions I would greatly appreciate some support to get me through the math. Best regard, Frans van der Horst Reinier de Graaf Hospital Delft NL frans@semantoya.nl

I think exact match (P2888), equivalent class (P1709), and described at URL (P973) are relevant properties, although I'm not sure exactly where each is preferred. Ghouston (talk) 00:48, 13 April 2020 (UTC)[reply]
  • I suppose the question is how to ensure that accreditation (Q705899) is about the ISO concept and not about a more general one. Or if we should have another item that is specifically about the ISO definition thereof.
About the inclusion of textual definitions, there was some discussion on Wikidata:Property_proposal/definition. --- Jura 00:58, 13 April 2020 (UTC)[reply]


Hello Frans van der Horst. Regarding ambiguity, there are (at least) two ways of fixing the meaning of an item.
  1. Suggest a property of type "external identifier". In this case, one could imagine a property "ISO 15189:2012 item number". (I don't have access to this standard, but I could imagine that each term defined therein has an item number.) Then, when someone finds this item, they can look up the definition in the standard and be sure that they found the correct one.
  2. Use described by source (P1343), with value <some item for ISO 15189:2012>, and qualifier section, verse, paragraph, or clause (P958) (for the item number), and optionally the qualifier subject named as (P1810).
Here is an example item (identifying a physical/chemical quantity) that demonstrates how both approaches are used (linking to both ISO and IUPAC): absolute chemical activity (Q89097848).
Best wishes. Toni 001 (talk) 09:22, 13 April 2020 (UTC)[reply]

Cattle as a source of gelatin

I'm new to editing WikiData, so please let me know if there is a better forum for this discussion. I edited cattle (Q830) to indicate that the this taxon is source of (P1672) gelatin (Q179254) (citations 1 2). User:Succu has repeatedly reverted my edits and the conversation I initiated on their talk page has been unproductive. Would anyone be able to offer some perspective here on how to proceed? Thank you. Scientific29 (talk) 21:35, 12 April 2020 (UTC)[reply]

Hi Scientific29, you asked me for advice on my talk page but unfortunately I am unfamiliar with this taxon is source of (P1672) as I mainly work in biographical items. Using "what links here" reveals several examples of how the property is used and I can't see anything wrong with your proposed edit. @Succu: Can you please explain where Scientific29 is going wrong here? Is this a dispute that one is not a source of the other or that this needs to be recorded in a different way? From Hill To Shore (talk) 00:09, 14 April 2020 (UTC)[reply]
  1. We do not claim cattle (Q830) this taxon is source of (P1672) milk (Q8495), but cattle (Q830) this taxon is source of (P1672) cow's milk (Q10988133)
  2. We do not claim cattle (Q830) this taxon is source of (P1672) butter (Q34172), there are other sources than cow's milk (Q10988133)
  3. The en description of gelatin (Q179254) is mixture of peptides and proteins derived from connective tissues of animals. enWP states derived from collagen (=collagen (Q26868)) taken from animal body parts. So I don't think we should claim cattle (Q830) this taxon is source of (P1672) gelatin (Q179254).
--Succu (talk) 14:35, 14 April 2020 (UTC)[reply]
@Succu: So the key problem is that we are missing an intermediate step, is that correct? If Scientific29 is able to provide sources for "beef gelatin" they can make a new item for it as a subclass of gelatin and then say on cattle (Q830) that this taxon is source of (P1672) "beef gelatin." Would that be a satisfactory approach? From Hill To Shore (talk) 16:32, 14 April 2020 (UTC)[reply]
(ec) I don't think claiming cattle (Q830) this taxon is source of (P1672) "cow milk butter" makes any sense. "cow milk butter" made from material (P186) cow's milk (Q10988133) does. --Succu (talk) 16:44, 14 April 2020 (UTC)[reply]
I am not sure if that is an answer to my last point as you have marked it as an edit conflict. However, referring to a hypothetical about "cow milk butter" instead of talking about gelatin is possibly steering this discussion in the direction of a w:Straw man. Let's try to break this discussion down into components so we can draw out the source of the dispute and find any areas of agreement. Looking on Google, there appear to be several common types of gelatin, mainly marketed to people with certain dietary requirements. Pork based gelatin appears to be sold to Christians and Hindus but not Muslims or Jews (due to religious laws against eating pork). Beef based gelatin appears to be sold to Christians, Muslims and Jews but not to Hindus (due to cows being sacred to Hindus). Fish based gelatin appears to be marketed mainly at Muslims and Jews as it can be shown more easily to comply with their religious laws. This suggests two questions to move this discussion along:
  1. If Scientific29 can provide reliable sources to establish the existence of "Beef based gelatin" as a real world product, would you have any objections to the creation of a new Wikidata item to reflect that product?
  2. If a "Beef based gelatin" item is created and Scientific29 can provide reliable sources to demonstrate that it is sourced from cattle, would you have any objections to using cattle (Q830) this taxon is source of (P1672)?
If you object to either of these proposals, please provide a reason. At the moment I am struggling to understand the source of the dispute. From Hill To Shore (talk) 18:46, 14 April 2020 (UTC)[reply]
Thank you both for your input here. User:Succu, this taxon is source of (P1672) specifically indicates that the taxon-product relationship can be one-to-one or many-to-one: "Some products may be yielded by more than one taxon." To further complicate matters, I believe many commercially available gelatins represent a mixture of several taxa, with single-taxa gelatins produced for religious consumers. Scientific29 (talk) 01:25, 16 April 2020 (UTC)[reply]

CentralNotice event for korean users

Hi. I'm WMKR (Wikimedia korea) Project Manager.

WMKR want to encourage Korean users to participate in Wikidata.


So we are going to hold the following event and do CentralNotice as below.


The CentralNotice runs from May 1st 15:00 UTC to May 16th 15:00 UTC

and applies only to 'Korean language' and 'Republic of Korea (=South korea)'.

thanks to read it :) --이강철 (WMKR) (talk) 02:55, 13 April 2020 (UTC)[reply]

Conflation Of

This query shows 64 "conflations" i.e. mixes of two items:

select ?x ?xLabel ?xDescription {
  ?x wdt:P31 wd:Q14946528
     SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
} limit 100
Try it!

Maybe some of them are legit, eg "Minesweeper...conflation of several games developed by Microsoft".

But I think that creating conflations of people is harmful @DrGavinR:. These then become targets of wrong external links and create confusion.

1. It's better to keep to only records about actual people and add "different from" links between them.

2. Even if there's a very good reason to create a "conflation" item, we already have a type "disambiguation page".

--Vladimir Alexiev (talk) 05:59, 13 April 2020 (UTC)[reply]

  • Q2965940 has a bunch of links that conflate(d) the two persons with that name. I think it would be harmful to arbitrarily assign the identifiers to one of the two persons.
Help:Conflation_of_two_people explains how to go about it. --- Jura 06:22, 13 April 2020 (UTC)[reply]

From now on, I won't do any more edits connected with conflated external identifiers until we've reached consensus about the best way to handle this problem. We need a consistent solution because it happens often with VIAF ID (P214), WorldCat Identities ID (superseded) (P7859) and Semantic Scholar author ID (P4012). --DrGavinR

For anyone just joining, Q87066628 is a good example of the conflation items I've created. In this case all three conflated authors can be identified clearly enough to have a Wikidata item. There may be other cases where not all of the conflated authors can be positively identified, and I have no idea what to do about that. If only one of the authors can be identified clearly enough to have a Wikidata item, then creating a new item to represent the conflation won't work.--DrGavinR (talk) 16:17, 14 April 2020 (UTC)[reply]

In my experience, emailing VIAF corrections to bibchange@oclc.org has no effect. It would be good to get input from: @Merrilee: @Thisismattmiller:

Andrew Gray Andy Mabbett Bamyers99 Ijon Vladimir Alexiev Epìdosis emu Alexmar983 Simon Cobb Pmt Mathieu Kappler MasterRus21thCentury Jheald JordanTimothyJames Maxime Jonathan Groß

Notified participants of WikiProject Biographical Identifiers --DrGavinR

How can I be helpful? Merrilee (talk) 17:33, 14 April 2020 (UTC)[reply]
hi @Merrilee: You wrote at https://en.wikipedia.org/wiki/Wikipedia:VIAF/errors that VIAF errors shouldn't be collected there. But I don't know any better way so I collected at https://en.wikipedia.org/wiki/Wikipedia:VIAF/errors#WorldCat_Identities_errors and emailed bibchange@oclc.org. Now @DrGavinR: says that is ineffectual. We really need a working flow for data corrections between Wikidata and VIAF: can you shed some light? If we can organize it in any more automated manner, that would be perfect: Wikidatians are good at that sort of thing --Vladimir Alexiev (talk) 06:36, 15 April 2020 (UTC)[reply]
In the meantime, do you have another suggestion? A solution that doesn't rely on contributors here emailing a person at every external database (Wikidata has other IDs than VIAF) and ensuring that datausers using these identifiers don't end up with mismatches, because someone here arbitrarily assigned it to the wrong item. --- Jura 06:48, 15 April 2020 (UTC)[reply]
Vladimir I've looked at the page and am having a hard time figuring out an example of a request you've made to bibchange? I am sorry to be so dense but struggling to understand what the problem is here. Merrilee (talk) 13:30, 15 April 2020 (UTC)[reply]

OSM relation but no node or way?

Can someone explain why we got OpenStreetMap relation ID (P402) but no similar props for OSM node or OSM way? Eg the village where I live Dragichevo (Q1074197) has OSM node https://www.openstreetmap.org/node/273878582 but no relation (nobody geo-traced the village's boundary yet) --Vladimir Alexiev (talk) 06:56, 13 April 2020 (UTC)[reply]

OSM relations are relatively stable, while node and way IDs change in the normal editing process. If someone were to delete your village node and replaced it with the drawn boundary (a perfectly reasonable thing to do), the resulting entity would have a new ID. Even with the village as a node, you can link the two on OSM using Key:wikidata. Vahurzpu (talk) 12:18, 13 April 2020 (UTC)[reply]
Vahurzpu is entirely correct, and I would say even the inclusion of relation IDs here is questionable as they're still pretty unstable. --SilentSpike (talk) 13:51, 13 April 2020 (UTC)[reply]

Apparently this duplicate has been created because of the duplicate ORCID iDs

What is the proper merging of the two ? Kpjas (talk) 08:49, 13 April 2020 (UTC)[reply]

@Kpjas: Just merge the two items and the merged item will get two ORCIDs. (For what it's worth, 0000-0002-3777-0960 is "preferred" because it lists more works). Do you observe many duplicates in ORCID? If so, we should organize some reporting to them --Vladimir Alexiev (talk) 08:59, 13 April 2020 (UTC)[reply]
@Kpjas, Vladimir Alexiev: You can use the contact links on the ORCID website to let them know about problems like this; I've sent them a dozen or so such cases. I think they are still working on procedures to address these, but in principle the process is that they contact both registered email addresses to confirm they are the same person, and then deprecate one ID in favor of the other. I believe they've done that successfully in a few cases I sent so far, but it takes time. ArthurPSmith (talk) 17:29, 13 April 2020 (UTC)[reply]

Location format error in commons

I was trying to correct a missing location on w:c:File:Austrian Constitutional Court building 01.jpg and w:c:File:Austrian Constitutional Court building 02.jpg and knew nothing about wikidata. The Commons location template is {{object location|Wikidata=Q873868}}. Q873868 had coordinates under statement "headquarters location"; I added a separate statement "coordinate location" with the same coordinates but now the commons page has the error "Lua error in Module:Coordinates at line 764: bad argument #2 to 'format' (string expected, got nil).", unless I also add "#statements:coordinate location|from=Q873868", when the coordinates appear correctly but twice. What am I doing wrong? Finavon (talk) 12:32, 13 April 2020 (UTC)[reply]

I have a small conclusion. What I have understood is that both headquarters location as well as coordinate location are entirely different. Take for example US Bank. It will have a headquarters and have various branches. In the Wikidata of a branch, you need to specify both the coordinate location of branch as well as the coordinate location of headquarters. If any experienced editors find my solution illogical/confusing, please remove it. Adithyak1997 (talk) 15:09, 13 April 2020 (UTC)[reply]
Looks like a mix of several issues:
  1. coordinates under statement "headquarters location" is a valid way to indicate the exact location of headquarters. The module could be enhanced to work with it, too.
  2. I cannot see any code at the exact line that would throw this error (it's close, though). There could have been a cached version that you invalidated with your edit. It could indicate a bug in the module, too.
  3. various branches - the information about them can be split to multiple items. It isn't flexible to store information about multiple entities in a single item.
--Matěj Suchánek (talk) 15:22, 13 April 2020 (UTC)[reply]

Yes, commons does not find the location when it is in a headquarters location statement (or location for an image statement). I thought duplicate information in multiple statements might be a problem and have now removed headquarters location coordinates, leaving only coordinate location. There is still an error in commons (removed by adding the statement call). I'm just discovering wikidata and worry I am misusing the item (Constitutional Court) - an organisation while I want the building location. Lots of commons images are being tagged with a location using wikidata. Advice welcomed. Finavon (talk) 19:49, 13 April 2020 (UTC)[reply]

Are Constitutional Court of Austria (Q873868) and Bank Austria Kunstforum Wien (Q806615) separate entitities that just happen to share the same building? If so, we need to end up with an item for each organisation and an item for the building. We can then say that the building is occupied by the two organisations using occupant (P466). The geographical information can be tied into the building item through coordinate location (P625) and to the organisation items through headquarters location (P159). Does this sound like a reasonable approach to more experienced editors? From Hill To Shore (talk) 20:41, 13 April 2020 (UTC)[reply]
I think, yes. From w:de:Bank_Austria_Kunstforum_Wien: "The former bank building has been owned by Signa Holding since 2010 and has also housed the Constitutional Court since August 2012 (Renngasse 2 entrance; this was given the address Freyung 8 on the occasion of the VfGH move)." So appears to be a former bank building, now owned by w:Signa Holding and occupied by a museum/exhibition centre and a court (vfgh). Helpful but does not move forward my question about how to properly remove the Lua error (whatever that is) from a location call in commons. Finavon (talk) 07:05, 14 April 2020 (UTC)[reply]

different from

@Bouzinac: The way I understand different from (P1889) - and what it says in its description "item that is different from another item, with which it is often confused" - it should only be used when two items can be easily confused. But - are the city Pattaya (Q170919) and the fruit dragon fruit (Q232755) aka "Pitaya" really something which anyone would ever confuse, only because it is spelled a little similar? see this diff I only use that property for items which same name and same (or very similar type), like two same-named hills, especially if close to each other. Ahoerstemeier (talk) 18:04, 13 April 2020 (UTC)[reply]

Confusion is very much a cultural thing. You are taking the example of the city Pattaya (Q170919) and the fruit dragon fruit (Q232755) . The fruit in France/Europe is very little known (more known as "fruit du dragon" and not "pittaya"), hence a confusion is possible. That's why the enwiki has shown a disambiguish template for long time and thus Wikidata should show too. Take into account the main aim of different from (P1889) is to prevent people from merging. Cheers Bouzinac (talk) 18:22, 13 April 2020 (UTC)[reply]
@Bouzinac: : I understand the logic behind your edits but I think you should not doing it mindlessly nor on the sole basis of WP:EN. For example, you linked Quintilian (Q193769) with names of large numbers (Q1151232), because Quintillian links to Quintillion which is a redirection (!) to Names of large numbers. I can't figure a situation where an user would mix the resulting two Wikidata items and merge them. --Jahl de Vautban (talk) 22:07, 13 April 2020 (UTC)[reply]
Yes, according to the English, quintillion is a very large number, not to be confused to Quintillian, by paronym/mispelling. Perhaps a French wouldn't confuse but perhaps an English would do. Again, it is not my creation but a need (very light need, I admit) put into English wikipedia. It has been populated and farmed with the presence of template Distinguish ; I am also farming frwiki with template Confusion with HarvestTemplate tool here. I tried other languages but it had been not very much working, cause the distinguish templates are used differently. Bouzinac (talk) 22:18, 13 April 2020 (UTC)[reply]
  • I don't think different from (P1889) is about the same function as disambiguish template on Wikipedia or storing paronyms in general. EnWiki also doesn't profit from us setting the disambigutation property this way. I don't think our users are going to try to merge a city with a fruit as it's easy to understand that the two are different without different from (P1889) having been set. I don't see the value in farming those template for setting different from (P1889).
Paronymity is also a property of lexemes and not of items towards which the lexemes point. It's different in different languages and listing it in the item will confuse users about the distinction between names and their referents. ChristianKl09:07, 14 April 2020 (UTC)[reply]
[14] is the very property of disambiguation. It is sometimes far-fetched, I agree, sometimes very helpful. May be far-fetched for you and helpful for the others. See for instance Leo Tolstoy (Q7243) and Lev Lvovich Tolstoy (Q1152362) or Giordano Bruno (Q36330) and Bruno Giordano (Q613257). In their wikipedias, someone decided confusion is possible between A and B, therefore different from (P1889) should also reflect that. On top of that, I feel that property could also be helpful as a way of "see also". Bouzinac (talk) 13:12, 14 April 2020 (UTC)[reply]

Proposal towards a multilingual Wikipedia and a new Wikipedia project

I sent a long email to the Wikimedia-l list and also made the same post to Meta. I published a new paper recently with a proposal for a multilingual Wikipedia and more, and, unsurprisingly, Wikidata plays a central role in that proposal. I am trying to have the discussion not to be too fragmented, so I hope it will happen on Meta or on Wikimedia-l, but I also wanted to give a ping here. Stay safe! --Denny (talk) 01:07, 14 April 2020 (UTC)[reply]

On the website of the Stedelijk Museum Amsterdam (Q924335), the year of production of an artwork is at least sometimes displayed as "0000" when it is unknown, e.g. [15]. This has been imported, generally by BotMultichill, to dozens of items at a minimum, as a statement of inception (P571) being 0, with most the examples I've looked at being from artists born in the 19th century and dying in the 20th. I think this is pretty clearly an issue, but what is the best way to resolve it? Should these incorrect statements be marked as deprecated, or should they really just be removed as a quirk of the museum's website that got propagated here by a bot? (I tried removing a few, and the bot added them back; I'm not sure what to do about that either.) Is it OK to leave these items without a non-deprecated date statement, or is there an appropriate way to indicate in the items that these works were created within the artists' lifetimes, even if no more specific date is available? I'm not quite sure how to handle this after looking for more information (e.g. Help:Dates, Help:Ranking), and would be interested in any help others can provide. Affected items include: Kamerinterieur (Q24056498), De danseres Angelica Velez (Q24056105), (Vaas met bloemen) (Q24055222), Zittende man (Q24063088), In de omgeving van Schiedam (Q24060208), Huisjes met bleekveld (Q24059967), De l'intérieur à l'extérieur (Q63184566), Azalea (Q24063419), Landschap met lage duinen (Q24063134), Stilleven paarse bloemen (seringen) (Q24055387), Tuin met maaier (Q61856808), Zelfportret (Q61856615), Landschap met koe (Q24056789), Stilleven met boeken (Q24055423), Muziekinstrumenten (Q24057996), Post-Office Witte Brug at Scheveningue (Q24062267), Débarquement de la pêche (Q24055382), Zonder titel (Q24055394), Still Life (Q61856867), Stierkalf (Q61856729), Landschap met geiten (Q24063478), Kerkinterieur (Q24055430), Avondstemming (Q24063099), Vaas met narcissen (Q24060173), Dienstmeisje (Q24056555), Kinderen aan het strand (Q24056568), Q24056888, Vue de Paris (Eglise de St. Germain l'Auxerrois) (Q24063007), Stilleven chrysanten (Q24055392), Paysage (Q24063009), Stalinterieur (Q24063506), Seascape (Q61857024), Vrouwenportret (Q61856655), Landschap met ven, bij Hilversum (Q24056091), Zelfportret (Q24056525), Ruiters onder een poort (Q24055149), Thunderstorm (Q61857023), Bosrand (Q24057948), Spuistraat bij het Kattegat (Q24056516), Landschap (Q24056443), Begijnhofje te Haarlem (Q61856983), Bloemstilleven (Q24056877). Thanks, Jamie7687 (talk) 01:38, 14 April 2020 (UTC)[reply]

There's been a fair amount of discussion of late about debatable imports by bots, without definitive resolution that I'm aware of, so my suggestion here should not be considered authoritative.
I would agree with you that the 0-valued statements would best be deleted, but you're right, the bot just reinserts them.
So I would suggest (a) lowering the 0-valued statements to a rank of 'deprecated', and perhaps adding a reason for deprecated rank (P2241) qualifier, if we can figure out what the reason for deprecation should be. See also Help:Ranking#Deprecated_rank and Help:Deprecation. —Scs (talk) 03:50, 14 April 2020 (UTC), edited 04:16, 14 April 2020 (UTC)[reply]
(ec) As a trial run, I've deprecated the offending date at Zelfportret (Q61856615) and added a reason of error in referenced source or sources (Q29998666). (It might be more accurate to use a deprecation reason of "value unknown in referenced source", but there's nothing like that in the list of official deprecation reasons at Q52105174.) —Scs (talk) 04:04, 14 April 2020 (UTC)[reply]
Slightly better reason: explicitly designated as wrong in other source (Q76449977). —Scs (talk) 04:10, 14 April 2020 (UTC)[reply]
  • Seem to be old items. Rather trying to fix them manually, I ask the bot operator to look into it. If the bug is fixed since, it should be fairly easy to delete the statements. --- Jura 04:00, 14 April 2020 (UTC)[reply]
BTW, this reminded me of some other bug: Wikidata:Bot_requests#Cleanup_collection_size_0_= --- Jura 04:16, 14 April 2020 (UTC)[reply]

Wikidata weekly summary #411

Léon Auscher and Category:Léon Auscher

Hello, there is something wrong with the items Léon Auscher (Q65088391) and Category:Léon Auscher (Q90567396) and the connexions to fr:Léon Auscher and commons:Category:Léon Auscher. Can somebody correct it? --Havang(nl) (talk) 16:41, 14 April 2020 (UTC)[reply]

@Havang(nl): ✓ Done. There are two items: Léon Auscher (Q65088391) for the person (with Wikipedia link) and Q90567396 for the category. Tubezlob (🙋) 16:54, 14 April 2020 (UTC)[reply]
@Tubezlob: Thanks. --Havang(nl) (talk) 17:27, 14 April 2020 (UTC)[reply]
A separate category item is unnecessary; the Commons category can be added to Q65088391. Q90567396 was created linking to French Wikipedia and Commons. It was then made into a category, based on the English label, and there was an existing item with no sitelink that the Wikipedia link could be moved to, but the "Category:" shouldn't have been there. Peter James (talk) 21:21, 14 April 2020 (UTC)[reply]
Another example of this: when I created Q90585603 it had "Category:Cwm Pennant" as the English label and "Cwm Pennant (Powys)" as the Welsh label. Commons isn't an option when creating an item from Wikipedia; an item created by adding a link to Wikipedia from a Commons category makes it a category in English but not in the language of the Wikipedia article. Peter James (talk) 22:02, 14 April 2020 (UTC)[reply]
Cleaned it up. --- Jura 22:05, 14 April 2020 (UTC)[reply]

Terms of use violation while importing dataset

I'm writing here, because I don't believe anything will be done about this by the user involved (judging by the previous indifference about reported problems with imports). GZWDer just imported CAS COVID-19 Anti-Viral Candidate Compounds (Q90481889) dataset (with 2 contraint violations per item and with a pseudo-reference by the way), but the terms of use seems quite clear: this is not CC-0. What's more, it seems that the data is copyrighted. What should be done now? Wostr (talk) 16:52, 14 April 2020 (UTC)[reply]

A lot of datasets are not CC-0 but importable, given no copyright can be claimed about individual piece of data.--GZWDer (talk) 16:55, 14 April 2020 (UTC)[reply]
@GZWDer: In this case it doesn't apply, but do note that copyright can be claimed on individual pieces of data due to EU database copyright. It doesn't apply here though since American Chemical Society (Q247556) isn't a European legal entity. --SixTwoEight (talk) 21:03, 14 April 2020 (UTC)[reply]
So are you 100% sure that the data you've imported is now CC-0? Wostr (talk) 20:36, 14 April 2020 (UTC)[reply]

How can I add a Interwikilink at a discussion page to the German Wikisource. Is this possible and where can I find the abbrevation of the code of the wiki I need to use for that. I havent found it at the Mediawiki. But maybe I havent looked to the right page. --Hogü-456 (talk) 20:22, 14 April 2020 (UTC)[reply]

You can see a list of all interwiki prefixes on the German Wikisource at https://de.wikisource.org/wiki/Spezial:Interwikitabelle . You should be able to find a corresponding table on every project under Special Pages. Bovlb (talk) 21:20, 14 April 2020 (UTC)[reply]

Tiles signs

Is there any wikidata item I can use for files in Commons:Category:Station tile signs in the Netherlands? When I search for 'tiles' in Wikidata I get nowhere.Smiley.toerist (talk) 09:29, 15 April 2020 (UTC)[reply]

Mostly in combination of P1071 (location).Smiley.toerist (talk) 10:23, 15 April 2020 (UTC)[reply]
Agreed, but then I have to create a whole top-down structure: 'tile', 'tile signs', 'station tile signs' and 'Station tile signs in the Netherlands'. I have little experience in this and dont to upset the whole applecart. There is Q3695082 (sign), Q468402 (tile), Q55488 (railway station) and Q55 (The Netherlands). Wich combinations are usefull? Most information can be derived by other ways. A specific railway station has a location, country and of course is a railway station. It seems to me that only Q3695082 (sign) and Q468402 (tile) combination is usefull to create.Smiley.toerist (talk) 10:59, 16 April 2020 (UTC)[reply]

Is there a way to get the OWL file for wikidata?

The only ontology that I have found so far is http://wikiba.se/ontology-1.0.owl. Is there a different url / method to retrieve / navigate the ontology?

--Helt cs (talk) 13:16, 15 April 2020 (UTC)[reply]

@Helt cs: The OWL file you cite is just for the structure of the RDF embedding. I'm not sure exactly what you're looking for, but:

See also Wikidata:RDF. Bovlb (talk) 13:37, 15 April 2020 (UTC)[reply]

@Bovlb: In object oriented programming, there is a difference between the structure of an object (aka class) and the actual object (aka instance). The same is true for classic relational data bases, such as mysql or postgres, where there is the database schema and the actual rows (instances) which populate the tables. I am looking for something similar for wikidata, i.e. a description of the schema, without the instances itself; I thought that OWL would be exactly that. If it isn't, or not applicable to wikidata, I am curious why that I so, and what I could do to work around that issue.

MediaWiki:Wikibase-validator-sitelink-conflict currently says: Site link $1 is already used by item $2. Perhaps the items should be merged and one of them deleted? Request deletion of one of the items at d:Wikidata:Requests for deletions, or ask at d:Wikidata:Interwiki conflicts if you believe that they should not be merged. This is the message that you see when you try to undo or rollback the deletion of sitelinks that have subsequently been added elsewhere.

I think this is bad advice, because it is not our practice to merge and then delete; rather, we merge and leave a redirect. This is important because Wikidata ids are intended to be permanent identifiers and may be used by external sites. Keeping the redirect allows eventual merging of records, whereas deletion does not.

I propose to change this message to:

Site link $1 is already used by item $2. Perhaps the items should be merged. Ask at d:Wikidata:Interwiki conflicts if you believe that they should not be merged.

Bovlb (talk) 16:27, 15 April 2020 (UTC)[reply]

 Strong support sure, the current formulation is really outdated! --Epìdosis 16:32, 15 April 2020 (UTC)[reply]
Note also the discussion at Wikidata:Requests_for_comment/Redirect_vs._deletion which suggests that Deleting is however appropriate if an item has not been existed longer than 24 hours and if it's clear that it's not in use elsewhere. Bovlb (talk) 19:41, 15 April 2020 (UTC)[reply]

Groups of countries

I'm currently preparing a mapping of the country classification of the 20th Century Press Archives (Q36948990) to Wikidata. One problem I came accross were groups of countries or territorial entities, such as European colonies in Africa (Q90696277) or Russian peripheral countries (Q90303093). At some point in time, these groups of countries were considered important enough not only that newspapers published about them, but also they formed a distinct category for the collection knowledge about the world. I have looked through some historical and current examples, but found no elegant solution how they could be modelled:

  1. . Should they be considered as class? (e.g., a subclass of dependent territory (Q161243)?)
  2. . Should they be considered as an instance? (e.g., instance of administrative territorial entity of more than one country (Q15646667)?)

Both do not fit well. Finally I found geopolitical group (Q52110228), which however is not much in use. Are there other solutions out there, of which I'm not aware? Cheers, Jneubert (talk) 16:52, 15 April 2020 (UTC)[reply]

@Jneubert: What about something like instance of (P31) -> group (Q16887380) -> of (P642) -> country (Q6256)? --SilentSpike (talk) 19:20, 15 April 2020 (UTC)[reply]
@SilentSpike: Thanks, this is indeed a valid option, which did not come to my mind. I hesitate however, because it is much more difficult to qeury, and leaves a lot of room for variations (group -> of -> country|state|colony|...).
I'm considering now a rename of geopolitical group (Q52110228) to "group of countries", for which the description "group of independent or autonomous territories sharing a given set of traits" would fit perfectly (as well as the other statements). For a group like European colonies in Africa (Q90696277), where the common status was forcefully implied. the current label "geopolitical community" does definitly not fit. The change would make clear, that the group is not necessarily self-defined (which is also in line with it's current use for e.g. Four Asian Tigers (Q190918).
If there are no objections here, I'd go forward and suggest the change on the item's talk page. Cheers, Jneubert (talk) 06:53, 16 April 2020 (UTC)[reply]
How about "geographic region" linked with "part of" from the relevant items? --- Jura 07:34, 16 April 2020 (UTC)[reply]
I had considered "geographic region" as well, but that means a (somehow contiguous) geographic shape, not a political grouping of possibly geographically disparate entities (e.g. the "geopolitical community" G4 nations (Q838116) - Brazil, Germany, Japan, India -, or the "European colonies ..." example above). So it could be used in some cases, but not in outhers. Jneubert (talk) 08:13, 16 April 2020 (UTC)[reply]

There a quite a few items on Wikidata that are collected in one article on Wikipedia, however, they do have a redirect leading to the main article. I have seen this very often with list articles or other articles where two very similar topics are described.
On the item itself I saw that you can select the redirect page and there even is some sort of qualifier that says it's an (intentional) sitelink to a redirect. Yet I only get the error "Could not save due to an error. The save has failed."
Any ideas? --D-Kuru (talk) 19:14, 15 April 2020 (UTC)[reply]

D-Kuru See Help:Sitelinks#Linking_to_Wikimedia_site_pages. If a redirect already exists on Wikipedia, the simplest way to link WIkidata is to temporarily break the redirect (e.g. remove the "#"), connect Wikidata to the redirect, then restore the redirect to its original target. See for instance the revision history of Guy Lombardo and His Royal Canadians. -Animalparty (talk) 21:22, 15 April 2020 (UTC)[reply]
Thanks for the hint. Even I understand the reason behind it, it is still a stupid idea if you want to connect to a redirect. It takes two extra edits and more time for the exact same result. --D-Kuru (talk) 23:04, 15 April 2020 (UTC)[reply]
See phab:T54564 for a proposal to permit sitelinks that are redirects. Bovlb (talk) 23:41, 15 April 2020 (UTC)[reply]
It takes less effort to link to a Wikipedia redirect than to create a new article or redirect, or even to add a couple additional properties to the item. I don't know exactly why one can't link directly to a redirect immediately (possibly to prevent inadvertently linking to typos or alternate spellings of existing items as opposed to to sub-topics: there are multiple classes of redirects), but I don't think the process is stupid. -Animalparty (talk) 04:28, 16 April 2020 (UTC)[reply]
I believe the rationale is: 1) most redirects represent synonym expressions; 2) an item with a redirect sitelink is therefore likely to be a duplicate of another item (with the redirect target); and 3) we don't like having duplicates. Bovlb (talk) 04:41, 16 April 2020 (UTC)[reply]

Birth place or place of birth register

Hello all!

Adam Alsing (Q3355833) was physically born in Stockholm (Q1754). "The problem is that the filed [ place of birth (P19) ] normally is where the mother of the child lived, or more exactly, where she was filed in the registry-books when the birth took place". Most sources state that the person was born in Karlstad (Q25457) due to that being the birth domicile (Q10501047) (place of register). How should the be denoted properly on the item? Both can't be used for birthplace, right+ Or if they should, which should be the preferred rank?

This was discussed in 2015 without much success or progress at Property talk:P19#method, and since this person, unfortunately, passed away yesterday due to COVID-19 (Q84263196) this is a high-impact item (in Sweden). (tJosve05a (c) 22:43, 15 April 2020 (UTC)[reply]

Personally, I've assumed it was supposed to be geographical place where the birth took place. Some people are even born in transit, e.g., in a ship or aircraft. The only time I remember it making a difference in an item I edited is Stewart McSweyn (Q36672959); his parents were from King Island, but since that's a small island with few medical facilities, people are generally not born there. Ghouston (talk) 23:55, 15 April 2020 (UTC)[reply]
If both can be sourced, one would want to include either anyways, ideally qualified with, e.g., criterion used (P1013). The question is then which rank to use. If this can't be answered all would have normal rank. --- Jura 05:04, 16 April 2020 (UTC)[reply]
For the most part, modern man is going to be born in the town of the nearest hospital. Which is not going to be the town of "OMG, my water broke, let's go to the hospital", or the town where one resides immediately after. I always list place of actual birth, which would be the town with the hospital, not the address filled in on the admissions form. Quakewoody (talk) 10:34, 16 April 2020 (UTC)[reply]
Wikipedia is an international project, so words have the meaning they have in ordinary language, not the meaning they have in some obscure law or rule in some individual country, state, or province. The place of birth is the physical place of birth, regardless of any strange definition that might exist in some law.
Laws do create requirements about where the birth should be registered, where the birth records are kept, where the records may be accessed, and who may access the records. Often, this is irrelevant for Wikidata because for people who are still alive or who died recently, the official records of their birth are not accessible to the general public, and even if they were, it would be difficult to reliably determine that a certain official birth record refers to some well-known adult who is the topic of a Wikidata item. Therefore, Wikidata should rely on reliable independent published sources, not official birth records.
In the cases where an official birth record would be appropriate to use, if there is an "official" birth location that differs from the physical birth location, that should be regarded as a matter of referencing (that is, something that helps in finding the correct official birth record). It is not a fact that pertains to the person, only a referencing detail. Jc3s5h (talk) 10:59, 16 April 2020 (UTC)[reply]
To return the original question. I think the question was how to record "place of birth" that appears in w:Swedish_passport#Identity_information_page. place of birth (P19) seems fine. --- Jura 11:15, 16 April 2020 (UTC)[reply]