Shortcuts: WD:PC, WD:CHAT, WD:?

Wikidata:Project chat

From Wikidata
Jump to navigation Jump to search

Wikidata project chat
A place to discuss any and all aspects of Wikidata: the project itself, policy and proposals, individual data items, technical issues, etc.

Please use {{Q}} or {{P}} the first time you mention an item or property, respectively.
Other places to find help

For realtime chat rooms about Wikidata, see Wikidata:IRC.
On this page, old discussions are archived after 7 days. An overview of all archives can be found at this page's archive index. The current archive is located at 2024/04.

Problems with naive user merges with Distributed game: duplicate authors[edit]

I just spent several hours going through and fixing this new user's edits which were almost all merges of human items based on Magnus Manske's "The Distributed Game (88): Duplicate authors #distributed-game". Of the 59 merges done, I assessed 26 as wrong or unjustified - 44%. Many of them were merging people with wildly different names (the tool seems to only check last name and at least one matching initial?). I know there's some underlying logic regarding common co-authors, but whatever it's doing something there is broken. Can the tool be limited to only at least auto-confirmed users who have done something else in Wikidata first? Does anybody have another suggestion here? It's certainly a useful tool, and a (slight) majority of the edits here were good, but it's an awful lot of work to fix bad merges - and that would have been worse if I had waited until the bot redirected all the associated articles to the new items. ArthurPSmith (talk) 20:22, 2 April 2024 (UTC)[reply]

I'm troubled by experienced editors have to spend hours on a cleanup that should have been unnecessary. The game has a tiny notice that's barely noticeable: "Please make sure that the items are really about the same entity!". Gamifying merging is giving people the impression that this is something they can take lightly (!). In my opinion it's not enough to limit this game to people who are autoconfirmed. The message should be visually more noticeable and read something like "It is important to make sure the two items really are the same before merging, so please click on each of the links to examine their contents before merging." Along with a "Please confirm you've read this and understood" labelled checkbox, that saves the confirmation to the user's settings. This way it's harder to claim ignorance. Infrastruktur (talk) 12:29, 3 April 2024 (UTC)[reply]
Thanks for the suggestions. @Magnus Manske: Can the labeling be at least changed? Or maybe the underlying logic needs a look? It shouldn't be merging "Yonghoon Choi" with "Yunsoo Choi", or "Chang-Bao Li" with "Chuanyou Li" for example. Also I have a concern that one bad merge will lead to others - if two people were not actually coauthors of the same person, but a bad merge makes them seem like they are, this can have cascading effects. ArthurPSmith (talk) 16:54, 3 April 2024 (UTC)[reply]
I have also had to spend a fair bit of time recently checking and unmerging edits made via this game and I echo the points raised above. For me, it shouldn't be possible to merge items with different ORCID iD (P496) claims and a constraint to prevent the merger of such items would avoid most of the issues I have encountered. Having looked at the tool, I am suprised how little guidance is given about how to identify items that can safely be merged. There really must be a warning that the information provided in the tool is insufficient to make an informed decision about whether two item represent the same entity. Have to admit I'm not a fan of gamification but this trivialises something that is actually quite complicated, hence the amount of time required to fix the incorrect merges. Simon Cobb (User:Sic19 ; talk page) 22:50, 3 April 2024 (UTC)[reply]
I will amend the message in the game, and have a look at the duplicate candidate generator. --Magnus Manske (talk) 13:36, 4 April 2024 (UTC)[reply]
Thanks for looking at this. It really is useful to merge duplicates, but bad merges can be quite hard to fix. ArthurPSmith (talk) 19:24, 4 April 2024 (UTC)[reply]
@Sic19, Magnus Manske: A followup - I've been tracking these edits recently and they seem much improved, though there are still several problem cases. One frequent problem now is caused by a Wikidata entry based on an ORCID id where the name does not match the ORCID - rather somehow the name is that of a different author on some co-written articles. I'm guessing there are some data flow issues between ORCID and publishers and Crossref and whatever data source was used for wikidata imports (usually Europe PubMedCentral?). Can some sort of look-up be done on the ORCID id's before merging to confirm the names actually match? Not sure what best steps here are. There are definitely a lot of duplicate ORCID cases too and it would be a shame not to put together those duplicates on our end. Another common case I'm running into is where two people have the same name, and one of their ORCID records includes papers from the other person, usually because the paper list was supplied by some institutional search rather than the person themselves. Hard to fix issues where ORCID records themselves are incorrect. ArthurPSmith (talk) 17:08, 11 April 2024 (UTC)[reply]
The mismatching names in ORCID and Wikidata is quite a subtle problem. From the examples I've seen, the problem is usually caused by an ORCID being associated with the wrong co-author on the version of record and then being replicated in Crossref. To identify errors, I've compared the names in the ORCID dump and Wikidata labels. There are ~3500 items that are very likely to represent a different entity to that of the attached ORCID and a further 800 merged items with 2 or more ORCIDs that are now conflating different identities from ORCID. I'm not sure what to do now as the relationship between these items and any authored papers is messy. Do you have any thoughts about how to fix these items without ending up with papers with incorrect authors? Simon Cobb (User:Sic19 ; talk page) 21:06, 16 April 2024 (UTC)[reply]
@Sic19: Ooh hey that's great that you've already done this! Here's how I've been fixing these so far, but it's labor-intensive:
1. Fix the label(s):
  • If there's not much metadata (ie. no sitelinks, maybe just P31 and P106 main statements) and only the ORCID id, change the English label to match the ORCID label and remove other labels. This is only appropriate if the ORCID label is in Latin text of course.
  • Otherwise deprecate the ORCID id on the item with a reason for deprecation that it's about a different subject.
2. Go to "What Links Here" and see what articles are linked (should be P50 relations).
  • If it's a relatively small number (up to 5) go through them by hand and check the name on the original article against what Wikidata has, and correct the entries by hand if necessary (adjust series ordinal and/or object named as values, or remove the author (P50) relation completely and replace with a author name string (P2093) instead.
  • For larger numbers I bring up the author disambiguator author page and look for any patterns I can to mass-fix the entries. For example if there are a clear group of papers with a different object named as (P1932) qualifier from the actual name, those can be filtered and mass reverted in the tool. Unfortunately a lot of these cases are missing object named as (P1932) qualifiers, which makes this harder.
Could you make your list available for others to work on to fix these? ArthurPSmith (talk) 21:30, 16 April 2024 (UTC)[reply]
@ArthurPSmith: The lists of items with an incorrect label and bad merges are in this file: Wikidata-ORCID data issues 20240417.zip.
That approach to fixing the issues is similar to what I've been doing but the author mode in the author disambiguator will save time, thanks! It would be great if we could develop a process to bulk update some of these items. I'm going to experiment with using the Crossref API to check the authors on linked the papers at some point. Simon Cobb (User:Sic19 ; talk page) 17:14, 17 April 2024 (UTC)[reply]

Dealing with different Google Knowledge Graph IDs for same item[edit]

Atoiya (Q4286366) has two values for Google Knowledge Graph ID (P2671), triggering a Potential Issue. The first value, /g/12264jtz, was added by Lockalbot (@Lockal:); the second value, /g/122z2rt5, was added when 芝高 merged Q15621844 into Atoiya (Q4286366) (to Q15621844, the value had been added by Lockalbot, too).

What is the best way to deal with this? Judging from the descriptions of the items prior to merging, it seems that Q15621844 referred to the cape (and as part of Japan, which has an ongoing dispute over the region in question with Russia, which controls it) while Q4286366 has the English description “human settlement in Yuzhno-Kurilsk, Sakhalin Oblast, Russia”. Should the two items be unmerged perhaps? --Data Consolidation Officer (talk) 11:05, 6 April 2024 (UTC)[reply]

I restored the separate item for the cape. Both Google Knowledge Graph IDs are currently "Cape Atoiya" with no additional information so I don't know if they should both be in the cape item. Google Knowledge Graph has many duplicates, sometimes for the same name and others with different names; this is usually the result of merging pages and is not an issue - the constraint should probably be removed or have deprecated rank similar to VIAF ID (P214). Peter James (talk) 12:32, 6 April 2024 (UTC)[reply]
Agreed. We should consider removing the constraint limiting an item to a single Google Knowledge Graph identifier. There are often multiple knowledge graph ids describing the same thing. Iamcarbon (talk) 20:45, 9 April 2024 (UTC)[reply]
What’s the point of having Google Knowledge Graph ID (P2671) as an external identifier if it does not uniquely (and comprehensibly) identify a concept? (For transparency, I remark that this very property has always appeared fishy to me. Its values simply redirect to Google searches, without any means of telling which concept, if any, the “knowledge graph ID” represents, as opposed to a simple but ambiguous search string.) --Data Consolidation Officer (talk) 15:48, 13 April 2024 (UTC)[reply]
In most cases, these ids are unique and comprehensive -- and when used with the knowledge graph API (https://cloud.google.com/enterprise-knowledge-graph/docs/search-api), can be used to verify item details. One particular case I've found them super helpful is with identifying works of art via Google Vision, which returns a Google Knowledge Graph Id / Freebase Id. These can be used to lookup the Wikibase item and find additional object details.
In some rare cases, some entities have multiple identities, usually with a slight name difference. For example: "iPhone" and "Apple iPhone". The both describe the same thing. Iamcarbon (talk) 05:49, 16 April 2024 (UTC)[reply]
@Data Consolidation Officer: the kgids disambiguate between identically named items. it looks like it just redirects to a search for a string but it isn't really. BrokenSegue (talk) 05:51, 16 April 2024 (UTC)[reply]
@BrokenSegue: I suspected it does, but it always looked to me as if when following the link, Google just searches for a specific term. In the case of Atoiya (Q4286366), the term was identical (“Cape Atoiya”) for both identifiers (and the search results contained pages of online shops where you could buy a cape, way off), something that has always made me question the usefulness of these identifiers. (Users of Google’s enterprise API may be able to make more use of them, though, judging from Iamcarbon’s post.) --Data Consolidation Officer (talk) 10:26, 21 April 2024 (UTC)[reply]

Birth after father's death[edit]

We need to change the calculation so that we do not get the error message unless the difference between death and birth is more than 9 months, perhaps 10. See George Francis Valentine Scott Douglas (Q75268023) RAN (talk) 11:54, 10 April 2024 (UTC)[reply]

People have used "exception to constraint" to get rid of these messages, but this doesn't scale. The next step people tend to use is adding "separator" but "object has role" is too ambiguous for this purpose, we would need a qualifier that is more or less single-purpose. Got any ideas? Make a new one maybe? Infrastruktur (talk) 15:52, 10 April 2024 (UTC)[reply]
Just noticed "separator" only works for the single value constraint. :-( But it does seem like a good way to mark claims that are manually checked. @Lucas Werkmeister (WMDE): Something similar for contemporary constraint might be a good idea. At least it's a solution that doesn't have complexity issues. Infrastruktur (talk) 18:58, 10 April 2024 (UTC)[reply]
@Infrastruktur: I don’t understand what you’re trying to do. What does this have to do with a “separator”? What are the date of birth / date being separated from? Lucas Werkmeister (WMDE) (talk) 10:18, 11 April 2024 (UTC)[reply]
Forget about "separator" that was my mistake. I was interested in hearing if you thought it would be feasible to add a way to manually mark claims such as child (P40) with a qualifier basically telling the constraint checker that this claim have been manually checked so don't show an error message here. Basically doing what "exception to constraint" does except the exception info is moved to the claims themselves so it should be more scalable I guess. Infrastruktur (talk) 16:24, 11 April 2024 (UTC)[reply]
I wouldn’t mind adding that, I think… should be relatively simple to implement in WBQC, at least. It’s not an ideal solution, but it’s not like we have any much better solutions lined up either (“constraint exceptions don’t scale” has been a known issue for a while, and the last proposal I dimly recall, which I think would’ve encoded exception lists as additional items, was probably worse). My main concern would be that people would object to these qualifiers, but maybe I’m being too paranoid there ^^ Lucas Werkmeister (WMDE) (talk) 09:35, 12 April 2024 (UTC)[reply]
I was thinking about how to encode the exception. If we use a single new qualifier "constraints manually checked" that would remove warnings for any and all constraints, which might be ok. Trying to encode information about individual exceptions in the predicate position strikes me as a bad idea, but they could be encoded in the object position if there was a URI prefix reserved for this purpose. The "wdno:" prefix encodes which property it pertains to, so likewise an "wdnoexception:" prefix could encode which property and exception it pertained to e.g. "?statement_node pq:P99999 ("constraint manually checked") wdnoexception:P40-Q25796498". Edit: Or maybe something like "?statement_node wdnoexception:P40 ("constraint manually checked") wd:Q25796498" would be better after all? It would add new things to the data model so it's not something that can be rushed. Edit 2: Or since we know which property from the claim itself, we could do without any new URI prefix at all which is actually way better, e.g. "?statement_node pq:P99999 ("constraint manually checked") wd:Q25796498". Infrastruktur (talk) 13:56, 12 April 2024 (UTC)[reply]
I think any addition to the technical data model is a non-starter, to be honest – if we want this soon, we should just encode it using the normal data model and accept that it won’t be 100% precise. I was thinking we could reuse exception to constraint (P2303)subject type constraint (Q21503250) as a qualifier, and when it appears outside of a property constraint (P2302) statement, reinterpret it as “ignore all constraints of this constraint type in this statement” (i.e. in the main snak, qualifiers and references). Or create a new property, of course. (In theory, we could use a URL-valued property, where the value is the URI of a property constraint like http://www.wikidata.org/entity/statement/P31-A89E967D-82B7-4081-BAE3-BADF28B4E7E3; this would let us distinguish between multiple constraints on the same property with the same constraint type, but it would look ugly in the UI and also be much harder to edit.) Lucas Werkmeister (WMDE) (talk) 15:55, 15 April 2024 (UTC)[reply]
I like your first suggestion. Would it be worth using exception to constraint (P2303) only for the main-snak, and make a similar property that is transitive (applies to qualifiers and references as well)? Infrastruktur (talk) 21:06, 15 April 2024 (UTC)[reply]
We could also do that, sure. Would need a new property proposal, I guess ^^
(Also, disclaimer: right now I’m not sure who’s “responsible” for pushing this proposal forward, if we want to go ahead with it; I’m not sure if I should be doing it as a staff account – I see myself more in the role of implementing it once it’s been decided. But I’m also not sure how much more consensus we need, or if we think it’s enough if nobody objected to reinterpreting P2302 here.) Lucas Werkmeister (WMDE) (talk) 15:38, 17 April 2024 (UTC)[reply]
  • Perhaps we can do a search for all the children born within 10 months of the father's death and mark them all object_has_role=born after father's death (Q105083598) and rewrite the rule for the error message so that it is not triggered when object_has_role=born after father's death (Q105083598). I am not familiar with how the error message rules are coded to make the changes myself. Do we have an error message when a child is born after the mother's death, which would indicate that the child belongs to a different spouse of the husband? --RAN (talk) 18:16, 11 April 2024 (UTC)[reply]
I suspect a general implementation of such a check (not limited to humans) would have a high complexity cost. It also trades false positives for false negatives. Infrastruktur (talk) 18:44, 11 April 2024 (UTC)[reply]
Notified participants of WikiProject property constraints

Lucas Werkmeister (WMDE) (talk) 09:36, 12 April 2024 (UTC)[reply]

@Lucas Werkmeister (WMDE) how costly would it be to implement the solution that RAN proposed? ChristianKl23:04, 18 April 2024 (UTC)[reply]
I can’t really answer that question, as I don’t think it’s really an implementable proposal yet… the constraint is defined here, and it’s not really clear to me how you would encode not triggered when object_has_role=born after father's death (Q105083598) in the constraint parameters. Lucas Werkmeister (WMDE) (talk) 13:56, 22 April 2024 (UTC)[reply]

Property documentation / Current uses[edit]

I apologize if this has been mentioned before, and I'll just repeat: the little box on the properties' talk page that summarizes the usage of the property in a table hasn't been working for a while, I think. Maybe this is it: Module:Property_documentation. Pallor (talk) 15:05, 12 April 2024 (UTC)[reply]

The bot that updates this information runs once a week, has it been longer than that? Infrastruktur (talk) 21:29, 13 April 2024 (UTC)[reply]
Infrastruktur You're right, it was updated today. I didn't know that, thanks. Perhaps it would help to understand how it works if the date of the next update was written in that section. Thanks for the answer! Pallor (talk) 11:44, 19 April 2024 (UTC)[reply]

Can't edit Wikidata[edit]

Hello! I'm a project manager for the Albanian Wikipedia and we were trying to do an activity dedicated to Wikidata today but we quickly noticed that very few of the involved members had the "Edit" button. Can someone tell me what is going on? I assume there should be a kind of threshold that maybe they have yet to reach but I have limited knowledge in that direction in regard to Wikidata. Any information would be appreciated. Thank you! - Vyolltsa (talk) 10:26, 13 April 2024 (UTC)[reply]

Were the people involved using a mixture of devices? The Wikidata interface on mobile devices is fairly limited. Alternatively, can you give an example of a page where the problem occurred? It may be that the page has been semi-protected due to vandalism by users that aren't logged in (semi-protection also blocks newly registered users. From Hill To Shore (talk) 11:40, 13 April 2024 (UTC)[reply]
@ Vyolltsa: can you give us an example username that's having a problem? In general pages are open for editing even by new users. Though generally we suggest you preregister an event to prevent users from getting banned. BrokenSegue (talk) 14:35, 13 April 2024 (UTC)[reply]
@BrokenSegue where do events get preregistered? ChristianKl17:10, 18 April 2024 (UTC)[reply]

On Vandalism Tracking[edit]

Since around the turn of the year, I have been working on a tool to efficiently track bad edits on Wikidata. The effort was motivated by my observations regarding patrolling work on Wikidata:

  • Special:RecentChanges lists too many changes. Even if you narrow it to unpatrolled changes and filter out other generally low-risk changes, you are left with a haystack that is replaced by another one in about two hours. Except for obvious cases, fast reviewing is impossible since you need to gather some context first.
  • Special:AbuseFilter works well for isolated cases/patterns or can help add tags to recent changes. But it is not scalable to all properties/languages/etc. and has access only to limited context.
  • Property constraints are a very well maintained, integrated and understood system. Every property has some constraints, and violation of a constraint means something is wrong or someone is doing something wrong. But reports of violations are scattered over thousands of pages with no indication of recency and no available integration with RecentChanges and AbuseFilter.
  • More ideas: [1][2].
  • See also Wikidata:Project chat/Archive/2024/02#An unexpected effect of Covid-19 on the Wikidata project?

I took the best of each and started compiling weekly reports of changes during a time window that introduced some constraint violations. It involved creating a Python library with algorithms for some (not all) constraint types and some custom ones.

The reports are available here: Special:PrefixIndex/User:MatSuBot/Reports/Vandalism. The most recent reports should be the most accurate since patrolling status is available for them. (Reports older than a month cannot use patrolling status, some entries could have been solved in the meantime.)

This is just a single step for better counter-vandalism on Wikidata, it cannot replace the inefficient patrolling process yet. The reports are not real-time, they cover cases of more-or-less obvious vandalism, and deal only with changes to claims (i.e., not labels and descriptions). More effort would be needed to establish an automated (maybe AI-powered) system for countering vandalism.

I'm curious if others find such reports useful or if there is a way to make them (more) useful. (Daily reports instead of weekly? Cover more types of violations? Missed cases of vandalism?)

--Matěj Suchánek (talk) 12:37, 14 April 2024 (UTC)[reply]

this is really cool work. you might want to reach out to the people in https://phabricator.wikimedia.org/T341820 where they are working on a new anti-vandalism model for wikidata. incorporating your library into their workflow seems like it'd be useful. the newest model is much better than ORES.
Did you consider incorporating ORES into your reports? In theory it should be ok at predicting vandalism.
I think enwiki has bots monitor vandalism rates and report that in a template (see https://en.wikipedia.org/wiki/Template:Vandalism_information). not sure if it's really useful though. BrokenSegue (talk) 18:55, 14 April 2024 (UTC)[reply]
I am aware of that initiative (I recall providing a handful of labels) as well as the Automoderator (which I believe is more focused on Wikipedia, though).
However, I don't think it's necessary (and a good idea) to insist on incorporating right now. I can see there is already some progress that I don't want to steer down. Also, my library reflects a community-maintained system, i.e., something that can change any time. If they had the model trained w.r.t. some snapshot of constraints, but we had it changed later, it could lose its precision (i.e., instead of , they'd need something like , which is definitely harder to model).
I think we can just keep them separate, and benefit from advantages of each.
Did you consider incorporating ORES into your reports? I'm not sure what you mean. Like indicating its score in the reports in a separate column? The problem with ORES is it evaluates each revision independently, not as a sequence (like I do).
enwiki has bots monitor vandalism rates Looks like that is based on the amount of recent reverts (real-time). However, I am dealing rather with backlog.
Idea: Instead of (or as another alternative to) a complex ORES model that tries to handle everything, have a very simple one using limited context (type of change, language changed, property involved, etc.) such that it allows to simply filter for changes that have somewhat higher probability of being reverted (IP changes to instance of (P31), sex or gender (P21), pseudonym (P742), English description, etc.). The goal is to allow synchronous patrolling using RecentChanges filters, while leaving the rest to report-based backlog. IMO, just using the "reverted" tag (mw-reverted) for training should be sufficient. (But see also phab:T357163.)
Another idea: see phab:T358729, but ignore the AbuseFilter part. (I really need to save my ideas somewhere.)
--Matěj Suchánek (talk) 09:02, 15 April 2024 (UTC)[reply]
@Matěj Suchánek: I am sure you are aware of Wikidata:WikiProject Counter-Vandalism - that would be a good place to save your ideas and add links to what you've done. ArthurPSmith (talk) 16:40, 15 April 2024 (UTC)[reply]
Update: There is now a link to the reports on Wikidata:WikiProject Counter-Vandalism. I intend to generate the newest report every Tuesday. I have also made improvements to detecting whether an instance of vandalism has already been dealt with. --Matěj Suchánek (talk) 09:21, 22 April 2024 (UTC)[reply]

Edit groups tool seems to be broken[edit]

Per Edit groups discussion there seems still to be a problem. Pintoch is apparently busy, so could someone please look into that matter ? Kpjas (talk) 16:13, 15 April 2024 (UTC)[reply]

This needs more attention. Sdkbtalk 19:10, 22 April 2024 (UTC)[reply]

Twinkle or similar?[edit]

Does Wikidata have Twinkle or some similar tool for dealing with vandalism? Sjö (talk) 04:56, 16 April 2024 (UTC)[reply]

@Sjö: Wikidata:WikiProject Counter-Vandalism lists the counter-vandalism tools used here. Edits in Wikidata are generally very different from edits in regular wikipedia projects, so the tools applicable there don't work so well here. ArthurPSmith (talk) 14:06, 16 April 2024 (UTC)[reply]

Items to be careful with, controversial topic[edit]

We should watch out for items like unborn child (Q63177820) human embryo (Q63177917) human fetus (Q26513). They are currently appearing on a subclass loop report. Considering the controversial nature of the topic I think it's best to think about this with a wider audience and think about what we should do about this together. author  TomT0m / talk page 07:47, 16 April 2024 (UTC)[reply]

regarding the controversy can't we just add qualifiers to capture the disputes (which might mean the loop remains) BrokenSegue (talk) 03:44, 17 April 2024 (UTC)[reply]
From what I can find on Wikipedia for humans embryo refers to the stage before fetus. As long as "unborn child" and "human" are strictly superclasses I think we can avoid any controversy and still fix the classification. So "unborn child" should not be a subclass of "human fetus" or "human embryo" and I think you're good. The levels will then look like: 1. Human, unborn child 2. human embryo, human fetus.
I don't do classifications, but it looks like you do, so what do you think? Infrastruktur (talk) 05:00, 17 April 2024 (UTC)[reply]

old-computers.com ID Property:P5936[edit]

The site is definitely down (the creator said that Will not fix the availability). Can we modify the property to use/link towards the Wayback Machine ? Arosio Stefano (talk) 01:36, 17 April 2024 (UTC)[reply]

i updated it but it takes a little time to update on the interface BrokenSegue (talk) 03:43, 17 April 2024 (UTC)[reply]

Systematic error related to patrollable edits[edit]

If an entity has/have unpatrolled edit(s), and if any established user will do a new edit to this entity, then there is no header message that actually previous edits are unpatrolled. See e.g. at Q12376063. Is it really so, that this important topic is not discussed somewhere?! Estopedist1 (talk) 06:08, 17 April 2024 (UTC)[reply]

The UnpatrolledEdits gadget were only checking the last edit. It also did so by analyzing the DOM of the diff page. It seems cleaner just to ask the API and check for all unpatrolled edits in one go, here's a prototype rewrite that does just that [3]. An interface admin can replace the existing gadget. There are 3 lines commented out that has to do with translation, but it should be obvious how to enable that. Infrastruktur (talk) 08:37, 18 April 2024 (UTC)[reply]
If this works correctly I have no objects to replacing the code, will probably save some server calls and be quicker too. Sjoerd de Bruin (talk) 08:51, 18 April 2024 (UTC)[reply]
Audited my own code for safety and fixed one minor issue. Infrastruktur (talk) 22:02, 18 April 2024 (UTC)[reply]

New update about splitting the Wikidata Query Service graph is out[edit]

There is a new update relative to the experiment with splitting the Wikidata Query Service graph.

For those who don't know about this, the Search team is currently running an experiment to split the Wikidata Query Service graph and use federation for the queries that need access to all subgraphs. This is a breaking change, which will require a number of queries to be rewritten, either to access a new SPARQL endpoint, or to use federation. We want to have a good understanding of the trade-offs before we commit to any long-term solution.

A new proposal for the split has also been published, everyone is encouraged to read it. We are open for feedback until May 15th 2024. Sannita (WMF) (talk) 14:09, 17 April 2024 (UTC)[reply]

Ability to revert batch edits appears broken[edit]

See Wikidata_talk:Edit_groups#Not_working?. Sdkbtalk 18:54, 17 April 2024 (UTC)[reply]

Umm, why isn't this causing alarm? Sdkbtalk 17:18, 19 April 2024 (UTC)[reply]

Merged Uzbek ministries[edit]

In 2023, a number of Uzbek ministries was merged. Most notably, Ministry of Economy and Finance of Uzbekistan (Q17072985) was merged with Ministry of Economy of Uzbekistan (Q16952230). I cannot merge the items nor keep one for historic purposes because of the doubled interwikis. Any suggestions on what to do? Best, Fordaemdur (talk) 07:15, 18 April 2024 (UTC)[reply]

Use replaced by (P1366) and replaces (P1365). Sjoerd de Bruin (talk) 08:09, 18 April 2024 (UTC)[reply]
Thank you! Fordaemdur (talk) 08:15, 18 April 2024 (UTC)[reply]

Google Knowledge Graph ID (P2671)[edit]

Google Knowledge Graph ID (P2671) was recently vandalized by a user, who switched all the info to "Pluntar Athletic Club". For some reason, some of that is still appearing on the page, even if everything has been reverted. Does anyone know how to resolve it? Bricks&Wood (talk) 17:33, 18 April 2024 (UTC)[reply]

Simply purged the page with gadget... Seems now it appears in a proper way. --Wolverène (talk) 21:04, 18 April 2024 (UTC)[reply]
Actually it does not, purging/cache cleaning was not enough. (Also, I believe that such frequently-used properties should be indefinitely semi-protected). --Wolverène (talk) 23:59, 18 April 2024 (UTC)[reply]
All properties are already indefinitely semi-protected. --Matěj Suchánek (talk) 06:10, 19 April 2024 (UTC)[reply]
Oh, and why I did not know this... Every day's a lesson. :) --Wolverène (talk) 06:39, 19 April 2024 (UTC)[reply]
@Matěj Suchánek: I hope you are right about protection. I full-protected it. The user related to this vandalism was clever: did 50 good edits, then waited some days (when got autoconfirmed) and then mass-vandalism! Estopedist1 (talk) 11:16, 21 April 2024 (UTC)[reply]

Link to other wiki projects[edit]

Is there a format `[[]]` or template `{{}}` to compose a link to other wiki projects, within Wikidata page, to display rather like an external link? Example below :

https://en.wikipedia.org/wiki/Roundabout vs https://www.google.com/

Like in en-wp,

[[Roundabout]]

. JuguangXiao (talk) 01:27, 19 April 2024 (UTC)[reply]

You can prefix the page name in the square brackets with : and a language code to link to a specific language Wikipedia page. So [[:en:Roundabout]] renders as en:Roundabout. In addition you can do things like [[:en:Roundabout|arbitrary text]] to render to arbitrary text. This page on enwiki goes into the concept in greater depth: en:Help:Interwiki linking. It also lists some of the interwiki prefixes for multilingual projects. -- William Graham (talk) 02:56, 19 April 2024 (UTC)[reply]

Question[edit]

Is it correct for

⟨ former bridge (Q11486300)  View with Reasonator View with SQID ⟩ subclass of (P279) View with SQID ⟨ bridge (Q12280)  View with Reasonator View with SQID ⟩

and/or

⟨ proposed bridge (Q44665130)  View with Reasonator View with SQID ⟩ subclass of (P279) View with SQID ⟨ bridge (Q12280)  View with Reasonator View with SQID ⟩

? If not, what is the appopriate way to connect these items? — Martin (MSGJ · talk) 14:15, 19 April 2024 (UTC)[reply]

@MSGJː How about followed by (P156). (Folllowed by, at least for former bridge), I would not use subclass of in that way. Do you see a reason to do so? SM5POR (talk) 16:54, 19 April 2024 (UTC)[reply]
I don't see how P156 would be relevant here. In my naive thinking, a former bridge is a bridge which has fallen down, and a proposed bridge is a bridge which has not yet been built, i.e. they are both bridges. But I understand that other views exist — Martin (MSGJ · talk) 18:25, 19 April 2024 (UTC)[reply]
I would say that something is a bridge within Wikidata if there's a point in time where the claim of it being a bridge is true. There's no good reason for former bridge (Q11486300) to exist. As it's just bridge (Q12280) with end time (P582) and using it makes it harder for people to query correctly.
A proposed bridge however never existed so maybe proposed bridge (Q44665130)fictional or mythical analog of (P1074) bridge (Q12280). ChristianKl20:26, 20 April 2024 (UTC)[reply]
Using instance of (P31) destroyed building or structure (Q19860854) might be better, but dissolved, abolished or demolished date (P576) could be used. As is common with edge cases in WD, you'd get tripped up. Ditto abandoned project (Q21514702). But I don't think the solution for structures is a shadow set of former_X items. Vicarage (talk) 04:14, 21 April 2024 (UTC)[reply]
Yes, that would also be better. Replacing former_x items would be a step forward. ChristianKl18:09, 21 April 2024 (UTC)[reply]
I'd not use P576 because I've often see wikipedia input closure dates whilst the bridge was simply rebuild/refurbished. So P576 does not automatically says it's demolished because of misleading data input by various wikipedias Bouzinac💬✒️💛 19:52, 21 April 2024 (UTC)[reply]
@ChristianKl I think, but this might indeed not be a very popular take, that in a query that a naive query that search for "bridges" (say, using "wdt:" in sparql) should only find actual bridges. There is no harm into it being more complicated if you look for historical data or bridges as a certain date. You'd have to write such a query anyway.
One way to achieve this is to set a "preferred value" to something else in instance of (P31) value. Viewed like that as a general rule, it does not make querying harder at all, just the opposite.
By not using that if you look for, say, french communes or department, you get historical data and you most likely do not want them and you have to after that do something more complicated to exclude the old one. author  TomT0m / talk page 18:19, 21 April 2024 (UTC)[reply]
I don't think properties can work like that. Just for an example. A dead person cannot be a spouse, but when a person dies we do not change statements like spouse to be deprecated. And when we query for a person's spouses, we usually do not want just the current spouse — Martin (MSGJ · talk) 18:31, 21 April 2024 (UTC)[reply]
Persons are different from bridges or administrative juridictions. With persons you get a whole lot of philosophical issues and I think it's best not to change the way we treat people, or animals maybe. For business, or objects, it's not the same issues at all and we can afford something like that, in my opinion, if it's convenient.
With people we typically don't use subclasses of human (Q5) items, it's also generally an exception. author  TomT0m / talk page 18:36, 21 April 2024 (UTC)[reply]
(and also, the you most likely do not want them is not really valid for persons, as we regularly query for deceased persons, it's kind of reversed.) author  TomT0m / talk page 18:55, 21 April 2024 (UTC)[reply]
wdt:P31 wd:Q12280; wdt:P5816 wd:Q56557159 would be better for old bridges : you query "bridge" +"ruined" for instance Bouzinac💬✒️💛 19:49, 21 April 2024 (UTC)[reply]
I don't know who you are responding to but it kind of, I think, is a point in my favor. If most people querying bridges actually searches for bridges they can cross know, they might not want to know all the different ways there is to note that a bridge is no more usable, forever. If there is several clauses to add to the query to take into account all the different possibilities it complicates further the querying.
This is why I'd push the simple principle « if you query (not living organism) entities using wdt: and, say, instance of (P31) you should get entities that are not destroyed or permently out of service ». Minimum knowledge should be the rule for typical usage. author  TomT0m / talk page 19:59, 21 April 2024 (UTC)[reply]
For a building, destruction is rarely total, castles become ruins then archaeological sites, and buildings are reused, factories become museums. Its always going to be muddy.
I think the combination X and NOT destroyed is better than having "former_X" for every sort of physical object. And checking the UK, with 4000 odd bridges, only a half dozen are marked destroyed or former bridges, so its not a common problem. (Oh and we have New London Bridge (Q56739652), now in Arizona!) Vicarage (talk) 20:15, 21 April 2024 (UTC)[reply]
One problem of "former bridge" is that it's unclear whether the items are about a ruin of a bridge or just spot were there used to be bridge in the past. ChristianKl23:09, 21 April 2024 (UTC)[reply]
You don't have to have a statement for every kind of physical objects, for 2 reasons :
  1. if no more appropriate value, you can use a generic "former object" item (high in the class tree) and use it as preferred value for instance of, this does the trick of not showing the result is simple queries
  2. if the instance of (P31) preferred statement change to something like "ruin", for example, this also works.
author  TomT0m / talk page 09:12, 22 April 2024 (UTC)[reply]
Wikidata operates under the open-world hypnothesis. If the status of an item changes, that change won't immediately show up in Wikidata. In the semantics in which Wikidata is founded a missing claim about a closure is just data incompleteness. If you think someone should have an expectation of all those items having information about being out of service, that would mean to say that we have wrong data in a lot of cases. ChristianKl22:59, 21 April 2024 (UTC)[reply]
I don't think so, you just have to consider that "end date" might come anytime to an instance of statement, it's semantically totally equivalent to a non updated "conservation status" with no end date or point in time. The information is incomplete in both cases, exactly the same status for an open-world hypothesis.
Except if you gives a special status for instance of (P31) statement beside the OWI compared to over statements, I don't see the difference. Any statement with a "begin date" might be incomplete, any "timed" statement at least, and might even miss a "begin date" (this happens). This is just missing informations, all the time, for stuff that evolves in time. author  TomT0m / talk page 09:19, 22 April 2024 (UTC)[reply]

Expanding the scope of network bands (P8097) to cover all frequency bands[edit]

If you are interested, feel free to join the discussion about expanding the scope of network bands (P8097) to cover all frequency bands at Property talk:P8097. Thanks! –Samoasambia 18:27, 19 April 2024 (UTC)[reply]

Sachsen-Anhalt[edit]

Hi, the German state Q1206 correctly has inception 3 Oct 1990, the day of German reunification. Located in the administrative territorial entity has two entries:

  • Germany, starting 7 Oct 1990. Can somebody change that to 3 Oct 1990, please? It was a German state starting with reunification.
  • East Germany. However, this state is a "federated state of Germany", so this does not apply to former East Germany. Sachsen-Anhalt's territory belonged to East Germany, but not this federal state. Could someone please remove this second value?

I'm not allowed to make these edits. Also, if I've overlooked something, please tell me. Thanks.--178.201.237.227 19:20, 19 April 2024 (UTC)[reply]

I made the edits. ChristianKl02:24, 21 April 2024 (UTC)[reply]

merge of Cyanobacteria (Q93315)/Cyanobacteriota (Q25577567)[edit]

Hello,

These two pages should be merged, Cyanobacteriota being only a recent change (last year) for the former phylum name Cyanobacteria. The two names are, for all practical purposes, synonymous. Since the change is recent ("Cyanobacteria" remaining far more common), only two Wikipedias (French and Spanish) have renamed their page to Cyanobacteriota. This is quite inconvenient since only their redirect is now linked to the other language wikis concerning cyanobacterias (about 80 languages), but their main page is not anymore.

Thank you 80.214.24.201 04:11, 20 April 2024 (UTC)[reply]

Hello, please also see Wikidata:Project_chat/Archive/2024/04#Cyanobakterien_(Q93315) M2k~dewiki (talk) 21:53, 20 April 2024 (UTC)[reply]

Spam filter exception?[edit]

I am trying to add an old domain, seen on https://web.archive.org/web/20061017125903/https://vgcats.com/badmushrooms/?strip_id=0, to the entry for en:VGCats (Q1290453). The spam filter, however, prevents me from adding the CJB domain. How do I ask for an exception for the spam filter for Q1290453?

Thank you, WhisperToMe (talk) 20:15, 21 April 2024 (UTC)[reply]

Did some checking for you. cjb.net site seemed to have been a free hosting provider who at some point allegedly was caught pushing spyware on unsuspecting users. It would be inadvisable to add any sort of local override. This sucks because I know many people like Scott's stuff. Infrastruktur (talk) 21:34, 21 April 2024 (UTC)[reply]

Merger[edit]

Pages Q113336174 and Q125552166 should be merged Futurolog21 (talk) 10:52, 22 April 2024 (UTC)[reply]

→ ← Merged
I think that this discussion is resolved and can be archived. If you disagree, don't hesitate to replace this template with your comment. RVA2869 (talk) 12:54, 22 April 2024 (UTC)[reply]

Next / alongside[edit]

New Shuttle (Q846381) is a people mover system that is constructed alongside the Jōetsu Shinkansen (Q912219) and Tōhoku Shinkansen (Q900665) railway lines. Is shares border with (P47) the rigth property?Smiley.toerist (talk) 10:59, 22 April 2024 (UTC)[reply]

Wikidata weekly summary #624[edit]

memory capacity (P2928)[edit]

So is this property about RAM capacity or storage capacity? It's not exactly being clear Trade (talk) 21:42, 22 April 2024 (UTC)[reply]

If you read the property proposal it's supposed to be a qualifier that's used with has part(s) (P527). It seems like some users have used it for different purposes which leads to bad data because sometimes people might mean RAM and otherwise storage. One solution would be to disallow the usage as main value and delete the uses that don't follow the data model that the property proposal suggests. ChristianKl02:03, 23 April 2024 (UTC)[reply]