Wikidata:Constraint violation report input
Jump to navigation
Jump to search
Development plan | Usability and usefulness | Status updates | Development input | Contact the development team |
This page is currently inactive and is retained for historical reference. Either the page is no longer relevant or consensus on its purpose has become unclear. To revive discussion, seek broader input via a forum such as the project chat. |
One of the very important tasks on Wikidata is data maintenance. Constraint violation reports help with finding problematic items and fixing them. A team of students is working on improving them. Please provide initial input for them on this page.
This is working with the current system[edit]
- Lists of violations: Wikidata:Database_reports/Constraint_violations/All_properties supplies lists of constraint violations; however only very few locations invite users to resolve these issues. -- LaddΩ chat ;) 21:31, 16 October 2014 (UTC)
- Clear constraints & usage info: once a user knows that important information is available from property talk pages, the list of applicable constraints and general usability rules are pretty clearly stated there. -- LaddΩ chat ;) 22:06, 16 October 2014 (UTC)
- Templates are easy to set up and the reports work really well for a lot of constraints. -Tobias1984 (talk) 20:43, 18 October 2014 (UTC)
This isn't working with the current system[edit]
- Not native: The scheme for detecting constraint violations relies fully on User:KrBot and his operator Ivan A. Krestinin. A scheme native to Wikidata should be developed. -- LaddΩ chat ;) 21:59, 16 October 2014 (UTC)
- No feed-back: Users have no feed-back when they enter property values that violate constraints set up for that property. -- LaddΩ chat ;) 21:59, 16 October 2014 (UTC)
- Template-based: Property constraints rely on custom-developed
{{Constraint}}
templates, installed manually on property talk pages, such as Property talk:P215. Necessary constraint information should be supplied by property statements. -- LaddΩ chat ;) 21:59, 16 October 2014 (UTC) - Time lag: 12 hours is minimum delay before a edit appears in constraint report. Maximum delay is 36 hours (if all works fine). Minimum is defined by 12 hours time lag of incremental dumps: [1]. Maximum is defined by same parameter + incremental dump period (24 hours). It would be great to remove 12 hour time lag for Wikidata. This lag is actual for Wikipedia, but not for Wikidata as I see. And generate the dumps twice per day maybe. — Ivan A. Krestinin (talk) 03:46, 17 October 2014 (UTC)
- Compound types Properties that are allowed on multiple distinct types can not be expressed. Example https://www.wikidata.org/wiki/Property:P136 is described as "genre of a creative work or genre in which an artist works" so it would need Type:work|artist but currently that is not supported. JanZerebecki (talk) 11:19, 17 October 2014 (UTC)
- An immediate solution could be to split it in two different properties so different constraints can be applied to each one.--Micru (talk) 23:21, 20 October 2014 (UTC)
- Qualifiers: There is no way to set qualifier constraints, or constraints for the source having certain statements. -Tobias1984 (talk) 20:45, 18 October 2014 (UTC)
- your input and signature
Ideas for new useful features around constraint violations[edit]
- Could use Wikidata Query and the constraint conditions on the Property talk pages to generate almost-live (~10-20 min out-of-date) reports on format violations. --Magnus Manske (talk) 14:47, 16 October 2014 (UTC)
- I would second this and push for it to be a lag-free aspect of the API. Ideally the system would check for constraint violations and report them before (or at the same time) as the assertion is stored in the database. Making this realtime would greatly increase the chances that (a) a user that makes a mistake could fix it while they are still there.. and (b) bots could detect that they were going haywire before writing thousands of erroneous statements. --Genewiki123 (talk) 20:35, 16 October 2014 (UTC)
- See Missing property pairs ("mother" in one item should have "child" in the other etc.) --Magnus Manske (talk) 14:47, 16 October 2014 (UTC)
- That might / would / could / should involve
- a way to mark two properties as exact inverses of one another;
- There's "inverse" in the overview table.
- automatically inserting the "child" property when either "mother" or "father" were entered, and automatically inserting "mother" or "father" when "child" is there and the "gender" is known. Question: What should trigger the automatism? Could be the insertion or a change of "gender", which may be a pretty complicated and error prone thing to configure in a general way, I think. --Purodha Blissenbach Discussion 19:27, 16 October 2014 (UTC)
- I offered this a year ago, was booed down. It could cement vandalism, after all!!1! --Magnus Manske (talk) 08:05, 17 October 2014 (UTC)
- a way to mark two properties as exact inverses of one another;
- That might / would / could / should involve
- Exclusive statements, e.g. a human should not have a location. --Magnus Manske (talk) 14:53, 16 October 2014 (UTC)
- "Disconnect" items; e.g., a species should be connected (through "parent taxon") to "biota" (universal root for life). See the second half of this blog post. --Magnus Manske (talk) 15:00, 16 October 2014 (UTC)
- maybe a tool that allows to fix a specific kind of constraint with one click, and at the same time takes it out of the report --Denny (talk) 15:16, 16 October 2014 (UTC)
- a way to list exceptions for any of the constraints --Denny (talk) 15:16, 16 October 2014 (UTC)
- possible to make more complicated constraints. For example every item that uses Rijksmonument ID (P359) should have located in the administrative territorial entity (P131) -> <someitem> and that <someitem> should be instance of (P31) -> municipality of the Netherlands (Q2039348). Multichill (talk) 20:24, 16 October 2014 (UTC)
- +1 Logical operations on constraints are sometimes vital like "there should be this property OR this property". --Infovarius (talk) 12:49, 18 October 2014 (UTC)
- A graphical user interface for creating/editing constraints would be very helpful. Ideally you could add a constraint to an existing property, hit 'preview' and see how many items would be impacted. (Which would just use a WD query of course) --Genewiki123 (talk) 20:35, 16 October 2014 (UTC)
- A Fix? button. When constraint violations exist, it may be possible automatically fix them in large batches. The simplest approach is obviously just to roll them all back. Giving access to tools that would impact all statements that violated a constraint from the property page itself would be very powerful. (maybe too powerful?) --Genewiki123 (talk) 20:35, 16 October 2014 (UTC)
- Some values shoul be unique for certain property. Some non-unique are candidates for merging. It would be useful to have script for easy merging (without c+p) - some javascript which takes two Qs in one line and make link for merging api. JAn Dudík (talk) 21:13, 16 October 2014 (UTC)
- I propose instead or additional a maintenance property within an item. The property type should be string. A string should consist of a) the status of the message ('new' or 'done') b) description of the problem (e.g. 'same as q...', 'Q... is of wrong or missing type') and as an option c) an external source (e.g. 'en:WP', 'OSM'). Of course, such a string could look different if a human writes it. There are some advantages: 1)The message is where the problem is and could be seen easily. 2) Tools, bots and modules could work with them. For example, a lua-module could show in the infobox a warning. And many bots and humans could add and delete maintenance properties. Additional a bot that works with an external database, e.g. openstreetmap, could be integrated. 3) The Query Editor could be used to list the problems, if some additional code is written, which could e.g. sort the output. 4) This system is updatable. We could define whenever we want a new or extended message string. 5) We even could give warning messages if an editor changes a value, but don´t change the references too. Or we could define a new status 'monitoring' which means, that a bot is monitoring a certain value or the whole item and gets active after updating (e.g. pasting a message somewhere). 6) Maybe we would need the status 'keep it' too. I saw e.g. the constraint violation that the grave of a cat is set, which is not allowed for this property. If we would set that value to 'keep it' the bot shouldn´t take that value as an violation anymore, but it still could be find with an extended Query Editor. 7) Another status could be 'in use', which says, that a value is used in e.g. Wikipedia X. (see this too) --Molarus (talk) 23:22, 2 November 2014 (UTC)
- Constrains that involve rank and qualifier. E.g. violation if there are multiple statements on an item for property population with rank normal that have non-overlapping dates and no other qualifiers. --JanZerebecki (talk) 23:02, 3 December 2014 (UTC)
- your input and signature