Wikidata:Constraint violation report input

From Wikidata
Jump to navigation Jump to search
Development plan Usability and usefulness Status updates Development input Contact the development team

One of the very important tasks on Wikidata is data maintenance. Constraint violation reports help with finding problematic items and fixing them. A team of students is working on improving them. Please provide initial input for them on this page.

This is working with the current system[edit]

This isn't working with the current system[edit]

Ideas for new useful features around constraint violations[edit]

  • Could use Wikidata Query and the constraint conditions on the Property talk pages to generate almost-live (~10-20 min out-of-date) reports on format violations. --Magnus Manske (talk) 14:47, 16 October 2014 (UTC)[reply]
    • I would second this and push for it to be a lag-free aspect of the API. Ideally the system would check for constraint violations and report them before (or at the same time) as the assertion is stored in the database. Making this realtime would greatly increase the chances that (a) a user that makes a mistake could fix it while they are still there.. and (b) bots could detect that they were going haywire before writing thousands of erroneous statements. --Genewiki123 (talk) 20:35, 16 October 2014 (UTC)[reply]
  • See Missing property pairs ("mother" in one item should have "child" in the other etc.) --Magnus Manske (talk) 14:47, 16 October 2014 (UTC)[reply]
    • That might / would / could / should involve
      1. a way to mark two properties as exact inverses of one another;
      2. automatically inserting the "child" property when either "mother" or "father" were entered, and automatically inserting "mother" or "father" when "child" is there and the "gender" is known. Question: What should trigger the automatism? Could be the insertion or a change of "gender", which may be a pretty complicated and error prone thing to configure in a general way, I think. --Purodha Blissenbach Discussion  19:27, 16 October 2014 (UTC)[reply]
  • Exclusive statements, e.g. a human should not have a location. --Magnus Manske (talk) 14:53, 16 October 2014 (UTC)[reply]
  • "Disconnect" items; e.g., a species should be connected (through "parent taxon") to "biota" (universal root for life). See the second half of this blog post. --Magnus Manske (talk) 15:00, 16 October 2014 (UTC)[reply]
  • maybe a tool that allows to fix a specific kind of constraint with one click, and at the same time takes it out of the report --Denny (talk) 15:16, 16 October 2014 (UTC)[reply]
  • a way to list exceptions for any of the constraints --Denny (talk) 15:16, 16 October 2014 (UTC)[reply]
  • possible to make more complicated constraints. For example every item that uses Rijksmonument ID (P359) should have located in the administrative territorial entity (P131) -> <someitem> and that <someitem> should be instance of (P31) -> municipality of the Netherlands (Q2039348). Multichill (talk) 20:24, 16 October 2014 (UTC)[reply]
    +1 Logical operations on constraints are sometimes vital like "there should be this property OR this property". --Infovarius (talk) 12:49, 18 October 2014 (UTC)[reply]
  • A graphical user interface for creating/editing constraints would be very helpful. Ideally you could add a constraint to an existing property, hit 'preview' and see how many items would be impacted. (Which would just use a WD query of course) --Genewiki123 (talk) 20:35, 16 October 2014 (UTC)[reply]
  • A Fix? button. When constraint violations exist, it may be possible automatically fix them in large batches. The simplest approach is obviously just to roll them all back. Giving access to tools that would impact all statements that violated a constraint from the property page itself would be very powerful. (maybe too powerful?) --Genewiki123 (talk) 20:35, 16 October 2014 (UTC)[reply]
  • Some values shoul be unique for certain property. Some non-unique are candidates for merging. It would be useful to have script for easy merging (without c+p) - some javascript which takes two Qs in one line and make link for merging api. JAn Dudík (talk) 21:13, 16 October 2014 (UTC)[reply]
  • I propose instead or additional a maintenance property within an item. The property type should be string. A string should consist of a) the status of the message ('new' or 'done') b) description of the problem (e.g. 'same as q...', 'Q... is of wrong or missing type') and as an option c) an external source (e.g. 'en:WP', 'OSM'). Of course, such a string could look different if a human writes it. There are some advantages: 1)The message is where the problem is and could be seen easily. 2) Tools, bots and modules could work with them. For example, a lua-module could show in the infobox a warning. And many bots and humans could add and delete maintenance properties. Additional a bot that works with an external database, e.g. openstreetmap, could be integrated. 3) The Query Editor could be used to list the problems, if some additional code is written, which could e.g. sort the output. 4) This system is updatable. We could define whenever we want a new or extended message string. 5) We even could give warning messages if an editor changes a value, but don´t change the references too. Or we could define a new status 'monitoring' which means, that a bot is monitoring a certain value or the whole item and gets active after updating (e.g. pasting a message somewhere). 6) Maybe we would need the status 'keep it' too. I saw e.g. the constraint violation that the grave of a cat is set, which is not allowed for this property. If we would set that value to 'keep it' the bot shouldn´t take that value as an violation anymore, but it still could be find with an extended Query Editor. 7) Another status could be 'in use', which says, that a value is used in e.g. Wikipedia X. (see this too) --Molarus (talk) 23:22, 2 November 2014 (UTC)[reply]
  • Constrains that involve rank and qualifier. E.g. violation if there are multiple statements on an item for property population with rank normal that have non-overlapping dates and no other qualifiers. --JanZerebecki (talk) 23:02, 3 December 2014 (UTC)[reply]
  • your input and signature