Shortcuts: WD:DP, WD:DEVPLAN

Wikidata:Development plan: Difference between revisions

From Wikidata
Jump to navigation Jump to search
Content deleted Content added
+shortcuts, spacing
Line 181: Line 181:
* Wikiversity
* Wikiversity
** Sitelinks: 23.02.2016 {{done}}
** Sitelinks: 23.02.2016 {{done}}
** Data: ?
** Data: 03.05.2016
* Meta, MediaWiki, Wikispecies
* Meta, MediaWiki, Wikispecies
** Sitelinks: 20.10.2015 {{done}}
** Sitelinks: 20.10.2015 {{done}}

Revision as of 09:39, 28 April 2016

Development plan Usability and usefulness Status updates Development input Contact the development team

This is the Wikidata development plan. It is ordered by when items will likely be started but the order is not necessarily fixed. It can change depending on changed priorities.

Done

Badges

Some clients want to add extra meta data (badges) to their articles. This includes things like "good article", "featured article" and the importance and quality of an article.

If your wiki uses this feature, please ensure it's listed.

  • The badges will be defined on Wikidata items' sitelinks as Wikidata items, like "featured article (Q123456)".
  • The items allowed as badges will be defined in the Wikidata configuration settings. Initially "good article" and "featured article" badges will be available.
  • The user can set and change a badge by selecting one from such pre-defined list of badges.
  • On the clients sitelinks with badges will have an icon in the list of language links in the sidebar. There will be default icons shipped by a Wikimedia specific Wikibase extension, but wikis can choose to use own icons.
  • If further customization on a per wiki level is needed, that can be done by using the CSS classes that are set on sitelinks with badges. There will be CSS classes that map to badge ids (for example Q120 could map to class GA). Also there will be canonical classes which aren't setting dependent (like wb-badge-Q123) on every sitelink with badges.
  • The badges can be queried via a special page called Special:PagesWithBadges on every client wiki

Technical details

These badges correspond to Wikidata items.

Remaining problems

  • phab:T73887: Other projects sidebar should show badges if applicable

Merges and redirects

When two different items about the same topic are created they can be merged. Labels, descriptions, aliases, sitelinks and statements are merged if they do not conflict. The item that is left empty can then be turned into a redirect to the other. This way, Wikidata IDs can be regarded as stable identifiers by 3rd-parties.

Mono-lingual text datatype

Users can add strings and specify a language for it. They can for example enter the motto of a country in the country’s language. It is shown in this language to all users regardless of their language setting.

JSON dumps

For 3rd party re-use and analysis of the data in Wikidata, we provide JSON dumps in addition to the regular XML dumps. The JSON dump contains the canonical JSON representation of all entities (as opposed to the brittle internal JSON representation found in the XML dumps). The JSON representation of individual entities is also available via Wikidata’s linked data interface (Special:EntityData).

Quantities without units

Users are able to enter quantitative data in Wikidata and re-use it on the clients as well as outside of Wikimedia. It is possible to express statements like “Berlin has an estimated population of 3,397,469 (±100) as of 31 July 2013” or “Berlin has an area of 891.85 km²”. At first only unitless quantities are supported. In a later deployment a small number of units are added and expanded in future deployments. When viewing an item with such quantitative data, the user sees these according to local conventions (decimal separator, unit conversion). On the client this data is accessed via the parser function and Lua and also shown according to the content language. The API provides a way to access the data in the preferred format of the request sender.

Remaining problems

  • phab:T68580: Better support for exact values in Quantity DataType
  • phab:T59589: make it possible to show the + sign in quantities on item pages

In other projects sidebar

mw:Beta Features/Other projects sidebar

On an article a user can see links to the same topic on other projects in the sidebar. This is similar to how different languages of the same project are linked.

Entity suggester

When entering a new statement the user is shown a number of properties that he is likely to use. These properties are calculated based on which properties are used in similar items. In future versions suggestions should also be made for values.

Statements on properties

To improve maintainability of the data in Wikidata it is possible to add statements to property pages. This is used to store constraints for properties. An example for such a constraint is that the winner of a certain award must be human. Another example would be that the number of inhabitants needs to be a positive integer.

Language fallback

When viewing an item that is linking to other items the labels for these items are shown in the users language. If labels in this language are not available labels in languages are shown that the user is likely to speak.

Data usage tracking

To ease maintenance of the data in Wikidata it is possible to get a list of all articles certain data is used in. Users are thereby able to see which articles are affected by changes they are making. This also allows a better overview of where and how Wikidata’s data is used in Wikimedia’s projects.

Remaining problems

  • phab:T49727: show properties used in an article
  • phab:T66591: write API module that gives a list of pages that use a given item
  • phab:T73498: benchmark database performance of usage tracking
  • phab:T75220: populate entity usage table during database schema update
  • phab:T93191: correctly track redirect usage

Not to be done

  • Usage tracking outside Wikimedia’s projects

Primary sources tool

Wikidata:Primary sources tool

Data from another database can be enhanced with references before being added to Wikidata.

Access to data from arbitrary items

Users on the client are able to include data from any Wikidata item they chose by specifying its ID. This expands on their ability to access data of the item currently associated with the page via a sitelink. This access is possible via both the parser function and Lua.

Remaining problems

  • phab:T107722: Make client change handling scale by batching updates
  • phab:T106190: {{#property:…}} parser function should enforce the parser's restricted function limit
  • phab:T95567: Preload sitelinks based on usage tracking data
  • phab:T76156: mw.wikibase: Use __index to lazy load entity contents
  • phab:T76159: Preload labels and descriptions for Lua and the parser function based on usage tracking data

Improved user experience for referencing: Duplication of an existing reference

It is possible to add a new reference by simply re-using another one from the same item.

Quantities with units

Users are able to enter quantitative data in Wikidata and re-use it on the clients as well as outside of Wikimedia. It is possible to express statements like “Berlin has an estimated population of 3,397,469 (±100) as of 31 July 2013” or “Berlin has an area of 891.85 km²”. At first only unitless quantities are supported. In a later deployment a small number of units are added and expanded in future deployments. When viewing an item with such quantitative data, the user sees these according to local conventions (decimal separator, unit conversion). On the client this data is accessed via the parser function and Lua and also shown according to the content language. The API provides a way to access the data in the preferred format of the request sender.

Remaining problems

  • phab:T117031: Represent normalized unit values in full values RDF
  • phab:T112082: QuantityParser must pass-through valid unit representations
  • phab:T111770: Decide how to represent quantities with units in the "truthy" RDF mapping
  • phab:T111022: Improve input for quantities with units
  • phab:T110673: Meaningful ranking for the selectable units
  • phab:T108807: Edit summary for unit changes
  • phab:T95425: Quantity formatter rounding causes significant data loss
  • phab:T77983: Use localized unit symbols
  • phab:T77978: Support unit conversion

Not to be done yet

  • Currency conversion (Conversion rates and the value of a single currency change over time. That is very complex to model.)

Wikidata Query Service

Phabricator project

The data in Wikidata can only be used to its full potential with a way to query this data. This is done using a SPARQL endpoint.

Improved user experience for referencing: One-step-adding

Adding a statement including its reference is done in one step. It is not necessary to first save the statement and then add the reference to it.

Mobile view

Wikidata should offer a view that is optimized for mobile devices.

Remaining problems

  • phab:T115013: Display Wikidata pages on mobile watchlist with labels instead of Q id titles
  • phab:T95878: Make Wikidata editable on mobile web
  • phab:T95883: Investigate how to use different/new diff views for entity diffs on mobile

Formula datatype

Users can enter formulas in Wikidata using MathML html markup and a subset of AMS-LaTeX. They are rendered nicely.

Remaining problems

phab:T126349: RDF export for the math data type should not export input texvc string but its MathML representation phab:T125712: Show TeX source for math values in diffs

Not to be done yet

  • editing optimized for mobile devices

In progress

Access for remaining sister projects

The remaining sister projects have access to sitelinks and data via Wikidata. The roll-out is staged to allow the communities to adapt. The planned order is:

  • Wikisource
    • Sitelinks: 14.01.2014 ✓ Done
    • Data: 25.02.2014 ✓ Done
    • Oldwikisource not done yet (see phab:T64717)
    • Edition interwiki links not done yet (see phab:T128173)
  • Wikiquote
    • Sitelinks: 08.04.2014 ✓ Done
    • Data: 10.06.2014 ✓ Done
  • Wikinews
    • Sitelinks: 19.08.2014 ✓ Done
    • Data: 2.12.2015 ✓ Done
  • Wikidata itself
    • Sitelinks: 19.08.2014 ✓ Done
    • Data: 19.08.2014 ✓ Done
  • Commons (not including file metadata!)
    • Sitelinks: 23.09.2013 ✓ Done
    • Data: 2.12.2014 ✓ Done
  • Wikibooks
    • Sitelinks: 24.02.2015 ✓ Done
    • Data: 22.09.2015 ✓ Done
  • Wikiversity
    • Sitelinks: 23.02.2016 ✓ Done
    • Data: 03.05.2016
  • Meta, MediaWiki, Wikispecies
    • Sitelinks: 20.10.2015 ✓ Done
    • Data: 2.12.2015 ✓ Done
  • Incubator
    • Sitelinks: ?
    • Data: ?

Not to be done (yet)

  • Wiktionary

UI redesign

Wikidata:UI redesign input

Reading and editing Wikidata is joyful and intuitive on desktops, tablets and mobile phones. The interface is visually pleasing, integrates nicely with other Wikimedia projects and contains no jargon. The interface provides the user with the information they were looking for quickly and does not overwhelm them (i.e. deprecated data is hidden initially and information is ordered in an intuitive way). It invites the user to add additional information (including qualifiers and sources) and offers little nudges towards making correct and useful contributions by offering suggestions. Erroneous contributions and vandalism are discouraged. Navigating and editing the website is fast. Both the data and the interface is localized in the user’s locale and language preferences. Where no data is available in a particular language a fallback is used.

Not to be done

  • enforcing user-defined constraints on data input

Improved internal consistency checks

https://phabricator.wikimedia.org/project/profile/1202/ and Wikidata:Constraint violation report input

It is easy for a user to find and understand constraint violation reports. Fixing an item that violates a constraint is easy. When viewing an item with a statement that violates a constraint the user can easily spot the wrong statement.

Consistency checks against 3rd parties

https://phabricator.wikimedia.org/project/profile/1203/

We check Wikidata's data against other databases. Inconsistencies are visible to the user when viewing an item.

RDF export

For 3rd party re-use and analysis of the data in Wikidata, we provide access to the data in RDF. This RDF will not assert facts, but rather represent claims. The RDF representation of individual entities is available via Wikidata’s linked data interface (Special:EntityData).

Improved watchlist integration

Wikidata:Watchlist integration improvement input

Users on Wikipedia and co need to see changes on Wikidata that affect their articles easily and comprehensively in their watchlist. This includes support for showing Wikidata changes when using the enhanced recent changes option.

Hover cards

When a user hovers over a link to an item a small card is shown that holds the most important information about that item.

Article Placeholder

Wikidata:Article placeholder input

As a Wikipedia reader I want to get information about a topic even if no article is available in my language. When searching for a topic that doesn't have an article in my language I am presented with basic information from Wikidata and get the option to write one.

Better handling of identifiers

Identifiers are better handled visually separated from other statements. They need to be moved to their own section in the sidebar and get their own datatype. The linking of identifiers to their database should be done in Wikibase itself instead of a gadget.


Wikimedia Commons

c:Commons:Structured data

Wikimedia Commons holds a huge amount of multimedia files available for the other Wikimedia projects and the world to use. Structured data support for Wikimedia Commons is important to make it easier to maintain the files and make reuse, especially 3rd-party reuse, easier. The structured data support comes in two ways. The first is by providing access to the data stored in Wikidata. This includes things like the date of birth of an artist. The second way is by enabling Wikimedia Commons itself to store structured data related to the files stored there. This includes things like the license and subject of a photo for example.

When a new file is contributed, the uploader is asked to provide some information like tags, creator name and license in the upload wizard. Users are able to then access and edit this structured data via a form as well as an API (similar to how it is done on Wikidata). It is easy to specify and retrieve the licensing and provenance information of a multimedia file. Additionally it is easy to tag and categorize images based on concepts from Wikidata. Tags and other file information is shown in the user’s language to accommodate the multi-lingual audience of Wikimedia Commons. All this information can be used to easily search for files that fit certain criteria like “picture of a cat and a child from 2010, licensed under CC-BY-SA”.

Technical details

The data is stored on a “data” page attached to the file’s page that is similar to Wikidata’s item pages. Commons is thereby a repository and at the same time its own client as well as a client of wikidata.org.

On hold

Complex queries

Users are able to write queries that are more complex than the simple queries. This includes queries like “all poets who lived in 1982” or “all cities with more than 1 Million inhabitants”. They are entered (using the semantics as embodied by Semantic MediaWiki’s Ask extension) in a page in the Query namespace and internally saved as JSON. They are then executed when resources are available - usually not immediately. The result is cached. A query can be set to rerun at regular intervals or on-demand by an administrator. The result of the query is shown on the same page. It can also be accessed via the API. The clients can include the result of a query in their pages to for example create list articles. The result will be a list of items. It can then be manipulated as needed by Lua. More result formatters and visualisations will be made available in future deployments based on Semantic MediaWiki’s result formatters. These queries are making Wikidata even more useful to the Wikimedia projects and the world and are needed by the community to maintain the large database.

Optional

  • transitive queries
  • disjunction

Todo

Infobox demos + documentation

We provide a few demo infoboxes and good documentation for people to use as a starting point for moving infoboxes on Wikipedia towards using more data from Wikidata.

Improved user experience for referencing

Wikidata:Referencing improvements input

We make adding references easier.

Nudging users about adding or changing references

When adding a statement without a reference the editor is nudged to provide one. When a statement is changed but not its reference the editor is made aware of it and nudged to change the reference.

Wizard

A user adds one piece of information of a reference like its ISBN. The tool then automatically adds the other necessary information.

Article history integration

Editors on a client can look at the change history of an article and see all Wikidata changes relating to this article. This way they can see all changes affecting their article without having to go to another project.

Multi-lingual text datatype

Users can add strings and specify a language for it. This is similar to the mono-lingual string. However translations in more than one language can be provided. The one in the user’s language is shown and the others can be shown on-demand.

Geo-shape datatype

Users can enter geo-shapes in Wikidata. They can for example use it to store the outline of a country.

Technical details

This will likely be realised using leaflet.js.

Wiktionary support

Wikidata:Wiktionary

A number of proposals have been put forth for using structured data on Wiktionary.

Beyond the planning horizon of this plan

Structured Wikiquote

m:Wikidata/Development/Wikiquote

Installing Wikibase on Wikiquote and allowing structured data on this sister project has lots of advantages. However, there is also lots of work to do in the software to support all features needed for this proposal.

Access from 3rd party wikis

MediaWiki installs outside the Wikimedia cluster are able to make use of the data on Wikidata similar to how they make use of images from Wikimedia Commons via InstantCommons.

Withdrawn

Simple queries

Users are able to pose simple queries to Wikidata via a SpecialPage as well as the API. Wikidata can answer queries like “What has the ISBN 2-01-202705-9” or “What has the capital Paris”. These queries are restricted to one property/value pair and return a list of items. The returned result only includes items where the statement is marked as preferred. These queries are most useful for use with one of the many identifiers in Wikidata that connect the knowledge base to other databases.

This is canceled in favor of Wikidata Query Service and the external identifier look up service.

Not to be done

Querying for sources or qualifiers

Things to keep in mind

Some data types are easier to query than others. Time, Geo and Quantity values require range queries. For the Item and String data types, simple equality is sufficient.