Wikidata:WDQS and Mediawiki API

From Wikidata
Jump to navigation Jump to search


See mw:Wikidata_query_service/User_Manual/MWAPI for implementation documentation.

Wikidata Query Service + Mediawiki API = <3

Configured template[edit]

Each API request should have a configured template that describes inputs and outputs. This is required (at least for now) because of the difference of how Mediawiki API and Query service sees the data - for the former, the input is and URL and the output is a tree-like data structure, for the latter, the data is either triples or tabular.

Currently the template must be pre-configured, but if that model works, we may allow to create dynamic definitions right in the query.

Example service configuration:

{
  "services": {
    "Categories": {
      "params": {
        "action": "query",
        "prop": "categories",
        "titles": {
          "type": "list"
        },
        "cllimit": {
          "type": "int",
          "default": 500
        }
      },
      "output": {
        "items": "/api/query/pages/page/categories/cl",
        "vars": {
          "category": "@title",
          "title": "/api/query/pages/page/@title"
        }
      }
    }
  },
  "endpoints": [
    ".wikipedia.org",
    "www.mediawiki.org",
    "www.wikidata.org"
  ]
}

This defines one service, named Categories, implementing Categories API.

Example API query:

SELECT * WHERE {
     SERVICE wikibase:mwapi {
          bd:serviceParam wikibase:api "Categories" .
          bd:serviceParam wikibase:endpoint "en.wikipedia.org" .
          bd:serviceParam mwapi:titles "Albert Einstein" .
          bd:serviceParam mwapi:category ?category .
          bd:serviceParam mwapi:title ?title .
      }
 }
Try it!

Choosing the service[edit]

The API query should always have two parameters - wikibase:api and wikibase:endpoint - provided. Currently, both must be constants.

Example:

bd:serviceParam wikibase:api "Categories" .
bd:serviceParam wikibase:endpoint "en.wikipedia.org" .

The api parameter should refer to existing service template, like above, the endpoint should be either hostname or URI node pointing to one of the allowed wiki endpoints. In the case of URI, only the hostname is used.

Allowed endpoints[edit]

The endpoint hostname should have one of the allowed endpoints configured in endpoints config above as its suffix. This means en.wikipedia.org is OK, but (with this config) en.wiktionary.org or fakewikipedia.org is not.

Forming the query[edit]

The query is described by a set of query parameters. Query parameters can be either constant or variable. For constant parameters, such as action and prop, the value is specified. For variable parameters, the type is given (currently ignored but may be validated later) and the default can be supplied. If the default is not supplied, the parameter must be bound (see below). If the default is ""(empty string), then if the parameter is not bound it is omitted from the query.

The parameter bindings are specified in the service params like this:

bd:serviceParam mwapi:titles "Albert Einstein" .

The subject can be either a constant or a variable.

Note that currently input and output parameters should have distinct names.

Processing the result[edit]

The output of the service is processed using XPath. items configuration defines XPath that provides the collection the is source for the results. The vars configuration is evaluated relative to each element of that collection to produce a single result row.

Note that category is an attribute relative to the collection element, but title is relative to the root of the query.

The binding to the variables is provided in the service params like this:

bd:serviceParam mwapi:category ?category .
bd:serviceParam mwapi:title ?title .

It is OK to omit one of the configured output vars, in that case it will just be ignored.

Note that currently input and output parameters should have distinct names.

Example[edit]

Example query, showing categories used for pages in English Wikipedia which describe laureates of Nobel Award in physics:

The following query uses these:

  • Properties: award received (P166)  View with Reasonator View with SQID
    SELECT * WHERE {
      ?item wdt:P166 wd:Q38104 .
      ?item rdfs:label ?name .
      FILTER(lang(?name) = "en")
      SERVICE wikibase:mwapi {
              bd:serviceParam wikibase:api "Categories" .
              bd:serviceParam wikibase:endpoint "en.wikipedia.org" .
              bd:serviceParam mwapi:titles ?name .
              bd:serviceParam mwapi:category ?category .
              bd:serviceParam mwapi:title ?title .
      }
    }
    

Run it on http://wdqs-test.wmflabs.org/, since the production service does not have the prototype enabled.