Wikidata:SPARQL query service/WDQS backend update/April 2024 scaling update

From Wikidata
Jump to navigation Jump to search

Hello all!

We’ve been moving forward on the WDQS Graph Split, time for an update!

We have new documentation to help the migration to the split graph:

  • Federation limits: Explanation of the limitations of the SPARQL federation as used on the graph split. This might help you understand what is possible and what isn’t when you need to federate the main WDQS graph with the scholarly subgraph.
  • Federated queries examples: This document explains how to rewrite queries to use SPARQL federation over the split graph. We’ve taken a number of real life examples, and we’ve rewritten them to use federation. While rewriting queries is not always trivial, the examples that we tried are all possible to make work over a split graph.

We have been reaching out to people who will be impacted by the graph split. In particular, we have been having conversations with community members close to the Scholia and Wikicite projects. In that context, we are realizing that our initial split proposal (moving all instances of Scholarly articles to a separate graph - ?entity wdt:P31 wd:Q13442814) is not sufficient. We have prepared a second and last proposal that will refine this split to make it easier to use. See WDQS Split Refinement for details. We are open for feedback until May 15th 2024, please send it to the related talk page.

While we refine this split, we are starting work on the implementation of the missing pieces to make the graph split available. This includes modifying the update pipeline to support the split and better automation of the data loading process. We are also working on a migration plan, which we will communicate as soon as it is ready. Our current assumption is that we will leave ~6 months for the migration once the split services are available before shutting down the full graph endpoint.

We need your help more than ever! If you have use cases that need access to scholarly articles, please read "Federation Limits" and "Federated Queries Examples", rewrite and test your queries, and add your working examples to "Federated Queries Examples". Send your general feedback to the project page.

On a side note, WDQS isn’t the only SPARQL endpoint exposing the Wikidata graph. You can have a look at "Alternative endpoints", which lists a number of alternatives not hosted by WMF, which might be helpful during the transition.

Thanks!

Guillaume