Wikidata talk:REST API feedback round/Archive1

From Wikidata
Jump to navigation Jump to search

Please provide us your feedback here. It’d be great if you could copy the template at the bottom of the page to make it easier to structure the feedback. Thank you!

Feedback on initial specification[edit]

Feedback by Albert.meronyo and pasqLisena[edit]

This is a great initiative, and a great improvement over the current API. We can only expect that with a full-blown REST, OpenAPI specification more applications will be able to easily get from and add to Wikidata.

  • I’m currently using the action API in this way:
    • We are not using the action API at the moment
  • I think this is great about the proposed REST API (incl. why):
    • It relies on Swagger/OpenAPI which is the industry de facto standard
    • It makes full use of HTTP verbs/methods
    • Includes schema information
  • I think this could be improved about the proposed REST API (incl. why):
    • We agree with Dragan’s points on the overlap with SPARQL functionality, and have actually developed tooling (grlc and SPARQL Transformer) that bridges the gap between publishing SPARQL queries and publishing equivalent REST APIs. These tools have been warmly welcomed in the community (more info at https://d2klab.github.io/swapi2020/). SPARQL Transformer offers ways to choose the desired JSON output of a query, merge output bindings around common ids, and parse datatype values. grlc reads a query collection (stored locally or in an online repository) and builds a REST APIs on top of it, following the OpenAPI specification and exposing at the same time a Swagger documentation.

Assuming that SPARQL queries implementing the API functionality are well documented (e.g. description, parameters, desired JSON output, etc.), these tools can automatically build the OpenAPI specification and appropriately answer HTTP API calls

    • We believe that the usage of these tools could greatly improve the maintenance life-cycle of the new Wikidata REST API, minimizing it to the task of documenting and curating SPARQL queries
    • We would be very happy to invest part of our time supporting an eventual adaptation and deploy of these tools to support the new Wikidata REST API
    • At the same time, we are aware that not all the expected functionality of the API proposal (e.g. DELETE) is fully supported in our tools; but we would be very keen on using the chance to improve these tools
  • Concerns I have:
    • The route organisation/modelling can be a bit confusing with the verbose /entities/ and /statements/ prefixes?
  • Other comments:
    • We are available for a chat anytime!

Albert.meronyo (talk) 19:21, 3 February 2021 (UTC)[reply]

Feedback by Pyfisch[edit]

  • I’m currently using the action API in this way: editing items, fetching recent changes, (querying user information)
  • I think this is great about the proposed REST API (incl. why):
    • The PATCH method and JSON-patch. I recently spent a lot of time to build a framework around wbgetentities and wbeditentity that allows me to fetch an entity, apply some changes and send a single request to wbeditentity. To accomplish this I track changes made to the typed entity representation and finally manually construct the commands that should be sent to wbeditentity. With the new API I can just format the typed entity representation to JSON and use the patch algorithm to determine the changes to send to the API
    • authentication through the use of HTTP headers & that you don't need to fetch tokens (OAuth is sufficient for me)
  • I think this could be improved about the proposed REST API (incl. why):
    • There should not be different paths for items and properties. Wikidata already uses http://www.wikidata.org/entity/Q42 and http://www.wikidata.org/entity/P42 and therefore the API should just be /entities/{entity_id}. Separation is already provided by the initial letter. This simplifies code that handles different types of entities.
    • The additional data attached to revisions (comment, tags, bot) are not part of the item representation. Therefore a HTTP purist approach would be to send them as query parameters. Then the HTTP patch representation looks exactly like in the RFC and can use the proper MIME type.
  • Concerns I have:
    • Is it possible to avoid having qualifiers-order and snaks-order and instead store the snaks in a list?
    • Is it possible to check if additional constraints would be violated by the changed statements before saving? This may improve the data quality of Wikidata in the long run as tools wouldn't add contradictory information.
  • Other comments:
    • Besides access to the current revision it would be useful for some tasks (e.g. vandalism detection) to access previous revisions and structured diffs.

Thank you for working on this! --Pyfisch (talk) 12:53, 13 October 2020 (UTC)[reply]

@Lydia Pintscher (WMDE): The "give feedback" button directs users to another site. Maybe all feedback should be moved to one place? --Pyfisch (talk) 15:52, 13 October 2020 (UTC)[reply]
Uhhhh Yes! THanks for noticing. Will fix by moving the main page. --Lydia Pintscher (WMDE) (talk) 16:02, 13 October 2020 (UTC)[reply]
For me personally, I disagree with merging the items and properties endpoints. Having them separate is very useful. Nicereddy (talk) 04:31, 16 October 2020 (UTC)[reply]
@Nicereddy: Could you give an example and supply an argument of why it's more useful to separate them? I can't think of any. What about Lexemes then? Should they also get their own endpoint?--So9q (talk) 12:09, 19 December 2020 (UTC)[reply]
One additional question: How will maxlag work in the REST-API? I'd appreciate if the congestion management would be done on the server side and the server just returned 429 Too Many Requests and a Retry-After header if the bot or tool should slow down. This leaves the possibility open for more advanced server-side load management schemes in the future. --Pyfisch (talk) 09:58, 16 October 2020 (UTC)[reply]
+1. Using a 429 with Retry-After makes a lot of sense. Husky (talk) 12:31, 3 November 2020 (UTC)[reply]
+1.--So9q (talk) 12:09, 19 December 2020 (UTC)[reply]
Thank you again for the feedback! Some replies to points you raised:
  • We’ll think more about the separate endpoint for Items and Properties. It seems people are split on this. We’re tracking this in phab:T264086.
  • We’ll also think more about the edit metadata you mentioned. That was forgotten so far.
  • Lists: that seems like a more fundamental change to the data model that’s out of the scope of this work but we’ll put it on the list of things to think about for future improvements.
  • Checking constraints: that is currently also out of the scope of this work but will be considered for later. One issue there is that not all constraints are local to the Item and could currently be checked on save. I agree though that this would be good to have in the future.
  • Historic revisions: This has come up several times in unrelated discussions. It’s probably too much for the first version but looks like something that should follow soon after.
  • Maxlag and congestion management: very good point. We’ll ponder it.
--Lydia Pintscher (WMDE) (talk) 08:12, 19 November 2020 (UTC)[reply]

Feedback by BrokenSegue[edit]

  • I'm a bit confused by the GET items endpoint. This just lists arbitrary items in no particular order? That doesn't sound useful. Also in my experience APIs that paginate tend to give you a token for the next page (which solves some issues around the data changing during listing and prevents you from asking for page 1000 which might be expensive).
  • A main use case I have in mind is looking up items based on an identifier. i.e. show me the item with this ISBN. Is something like that in scope for this project?
  • Will the API server convert these API requests that modify multiple statements into multiple edits?
  • How is authorization going to be done? None of the endpoints mention the required parameters for this though clearly you plan to do it.
  • I assume "_fields" is a list of Property IDs?

BrokenSegue (talk) 12:29, 13 October 2020 (UTC)[reply]

BrokenSegue: You might find The Linked Data Fragments Endpoint to be of interest. It's incredibly fast and supports your ISBN use case (although ISBNs are a bad example because the formatting is all over the place.) --Matthias Winkelmann (talk) 20:35, 16 October 2020 (UTC)[reply]
Thank you for the feedback! Some replies to points you raised:
  • For things like ISBN lookup we’ll have to provide other ways to do this efficiently. It was also one of the things that came up as part of our investigation around how to improve the Query Service situation.
  • Will the API convert edits into multiple edits: currently our thinking is no.
  • Authorization will happen via HTTP. We’ll see to make that clear in the documentation in the future.
  • We’ll improve the documentation for the _field parameter. It is for the top level entity fields.
--Lydia Pintscher (WMDE) (talk) 08:13, 19 November 2020 (UTC)[reply]
In the spirit of KISS I thinks it's best to keep multiple edits in one "changeset". If you want different changesets to show up in the history, send multiple requests instead.--So9q (talk) 12:14, 19 December 2020 (UTC)[reply]

Feedback by Dragan Espenschied[edit]

This is a great initiative.

I would suggest to include querying the SPARQL endpoint "associated" with a Wikibase into the API. Wikibase as just an extension to Mediawiki is not that useful on its own. Wrapping SPARQL querying into the new API would make it much easier for scripts to achieve certain tasks and avoid iterating over lists or storing interim query result tables on disk. --Dragan Espenschied (talk) 12:57, 13 October 2020 (UTC)[reply]

  • I’m currently using the action API in this way:
    • Editing items via Wikidata Integrator
    • Adding pages to mediawiki and connecting them to items via Sitelinks
    • Performing actions on items that cannot be done via OpenRefine (especially deletion of statements)
  • I think this is great about the proposed REST API (incl. why):
    • It will be possible to create better libraries for data manipulation
    • Getting rid of the "page" model for editing items is critically important
    • Step towards editing history per statement
  • I think this could be improved about the proposed REST API (incl. why):
    • Inclusion of SPARQL endpoint would make for a much more complete API, so that a Wikibase is defined by a repo, a frontend, and a query service.
    • Mass editing functionality could be added to avoid having to launch a call for every single edit, and "hacks" like quickstatements could be replaced
  • Concerns I have:
    • It is foreseeable that parts of the community will demand the creation of features in REST API that could be better served by SPARQL. Especially pywikibot legacy seems to be an issue.
    • Since a wholesome use of Wikibase includes interactions with Mediawiki, a mix of the new REST API and the existing action API can easily become a mess to work with

--Dragan Espenschied (talk) 05:47, 22 October 2020 (UTC)[reply]

Thank you for the feedback! Some replies to points you raised:
  • SPARQL: We decided not to integrate SPARQL in this API at this point and instead focus on covering at least the feature set the current action API provides so we have a solid base to build on in the future.
  • Mixing REST for the Wikibase-specific API modules with action API modules from MediaWiki core: The WMF is working on a similar REST API for MediaWiki code so this should work nicely together in the future.
--Lydia Pintscher (WMDE) (talk) 08:14, 19 November 2020 (UTC)[reply]

Feedback by Hogü-456[edit]

I think it is good if APIS are improved. Im am not good in Programming and so I dont use the API. If you want to make it easier to get Data from Wikidata then this is a first step. I think also the processing after it should be easier. Many people can use Spreadsheets to process Data but the processing in Web Applications for example is something what is not so known. How to make it easier to process data from a database is something where I think much about. --Hogü-456 (talk) 18:57, 13 October 2020 (UTC)[reply]

Feedback by ArthurPSmith[edit]

The API as it stands is adequate if a bit clunky; I think a better improvement could be made by going to a GraphQL model; that allows the client to determine the shape of the response, rather than having a set collection of fixed shapes from the various REST endpoints. It could also greatly simplify the overall API. For example, if somebody needs labels and sitelinks but not statements for an item, that would allow a single API query. If somebody wants a particular set of property statements, but not all of them, then that could be specified. And what if I want to fetch data on linked items - say I want all the English labels on employers of all the authors on a particular work, I believe that could be done through a single GraphQL API query, rather than a (possibly lengthy?) series of REST queries. This may be getting more into SPARQL territory of course, and I used WDQS for a lot of such data fetching now. But I think it's at least possible a GraphQL approach could be much more developer-friendly on both sides. ArthurPSmith (talk) 19:09, 13 October 2020 (UTC)[reply]

I've created and used GraphQL APIs and I like them quite a bit. My main reservation with using them here would just be the amount of work necessary to get them to function correctly in Wikidata's model. It wouldn't be trivial by any means. Nicereddy (talk) 04:27, 16 October 2020 (UTC)[reply]
Thank you for the feedback! GraphQL is actually on the table for the future. We see this REST API as a necessary step on the way there. --Lydia Pintscher (WMDE) (talk) 08:14, 19 November 2020 (UTC)[reply]
@ArthurPSmith: There is already a GraphQL-API for Wikidata it seems.
I recently created Wikidata:LexUse and use a single SPARQL-query to get what I need from WD and then add usage examples using WikibaseIntegrator and then add the lexeme to the user watchlist using the Watch REST API. That works fine and I don't get more data than needed by crafting the SPARQL wisely.--So9q (talk) 13:11, 19 December 2020 (UTC)[reply]

Feedback by GZWDer[edit]

I'm struggling to figure out quite what you mean here. Is this, when using the /entities​/items endpoint to lookup multiple entities you would like to be able to get multiple entities, but not some fields from those entities? Or is this about editing? Adam Shorland (WMDE) (talk) 12:18, 11 November 2020 (UTC)[reply]

Feedback by Ghouston[edit]

There are two problems I noticed with the old API (I haven't really looked at the new proposal):

  1. There doesn't seem to be a mechanism for dealing with large numbers of statements on an item, e.g., when requesting authors of articles, maybe you don't want to get a couple of thousand authors from certain CERN papers. Some kind of paged interface and an ability to set a limit on results, like a batch size?
  2. The API seems to be non-symmetric when requesting items, you can get items that appear on the right side of a statement, but not the left. E.g., you can't ask for items that have a member of (P463) The Beatles (Q1299) statement, so people instead use has part(s) (P527) and part of (P361) because they have inverses. Since templates only have access to the API, and not the query engine, they can't get the values otherwise. This also interacts with the problem above, in case you ask for all instances of human (Q5) or something. Ghouston (talk) 03:34, 14 October 2020 (UTC)[reply]
Thanks for your feedback! Some replies to the points you raised:
  • Pagination: We’ll look into that more. Good point.
  • Symmetric Properties: That’s outside the scope of the current work but it’s good to have it on the radar for the future as this is also something people have been thinking about in other areas.
--Lydia Pintscher (WMDE) (talk) 08:15, 19 November 2020 (UTC)[reply]
+1 for some kind of pagination.--So9q (talk) 13:15, 19 December 2020 (UTC)[reply]

Feedback by Isaac (WMF)[edit]

  • I’m currently using the action API in this way: as a researcher and tool builder, which generally means that I have a large list of Wikidata items or Wikipedia articles that I'm trying to filter / augment with the values associated with a few select properties in Wikidata such as instance-of (P31), gender (P21), or coordinate location (P625).
  • I think this could be improved about the proposed REST API (incl. why): a use-case that recurs quite frequently in my research / tool building is filtering a list of articles based on a certain Wikidata property. For instance, I might have 500 Wikipedia articles and I want to know which of them are articles of women. Currently, this requires either making 1000 calls to wbgetclaims or requesting the entire set of Wikidata claims for each article from wbgetentities and then filtering based on instance-of = human and sex-or-gender = <list of relevant gender values>. Extending wbgetclaims to take multiple items (and ideally properties too) would greatly reduce the I/O and latency of these tools. For example, we could include this logic in our WikiGapFinder tool.
  • Other comments: I skimmed the specifications and couldn't tell if this sort of request was out of scope but thanks for taking on this work and being open to feedback. In general, standardizing is much appreciated as I have also run into issues where I wrote code to operate on the Wikidata dumps and then switched to the API just to find that the schema of the data had shifted underneath me.

Thanks! --Isaac (WMF) (talk) 15:34, 14 October 2020 (UTC)[reply]

Thanks for your feedback!
We will look more into the filtering use case but this sounds like something where we would recommend actually calling the API many times.
You mention schema differences. Could you tell us which dumps you were using and which differences you saw if you remember?
--Lydia Pintscher (WMDE) (talk) 08:15, 19 November 2020 (UTC)[reply]
@Isaac (WMF): Is there a reason why you don't use a SPARQL-query for that instead or the GraphQL endpoint?--So9q (talk) 13:19, 19 December 2020 (UTC)[reply]
@Lydia Pintscher (WMDE): @So9q: I apologize because this is an embarrassingly long time to respond so I suspect it's long past useful, but the schema differences arise between the XML dumps of the Wikidata articles (e.g., wikidatawiki-latest-pages-meta-current...) and JSON dumps or API where the XML lacks the datatype key while the JSON dumps / API have it. The reason I didn't use a SPARQL-query is because I have a very specific set of QIDs that I want information on -- i.e. I suppose I could just run a daily SPARQL query that gathers all items that match my criteria and then check locally against that but that introduces extra overhead and it is easier/more-up-to-date to always gather that information on-the-fly from the APIs (though it requires a lot of API calls which adds to latency). Perhaps there is a way to use SPARQL to quickly gather arbitrary information on just a specific set of e.g., 100 QIDs but I assumed that that wasn't particularly feasible to do quickly. --Isaac (WMF) (talk) 20:38, 29 June 2021 (UTC)[reply]
@Isaac (WMF): I still don't understand what it is you need. Do you need to know if changes occurred to a range of QIDs? If you want to know if 500 Wikipedia articles are women, would it not be easier to add relevant categories to them and use petscan? If you could be more clear about what you are trying to do it would be easier to help you get the data in the most efficient way (could you give an example?).
That said I really like your suggestion to enable the API to take a list of items and or properties (HTTP is expensive so hitting the endpoint many times loads down the whole network which is not great, nor environmentally sound.) --So9q (talk) 22:15, 29 June 2021 (UTC)[reply]

Feedback by Lagewi[edit]

  • I mainly use the API in Citation.js, where I need to fetch all information related to an entity that might be used in a formatted reference or other related machine-readable formats (which currently means anything in CSL-JSON). However, I cannot request the publisher of a book before I have the QID of that publisher. My current implementation first fetches the book, then the publisher, then the publisher location, then the country that location is in, to get something like "Oxford University Press, Oxford, UK". It groups these requests as much as possible, so if it needs to fetch two books with two different publishers it still only has to make 4 requests. This might be able to be simplified with a single SPARQL query but I would have to revisit the code.
  • I really like that this proposes a REST API that takes the structure of the data in Wikidata into account.
  • This is probably inherent to REST APIs but given how I use the API that could result in a lot of requests. There was/is work being done on a GraphQL API I believe, but if that is implemented using this API it might not help much.

Thanks! --Lagewi (talk) 22:10, 15 October 2020 (UTC)[reply]

Thank you for your feedback!
Good to understand the use case. We’ll look into how to better handle that as it has come up in other places as well. It’ll not happen in the first version of the REST API though.
--Lydia Pintscher (WMDE) (talk) 08:16, 19 November 2020 (UTC)[reply]
@Lagewi: FYI there is already a [GraphQL endpoint] available. Also, what you want seems totally possible in a single SPARQL-query request that only takes a book QID.--So9q (talk) 13:23, 19 December 2020 (UTC)[reply]

Feedback by Nicereddy[edit]

  • I’m currently using the action API in this way:
  • I think this is great about the proposed REST API (incl. why):
    • This is soooo much better than the current Wikidata API, thank you so much! I might end up creating a Ruby gem to wrap this API once it's available :)
    • Having it use normal REST conventions is generally a fantastic idea, it's much easier to reason about when it's built specifically for Wikibase and has sensible API paths.
  • I think this could be improved about the proposed REST API (incl. why):
    • I'd like to hear a bit more about how errors will work with this API. One of the biggest problems I've had with the current API is that the errors can be very generic, and so difficult to debug. e.g. 'invalid snak' could mean a few dozen different possible issues with my API request. Good documentation on what a valid snak looks like and better error messages around that would be greatly appreciated!
    • Some improved documentation around the valid values on certain type properties would be nice to have (e.g. Snak has a "snaktype", I'm pretty sure there are only a handful of valid values for those and Statement has a "rank", which also only has a few values valid values IIRC)
  • Concerns I have:
    • The 'update' methods for qualifiers and references should probably be PUT rather than POST requests, right?
    • This isn't a huge problem, but the GET​ /entities​/items​/{entity_id} endpoint (and some other endpoints) have a statements property in their response but the property isn't typed.
    • Some endpoints don't have very good example response data, e.g. GET /statements/{statement_id}/qualifiers.
  • Other comments:
    • I just want to restate how enthusiastic I am about this, this seems like a huge step forward from the current API. I very much look forward to further progress here! :)

Thanks for your work! Nicereddy (talk) 04:24, 16 October 2020 (UTC)[reply]

One other thing I thought of that is worth mentioning, it'd be nice to be able to make requests about multiple entities at once, e.g. if I wanted to see all video games by a specific developer, I'd query that via SPARQL and then I could take the Q IDs for each of the returned entities and query for information about them all at once. So having the ability to specify IDs in the GET /entities/items route would be useful. Of course, this might not be performant if there are too many so it'd be important to limit how many can be requested at once. Nicereddy (talk) 02:28, 20 October 2020 (UTC)[reply]
Thanks for your feedback!
It’d be lovely if you created a Ruby gem for it in the future!
About error reporting: We will have to spec that out some more. The situation hopefully already improves with HTTP status codes and the ability to PATCH defined resources which should help make it easier to understand where the error is occurring.
About documenting valid values for snaktype and rank: Will do! Good catch.
About PUT vs. POST for update on qualifiers and references: That was a conscious decision. Please see https://github.com/wmde/wikibase-rest-api-proposal/blob/master/GOTCHAS.md for the reasoning.
--Lydia Pintscher (WMDE) (talk) 08:17, 19 November 2020 (UTC)[reply]

Feedback by Egonw[edit]

Awesome! This is a lot to digest, and I was looking at the OpenAPI (thanks, keep that :) and was looking for APIs for the following without noticing a clear method (but may be overlooking it, but at least indicates one way I'd be using it):

  • give me all items that are instance of (possibly transitively via subclas of) of some type (e.g. all things that are chemical compound)
  • give me all items with a statement with some property (e.g. all items with an InChIKey)

I guess some of these can be implemented with optional arguments for existing methods. Looking forward to this API! --Egon Willighagen (talk) 09:07, 17 October 2020 (UTC)[reply]

Thanks for your feedback!
The first one seems like something that should continue to be served by SPARQL at this point. The second one we’re looking into a specialized API for.
--Lydia Pintscher (WMDE) (talk) 08:17, 19 November 2020 (UTC)[reply]

Feedback by Andrawaag[edit]

  • I’m currently using the action API in this way:
    • Fetching the json model of Wikidata items for bulkediting and writing the fully edited json upon completion in both Wikidata and Wikibase
    • Login and create sessions in botcode
    • Merge wikidata items when needed
    • Label search
  • I think this is great about the proposed REST API (incl. why):
    • My main issue with the current action API is that it is difficult to navigate and that there are many redundancies implemented. For example, it took a while and after a suggestion in the community that I learned of the wbeditentity call, where a wikibase item is fetched in its entirety which allows local editing before resubmitting. Using this call meant a really increase performance, beause before an item was edited statement by statement and qualifier by qualifer, simply because those calls do exist as well. Although the rest api is not implemented just by looking at it, it is immediatly obvious which call is needed.
  • I think this could be improved about the proposed REST API (incl. why):
    • It would be nice if a GUI function from wikibase is explicitly mapped to the REST Call or vice versa. A bit like the current practice in the WDQS where we are able to produce bot code in various languages to reproduce the same effect. So to give an example, if I know how to list all properties in a Wikibase (special pages -> list all properties), that it is easy to navigate to the API call with the same functionality. Currently it is not immediatly obvious how to perform a function through the API although it is obvious that there is an underlying function call.
    • One of the things I am currently missing are more fine grained parametrised function calls. For example fine tuning of label search. As far as I know it is not possible to do "detailed" label search. If I search for "apple" I will get a list of all items that have the word "apple" as a substring even if there is not item on the topic it self. Given the large number of scholarly articles being added, there are a lot of articles on a main topic, while the main topic does not have a wikidata item, this is not easily seen, since the api call does return a list of wikidata items which then need to be parsed one-by-one, while an additional parameter (e.g. "only_exact_match"= true would speed up the retrieval process. An other improvement could be to combine label sesrch based on existing properties and or items. e.g. Return all Wikidata items that contain the label "Ebola" which is an instance of a river.
    • Add a call to fetch an Item given a Wikipedia page/commons file. Currently, it seems to me that only from Wikidata to the sister projects is supported not the other way around.
    • Is it possible to extend the list of properties by its data type? So instead of all properties a list of property of type external-id
    • Limit the writing/posting to items only. Like with the action calls, it is possible to both change a Wikidata entirely or by its components. I am wondering that if only full item edits are supported the performance might get better, because the system does not have to deal with single labels, description, statements, qualifiers and references writes to the api. I was one told that every edit - small or big - leads to a full rewrite of that edit into the underlying database. So by first preparing all edits locally and than submit, leads to less rewrites of the full json in the database.
    • Support metadata REST calls
      • A call that returns the last time an item or statement was updated.
      • A revision history including the editors. Who did what when. This would facilitate optimal bot edits. A bot could skip for example an edit if the edit on that statement was not done by a bot-account.
  • Concerns I have:
    • I hope that this welcome improvement of the API does not results in a deviation in the support and development of the Wikidata Query Service. Once you understand the Wikidata datamodel navigating Wikidata and querying this knowledge graph is a lot easier than understanding the Action/REST APIs. From a user perspective, SPARQL is a far more powerful API-framework than REST. With SPARQL we basically have unlimited combinations of API calls, where with a REST API, we are at the mercy and choice of the REST development and maintenance team. Having said this, I do understand that this luxury of unlimited API calls leads to very complex querying, which can lead to serious performance issues. Ideally, we maintain a nice and delicate balance between the WDQS and the proposed REST API.
  • Other comments:
    • Can EntitySchema's be considered in this REST API?
Thanks a lot for your feedback!
  • The search point you raise is something that needs to be tackled in Elastic, independent of this new API I’d say.
  • We’ll have a closer look at fetching an Item based on its Wikipedia article.
  • Adding the datatype when listing Properties: We’ll look more into filtering in general, among other things by datatype.
  • Support for metadata: Revision and last modified information are in the headers now: e.g. https://wmde.github.io/wikibase-rest-api-proposal/#/items/get_entities_items__entity_id_ Figuring out when and by whom a Statement has been edited last is a different beast unfortunately that we don’t currently have a scalable solution for.
  • Taking away resources from WDQS: It is actually our hope that with this new API (and some others potentially) we can convince some users to migrate away from it for their use cases that don’t require WDQS. That should buy us some breathing room there as well.
  • Entity Schemas: Potentially in a future version, yeah.
--Lydia Pintscher (WMDE) (talk) 08:19, 19 November 2020 (UTC)[reply]

Feedback by Tpt[edit]

  • I’m currently using the action API in this way:
    • Massive insertion or update of item/property/lexemes/media info content with tools like WikidataToolkit. I rely a lot on wbeditentity to avoid making a lot of edits per item.
    • Editing gadgets that sometime use wbeditentity or the most specific endpoints if applicable.
  • I think this is great about the proposed REST API (incl. why):
    • It looks much much nicer and user friendly than the current API.
    • The PUT and PATCH actions that allows to change an entity directly in a more systematic way than `wbeditentity`. It will make WikidataToolkit implementation much simpler if we get support for all entity types (lexemes, forms, senses and mediainfo).
  • I think this could be improved about the proposed REST API (incl. why):
    • Having a `GET /entities/items/{siteid}/{title}` method that redirects to `GET /item/{qid}` would be great to cover the "fetch from sitelink" usecase that is quite common. This operation is used quite a lot by Wikidata Toolkit and my gadgets.
    • It would be amazing to have soon support for Lexemes. It's where `wbeditentity` lacks the most features. Complete adoption by WikidataToolkit will be blocked by that.
    • I am not sure to understand the current use case of the `GET /items` action that does not allow any filtering.
  • Other comments:
    • I am not sure why the `/items/` and `/properties/` part is useful in the API. What about just having e.g. `GET /entities/{entity id}`. The entity typing information is already in the entity ID.
    • It's a great draft. Thank you!
Thanks a lot for your feedback!
Your idea for fetching from sitelink sounds interesting. We’ll look into that.
Support for Lexemes: Noted! :D
Separate endpoints for Items and Properties: Noted! It seems people are split on this and we’ll ponder it more.
--Lydia Pintscher (WMDE) (talk) 08:20, 19 November 2020 (UTC)[reply]

Feedback by Husky[edit]

  • I’m currently using the action API in this way:
  • I think this is great about the proposed REST API (incl. why):
    • Clear REST-like paths for things like descriptions/labels are very useful and less confusing than the Action API.
    • Seems to take some design decisions from the Wikimedia REST API, which is easier to understand than the Action API.
  • I think this could be improved about the proposed REST API (incl. why):
    • I see no way of getting data for multiple items. Maybe the /entities/items/ endpoint should have a parameter where you can indicate QID's separated by comma?
  • Concerns I have:
    • I know selecting specific languages is not implemented in this spec, but that makes it a lot less useful for frontend (web) based applications, where limiting the data requested should be limited to make the app performant (e.g. when querying items like Q42 with lots of labels). Same is true for not being able to get multiple items in a single call, this will also reduce performance of web apps.
  • Other comments:
    • It might be useful to offer an option to simplify the deeply-nested JSON structure that you sometimes get from the API. Especially missing values (e.g. not having a specific language in an object) take up much time when dealing with the data.
    • I'm glad we're making some progress on simplifying the Wikidata API!

Husky (talk) 12:28, 3 November 2020 (UTC)[reply]

Thanks a lot for your feedback!
We’ll look more into language handling as well as requesting data for a list of entities before implementing the first version.
Could you elaborate what you mean with your point about deeply nested JSON structure and simplifying it?
--Lydia Pintscher (WMDE) (talk) 08:21, 19 November 2020 (UTC)[reply]
@Husky: Filtering on a list of language_codes= sounds like a good idea to reduce payload size. :) Please respond to Lydias question if possible.--So9q (talk) 13:37, 19 December 2020 (UTC)[reply]

Feedback by User:DCausse (WMF)[edit]

  • I’m currently using the action API in this way:
    • Explore the data for debugging Wikidata Query Service
  • I think this is great about the proposed REST API (incl. why):
    • A lot clearer than the action API
  • I think this could be improved about the proposed REST API (incl. why):
    • While labels, aliases and sitelinks have ways to filter the ones you want based on some criteria (lang, site) statements do not seem to have such ability. I'm often interested in only a few statements. Similar to wbgetclaims I think that GET ​/entities​/{entity_type}​/{entity_id}​/statements should expose params to filter on property and rank.
  • Concerns I have:
  • Other comments:

DCausse (WMF) (talk) 09:27, 4 November 2020 (UTC)[reply]

Thank you for your feedback! We’ll look more into filtering. --Lydia Pintscher (WMDE) (talk) 08:21, 19 November 2020 (UTC)[reply]

Feedback by knuthuehne[edit]

I think this is a really great idea to make Wikibase more approachable for newcomers, thank you for the effort. Just adding a couple of technical comments here:

  • Concerns I have:
    • I agree with BrokenSegue's comment about the usefulness of the GET collection endpoints and their possible uses. Without a sort property, I am not sure what they would be used for.
    • Pagination with only an offset seems dangerous to me for an API of this size. Cursor Pagination might be preferable.
  • Other comments:
    • I'm curious to hear why RFC 6902 was picked for the PATCH methods. Most people in my bubble seem to prefer RFC 7396 - JSON Merge Patch.
    • I was a bit confused when browsing the Open API documentation where the first example starts with
{
  "items": [
    {
      "id": "string",
      "type": "string",

Where I guess it should say

{
  "items": [
    {
      "id": "Q42",
      "type": "item",


I think that having good examples in the documentation would really help me as a potential consumer to get a feeling for the API.

Thanks a lot for your feedback!
We’ll look more into pagination.
About Patch: The thinking behind this is documented here: https://github.com/wmde/wikibase-rest-api-proposal/blob/master/DECISION-RESEARCH.md#updating-resources
We’ll fix the examples. Thanks for spotting it.
--Lydia Pintscher (WMDE) (talk) 08:22, 19 November 2020 (UTC)[reply]

Feedback by Yan Wong[edit]

  • I’m currently using the action API to display (but not edit) Wikidata information in html form for biological species on http://www.onezoom.org, if there is no available wikipedia page. If there is a wikipedia page in the appropriate language, the wikipedia REST API page is used to display a "basic" (unornamented) version of the wikipedia page, including removing external links if the user so requires. A similar think for Wikidata would be great.
  • I think this is great about the proposed REST API (incl. why):
    • The proposed REST API would be particularly helpful if (like the wikipedia API) it can be injected directly into another page, using a CORS request.
  • I think this could be improved about the proposed REST API (incl. why):
  • Concerns I have:
    • As above, I would like to ensure CORS is allowed for REST API pages, and that the html content handler provides a tabular layout for Wikidata, for ease of viewing.

HYanWong (talk) 14:39, 5 May 2021 (UTC)[reply]

Feedback by XXX[edit]

  • I’m currently using the action API in this way:
  • I think this is great about the proposed REST API (incl. why):
  • I think this could be improved about the proposed REST API (incl. why):
  • Concerns I have:
  • Other comments: