Wikidata:Contact the development team/Query Service and search/Archive/2020/12

From Wikidata
Jump to navigation Jump to search

http outputs from WDQS

WDQS produces HTTP links to wikidata items (both as the link in an a href in the WDQS interface, and when a result set is downloaded). Meanwhile wikidata appears to be website served as HTTPS. Why is WDQS unaware of wikidata's HTTPS? Why is WDQS in effect serving insecure URLs in preference to secure? Might we now move to WDQS serving HTTPS links? --Tagishsimon (talk) 10:06, 1 December 2020 (UTC)

Please see phab:T226453. DCausse (WMF) (talk) 08:00, 2 December 2020 (UTC)
I think the ticket is about another feature and some blog entry can hardly justify negligent handling of https on WMF servers. --- Jura 08:31, 2 December 2020 (UTC)
In the past I have complained about this as well, but I think this is not an issue any longer. Wikimedia projects including Wikidata and WDQS meanwhile use HTTP Strict Transport Security (Q2438540) headers, so users access entity pages on Wikidata via HTTPS anyways even if the WDQS output explicitly uses the "insecure" canonical HTTP URLs. —MisterSynergy (talk) 08:57, 2 December 2020 (UTC)
I should have pointed to phab:T153563 which has a lot more useful resources to understand the current situation. DCausse (WMF) (talk) 10:46, 2 December 2020 (UTC)
Are there test cases that ensure this works correctly and is this continuously monitored? --- Jura 10:52, 2 December 2020 (UTC)
Do you refer to the HSTS header? It is standard technology these days, supported by all non-ancient browsers, and this scenario is the use case for it. I am not sure whether it is being monitored, or needs to be monitored at all. For my setup, however, it does indeed work as expected: I was just able to verify that my browser automatically requests HTTPS links from the WDQS results lists, although the results are listed with the canonical HTTP URLs; there are also the HSTS headers visible in each request (all confirmed via the browser's developer tools). I really do not worry about this any longer, and I also do not think that anyone else needs to. —MisterSynergy (talk) 12:33, 2 December 2020 (UTC)
@DCausse (WMF): You point to phab:T226453 "Concept URI in sidebar on Wikidata uses HTTP instead of HTTPS" which is for sure a similar issue, but not the same issue. My question was plain, and the discussion has not answered it. If wikidata uses https URLs, why does the output of WDQS use http, and, can this be changed please. For the avoidance of doubt, at the very least it makes combining datasets sourced from WDQS and from the wikidata front-end, or from Petscan, a pain to the extent that one has to deal with the URLs being different. That an http call will be redirected to https is of little solice when the problem exhibits in, for instance, string-slicing URLs in my spreadsheet. It's plainly wrong, or wrong-headed, to issue http claims when objects are actually to be found at https, and I tend to think we should be trying to do a little better than this. --Tagishsimon (talk) 16:23, 13 December 2020 (UTC)
Sorry for pointing at phab:T226453, as said in another comment I should have pointed to phab:T153563 in the first place. I agree that if we were to start this again https would have been chosen from the get go. Changing it now was deemed too disruptive because it is thought that references to http IRIs are too widespread. DCausse (WMF) (talk) 08:20, 14 December 2020 (UTC)

suggestions on Query Service input GUI

list of additions

A blank line separates suggestions. Multiline suggestions wont have a blank line. Please add yours.

#defaultView:Map{"hide":["?coor"]}

#TEMPLATE={"template":"list of concepts of type by selection criteria" }

hint:Prior hint:gearing "forward".

hint:Prior hint:rangeSafe true.

FILTER(    ?date >= "1925-00-00"^^xsd:dateTime
        && ?date <  "1926-00-00"^^xsd:dateTime )

SERVICE bd:sample { ?item wdt:P31 wd:Q41176 . bd:serviceParam bd:sample.limit 42 }

SERVICE wikibase:mwapi
  {
    bd:serviceParam wikibase:endpoint "www.wikidata.org" .
    bd:serviceParam wikibase:api "Search" .
    bd:serviceParam mwapi:srsearch ?search .
    bd:serviceParam mwapi:srnamespace "0" .
    ?item wikibase:apiOutputItem mwapi:title .
  }

SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en".
                         ?item rdfs:label ?itemLabel .
                         ?item schema:description ?itemDescription .
                         ?item skos:altLabel ?itemAltLabel .
                       } # manual mode of label service

Try it!

The following help complete lines. Maybe another sample than Q41176 could be used.

wdt:P31 wd:Q41176

wdt:P31/wdt:P279* wd:Q41176

wdt:P31/wdt:P279+ wd:Q41176

wdt:P31/wdt:P279? wd:Q41176

p:P31/ps:P31 wd:Q41176

p:P31/ps:P31/wdt:P279* wd:Q41176

p:P31 [ ps:P31 wd:Q41176 ; a wikibase:BestRank ]

p:P31 ?st . 
	?st ps:P31 wd:Q41176 . 
	?st a wikibase:BestRank . 
	?st wikibase:rank ?rank . 
	OPTIONAL { ?st pq:P582 ?end }

schema:about ?item ; schema:name ?pagetitle ; schema:isPartOf <https://en.wikipedia.org/>

schema:about ?item ; schema:name ?pagetitle ; schema:isPartOf / wikibase:wikiGroup "wikipedia"

wikibase:sitelinks ?sl ; wikibase:statements ?st ; wikibase:identifiers ?ids

Try it!

Comments

Some time ago, we added a few suggestions to the GUI at https://query.wikidata.org see phab:T150950.

Above a few additional ones. Maybe a MWAPI one for category search or federation could be added as well. Feel free to improve/complete the list.

@DCausse (WMF): --- Jura 12:17, 7 December 2020 (UTC)

Hello @Jura1:, can you explain more clearly what are the suggestions you're proposing? Can you evaluate the impact that it could have on people running queries (how many people would benefit from this? what kind of improvement could it bring to them?) and give us any extra information that could help us defining the priority of these requests?
At the moment, we are not performing many changes on the Query Service GUI, as we are already working on many other features. This will continue next year. If a request proves to have a high priority, we could consider adding it to the roadmap, but I'm afraid that minor improvements cannot be added any time soon. Lea Lacroix (WMDE) (talk) 09:58, 8 December 2020 (UTC)
These seem to be questions mostly for developers: my guess would be it has no impact on people running queries (they are only used when people write queries and shouldn't impact performance, but it's not really my field). Also, checking queries people actually run could help determine how frequently people use these patterns. Supposedly the ones that were selected earlier were determined in a similar way. Which one do you generally use?
I think adding these is a resource efficient way of helping users to write queries. It probably takes much less effort than developing a query writing tool.
Given that it's mostly about improving Query Service, maybe WMF search team is better equipped to respond to the questions you ask and implement it. @DCausse (WMF): what do you think? In terms of work, I suppose one would just need to copy the above line to the configuration. --- Jura 11:39, 8 December 2020 (UTC)
The Query Service GUI is on WMDE's side, and David and I already had a chat about the answer to give you, which is the one I posted above.
Adding extra resources for experienced users and creating a new interface for beginners are two different things with different target audiences. My questions were aiming at defining if the change would be valuable for more people than only the person who requested it. If more people show interest in having these extra suggestions, we could consider moving forward with the idea. Lea Lacroix (WMDE) (talk) 12:25, 8 December 2020 (UTC)
Ok, would you have some numbers on the use of the existing ones/the ones I previously had added? Can you help determine the frequency of the above patterns? Obviously, some patterns might not be used because people don't find them or don't know about them. --- Jura 13:21, 8 December 2020 (UTC)

Prefix should be s: but it return wds:

Hi, I expect this to return s: prefix but how come it return with prefix wds: instead ? wds:Q36949-91bc1581-43b0-78c1-4970-c2480d22c56c

Because according to this entity ttl https://www.wikidata.org/wiki/Special:EntityData/Q36949.ttl

The value prefix is s: not wds: , you can search Q36949-91bc1581-43b0-78c1-4970-c2480d22c56c at that ttl.

select * 
WHERE {
  wd:Q36949 p:P2218 ?vv.
}
Try it!
Besides my response here, can I just note that your "should" is very presumptive. It would be ideal if the turtle and rdf manifestations of wikidata used consistent prefixes. It would be interesting to know why they are not consistent. Interesting to know whether, now that the inconsistency has been raised, a change will be made. But there is no "should" about it. --Tagishsimon (talk) 15:39, 13 December 2020 (UTC)
Hi, thanks, sorry, I didn't mean that it has to be that way or this way. But I just meant if the the ttl showing that as s: I thought it would have been following the ttl, so it's working not as expected, I don't know what other words I can use other than "should". I thought there was a single source of triple store database that is being used throughout Wikidata system, but now I know from the Wikidata sparql endpoint it seems like it's querying different copy of triple store from how this is displaying in https://www.wikidata.org/wiki/Special:EntityData/Q36949.ttl or maybe it's processed differently.--Esia1688 (talk) 14:41, 14 December 2020 (UTC)
Prefixes (turtle, SPARQL) are "local" to a file or a sparql query and they are not required to be the same everywhere, but I cannot agree more that it would be a lot more consistent if they were the same. I tried to dig into phabricator to find possible explanations in vein, if someone recollects some specific reasons I would love to hear from them. Note that this is not the sole prefix suffering from this difference:
If someone feels strongly that this inconsistency should be addressed please feel free to file a ticket in phabricator and attach it to this discussion.
To answer your last question, yes there are some differences between what you might see in the wikidata RDF dumps and the query service, they listed here: mw:Wikibase/Indexing/RDF_Dump_Format#WDQS_data_differences. DCausse (WMF) (talk) 14:47, 14 December 2020 (UTC)
Hi, I see, I just realized also that the prefix is not really so important in getting values in the query, previously I have an issue about querying Wikibase:quantityUnit, I thought it was because of the prefix issue but I realized it's my SPARQL variable was written wrongly. This is now fixed, so this now not an importance for me already whether wds: or s: Thanks for your help. --Esia1688 (talk) 04:40, 15 December 2020 (UTC)