Shortcuts: WD:RAQ, w.wiki/LX

Wikidata:Request a query/en

From Wikidata
Jump to navigation Jump to search


Request a query
Fishing in the Wikidata river requires both an idea where to look for fish and a suitable fishing method. If you have the former, this page can help you find the latter.

This is a page where SPARQL 1.1 Query Language (Q32146616) queries can be requested. Please provide feedback if a query is written for you.

For sample queries, see Examples and Help:Dataset sizing. Property talk pages include also summary queries for these.

For help writing your own queries, or other questions about queries, see Wikidata talk:SPARQL query service/queries and Wikidata:SPARQL query service/query optimization.

Help resources about Wikidata Query Service (Q20950365) and SPARQL: Wikidata:SPARQL query service/Wikidata Query Help and Category:SPARQL.

To report an issue about the Query Service (interface, results views, export...) please see Wikidata:Contact the development team/Query Service and search.
On this page, old discussions are archived. An overview of all archives can be found at this page's archive index. The current archive is located at 2024/05.


Filter out spesific items from a property[edit]

This query is going to be used in a Wikidata list. What I want is a query that generates a list of ships registered in Norway, and fetches data from significant event (P793), but only shipwrecking (Q906512), scuttling (Q1786766) and ship breaking (Q336332). The query must still return ship items without any of those three.

This is the query I made, but I don't know how to filter out the spesific entries:

SELECT ?item ?itemLabel ?IMO_number ?event ?eventLabel  WHERE {
  ?item wdt:P8047 wd:Q20.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "nb". }
  OPTIONAL { ?item wdt:P793 ?event. }
  OPTIONAL { ?item wdt:P458 ?IMO_number. }
}
Try it!
Cavernia (talk) 10:41, 22 November 2022 (UTC)[reply]
@Cavernia: We meet again. This gets only the three events of interest to you. Let's keep going with this until you get exactly what you're after ... maybe share the test Listeria page URL so I can see how it looks.
SELECT ?item ?itemLabel ?IMO_number ?event ?eventLabel  WHERE {
  ?item wdt:P8047 wd:Q20.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "nb". }
  OPTIONAL { ?item wdt:P793 ?event. FILTER(?event IN (wd:Q906512, wd:Q1786766, wd:Q336332))}
  OPTIONAL { ?item wdt:P458 ?IMO_number. }
}
Try it!
--Tagishsimon (talk) 13:10, 22 November 2022 (UTC)[reply]
Excellent, thanks a lot! --Cavernia (talk) 21:55, 22 November 2022 (UTC)[reply]
@Tagishsimon Now I've got another problem. I also want to show the year/date of the events which is added as qualifier, and the place. How can I return only the date/place for the actual event, and not dates/places for all events? The best solution would be to concatenate the event, date and place in the same coloumn. --Cavernia (talk) 18:57, 25 November 2022 (UTC)[reply]
@Cavernia: So this sort of direction. There is considerable scope for fiddling with the format of the dates, including just a SUBSTR(str(?date_),1,8), or translating months into strings - 07 -> July &c using a formulation such as BIND(IF(?month="7","July",IF(?month="8","August,"")) as ?monthLabel). Right now I've used some code with evaluates date precision and the length of day and month strings, to give a consistent yyyy:mm:dd output, providing days or months only when justified by the date precision. You say "concatenate the event, date and place", but I'm not sure where we're getting the "place" from. You can probably see how to work it to the SPARQL below, but come back here if not and/or for any other tweaks.
SELECT ?item ?itemLabel ?IMO_number ?event ?eventLabel ?eventdate WHERE {
  ?item wdt:P8047 wd:Q20.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "nb". }
  OPTIONAL { ?item p:P793 ?stat .
             ?stat ps:P793 ?event. 
             OPTIONAL {?stat pqv:P585 [
                wikibase:timePrecision ?precision ;
                wikibase:timeValue ?date_ ].
                BIND(IF(strlen(str(month(?date_)))=2,str(month(?date_)),concat("0",str(month(?date_)))) as ?month) 
                BIND(IF(strlen(str(day(?date_)))=2,str(day(?date_)),concat("0",str(day(?date_)))) as ?day) 
                BIND(IF(?precision=9,year(?date_),
                       IF(?precision=10,concat(str(year(?date_)),":",?month),
                         IF(?precision=11,concat(str(year(?date_)),":",?month,":",?day),""))) as ?date)                      
             }
             ?event rdfs:label ?eventLabel . filter(lang(?eventLabel)="nb")
             BIND(CONCAT(?eventLabel," ",str(?date)) as ?eventdate) 
            
            FILTER(?event IN (wd:Q906512, wd:Q1786766, wd:Q336332))
           }
  OPTIONAL { ?item wdt:P458 ?IMO_number. }
}
Try it!
--Tagishsimon (talk) 19:35, 25 November 2022 (UTC)[reply]
Here's an added hack to get the monthLabel, fwiw.
SELECT ?item ?itemLabel ?IMO_number ?event ?eventLabel ?eventdate ?monthitemLabel WHERE {
  ?item wdt:P8047 wd:Q20.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "nb". }
  OPTIONAL { ?item p:P793 ?stat .
             ?stat ps:P793 ?event. 
             OPTIONAL {?stat pqv:P585 [
                wikibase:timePrecision ?precision ;
                wikibase:timeValue ?date_ ].
                BIND(IF(strlen(str(month(?date_)))=2,str(month(?date_)),concat("0",str(month(?date_)))) as ?month) 
                BIND(IF(strlen(str(day(?date_)))=2,str(day(?date_)),concat("0",str(day(?date_)))) as ?day) 
                BIND(IF(?precision=9,year(?date_),
                       IF(?precision=10,concat(str(year(?date_)),":",?month),
                         IF(?precision=11,concat(str(year(?date_)),":",?month,":",?day),""))) as ?date) 
       
                BIND(month(?date_) as ?monthdate)
                ?monthitem p:P279 ?statm . 
                ?statm ps:P279 wd:Q18602249.
                ?statm pq:P1545 ?monthdate2.
                FILTER(str(?monthdate2)=str(?monthdate))             
             }
             ?event rdfs:label ?eventLabel . filter(lang(?eventLabel)="nb")
             BIND(CONCAT(?eventLabel," ",str(?date)) as ?eventdate) 
            
            FILTER(?event IN (wd:Q906512, wd:Q1786766, wd:Q336332))
           }
  OPTIONAL { ?item wdt:P458 ?IMO_number. }
}
Try it!
--Tagishsimon (talk) 19:59, 25 November 2022 (UTC)[reply]

List of place names in Cornwall, in both English and Cornish language[edit]

I've tried getting this Wikidata List (Listeria powered) to work, with no avail. If someone has a minute PLEASE can you correct it? I'm also trying to get the pronounciation (in both languages), but it's above me, as they're language specific!

Much of the data is on Wikidata, but I will try and get a Mix N Match project to get this database on Commons of Cornish names onto WD asap. Any help would be appreciated! THANKS! Meur ras! Llywelyn2000 (talk) 16:45, 24 November 2022 (UTC)[reply]

@Llywelyn2000: Probably need to tighten up the spec for the Listeria version of this. I get 15k rows of things in Cornwall (council area), and Listeria creaks north of 3000, has a hard limit of 5000. What direction do you want to take this query?
SELECT ?item ?itemLabel_en ?itemLabel_kw
WHERE
{
  ?item wdt:P131/wdt:P131* wd:Q22338583.
  OPTIONAL {?item rdfs:label ?itemLabel_en. filter(lang(?itemLabel_en)="en") }
  OPTIONAL {?item rdfs:label ?itemLabel_kw. filter(lang(?itemLabel_kw)="kw") }
}
Try it!
Oh. Perhaps restriced to
SELECT ?item ?itemLabel_en ?itemLabel_kw ?dedicated_toLabel_en ?dedicated_toLabel_kw
WHERE
{
  {?item wdt:P131/wdt:P131* wd:Q22338583.} hint:Prior hint:runFirst true.
  ?item wdt:P825 ?dedicated_to. 
  OPTIONAL {?item rdfs:label ?itemLabel_en. filter(lang(?itemLabel_en)="en") }
  OPTIONAL {?item rdfs:label ?itemLabel_kw. filter(lang(?itemLabel_kw)="kw") }
  OPTIONAL {?dedicated_to rdfs:label ?dedicated_toLabel_en. filter(lang(?dedicated_toLabel_en)="en") }
  OPTIONAL {?dedicated_to rdfs:label ?dedicated_toLabel_kw. filter(lang(?dedicated_toLabel_kw)="kw") }
}
Try it!
--Tagishsimon (talk) 02:08, 25 November 2022 (UTC)[reply]
Many thanks @Tagishsimon:, but the list should include towns or villages only, ordered by parish. Thanks! I'm sorry I didn't make that clear. Llywelyn2000 (talk) 10:05, 26 November 2022 (UTC)[reply]
@Tagishsimon: - Your first example is best, but needs to be limited (for WD Lists) to c. 3,000. Can you suggest a way to do this eg using the starting letter A-D then I could create a second list E-J etc? Thanks! Llywelyn2000 (talk) 11:07, 29 November 2022 (UTC)[reply]

Class tree[edit]

I'm looking to have something like this be made, I don't know if it is possible with queries. Based on subclasses.

user account (Q3604202)

social media account (Q102345381)
verified account or profile (Q28378282)
X Premium account (Q115148082)
premium account (Q2108670)
private account (Q58370623)
protected Twitter account (Q78680253)

Just an organized way to see the subclasses of an item. Is this possible? Thanks! -wd-Ryan (Talk/Edits) 20:58, 25 November 2022 (UTC)[reply]

@Wd-Ryan: Seems so, using the Tree view
#defaultView:Tree
SELECT ?level0 ?level0Label ?level1 ?level1Label ?level2 ?level2Label ?level3 ?level3Label ?level4 ?level4Label
WHERE 
{
  BIND(wd:Q3604202 as ?level0)
  ?level1 wdt:P279 ?level0. 
  OPTIONAL {?level2 wdt:P279 ?level1.
    OPTIONAL {?level3 wdt:P279 ?level2.
      OPTIONAL {?level4 wdt:P279 ?level3.} } }
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". } 
} order by ?level1Label ?level2Label ?level3Label ?level4Label
Try it!
--Tagishsimon (talk) 21:19, 25 November 2022 (UTC)[reply]
Oh this is amazing, queries can do anything! -wd-Ryan (Talk/Edits) 21:33, 25 November 2022 (UTC)[reply]

Filtering on novalue[edit]

I want to list battleships where vessel class (P289) is missing, but am getting those where it is set to <no value>. I was hoping a query like

SELECT DISTINCT ?item ?itemLabel WHERE {
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE]". }
  {
    SELECT DISTINCT ?item WHERE {
      ?item p:P31 ?statement0.
      ?statement0 (ps:P31/(wdt:P279*)) wd:Q182531.
      MINUS {
        ?item p:P289 ?statement2.
        ?statement2 (ps:P289/(wdt:P279*)) _:noValueP289_1.
      }
      MINUS {
        ?item p:P289 ?statement1.
        ?statement1 (ps:P289/(wdt:P279*)) _:anyValueP289_1.
      }
    }
  }
}
Try it!

but that doesn't work, and I'm struggling to find documentation on the anyValue logic to see whether noValue was the correct variant Vicarage (talk) 11:17, 27 November 2022 (UTC)[reply]

@Vicarage: Documentation: https://www.mediawiki.org/wiki/Wikibase/Indexing/RDF_Dump_Format#Novalue
SELECT ?item ?itemLabel 
WHERE 
{
  ?item p:P31/ps:P31/wdt:P279* wd:Q182531.
  filter not exists {?item a wdno:P289.}
  filter not exists {?item p:P289 [].}
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". } 
}
Try it!
--Tagishsimon (talk) 11:59, 27 November 2022 (UTC)[reply]
Exactly what I wanted. Thanks! Vicarage (talk) 12:09, 27 November 2022 (UTC)[reply]
It can, fwiw, be thought of best as 'member of the class of items which have <no value> for whichever property' ... 'a' being shorthand for rdf:type. --Tagishsimon (talk) 12:14, 27 November 2022 (UTC)[reply]

Units in plural form using lexemes?[edit]

Hi, how i can get formatted unit in plural in swedish form using lexemes? Below is the example query.

SELECT ?place ?placeLabel ?elev ?unit ?unitLabel
WHERE
{
  ?place p:P2044/psv:P2044 ?placeElev.
  ?placeElev wikibase:quantityAmount ?elev.
  ?placeElev wikibase:quantityUnit ?unit.


  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" }
}
LIMIT 1
Try it!

Br, -- Zache (talk) 10:20, 29 November 2022 (UTC)[reply]

This should work I think. This isn't a very efficient way of doing things by the way. You can get normalized values in meters and simply ignore looking up the words for the units, and instead just manually set the unit words.
SELECT ?place ?placeLabel ?elev ?ulabel #?unit ?prepr ?ulabel_ #?sense ?lexeme ?form
WITH {
  SELECT ?place (SAMPLE(?elev_st_) AS ?elev_st)
  WHERE {
    #?place wdt:P17 wd:Q34 . # Reduce the working set.
    #?place wdt:P31 wd:Q8502 .

    ?place p:P2044 ?elev_st_ .
    ?elev_st_ a wikibase:BestRank .
  }
  GROUP BY ?place
  LIMIT 100
} AS %i
WITH {
  SELECT ?place ?elev_st ?elev ?unit ?ulabel_ (SAMPLE(?prepr_) AS ?prepr) #?sense ?form ?lexeme
  WHERE {
    INCLUDE %i

    ?elev_st psv:P2044 [ wikibase:quantityAmount ?elev; wikibase:quantityUnit ?unit ] .
    ?unit rdfs:label ?ulabel_. FILTER(LANG(?ulabel_)="sv")

    OPTIONAL { 
      ?sense wdt:P5137 ?unit . 
      ?lexeme ontolex:sense ?sense . 
      ?lexeme ontolex:lexicalForm ?form. 
      ?lexeme dct:language wd:Q9027 .
      ?form wikibase:grammaticalFeature wd:Q146786 .
      ?form ontolex:representation ?prepr_ .
    }
  }
  GROUP BY ?place ?elev_st ?elev ?unit ?ulabel_ #?sense ?form ?lexeme
} AS %f
WHERE {
  INCLUDE %f
  BIND(IF(ABS(?elev) >= 2, ?prepr, ?ulabel_) AS ?ulabel)
  SERVICE wikibase:label { bd:serviceParam wikibase:language "sv,en" }
}
Try it!
Infrastruktur (talk) 14:39, 29 November 2022 (UTC)[reply]
Here's an example of using normalized values (always using metres). English has two plural forms, whilst swedish seems to use the same spelling for singular as plural form.
SELECT ?place ?placeLabel ?elev ?ulabel #?unit
WITH {
  SELECT ?place (SAMPLE(?elev_st_) AS ?elev_st)
  WHERE {
    #?place wdt:P17 wd:Q34 . # Reduce the working set.
    #?place wdt:P31 wd:Q8502 .

    ?place p:P2044 ?elev_st_ .
    ?elev_st_ a wikibase:BestRank .
  }
  GROUP BY ?place
  LIMIT 100
} AS %i
WHERE {
  INCLUDE %i
  ?elev_st psn:P2044 [ wikibase:quantityAmount ?elev; wikibase:quantityUnit ?unit ] .
  BIND(IF(ABS(?elev) >= 2, "meter", "meter") AS ?ulabel)
  SERVICE wikibase:label { bd:serviceParam wikibase:language "sv,en" }
}
Try it!
Infrastruktur (talk) 15:23, 29 November 2022 (UTC)[reply]
I was trying to figure out how to do this universally. Ie. I get some random unit from SPARQL or in the Lua module and then I should try to render the value. It would be also nice if I could change the language if needed between Finnish and Swedish.
However, I needed to get some push in the right direction on how to do this. This is what i came up based on your example.
SELECT DISTINCT ?item ?itemLabel ?elev ?unit_label #?unit ?sense ?lexeme ?form_1 ?prepr_1 ?form_2 ?prepr_2
WHERE 
{
  BIND(wd:Q13428 as ?item)
  ?item p:P2044/psv:P2044 ?placeElev.
  ?placeElev wikibase:quantityAmount ?elev.
  ?placeElev wikibase:quantityUnit ?unit.
  ?sense wdt:P5137 ?unit . 
  ?lexeme ontolex:sense ?sense .
  ?lexeme dct:language wd:Q1412 . # Finnish
  ?lexeme ontolex:lexicalForm ?form_1. 
  ?form_1 wikibase:grammaticalFeature wd:Q110786 .
  ?form_1 wikibase:grammaticalFeature wd:Q131105 .
  ?form_1 ontolex:representation ?prepr_1 .
  ?lexeme ontolex:lexicalForm ?form_2. 
  ?form_2 wikibase:grammaticalFeature wd:Q146786 .
  ?form_2 wikibase:grammaticalFeature wd:Q131105 .
  ?form_2 ontolex:representation ?prepr_2 .
  BIND(IF(?elev = 1,  ?prepr_1, ?prepr_2) AS ?unit_label)

  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". } # Helps get the label in your language, if not, then en language
}
Try it!
Zache (talk) 16:32, 29 November 2022 (UTC)[reply]
This should grab both finnish and swedish labels at the same time. By the way, your check on elevation won't work if elevation is negative or has decimals.
SELECT DISTINCT ?item ?itemLabel ?elev ?unit_label #?dlangLabel ?unit ?sense ?lexeme ?form_1 ?repr_1 ?form_2 ?repr_2
WHERE {
  BIND(wd:Q13428 as ?item)
  ?item p:P2044 ?place_st .
  ?place_st a wikibase:BestRank .
  ?place_st psv:P2044 ?placeElev.
  ?placeElev wikibase:quantityAmount ?elev.
  ?placeElev wikibase:quantityUnit ?unit.
  ?sense wdt:P5137 ?unit . 
  ?lexeme ontolex:sense ?sense .

  ?lexeme dct:language ?dlang . FILTER (?dlang IN (wd:Q1412, wd:Q9027))

  ?lexeme ontolex:lexicalForm ?form_1. 
  ?form_1 wikibase:grammaticalFeature wd:Q110786 .
  ?form_1 wikibase:grammaticalFeature wd:Q131105 .
  OPTIONAL { ?form_1 wikibase:grammaticalFeature ?indef_1 . FILTER (?indef_1 IN (wd:Q53997857)) }
  ?form_1 ontolex:representation ?repr_1 .
  FILTER((?dlang = wd:Q9027 && BOUND(?indef_1)) || (?dlang = wd:Q1412))

  ?lexeme ontolex:lexicalForm ?form_2. 
  ?form_2 wikibase:grammaticalFeature wd:Q146786 .
  ?form_2 wikibase:grammaticalFeature wd:Q131105 .
  OPTIONAL { ?form_2 wikibase:grammaticalFeature ?indef_2 . FILTER (?indef_2 IN (wd:Q53997857)) }
  ?form_2 ontolex:representation ?repr_2 .
  FILTER((?dlang = wd:Q9027 && BOUND(?indef_2)) || (?dlang = wd:Q1412))

  BIND(IF(ABS(?elev) >= 2,  ?repr_2, ?repr_1) AS ?unit_label)

  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}
Try it!
Infrastruktur (talk) 20:02, 29 November 2022 (UTC)[reply]

adding language+script for query output[edit]

How may I update the query, https://w.wiki/63ms, that returns zh to include script code? For instance, Chinese traditional script, zh-Hant? Thank you. jshieh (talk) 17:59, 30 November 2022 (UTC)[reply]

Is this what you had in mind? Or did you mean to match objects of any kind of chinese and report which variant?
SELECT ?s ?label (LANG(?label) AS ?lc)
Where {
    ?s ?p "Smithsonian Institution"@zh .
    ?s rdfs:label ?label . FILTER (langmatches(lang(?label), "zh"))
}
LIMIT 100
Try it!
Infrastruktur (talk) 18:24, 30 November 2022 (UTC)[reply]
In case you meant the latter, here's a query for that too.
SELECT ?s ?o (LANG(?o) AS ?lc)
Where {
  {
    SELECT DISTINCT ?lcs WHERE { 
      [] wdt:P424 ?lcs . FILTER(SUBSTR(?lcs, 1, 2) = "zh")
    }
  }
  BIND ( STRLANG("Smithsonian Institution", ?lcs) AS ?o)
  ?s ?p ?o . hint:Prior hint:runLast true .
}
LIMIT 100
Try it!
Infrastruktur (talk) 18:55, 30 November 2022 (UTC)[reply]

querying when a particular statement was last modified[edit]

Hi, I know how to query when an Entity was last modified, but is there a way to do that kind of query on a single statement itself? An example would be querying someone's Position Held and getting a) position info (with start/end times if exist) and b) the last time each of those statements was updated. So something like this but with info that lets me display when that statement was last modified

SELECT ?person ?personLabel ?position_held ?position_heldLabel ?start ?end WHERE { 
 SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
 ?person wdt:P39 ?position_held;
         p:P39 [
           rdf:type wdno:P39;
           pq:P580 ?start;
           pq:P582 ?end
         ].

}

LIMIT 100

Ruthbrarian (talk) 19:56, 30 November 2022 (UTC)[reply]

I don't think this can be done with a query. Infrastruktur (talk) 14:27, 1 December 2022 (UTC)[reply]

Items with the greatest number of P31 values?[edit]

I tried

SELECT ?item (COUNT(?value) AS ?referenceCount) WHERE {
  ?item wdt:P31 ?value.
} GROUP BY ?item
ORDER BY DESC(?referenceCount)
LIMIT 10
Try it!

but it times out. Lectrician1 (talk) 22:00, 30 November 2022 (UTC)[reply]

@Lectrician1: WDQS's Blazegraph times out before returning a result b/c it has to evaluate all P31s for all items to complete the query, but is given only 60 seconds to do so. Try Qlever - https://qlever.cs.uni-freiburg.de/wikidata/8yO2Os - which is more or less up-to-date and very fast - ~7 seconds for your query. --Tagishsimon (talk) 14:37, 1 December 2022 (UTC)[reply]

Filter presidents with his start date and end date[edit]

i tried 190.11.86.134 01:31, 3 December 2022 (UTC)[reply]

Concatenate multiple values for a column[edit]

This lists the most recently died cast members of a television show with the episodes in which they appear.

select ?person ?personLabel ?epLabel ?dod where
{
  $ep wdt:P31 wd:Q21191270.
  $ep wdt:P179 wd:Q79784.
  $ep wdt:P161 ?person.
  ?person wdt:P570 ?dod.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
} order by desc(?dod) LIMIT 50
Try it!

This returns one row per pair (person, episode). Is there a way to have one row per person, with all episode titles concatenated as a single cell in that row, maybe comma-separated? Otherwise persons with many appearances fill up the result quickly. So for Gregory Itzin the resulting cell would be "The One with Phoebe's Wedding, The One with Ross's Inappropriate Song".--176.199.18.181 20:49, 4 December 2022 (UTC)[reply]

Something like:
select ?person ?personLabel (GROUP_CONCAT(DISTINCT ?epLabel_;separator=" /// ") as ?epLabel) ?dod where
{
  $ep wdt:P31 wd:Q21191270.
  $ep wdt:P179 wd:Q79784.
  $ep wdt:P161 ?person.
  ?person wdt:P570 ?dod.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". 
                         ?person rdfs:label ?personLabel .
                         ?ep rdfs:label ?epLabel_ . }
} GROUP BY ?person ?personLabel ?dod order by desc(?dod) LIMIT 50
Try it!
--Tagishsimon (talk) 20:57, 4 December 2022 (UTC)[reply]

How do I sort a list of articles from a PagePile chronologically?[edit]

I'm trying to sort a PagePile of pages about individual human beings chronologically by their dates of birth. Is there any way to do that using SPARQL? 2602:306:C541:CC60:C067:5D78:17CE:2ACA 06:52, 5 December 2022 (UTC)[reply]

As far as I know, the Wikidata Query Service cannot load data directly from PagePiles. I am afraid that the easiest option is to process and sort the data externally. e.g. in OpenRefine. I would like to be proved wrong, though :) Vojtěch Dostál (talk) 08:55, 5 December 2022 (UTC)[reply]
This needs a manual step, but the approach I would use is to convert the PagePile into Wikidata items (if it isn't already), download that list (which is plain QIDs), do a search-and-replace to replace every "Q" with "wd:Q", and then feed that list into a SPARQL query using the VALUES method - VALUES ?person { wd:Q1 wd:Q2 etc}. I am not offhand sure how long a list this can cope with, but I just took a recent 2000-item pagepile, converted to WD, and fed it into the service with a simple query - the browser doesn't like displaying the page afterwards, but the query seems to have run OK in a couple of seconds.
I tried to replicate this use case by generating ~4000 people (everyone born in Glasgow), pasting it into VALUES, and asking the service to generate birthdates: again this one ran fine and very promptly. It was probably a bit slower than generating the list in the same query, but still a couple of seconds rather than taking ages. Again, though, the browser got a bit upset: it was happy to display the search result, but if I tabbed away and came back, it would have trouble displaying the page. This might be browser-dependent.
If you need to line the results back up with the original pagenames in the pagepile, this will be a little tricky but you could add a search element to generate the correct sitelink for everyone. Andrew Gray (talk) 12:50, 5 December 2022 (UTC)[reply]
select ?person (sample(?date) as ?date_sample) ?enwiki_pagename
where
{
  values ?person { wd:Q1405 wd:Q9554 wd:Q36359 wd:Q635 
                   wd:Q76 wd:Q1394 wd:Q23 wd:Q935 }
  optional { ?person wdt:P569 ?date }
  optional { ?article schema:about ?person ; schema:isPartOf <https://en.wikipedia.org/> ; schema:name ?enwiki_pagename }
} group by ?person ?enwiki_pagename order by ?date_sample
Try it!
Here's an example - it finds the items from the values list (the items can be seperate by spaces, linebreaks, or both) and then optionally looks for their enwiki page name + date of birth, and sorts the resulting list. The sample element is to pick one date if there are two equally preferred ones, which happens a bit more than you might like and would get you duplicates. Andrew Gray (talk) 13:04, 5 December 2022 (UTC)[reply]
As for how many values you can submit, I looked into that back in June this year: «Aside from the time-limitation, values statement increase the size of the data that needs to be transferred to the endpoint, and the proxy will terminate the request if it including protocol overhead is bigger than about 1024kiB (which translates to a query size of about 750kiB)». Infrastruktur (talk) 20:09, 5 December 2022 (UTC)[reply]
750K, wow - so that suggests the practical limit might be somewhere in the region of a few tens of thousands of items? Andrew Gray (talk) 23:43, 5 December 2022 (UTC)[reply]
@Andrew Gray That could be about right. I was usually able to do VALUES on about 10,000 items but not too many more. Vojtěch Dostál (talk) 16:52, 6 December 2022 (UTC)[reply]

Performers by province by edition[edit]

Hello! I'm trying to make this work. I want to show the province of the performers at a given performing contest by year. I have a problem when I filter the labels by language (I'm copying from some previous examples I have). How I'm doing this wrong? Thanks!

#defaultView:BarChart
SELECT ?year (COUNT(?herrialdea) AS ?count) (SAMPLE(?herrialdeaLabel) AS ?label) WHERE {
  SERVICE wikibase:label { bd:serviceParam wikibase:language "eu,es,fr,en". }
  ?Bertsosaio wdt:P31 wd:Q8247090.
  ?Bertsosaio wdt:P710 ?bertsolari.
  ?bertsolari wdt:P19 ?jaiolekua.
  ?jaiolekua (wdt:P131*) ?herrialdea.
  {?herrialdea wdt:P31 wd:Q162620.}
  UNION
  {?herrialdea wdt:P31 wd:Q34876.}
  ?Bertsosaio wdt:P582 ?data.
  BIND(STR(YEAR(?data)) AS ?year)
  FILTER((LANG(?herrialdeaLabel)) = "eu")
}
GROUP BY ?herrialdeaLabel ?year
Try it!

Theklan (talk) 08:55, 5 December 2022 (UTC)[reply]

@Theklan: You cannot do additional filters on a label SERVICE as far as I know, the service outputs only one (first available) label based on the sorted language tags "eu,es,fr,en". You can retrieve the eu label like this
#defaultView:BarChart
SELECT ?year (COUNT(?herrialdea) AS ?count) (SAMPLE(?herrialdeaLabel) AS ?label) WHERE {
  SERVICE wikibase:label { bd:serviceParam wikibase:language "eu,es,fr,en". }
  ?Bertsosaio wdt:P31 wd:Q8247090.
  ?Bertsosaio wdt:P710 ?bertsolari.
  ?bertsolari wdt:P19 ?jaiolekua.
  ?jaiolekua (wdt:P131*) ?herrialdea.
  {?herrialdea wdt:P31 wd:Q162620.}
  UNION
  {?herrialdea wdt:P31 wd:Q34876.}
  ?Bertsosaio wdt:P582 ?data.
  BIND(STR(YEAR(?data)) AS ?year)
  ?herrialdea rdfs:label ?herrialdeaLabel FILTER((LANG(?herrialdeaLabel)) = "eu") .
}
GROUP BY ?herrialdeaLabel ?year
Try it!

--Vojtěch Dostál (talk) 08:58, 5 December 2022 (UTC)[reply]

Thanks! I have copied that from a working query with other items... so it seems that it is possible... in some cases?. Thanks, yours is working perfectly! Theklan (talk) 09:11, 5 December 2022 (UTC)[reply]

Environmental Performance Index by Country[edit]

Hello, I was wondering is it possible to calculate EPI (or at least some sort of environmental-related index) using WikiData. I looked through properties of Q6256 but couldn't find anything about environment. Maybe it could be done some other way? Hopejesus55 (talk) 07:29, 6 December 2022 (UTC)[reply]

Stopping a timeout on castles in the UK[edit]

https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/query_optimization#Optimization_strategies explains a timeout workaround to get a list of museums in Northern Ireland, which works. Great I thought, I'd use that for castle (Q23413) in the United Kingdom (Q145) which keeps timing out. But no joy, it still times out, which really frustrating as there are only about 4000 castle sites in the UK

SELECT ?place ?placeLabel ?placeDescription ?location ?tripadvisor 
  (GROUP_CONCAT(?categoryLabel; SEPARATOR = ", ") AS ?categories) WHERE {
  {
    SELECT ?place ?categoryLabel ?placeLabel ?placeDescription ?tripadvisor ?location WHERE {
      ?place wdt:P131* wd:Q145 .
      wd:Q23413 ^wdt:P279*/^wdt:P31 ?place .
      ?place wdt:P31 ?category.
      OPTIONAL { ?place wdt:P3134 ?tripadvisor.}
      ?place wdt:P625 ?location.
      SERVICE wikibase:label { bd:serviceParam wikibase:language "en-GB,en". }
    }
  }
}
GROUP BY ?place ?placeLabel ?placeDescription ?location ?tripadvisor
Try it!

Vicarage (talk) 12:29, 6 December 2022 (UTC)[reply]

Hi. Your inner select (sometimes) works for me, but it gives about 80k results (not ~4k) – much more concating ;) What about changing ?place wdt:P131* wd:Q145 . to ?place wdt:P17 wd:Q145 ., is it acceptable (i'm not sure how much of these castles don't have P17)? Because it's not timing out, as it gives much less results for concating in outer select :) Piastu (talk) 13:05, 6 December 2022 (UTC)[reply]
Thanks, that will do as castles do have P17, just not roofs! Vicarage (talk) 16:06, 6 December 2022 (UTC)[reply]

How to randomly select 100 male persons?[edit]

It seems most examples of randomisation in SPARQL use RAND with ORDER BY, so I tried the following two queries, but I always get timeout. I guess there are too many people in Wikidata and the use of ORDER BY cannot handle. Any idea how to get the list of random 100 male persons (with Wikipedia links)? Many thanks!

SELECT 
DISTINCT ?person ?sitelink ?random 
WHERE{
?person wdt:P31 wd:Q5 .
  ?person wdt:P21 wd:Q6581097 .
  ?sitelink schema:about ?person .
  ?sitelink schema:inLanguage "en" .     
  ?sitelink schema:isPartOf <https://en.wikipedia.org/> .
  BIND(RAND() AS ?random) .
}
ORDER BY ?random
LIMIT 100
Try it!
SELECT 
DISTINCT ?person ?sitelink ?randomhash 
WHERE{
  ?person wdt:P31 wd:Q5 .
  ?person wdt:P21 wd:Q6581097 .
  ?sitelink schema:about ?person .
  ?sitelink schema:inLanguage "en" .     
  ?sitelink schema:isPartOf <https://en.wikipedia.org/> .
  
  BIND(SHA512(CONCAT(STR(RAND()), STR(?person))) AS ?randomhash) .
}
ORDER BY ?randomhash
LIMIT 100
Try it!

217.68.49.93 16:12, 7 December 2022 (UTC)[reply]

Wikidata uses the Blazegraph query engine; it has a SAMPLE service answering the requirement for a random set of triples, along the lines below. Not sure whether/how it is possible to get a random set via SPARQL in the absence of the service.
SELECT ?person ?sitelink ?article
WHERE{
  SERVICE bd:sample {
   ?person wdt:P21 wd:Q6581097 .
    bd:serviceParam bd:sample.limit 2000 .
  } 
  ?person wdt:P31 wd:Q5 .
  ?sitelink schema:about ?person ;
    schema:isPartOf <https://en.wikipedia.org/> ;
    schema:name ?article .
}
LIMIT 100
Try it!
 – The preceding unsigned comment was added by Tagishsimon (talk • contribs) at 17:39, 7 December 2022‎ (UTC).[reply]
The rand() function has a bit of a gotcha, the value is the same during the query so at best it can only be used for creating a random seed. The hashing trick generally works well on small sets, unfortunately by combining "order by", "distinct" and "limit" you force materialization of the entire solution set, and discard most of it. And since there is a lot of people on wikidata the result is timeout. Infrastruktur (talk) 17:25, 7 December 2022 (UTC)[reply]
Thank you for the answers! The answer query above works, but the outcome is always the same and cannot be used with ORDER BY etc, I think. So, yes it is hard to get truly random results in every query. If there are more ideas, please post them. Highly appreciated. 2001:610:450:40:0:0:0:14 17:40, 7 December 2022 (UTC)[reply]
Getting the same results is a side-effect of the query results being cached. Try altering the query ever so slightly to fix it. I usually add a comment with a number that I increment. Infrastruktur (talk) 17:51, 7 December 2022 (UTC)[reply]
Sorry, I forget to mention. Actually, the goal is to get 100 random people from Wikipedia not Wikidata (thus Wikipedia URI are retrieved in the query). I thought SPARQL is the way to go, but if there are other API ways to achieve this, please let me know. Thank you! 145.90.228.20 18:01, 7 December 2022 (UTC)[reply]
You can do the sorting by wrapping the first query which gets the result as subselect and then sort the result from it. About getting 100 random people from wikipedia. I think that query already will work like you wanted as it filters out the results without wikipedia page (?) :
SELECT * WHERE {
  SELECT ?person ?sitelink ?article
                 WHERE{
                   SERVICE bd:sample {
                     ?person wdt:P21 wd:Q6581097 .
                     bd:serviceParam bd:sample.limit 2000 .
                   } 
                   ?person wdt:P31 wd:Q5 .
                   ?sitelink schema:about ?person ;
                             schema:isPartOf <https://en.wikipedia.org/> ;
                             schema:name ?article .
                 }
  LIMIT 100
}
ORDER BY ?article
Try it!
--Zache (talk) 13:12, 8 December 2022 (UTC)[reply]
Yes, it's worth looking at Zache's query again in slo-mo. SPARQL requires ORDER to be placed before LIMIT in a simple query. The main query fetches a random 2000 items having P21=male, and then discards those without WP articles or who are not human. Imposing ORDER before LIMIT biases the report outcome to people whose names start ABCD &c and disfavours those starting WXYZ &c ... i.e. the SPARQL orders the random set by article name, and the limit chops off the bottom of the list; example below. What we want is to impose the limit first, and then order the set of 100, and that's what Zache's sleight-of-hand nested query does.
#This query favours people whose names start ABC &c over those starting XYZ &c.
  SELECT ?person ?sitelink ?article
                 WHERE{
                   SERVICE bd:sample {
                     ?person wdt:P21 wd:Q6581097 .
                     bd:serviceParam bd:sample.limit 2000 .
                   } 
                   ?person wdt:P31 wd:Q5 .
                   ?sitelink schema:about ?person ;
                             schema:isPartOf <https://en.wikipedia.org/> ;
                             schema:name ?article .
                 }
  ORDER BY ?article LIMIT 100
Try it!
--Tagishsimon (talk) 19:45, 8 December 2022 (UTC)[reply]

Using the results of a search as a series of properties to use[edit]

I would like to see the urls of heritage register entries for places, but there are a lot worldwide, so I wanted to use a query to generate the property list to search. So for castles in Kent I'd look for UK registers. Something like

SELECT DISTINCT ?castleLabel ?registerLabel ?URLprefix ?entry WHERE {
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE]". }
  {
    SELECT DISTINCT ?castle ?register ?URLprefix ?entry WHERE {
      ?register p:P31 ?statement0.
      ?statement0 (ps:P31/(wdt:P279*)) wd:Q18618628.
      ?register p:P17 ?country.
      ?country (ps:P17/(wdt:P279*)) wd:Q145.
      ?register wdt:P1630 ?URLprefix. 
      ?castle p:P31 ?statement1.
      ?statement1 (ps:P31/(wdt:P279*)) wd:Q23413.
      ?castle p:P7959 ?county.
      ?county (ps:P7959/(wdt:P279*)) wd:Q67479626.
      # pseudo code
      #?castle wdt:p?register ?entry .
    }
  }
}
Try it!

The problem being I don't know if you can use a variable as the middle part of a triplet. Vicarage (talk) 08:51, 9 December 2022 (UTC)[reply]