Wikidata talk:SPARQL query service/queries/Archive/2016

From Wikidata
Jump to navigation Jump to search
This page is an archive. Please do not modify it. Use the current page, even to continue an old discussion.

Decode URL?

Is there a way to get all Wikipedia sitelinks in a URL decoded way, e.g. for Wikidata:SPARQL_query_service/queries#Countries_that_have_sitelinks_to_en-wiki instead of "<https://en.wikipedia.org/wiki/South%20Africa>", just "<https://en.wikipedia.org/wiki/South Africa>" or "South Africa".
--- Jura 15:43, 31 December 2015 (UTC)

That would be a good candidate for custom function. Please file a phabricator ticket. There is an encoding function but seems to be no decoding one. --Smalyshev (WMF) (talk) 22:51, 1 March 2016 (UTC)

Shortest path / Erdos number problem: Deneuve number

Any idea how I could solve this: Wikidata_talk:WikiProject_Movies#Most_active_contemporary_actresses.3F. Is there a way to calculate the number of films it takes to link two actors - based on cast member (P161)? I got up to 3 by checking each distance separately in one query, but is there a way to do it for any distance on query.wikidata.org directly ?
--- Jura 08:31, 5 January 2016 (UTC)

@Jura1: What you need to solve your problem is breadth-first traversal, i.e. something like this, then optionally shortest path to display the link chain for a given person like this.
--- Nono314 (talk) 08:10, 13 January 2016 (UTC)

How to check pairwise inequality

I've got a few variables ?item1, ?item2, ..., ?itemn. How can I check if they are pairwise unequal? Do I really need to state n(n-1)/2 inequalities? --Jobu0101 (talk) 11:40, 5 January 2016 (UTC)

Label of OPTIONAL

SELECT ?country ?countryLabel ?capital WHERE {
  ?country wdt:P31 wd:Q3624078 .
  FILTER NOT EXISTS { ?country wdt:P31 wd:Q3024240 } .
  OPTIONAL { ?country wdt:P36 ?capital} .
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "en" .
  }
} ORDER BY ?countryLabel
Try it!

Since ?capital ist OPTIONAL here I can't access ?capitalLabel. Is there another way to print the label? Something like OPTIONAL { ?country wdt:P36 ?capital . ?capital enLabel ?capitalLabel}. --Jobu0101 (talk) 15:10, 5 January 2016 (UTC)

See Wikidata:SPARQL_query_service/queries#Optional_requirements -- Jheald (talk) 18:25, 6 January 2016 (UTC)
Something like [..] seems to work better, if the spouses have no label in the target language.
--- Jura 18:51, 6 January 2016 (UTC)
Oh no, it doesn't work either.
--- Jura 18:53, 6 January 2016 (UTC)

@Jheald: Thank you. I updated my query:

SELECT ?country ?countryLabel ?capital ?capital_label WHERE {
  ?country wdt:P31 wd:Q3624078 .
  FILTER NOT EXISTS { ?country wdt:P31 wd:Q3024240 } .
  OPTIONAL { ?country wdt:P36 ?capital} .
  ?capital rdfs:label ?capital_label FILTER (lang(?capital_label) = "en") .
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "en" .
  }
} ORDER BY ?countryLabel
Try it!

Unfortunately, I get an java.util.concurrent.ExecutionException now. --Jobu0101 (talk) 20:15, 12 January 2016 (UTC)

@Jobu0101: You need the label look-up to be inside the OPTIONAL { ... } block, like this: tinyurl.com/zrarcfo
Otherwise, ?capital might not be bound to a value when the query gets to ?capital rdfs:label ?capital_label, in which case the query would try setting ?capital to anything that had a label -- ie the entire database! -- so that is why the query times out if the label look-up is floating outside the OPTIONAL clause.
Hope that makes some sense! All best, Jheald (talk) 20:41, 12 January 2016 (UTC)
@Jheald: Thank you so much. It makes sense! Do you also know the answer of my below question? --Jobu0101 (talk) 20:49, 12 January 2016 (UTC)
Just bear in mind that it needs labels in the language you are querying. The above wont work correctly for "zh" or "en-gb".
--- Jura 20:54, 12 January 2016 (UTC)

How does the ORDER change the number of results?

Consider

PREFIX p: <http://www.wikidata.org/prop/>
PREFIX v: <http://www.wikidata.org/prop/statement/>
PREFIX pq: <http://www.wikidata.org/prop/qualifier/>

SELECT ?film ?filmLabel ?release WHERE {
  ?film wdt:P31/wdt:P279* wd:Q11424 .
  ?film p:P577 ?release_statement .
  ?release_statement v:P577 ?release . FILTER(year(?release)=2015) .
  ?release_statement pq:P291 wd:Q183 .
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "de" .
  }
} ORDER BY ?release ?filmLabel
Try it!

and

PREFIX p: <http://www.wikidata.org/prop/>
PREFIX v: <http://www.wikidata.org/prop/statement/>
PREFIX pq: <http://www.wikidata.org/prop/qualifier/>

SELECT ?film ?filmLabel ?release WHERE {
  ?film wdt:P31/wdt:P279* wd:Q11424 .
  ?film p:P577 ?release_statement .
  ?release_statement v:P577 ?release . FILTER(year(?release)=2015) .
  ?release_statement pq:P291 wd:Q183 .
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "de" .
  }
} ORDER BY ?release
Try it!

The only difference is that in the second query I don't order by ?filmLabel. But why do I get three further results in the upper case (584 instead of 581)? --Jobu0101 (talk) 20:11, 12 January 2016 (UTC)

Good question! Don't know the answer; but presumably it's because some films have either multiple release statements, or multiple values of ?release, which may or may not all be different; and the query somehow picks up slightly different multiples depending on the details of how it is optimised, which depends on the specified ordering.
If you use the DISTINCT codeword to select only unique output lines, i.e. SELECT DISTINCT ?film ?filmLabel ?release WHERE ... , then it finds 580 either way. Jheald (talk) 20:51, 12 January 2016 (UTC)
@Jheald: Here it finds 583 with DISTINCT. Wild Tales (Q16672466) is the item that we get twice without using DISTINCT. I don't see two different valid mappings for the variables with ?film=Q16672466, do you? Can you explain why we get Wild Tales (Q16672466) twice? There is only one publication date claim with Germany qualified. --Jobu0101 (talk) 21:10, 12 January 2016 (UTC)
Probably this? --ValterVB (talk) 21:32, 12 January 2016 (UTC)
@ValterVB: Thank you, that explains why we've got Wild Tales (Q16672466) twice. --Jobu0101 (talk) 22:59, 12 January 2016 (UTC)

@Jheald: Concerning the original question I've found out that it isn't a matter of the order. If you take one of those queries and execute them over and over again you get randomly 580 and 583 results. Try it tinyurl.com/zga4qgb! So that's a bug, do you agree? --Jobu0101 (talk) 22:59, 12 January 2016 (UTC)

I think there are two servers handling queries and their data might not always be identical.
--- Jura 08:15, 13 January 2016 (UTC)
Here are the three items that are neglected in the queries with 580 items as result:
  1. You're Not You (Q15046598)
  2. Dessau Dancers (Q20972616)
  3. In the Heart of the Sea (Q15079316)
Maybe the reason that one server neglects them is the same as in the paragraph below. One server did not catch the last edits. --Jobu0101 (talk) 11:52, 13 January 2016 (UTC)
I just edited Dessau Dancers (Q20972616) to catch the bad server's attention and it worked. Now it is shown in both result sets. So we get 581 and 583 items as results. --Jobu0101 (talk) 11:58, 13 January 2016 (UTC)

Strange result

PREFIX p: <http://www.wikidata.org/prop/>
PREFIX v: <http://www.wikidata.org/prop/statement/>
PREFIX pq: <http://www.wikidata.org/prop/qualifier/>

SELECT ?instanceLabel ?item ?itemLabel ?property ?awardLabel WHERE {
  ?item wdt:P31 ?instance .
  ?instance wdt:P279* wd:Q11424 .
  {{
    ?item p:P166 ?award_statement .
  	?award_statement v:P166 ?award
  }  UNION {
    ?item p:P1411 ?award_statement .
  	?award_statement v:P1411 ?award
  }}.
  ?item ?property ?award .
  ?award wdt:P31 wd:Q19020 .
  FILTER NOT EXISTS { ?award_statement pq:P805 ?ceremony } .
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "en" .
  }
} ORDER BY ?property ?instanceLabel ?itemLabel
Try it!

I don't understand why we get Thelma & Louise (Q658041) as result. In the claim with value Academy Award for Best Film Editing (Q281939) we are using statement is subject of (P805) as qualifier. But in our query we have FILTER NOT EXISTS { ?award_statement pq:P805 ?ceremony } which forbids statement is subject of (P805) as qualifier. --Jobu0101 (talk) 23:06, 12 January 2016 (UTC)

Similarly for The Day of the Jackal (Q901140). I can't see anything obvious.
Looking at what is recorded for Thelma & Louise (Q658041) -- tinyurl.com/ju7pdat -- it looks as though that last diff from 9 January was never added to the SPARQL database.
This is something we need to draw to the attention of User:Smalyshev (WMF), because the SPARQL database copy shouldn't be getting out of sync in this way. Jheald (talk) 10:44, 13 January 2016 (UTC)
Thanks for the repot, I'll take a look. --Smalyshev (WMF) (talk) 08:22, 14 January 2016 (UTC)

Most cited sources

I'm getting repeated timeouts on the following query:

PREFIX prov: <http://www.w3.org/ns/prov#>
PREFIX pr: <http://www.wikidata.org/prop/reference/>
SELECT ?source (count(distinct ?statement) as ?num) WHERE {
   ?statement prov:wasDerivedFrom/pr:P248 ?source .
} group by ?source order by desc ( ?num ) limit 5

Anyone know if there's any way to make this work? --Yair rand (talk) 21:41, 13 January 2016 (UTC)

Try dropping distinct, I'm not sure it is necessary in this context... --Smalyshev (WMF) (talk) 08:28, 14 January 2016 (UTC)

Checking if ?item is containted in list of values

How can I say that ?item should be bound to one element in a finite given list (for example {wd:Q5 , wd:Q7})? --Jobu0101 (talk) 23:45, 13 January 2016 (UTC)

VALUES ?item {wd:Q5 wd:Q7} . --Yair rand (talk) 23:56, 13 January 2016 (UTC)
Thank you! --Jobu0101 (talk) 12:56, 14 January 2016 (UTC)

SPARQL template problem and filter count question

Hi! I've got two new questions.

1. Why is the template not working here?
SELECT ?actor (COUNT(?film) AS ?films) WHERE {
  {SELECT DISTINCT ?actor ?film WHERE {
    VALUES ?film {wd:Q172241 wd:Q47703 wd:Q184768} .
     ?film wdt:P161 ?actor
  } }
} GROUP BY ?actor ORDER BY DESC(?films)
Try it!
2. Is there a possibility to filter the above query such that I get only results where ?films is greater than 1 without putting a third SELECT with FILTER around the query? --Jobu0101 (talk) 12:55, 14 January 2016 (UTC)

1.:fixed, 2:it can be done even shorter:

SELECT ?actor (COUNT(DISTINCT(?film)) AS ?films) (GROUP_CONCAT(?film) As ?filmography) WHERE 
{
    VALUES ?film {wd:Q172241 wd:Q47703 wd:Q184768} .
    ?film wdt:P161 ?actor
} 
GROUP BY ?actor 
HAVING (?films > 1)
ORDER BY DESC(?films)
Try it!

--
--- Jura 13:16, 14 January 2016 (UTC)

Thank you very much. I appreciate your new query. So the template problem came up because I used "}}" in my query, making the template think that the query is already over. --Jobu0101 (talk) 13:32, 14 January 2016 (UTC)

Why is the syntax highlighting for VALUES and HAVING so strange? --Jobu0101 (talk) 14:09, 14 January 2016 (UTC)

It's more fun :) But you don't have to worry about that. --Edgars2007 (talk) 08:53, 18 January 2016 (UTC)
It's on purpose to be more fun? --Jobu0101 (talk) 09:24, 18 January 2016 (UTC)
If that would be meant to be taken seriously, then there won't be smiley. My first guess was, that VALUES and HAVING isn't in the standart SPARQL (or something in this direction), but that of course isn't true. But as I said, you don't have to worry about that, because the query itself works. --Edgars2007 (talk) 10:29, 18 January 2016 (UTC)

Problem with prefix count query

SELECT (SUBSTR(?imdb,1,2) AS ?prefix) (COUNT(DISTINCT(?imdb)) AS ?number) WHERE {
  ?item wdt:P345 ?imdb
} GROUP BY ?prefix
Try it!

Why does this query not work and how do I fix it? --Jobu0101 (talk) 18:45, 16 January 2016 (UTC)

@Jobu0101: I think what you want is this:
SELECT ?prefix (COUNT(DISTINCT(?imdb)) AS ?number) WHERE {
  ?item wdt:P345 ?imdb
  BIND (SUBSTR(?imdb,1,2) AS ?prefix)     
} GROUP BY ?prefix ORDER BY DESC(?number)
Try it!
A key issue, if you're using GROUP BY, is that you can't include any variable in the SELECT unless it's one of the variables that you're grouping on, or it's inside one of the statistical functions. Since ?imdb doesn't appear in the GROUP BY, you can't SUBSTR on it in the SELECT.
So instead, the query above moves the SUBSTR inside the GROUP BY block, and then everything works. Jheald (talk) 19:20, 16 January 2016 (UTC)
Thank you so much. I'm so happy you made my query work. --Jobu0101 (talk) 19:32, 16 January 2016 (UTC)

Only items without English article

I've got several SPARQL query which run quite fast. They show me some items (bound to ?item) with certain properties. Now I want to put the extra condition to the query that only such items are shown which don't have a sitelink to the English Wikipedia. I tried to add things like FILTER NOT EXISTS {?article schema:about ?item . ?article schema:inLanguage "en"} but with that extra I always get a QUERY TIMEOUT. What shall I add to my queries instead? --Jobu0101 (talk) 22:45, 17 January 2016 (UTC)

A first thing to try might be:
     OPTIONAL {
          ?article schema:about ?item .
          ?article schema:inLanguage "en" .
     }
     FILTER(!bound(?article))
SPARQL has three different syntaxes for negation, and they don't all always get optimised the same way. So trying this form instead just might help.
What we hope the system isn't doing is building a list of all the articles in English, in order to then do a Hash_anti-join.
Instead we hope it's just taking your subset, and actually running a filter on it.
I believe that there are optimiser directives that can be made, to force it to process the filtering in a particular way. But it's something I have never really got to grips with; and nor did I find that part of the documentation very easy. Jheald (talk) 01:25, 18 January 2016 (UTC)
It might also depend where you place it (first condition or last condition). Here "Filter not exists" seems to work fine.
--- Jura 09:11, 18 January 2016 (UTC)
@Jheald: Your solution works and it is very fast. Thank you! --Jobu0101 (talk) 10:11, 18 January 2016 (UTC)

Items with P1411 without qualifier P585

I am trying to make a query to list nominated for (P1411) items that don't have a point in time (P585) qualifier. I want to display the following data: QID, Label, QID for value, label for value, Value QID for for work (P1686) (if P1686 is present) and label for values of P1686. I can't get the P1686 part to work. This is what I have so far:

PREFIX wikibase: <http://wikiba.se/ontology#>
PREFIX pq: <http://www.wikidata.org/prop/qualifier/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX wd: <http://www.wikidata.org/entity/>
SELECT ?item ?itemLabel ?nom ?nomLabel ?werk
{
  ?item wdt:P1411 ?nom .
        FILTER NOT EXISTS { ?nom pq:P585 ?x } .
  OPTIONAL {?item pq:P1686 ?werk} .
  
  
SERVICE wikibase:label {
    bd:serviceParam wikibase:language "nl,en" .
  }
} ORDER BY ASC(?item)

The column werk stays empty. I checked the first item in the list and it has a value for P1686. What am I doing wrong? Mbch331 (talk) 14:23, 20 January 2016 (UTC)

@Mbch331: a qualifier connects from a statement as its subject, not an item, so you need something like this:
PREFIX p:  <http://www.wikidata.org/prop/>
PREFIX pq: <http://www.wikidata.org/prop/qualifier/>
SELECT ?item ?itemLabel ?nom ?nomLabel ?werk
{
  ?item wdt:P1411 ?nom .
        FILTER NOT EXISTS { ?nom pq:P585 ?x } .
  OPTIONAL {?item p:P1411 ?nom_stmt . ?nom_stmt pq:P1686 ?werk} .
    
SERVICE wikibase:label {
    bd:serviceParam wikibase:language "nl,en" .
  }
} ORDER BY ASC(?item)
Try it!

I've altered the filtering, now it correctly filters, but there are duplicate statements. Per nom_stmt it shows all P1411 claims for the item and not only the one connected to nom_stmt.

PREFIX p: <http://www.wikidata.org/prop/>
PREFIX pq: <http://www.wikidata.org/prop/qualifier/>
SELECT ?item ?itemLabel ?nom ?nomLabel ?nom_stmt ?werk ?werklbl
{
  ?item wdt:P1411 ?nom .
  ?item p:P1411 ?nom_stmt
   FILTER NOT EXISTS { ?nom_stmt pq:P585 ?x } .
  OPTIONAL {?nom_stmt pq:P1686 ?werk . OPTIONAL {?werk rdfs:label ?werklbl filter (lang(?werklbl) = "nl") .} }
  
SERVICE wikibase:label {
    bd:serviceParam wikibase:language "nl,en" .
  }
} ORDER BY ASC(?item)
Try it!

So how to filter that? Or do I need to export it as JSON and import it into a PHP script for the check? Mbch331 (talk) 13:59, 21 January 2016 (UTC)

@Mbch331: Try this: tinyurl.com/glfxt3y Now only the nomination connected to ?nom_stmt should be returned, not all nominations connected to ?item Jheald (talk) 14:57, 21 January 2016 (UTC)
@Jheald:: Thanks. I have a lot to learn about SPARQL. Mbch331 (talk) 15:07, 21 January 2016 (UTC)
@Mbch331: SPARQL takes a little bit of getting used to, but once you have got into its way of thinking, I find it very neat and very powerful. (And by the look of it, you have already got a pretty good grasp).
As you get to grips with it, please do comment here and make suggestions about the attached project page -- what could be improved, what could be explained better, what was unexpected, what were sticking points that were difficult, what is not there or missing that should be added, etc...
It's a community resource, so let's see what we can all do to improve it together -- to try to make the learning experience progressively easier for each new person than it was the last. Jheald (talk) 18:32, 21 January 2016 (UTC)

Redirects and deleted items as result

Many of the results of the following query are redirects or deleted:

SELECT ?item1 ?item2 ?item1Label ?item2Label ?imdb WHERE {
  ?item1 wdt:P345 ?imdb FILTER (SUBSTR(?imdb,1,2)='nm') .
  ?item2 wdt:P345 ?imdb FILTER (STR(?item1)<STR(?item2)) .
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "en" .
  }
} ORDER BY ?imdb
Try it!

Do you know how this is possible? --Jobu0101 (talk) 20:15, 20 January 2016 (UTC)

Query deadline with a filter?

Hello, as part of the Wiki Loves Women writing contest I am trying to find which African women are missing an article on French wikipedia. At this time I have the following query :

PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX wikibase: <http://wikiba.se/ontology#>
PREFIX p: <http://www.wikidata.org/prop/>
PREFIX v: <http://www.wikidata.org/prop/statement/>
PREFIX q: <http://www.wikidata.org/prop/qualifier/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX schema: <http://schema.org/>

SELECT DISTINCT ?pers ?paysLabel ?persLabel (COUNT(distinct ?sitelink) as ?count) WHERE {
  ?pers wdt:P31 wd:Q5 .
  ?pers wdt:P21 wd:Q6581072 .
  ?pers wdt:P27 ?pays .
  ?pays wdt:P30 wd:Q15 .
  ?sitelink schema:about ?pers .
  filter not exists {
    ?pays wdt:P30 ?continent 
          filter( ?continent != wd:Q15) .
  }
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "fr,en" .
  }
} group by ?pers ?paysLabel ?persLabel
order by Desc(?count)
limit 10
Try it!

This works fine, but it includes items with a link on fr wiki. I have been trying to add the following string to this query : FILTER NOT EXISTS { ?wfr schema:about ?pers . ?wfr schema:inLanguage "fr" }

But when I add this line, I get an error message of a "Query deadline". I suspect it is because of my query being too complicated, any idea on how to optimize it? Thanks, Symac (talk) 20:40, 20 January 2016 (UTC)

Same issue as with #Only items without English article above.
Solution posted at Wikidata:Bistro#Spqarql et WikiLovesWomen.3F. Jheald (talk) 21:40, 20 January 2016 (UTC)

Last edit

Is it also possible to ask for the last edit of an item using SPARQL?  – The preceding unsigned comment was added by Jobu0101 (talk • contribs) at 14:05, 21 January 2016 (UTC).

I think that information may only be stored in the SQL tables; but pinging @Smalyshev (WMF): who will know much more. Jheald (talk) 14:47, 21 January 2016 (UTC)
@Jobu0101: You can see last revision and timestamp by using schema:dateModified and schema:version:
PREFIX schema: <http://schema.org/>SELECT * WHERE {
  wd:Q1 schema:dateModified ?time .
  wd:Q1 schema:version ?rev .
}
Try it!

--Smalyshev (WMF) (talk) 07:15, 25 January 2016 (UTC)

Which properties contains a value

I know it isn't quite nice to ask "Hey, can you write a query for me?", but in this case I must be that guy, who asks that. Currently I don't have any idea how to query that and didn't find anything useful for my query in question in examples. So... I have Latvia (Q211) and I would like to know, which properties have used it as value. And if the change isn't trivial, then it would be nice to know, how the query would look like for tree (tree[211][150][17,131]). --Edgars2007 (talk) 20:37, 22 January 2016 (UTC)

@Edgars2007: Really quick & dirty, for the first part I got that : https://query.wikidata.org/#PREFIX%20wd%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fentity%2F%3E%0APREFIX%20wdt%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fdirect%2F%3E%0APREFIX%20wikibase%3A%20%3Chttp%3A%2F%2Fwikiba.se%2Fontology%23%3E%0APREFIX%20p%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2F%3E%0APREFIX%20v%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fstatement%2F%3E%0APREFIX%20q%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fqualifier%2F%3E%0APREFIX%20rdfs%3A%20%3Chttp%3A%2F%2Fwww.w3.org%2F2000%2F01%2Frdf-schema%23%3E%0A%0A%0ASELECT%20distinct%20%3Fprop%20where%20{%20%0A%20%20%3Fitem%20%3Fprop%20wd%3AQ211%0A}
For the second part, would not be that hard to write. In the tools on query.wikidata.org there is a WDQ translator. Just mix the result and my query :) author  TomT0m / talk page 21:47, 22 January 2016 (UTC)
So simple query, I feel quite lame :D Thanks! --Edgars2007 (talk) 22:05, 22 January 2016 (UTC)
No, the change to tree isn't so trivial to me. Also mixing doesn't help. Could you write also that? Dirty is completely fine. --Edgars2007 (talk) 22:33, 22 January 2016 (UTC)
@Edgars2007: It would look like that I think. Is this intended there is only a few results ? https://query.wikidata.org/#prefix%20wdt%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fdirect%2F%3E%0Aprefix%20wd%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fentity%2F%3E%0ASELECT%20distinct%20%3Fprop%20WHERE%20{%0A%20%20%3Ftree0%20%28wdt%3AP150%29*%20%3Fitem%20.%0A%20%20%3Ftree0%20%28wdt%3AP17%7Cwdt%3AP131%29*%20wd%3AQ211%20.%0A%20%20%3Fitem%20%3Fprop%20wd%3AQ211%20.%0A}

@TomT0m: I don't think so. In theory I would suspect that there are at least as many properties, as were for the first one. And isn't your query wrong? I start to understand SPARQL, and I think this is what I need. I get property for those latitem, which are in tree, right? --Edgars2007 (talk) 08:04, 6 February 2016 (UTC)

@Edgars2007: Oh yes you're right, I don't think I fully understood the TREE statement in WDQ so I was not sure of what you wanted :) This query is "any property that has a statement with a value in any item (wd knows is) located in Latvia (Q211)  View with Reasonator View with SQID". By the way, a similar WDQ example on the help page translated in SPARQL : https://query.wikidata.org/#prefix%20wdt%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fdirect%2F%3E%0Aprefix%20wd%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fentity%2F%3E%0ASELECT%20%3Fitem%20WHERE%20{%0A%20%20%3Ftree0%20%28wdt%3AP150%29*%20%3Fitem%20.%0A%20%20%3Ftree0%20%28wdt%3AP17%7Cwdt%3AP131%29*%20wd%3AQ30%20.%0A%20%20%3Fitem%20wdt%3AP138%20wd%3AQ676555%20.%0A} timesout unfortunately with current wdqs. author  TomT0m / talk page 12:39, 6 February 2016 (UTC)
author  TomT0m / talk page 12:39, 6 February 2016 (UTC)

Log() function?

Is there a way to calculate a logarithm with sparql/blazegraph ? If not, apparently there are various ways to define a function server side. Could this be added?
--- Jura 12:07, 25 January 2016 (UTC)

As a workaround, I came up with "strlen(STR(ROUND(?quantity)))"
--- Jura 13:04, 25 January 2016 (UTC)
@Jura1: I think those functions are usable in SPARQL. Just need to understand how :) I'll let you know. author  TomT0m / talk page 13:23, 25 January 2016 (UTC)
Actually no they are not : https://query.wikidata.org/#PREFIX%20math%3A%20%3Chttp%3A%2F%2Fwww.w3.org%2F2005%2Fxpath-functions%2Fmath%2F%23%3E%0A%0ASELECT%20%3Fval%20WHERE%20{%0A%20BIND%28%20math%3Alog%28100%29%20AS%20%3Fval%29%20.%0A} seems to returns an unsupported function exception. It seems that they are not included in the SPARQL standard yet. Maybe a feature request to the Blazegraph guys would do the trick. author  TomT0m / talk page 13:41, 25 January 2016 (UTC)
Thanks for investigating. For it to work, sounds like something User:Smalyshev (WMF) would need to do.
--- Jura 14:14, 25 January 2016 (UTC)
When I wanted cosine() and sine() I ended up writing a power series -- see last example under Wikidata:SPARQL_query_service/queries#Working_with_co-ordinates (now about to become redundant, with real geo-query functions about to be activated with the upgrade to Blazegraph 2.0).
If you know roughly the size of what you want to take the logarithm of, you could do something similar based on the series for log(1+x) Jheald (talk) 20:38, 25 January 2016 (UTC)
Not really. I tried to find a way to define the "scale of values" link on property talk pages of quantities, e.g. at Property_talk:P2370. For this type of overview, the "number of digits" approach can do.
--- Jura 15:09, 26 January 2016 (UTC)
Looks like currently it isn't supported. I wonder what do you want to use logarithms? Can you provide a use case example? Even better, submit a ticket to phabricator.wikimedia.org to project "Wikidata Query Service" and describe it there. We are able to define custom functions, it just requires some work. --Smalyshev (WMF) (talk) 22:50, 27 January 2016 (UTC)

How to filter on language of monolingual text?

I want all items with a female form of label (P2521) statement, which doesn't have a statement in Dutch (languagecode nl). Getting all items with a P2521 claim is easy, but no idea how to filter on the language of the P2521 claim. Mbch331 (talk) 13:53, 25 January 2016 (UTC)

Try adding ". FILTER(?lang!="nl")" to the one for all languages at Help:Female_form_of_label.
--- Jura 14:14, 25 January 2016 (UTC)
Just discovered that myself with help of the query's on top of the property documentation. But thanks. Mbch331 (talk) 14:16, 25 January 2016 (UTC)
Good to hear that they work! .. BTW, I added one ranked by P106 to the help page. Top NL one is "politicus".
--- Jura 16:35, 25 January 2016 (UTC)

Deleted result

@Smalyshev (WMF):

SELECT ?item WHERE {
  ?item wdt:P345 "nm3943521"
}
Try it!

This query returns a deleted result (Q20882663). --Jobu0101 (talk) 14:09, 26 January 2016 (UTC)

Seems to be the same server sync issue encountered above:
--- Jura 12:27, 30 January 2016 (UTC)

Blank error

This query:

SELECT ?item ?type  WHERE { 
  ?item wdt:P345 ?imdb . 
  ?item wdt:P31 ?type. 
  filter (substr(?imdb,0,2) ='nm')
  filter (?type != wd:Q5)
}
Try it!

gives as a result ERROR: and that's it. It doesn't help me to understand if and what I'm doing wrong. Mbch331 (talk) 12:11, 30 January 2016 (UTC)

You broke it ;)
.. it works with "LIMIT 10". If you replace "FILTER" with "MINUS" it runs sample.
--- Jura 12:27, 30 January 2016 (UTC)

Grouping by only one variable

Is there any way / workaround to group a sparql query by only one variable from the SELECT statement? I have the following query and want to group it by ?player only:

SELECT ?number ?playerLabel ?birthday ?nationalityLabel
WHERE 
{
  wd:Q21011427 wdt:P527 ?player .
  OPTIONAL { ?player wdt:P569 ?birthday }.
  OPTIONAL { ?player wdt:P27 ?nationality }.
  OPTIONAL { ?player wdt:P1618 ?number }
    
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "de" .
  }
}
Try it!

How can I do the grouping? Adding GROUP BY ?player doesn't work obviously. Is there any possible way? Thanks, Yellowcard (talk) 11:15, 1 February 2016 (UTC)

Sometimes one can avoid "GROUP BY" by using "SELECT DISTINCT", but this doesn't work in this case.
Maybe grouping by player and birthday with "GROUP_CONCAT" gives you what you look for (sample).
--- Jura 11:39, 1 February 2016 (UTC)
@Yellowcard: "Group by" is usually used to make operation on the grouped results, for example counting the number of exact count of a value found in the set of ?player. If you want only one line of result per player (nationality) you should use "distinct" before ?player.
<syntaxhighligh lang="SPARQL">

PREFIX wikibase: <http://wikiba.se/ontology#> PREFIX wd: <http://www.wikidata.org/entity/> PREFIX wdt: <http://www.wikidata.org/prop/direct/> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

SELECT distinct ?number ?playerLabel ?birthday ?nationalityLabel WHERE {

 wd:Q21011427 wdt:P527 ?player .
 OPTIONAL { ?player wdt:P569 ?birthday }.
 OPTIONAL { ?player wdt:P27 ?nationality }.
 OPTIONAL { ?player wdt:P1618 ?number }
   
 SERVICE wikibase:label {
   bd:serviceParam wikibase:language "de" .
 }

}</syntaxhighligh>

Or if you want to count the number of nationalities you mays use something like
PREFIX wikibase: <http://wikiba.se/ontology#>
PREFIX wd: <http://www.wikidata.org/entity/> 
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

SELECT ?player (count(?nationality) as ?count)
WHERE 
{
  wd:Q21011427 wdt:P527 ?player .
  OPTIONAL { ?player wdt:P569 ?birthday }.
  OPTIONAL { ?player wdt:P27 ?nationality }.
  OPTIONAL { ?player wdt:P1618 ?number } . 

  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "de" .
  }
} group by ?player
author  TomT0m / talk page 11:48, 1 February 2016 (UTC)
UPDATE: you may also use an aggregation operation that concatenates all the results : [1] . Needs to be reworked so that it concatenates the labels :) Hint : it does not seem to work just adding ...Label :) author  TomT0m / talk page 11:53, 1 February 2016 (UTC)
Looks good, Jura, thanks. Only problem seems to be the number as I have multiple numbers for players with multiple nationalities. I tried to fix this by using STRBEFORE and remove the part after the first comma, but that deletes the numbers of the players with only one nationality ...:
SELECT (STRBEFORE(GROUP_CONCAT(?number; separator=", "),",") as ?numbers) ?playerLabel ?birthday (GROUP_CONCAT(?nationalityLabel; separator=", ") as ?nationalities)
WHERE 
{
  wd:Q21011427 wdt:P527 ?player .
  OPTIONAL { ?player wdt:P569 ?birthday }.
  OPTIONAL { ?player wdt:P1618 ?number }
  OPTIONAL { 	?player wdt:P27 ?nationality .
  				?nationality rdfs:label ?nationalityLabel FILTER(lang(?nationalityLabel)="de")
    		}
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "de" .
  }
}
GROUP BY ?player ?playerLabel ?birthday
Try it!
Any idea to fix this, maybe? Yellowcard (talk) 11:57, 1 February 2016 (UTC)
Not really. If you want to do a list on Wikidata or Wikipedia, you could use Listeria to define the fields you want to display and merely select items with SPARQL sample.
--- Jura 12:02, 1 February 2016 (UTC)
@Yellowcard: Not sure why you're removing the number before the first comma, if you actually want to keep it. Why not this tinyurl.com/zdh2mkp ? Or this tinyurl.com/z7qoo8a, if you want to suppress repeated identical numbers ? Jheald (talk) 15:09, 1 February 2016 (UTC)
@Jheald: The first one is what Jura suggested before and I tried to remove the part after the comma, not the part before. However, using DICTINCT is a good idea, thanks for that!
@Jura1: Listera seems to be down at the moment, at least for me. Or can you reach [2]? Cheers, Yellowcard (talk) 15:36, 1 February 2016 (UTC)
It seems to be out temporarily. For the query, maybe ?birthday should get concatenated as well in case one is missing.
Good idea Jheald! BTW, can we sort within GROUP_CONCAT?
--- Jura 16:53, 1 February 2016 (UTC)
@Jura1: How would you concatenate ?birthday? Yellowcard (talk) 17:13, 1 February 2016 (UTC)
Just like the other fields sample.
--- Jura 17:20, 1 February 2016 (UTC)
OK, but what is the purpose of this? Do I avoid a grouping of all players who don't have set a birthday (I currently don't have a test scenario for that)?
And a general question: Why can't I choose fields in the SELECT statement that I don't group, just as it works in MySQL and other database languages? Is this a known bug or does it have any useful background? Yellowcard (talk) 17:38, 1 February 2016 (UTC)
I deleted a date to test it :( Not sure about the reasons behind it. I'm somewhat new to SPARQL as well.
--- Jura 17:47, 1 February 2016 (UTC)

Group by ?award_statement but show ?award

Why is [3] and [4] possible but not [5]? And how can I fix it? What I actually want is to show ?award in the result table but to group by ?award_statement in order to get different lines in the result for Geena Davis (Q280098) and Susan Sarandon (Q133050) because they belong to different nominations. --Jobu0101 (talk) 09:51, 3 February 2016 (UTC)

If you remove "order by", it works. The error "com.bigdata.rdf.internal.NotMaterializedException: Vocab(2)" is mentioned in BLZG-1591,
different from the ones I usually get ("bad aggregate" or "timeout").
--- Jura 10:36, 3 February 2016 (UTC)
Thanks. So if I want them sorted I could use a surrounding SELECT only for sorting. --Jobu0101 (talk) 18:14, 3 February 2016 (UTC)
The question in the previous section leads me to conclude that "GROUP_CONCAT" is a workaround for many things.
BTW, maybe the error is a regression linked to the the update. @Smalyshev (WMF): what do you think? --- Jura 05:35, 4 February 2016 (UTC)
Nevermind, it's mentioned in [6].
--- Jura 06:18, 4 February 2016 (UTC)
@Jura1: @Jobu0101: there seems to be a bug with NotMaterializedException, we are working on it, it'll be fixed soon I hope. Looks like there's some issue with sorting or grouping by URI-typed nodes. --Smalyshev (WMF) (talk) 06:23, 4 February 2016 (UTC)
It appears that you need to repeat all selected properties in "GROUP BY", but by doing that OPTIONAL looses its effect. Is this consistent with SPARQL syntax?
--- Jura 06:27, 4 February 2016 (UTC)
An alternative that is sometimes be useful is to use (SAMPLE(?var) AS ?var), rather than putting ?var in the GROUP BY, if you know that ?var will have at most one value within the group (or if you are only interested in at most one value). I think if you use SAMPLE, or alternatively GROUP_CONCAT, in this way then that can cope with OPTIONAL values not necessarily being there. Jheald (talk) 15:52, 4 February 2016 (UTC)
Interesting idea! Hadn't thought about using it that way.
--- Jura 18:23, 8 February 2016 (UTC)

Claims with item as value

How do I list all items which have a claim with a certain item as value? Something like that: ?item ?property wd:Q???????. --Jobu0101 (talk) 14:20, 8 February 2016 (UTC)

Pretty much as you wrote:
SELECT ?item ?itemLabel ?prop ?propLabel WHERE {
   ?prop wikibase:directClaim ?wt_prop .
   ?item ?wt_prop wd:Q1744
   SERVICE wikibase:label {
     bd:serviceParam wikibase:language "en" .
   }
}
Try it!

gives all items with properties with Madonna (Q1744) as their object. (Alternatively, just look up Madonna on Reasonator). Jheald (talk) 14:43, 8 February 2016 (UTC)

Thank you very much! --Jobu0101 (talk) 09:57, 10 February 2016 (UTC)

Bug? "bes" works, but not "best"

In this query, I try to change "bes" to "best" and it disappears in the result. Same happens when changing "as" to "all". I had the same problem when trying a version with BIND. Any idea what's happening / how to fix it?
--- Jura 18:23, 8 February 2016 (UTC)

That's so odd, it's got to be a code bug -- especially when it breaks with BIND + SAMPLE as well. The version of Blazegraph is being upgraded to Blazegraph 2.0 soon (there's a couple of reversions & a race condition that have apparently been identified and need to be ironed out first) -- so it would be worth trying again, I think, once that is installed. Jheald (talk) 21:33, 8 February 2016 (UTC)
On further thought, it's probably worth pinging @Smalyshev (WMF): about this -- so that he can check this before Blazegraph 2.0 is fully rolled out; then, if the bug is still persisting, it's just possible the developers might be able to fix it in the current shake-down phase for version 2.0 Jheald (talk) 14:18, 10 February 2016 (UTC)
Looks like a bug. I'll check it out. --Smalyshev (WMF) (talk) 19:16, 10 February 2016 (UTC)
Checked on BG 2.0, does not seem to happen anymore. So probably will be fixed in 2.0 upgrade. I'll keep an eye on it just in case. --Smalyshev (WMF) (talk) 19:17, 10 February 2016 (UTC)
@Smalyshev (WMF): That's good news.
While you're looking at BG 2.0, is there any chance of seeing whether this bug is still present? It looks like somebody just left it out of a list of allowed commands. I filed a bug for it 4 months ago (BLZG-1564) with an example query, but so far it looks as if not a single living soul has ever read it. Jheald (talk) 20:08, 10 February 2016 (UTC)
Actually, the behaviour has changed since I originally filed it. The test query used to give two blank columns on the right-hand side; now, for no apparent reason, it just times out. Jheald (talk) 20:17, 10 February 2016 (UTC)
@Jheald: I'll check it out, but probably will take a bit of time as we're busy with bugs that block 2.0 upgrade. Once those are out of the way, I'll check on this one. --Smalyshev (WMF) (talk) 00:40, 17 February 2016 (UTC)
A workaround for this is to use something else than a string, e.g. SELECT (wikibase:statementProperty as ?use).
--- Jura 13:45, 14 February 2016 (UTC)

+Example(s)

I think this example by Matěj Suchánek also can be added in Wikidata:SPARQL query service/queries. One more useful example can be comparison of labels in different languages in the same item, for example to find all items about French communes where Italian labels differ from French. --XXN, 21:52, 8 February 2016 (UTC)

Multiple values

SELECT ?item ?itemLabel ?iedvien
WHERE {
  ?item wdt:P131 ?iedvien
  VALUES ?item { wd:Q15218492 }

 SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}
Try it!

Can somebody explain, why the query gives only one P131 value, while Aizkalne (Q15218492) has two (at least currently)? --Edgars2007 (talk) 14:01, 13 February 2016 (UTC)

@Edgars2007: One is higher ranked than the other. The "wdt:" namespace is set up such as to return only the best ranked available claims. There is other namespaces if you want a query that retrieves all the information. author  TomT0m / talk page 15:16, 13 February 2016 (UTC)
(edit conflict) OK, now I know, why (wdt vs p prefix), thank you Madonna and Voll for reminder. While in this case label for iedvien isn't important, how could I get real label for it?
PREFIX p: <http://www.wikidata.org/prop/>SELECT ?item ?itemLabel ?iedvien ?iedvienLabel
WHERE {
  ?item p:P131 ?iedvien
  VALUES ?item { wd:Q15218492 }

 SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}
Try it!
--Edgars2007 (talk) 15:19, 13 February 2016 (UTC)

Try this:

PREFIX p: <http://www.wikidata.org/prop/>
PREFIX v: <http://www.wikidata.org/prop/statement/>

SELECT ?item ?itemLabel ?iedvien ?iedvienLabel
WHERE {
  ?item p:P131 ?statement . 
  ?statement v:P131 ?iedvien .
  VALUES ?item { wd:Q15218492 }

 SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}|
Try it!

--Jobu0101 (talk) 15:26, 13 February 2016 (UTC)

Thanks guys for help :) --Edgars2007 (talk) 15:29, 13 February 2016 (UTC)

Counts for property talk pages

I expanded the "counts" link on property talk pages a bit (e.g. on Property talk:P26 top right corner). It seems that when counting statements with "BestRank" (wdt), deleted items can get included, while counts using wikibase:rank (Normal/Preferred/Deprecated) don't include them .. or maybe my query needs to be fixed.
--- Jura 14:03, 14 February 2016 (UTC)

Can't use aliases for labels from SERVICE wikibase:label

I want to use a query in Listeria (Magnus' tool), but that doesn't recognise mixed case labels. So I thought of creating all lowercase aliases for the labels, but that doesn't work. Then the result is empty for those fields. I can't use rdfs:label here, because that makes the query too hard and gives a timeout.

My query is:

select distinct ?item (?itemLabel as ?itemlabel) ?person (?personLabel as ?personlabel) ?humancheck (?humancheckLabel as ?humanchecklabel) WHERE
{
  ?item wdt:P58 ?person .
  OPTIONAL {?person wdt:P31 ?humancheck .}
  FILTER(?humancheck NOT IN (wd:Q5,wd:Q159979,wd:Q2985549,wd:Q10648343,wd:Q6934877,wd:Q16017119,wd:Q61002,wd:Q3046146,wd:Q14073567,wd:Q14756018))
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "nl,en" .
   } 
  } ORDER BY ?person
Try it!

Any ideas how to fix this? I also left a message on Magnus' talk page requesting Listeria to respect the case of the variables. Mbch331 (talk) 14:21, 14 February 2016 (UTC)

Something like this helped me get around the Listeria limitation. Sample at Help:Female form of label/lists/nl.
--- Jura 15:08, 14 February 2016 (UTC)
This doesn't have the Dutch labels, but does run within the time: tinyurl.com/z66dlg5 Jheald (talk) 00:56, 15 February 2016 (UTC)
Magnus fixed Listeria, so aliases are no longer needed. Mbch331 (talk) 07:08, 15 February 2016 (UTC)
@Jheald: Thanks for reminding us of this option. I added three basic samples to WQS.
--- Jura 12:09, 17 February 2016 (UTC)

Sometimes I need to rethink my logic for SPARQL to give the right result

For a tool I created I needed to return all the items (+current value and GUID for statement) with the value for YouTube channel ID (P2397) that don't start with UC and aren't 24 characters long (a correct YouTube channel id starts with UC and is 24 characters long). I wrote a query and it seemed to work:

PREFIX p: <http://www.wikidata.org/prop/>
SELECT ?item ?YouTube ?YouTubeEntity WHERE {
	?item wdt:P2397 ?YouTube .
    ?item p:P2397 ?YouTubeEntity .
    filter(SUBSTR(str(?YouTube),1,2)!="UC")
           }
Try it!

No problems until there was an item with 2 channel id's. My tool ran 4 times instead of the expected 2 times. Because both GUID's were returned for each value.

After trying various things, I finally found out that my logic was wrong. I needed to get the GUID's first and not the values for the P2397 claim, but the values for the GUID's.

PREFIX ps: <http://www.wikidata.org/prop/statement/>
PREFIX p: <http://www.wikidata.org/prop/>
SELECT ?item ?YouTube ?YouTubeEntity WHERE {
	?item p:P2397 ?YouTubeEntity .
    ?YouTubeEntity ps:P2397 ?YouTube .
    filter(SUBSTR(str(?YouTube),1,2)!="UC")
           }
Try it!

Posting it here, so people who run into the same problem can find the solution. Learned 2 things: don't always use wdt: and sometimes my logic needs adjusting. Mbch331 (talk) 21:56, 16 February 2016 (UTC)

The Sparql keyword DISTINCT can help cut down duplicate returns. Jheald (talk) 23:44, 16 February 2016 (UTC)
In this case that didn't help. All rows were unique, however with the first query for each value of P2397 all GUID's were returned (because I was thinking the wrong way, the GUID can't be derived from the value, only from the item). The first query fetched the GUID and value independent of each other, which made each row still being unique. There was no way of telling which GUID belonged to which value. The second query fetches the value based on the GUID, so the needed dependency was there. Mbch331 (talk) 08:14, 17 February 2016 (UTC)

Queries

Can someone help me with a SPARQL query which lists pairs of items with the same numeric value for KOATUU ID (P1077)?

Also I'm interested in queries which finds different items with:

  • 1) the same title of sitelinks for pages in certain different wikipedias (similar algorithm and result with Pasleim's projectmerge)
  • 2) titles of sitelinks for certain wikipedia(s) matched by advanced search (to find pages in one or more wikipedia(s) [and their items]) with similar page_name, for example pairs like: enwiki.title & enwiki.title_(disambiguation), or enwiki.title_(disambiguation) & frwiki.title_(homonymie) where enwiki.title=frwiki.title.
I can run SQL Queries against databases for this, but I'd like to see how they looks in SPARQL and to make use of features of this medium also.

--XXN, 15:13, 19 February 2016 (UTC)

@XXN: got this for identical Kotaku ids : https://query.wikidata.org/#PREFIX%20wdt%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fdirect%2F%3E%0APREFIX%20wikibase%3A%20%3Chttp%3A%2F%2Fwikiba.se%2Fontology%23%3E%0A%0ASELECT%20%3Fi1%20%3Fi1Label%20%3Fi2%20%3Fi2Label%20where%20{%0A%20%20%3Fi1%20wdt%3AP1077%20%3Fval%20.%0A%20%20%3Fi2%20wdt%3AP1077%20%3Fval%20.%0A%20%20filter%28%20!%28%3Fi1%20%3D%20%3Fi2%29%20%26%26%20str%28%3Fi1%29%20%3C%20str%28%3Fi2%29%20%29%0A%20%20SERVICE%20wikibase%3Alabel%20{%0A%09%09bd%3AserviceParam%20wikibase%3Alanguage%20%22en%2Cfr%22%20.%0A%20%20}%0A} author  TomT0m / talk page 16:33, 19 February 2016 (UTC)
Thanks! --XXN, 17:04, 19 February 2016 (UTC)
FYI, I created a template to generate such queries with a property number : Template:Identical id value. I'll create the corresponding Partial query next. author  TomT0m / talk page 17:41, 19 February 2016 (UTC)
Would be very nice to extend a little bit the first request (pairs of items with the same numeric value for KOATUU ID (P1077)) with query, which finds items with the same value for two (or more) properties, for example, the same Sports-Reference.com Olympic athlete ID (archived) (P1447) and date of birth (P569). It would help identify the real duplicates better. OK, maybe date of birth (P569) isn't the best example, but you got the idea. P.S. It was "requested" also here. --Edgars2007 (talk) 19:13, 19 February 2016 (UTC)
To answer my own question (was too lazy to make that query myself at that moment): the query. But I assume, it can't be used as replacement for constraint violations, right? It gives some old results. Like in my query it gives Watt W. Webb (Q15075639) and Watt W. Webb (Q7974918), which were merged a while ago. Question (can't do some tests): this piece of code will find those pairs, where i1 or i2 has P27:Q211:

{ ?i1 wdt:P27 wd:Q211 . } UNION { ?i2 wdt:P27 wd:Q211 . }? --Edgars2007 (talk) 12:59, 24 February 2016 (UTC)

Actually (contrary to my autodeleted post a little bit earlier) your union constructs 2 identical datasets except for the name of the column. As the variables ?i1 and ?i2 are not shared beetween your alternatives you will get something like
?1 ?i2
plop /
plop2 /
/ plop
/ plop2
It's some kind of disjoint sum. If both variables had the same name you would get
?1
plop
plop2
because the two alternatives are identical
So as I understand from your post my approach is wrong. What would be the right one? Oh, to correct me: need those pairs, where i1 and/or i2 has P27:Q211. If I'm saying some stupid things and my UNION will do the job, then sorry. Can't stick words from your post together, but I should :) --Edgars2007 (talk) 19:38, 26 February 2016 (UTC)
@Edgars2007: the template {{Identical id value}} can now be passed several properties. The duplicated now are those items who have similar values for each of this properties. Is this what you wanted ? author  TomT0m / talk page 17:56, 7 March 2016 (UTC)
Looks nice, thanks! --Edgars2007 (talk) 18:09, 7 March 2016 (UTC)

Blazegraph 2.0 is coming

Upgrade is happening later today [7]. It would be interesting to see queries illustrating new features of version 2.0.
--- Jura 10:41, 24 February 2016 (UTC)

It seems that "not in" stopped working.
--- Jura 19:04, 25 February 2016 (UTC)
@Smalyshev (WMF): can you confirm this? "not in" after logical-or doesn't seem to work. It seems to be a bug of Blazegraph.
--- Jura 09:14, 27 February 2016 (UTC)
Looks like a bug. I'll take a look. Thanks for the report! --Smalyshev (WMF) (talk) 23:24, 29 February 2016 (UTC)
@Jura1:, Yes, it is a bug, see https://jira.blazegraph.com/browse/BLZG-1785 --Smalyshev (WMF) (talk) 22:48, 1 March 2016 (UTC)
@Smalyshev (WMF): I wonder if it's the same bug. Items that should be returned have either no P17 or a P17 that's not P17=Q30 or P17=Q148 (for above sample query, based one of WQS' samples).
They seem to be testing something where MINUS and FILTER don't return the same elements. I don't think that happened before either, but I'm not sure.
--- Jura 23:08, 1 March 2016 (UTC)
Actually, may it is it is the same Version with Minus.
--- Jura 23:24, 1 March 2016 (UTC)

Prefix

To not confuse users, would it be okay if we change the prefixes in the examples to the standard prefixes defined in mw:Wikibase/Indexing/SPARQL Query Examples#Standard Prefixes? --Pasleim (talk) 18:46, 24 February 2016 (UTC)

Sounds good, but I'm not sure how to edit the ones at http:// tinyurl.com . Seems they need to be imported first.
--- Jura 18:49, 24 February 2016 (UTC)

Women by DOB

Hi! My next :) query is women ranked by period of birth. I wrote something like this. Currently there is only two periods, just for simplicity. I get error about FILTER, so I don't know, what the result looks like. In the result I would like to get time period and count of women born in that period. --Edgars2007 (talk) 19:29, 26 February 2016 (UTC)

Try something like this.
--- Jura 19:42, 26 February 2016 (UTC)
Here is one including the periods.
--- Jura 20:09, 26 February 2016 (UTC)
Your first one was my very initial query. The second one is perfect, thank you! Hope, I don't hit the timeout with the resulting query. --Edgars2007 (talk) 20:13, 26 February 2016 (UTC)
The "by century" one I added to property talk pages (top right corner) might work for you as well.
--- Jura 20:16, 26 February 2016 (UTC)
That sounds like a very good alternative. Thanks. --Edgars2007 (talk) 20:47, 26 February 2016 (UTC)

Linking to Commons file pages

There ought to be a simpler way to get SPARQL queries to give links to Commons file pages than having to do the following:

     ?item wdt:P18 ?pic .                                              # has pic
     BIND(STRAFTER(str(?pic), "Special:FilePath/") AS ?pic1) .         # get just the name bit
     BIND(IRI(CONCAT("https://commons.wikimedia.org/wiki/File:", ?pic1)) AS ?pic2).    # turn into link for file page

Pinging @Smalyshev (WMF): ? Jheald (talk) 22:08, 1 March 2016 (UTC)

To get around #Decode_URL? I had Listeria use a template to do the conversion.
--- Jura 22:13, 1 March 2016 (UTC)
As of now, there is no better way, but maybe we need one. Not sure how to do it though since there's only one value for wdt: - maybe custom function? --Smalyshev (WMF) (talk) 22:36, 1 March 2016 (UTC)
I don't think it's a function we want, so much as appropriate triples.
The situation seems to me very similar to the case for "External identifiers" -- where one wants to be able to retrieve either the identifier or the URL, so a new data-type is being defined, to make both accessible.
For Commons images one may want the unescaped file-name (which is what is shown on item pages), the URL for the file page (which is what is linked on item pages), or the URL for the image itself (which is what wdt:P18 is currently returning in queries).
Ideally all three should be easily available from the serialisation in the triple-store -- perhaps with an appropriate wikibase: predicate hung off the p:P18 statement. Jheald (talk) 23:21, 1 March 2016 (UTC)
Please note that RDF is the export format, which can and will be used by third-party tools. So we can not assume every tool knows what "Commons images" are and how to distinguish commons and non-commons images. Also, in RDF semantics having same property have both URI (IRI) and string values is very frowned upon. So if we say P18 is the image URL, it should always be URL. I am not convinced storing three values (which would be largely the same and easily derived from each other) is the best solution. I'd like to hear more about which use cases need which. --Smalyshev (WMF) (talk) 23:38, 1 March 2016 (UTC)
phab:T76042 may also be relevant here. --Smalyshev (WMF) (talk) 00:02, 2 March 2016 (UTC)
Do we say that P18 is the image URL? As far as Wikidata is concerned, on the property talk page, Property_talk:P18, we say that P18 is the filename string, from which the file-page is defined with a URL formatter (and the URL for a file would be returned by a different URL formatter).
The situation of having multiple URL formatters for single property is exactly analogous to what is needed for external identifiers -- compare eg VIAF ID (P214), which has two URL formatters, one for a URL for humans specified by formatter URL (P1630), the other a URL for data specified by formatter URI for RDF resource (P1921).
But what SPARQL returns (currently) is neither of the two, but simply the identfier itself which can be a lot more readable for humans reading the results of a query.
It's worth remembering the fuss there was over MediaViewer, when editors thought people wouldn't be seeing the file description pages with the attribution & licensing information for their files. If SPARQL is only serving up just the images, that may well not be at all popular. Jheald (talk) 00:37, 2 March 2016 (UTC)
The other thing to note is that we're going to be looking to these properties to point to the ways to make more than just images available. As Commons structured data becomes a reality, we may want to tabulate not just the URL for an image, but metadata about it as well -- when was it uploaded, what date does it represent, how is it licensed, what size is it, etc. The 'full' versions of these properties that point to Commons images will need to point to objects which in turn can point to the Commons entities from which this metadata can be obtained, not just image URLs. Jheald (talk) 02:02, 2 March 2016 (UTC)

Optional and Group by

Query.

The main problem. I want to get those items, which don't have at least one of those identifiers (which are in OPTIONAL), ranked by number of sitelinks (you know, kind of more popular films). To get those IDs in the results, I have to include them in GROUP BY, but that screws up everything (returned are those items, which have all five IDs).

The secondary problem. What piece of code I should add to get those items, which don't have all five properties? No, it's not related to main problem (at least directly). If I remove all that sitelinks and group by part, then between wanted results I do get those films, which already has all five IDs. Tried with UNION, but that times out. Ideas? --Edgars2007 (talk) 10:12, 10 March 2016 (UTC)

If you aggregate the fields with GROUP_CONCAT or SAMPLE, you can remove them from GROUP BY (see discussion further up).
This query use "?workaround" to find items with both maps and images.
--- Jura 10:39, 10 March 2016 (UTC)
Thank you, now everything is fine. If somebody is interested, here is the resulting query. --Edgars2007 (talk) 11:07, 10 March 2016 (UTC)

Items with a given label in any language

Hi! Is there an easy way to get items with a given label in at least one language? So I want to find all items that have the label "abc" in English and/or German and/or French and/or ... Any idea? Yellowcard (talk) 01:32, 22 March 2016 (UTC)

Yes is useful also for me. --ValterVB (talk) 20:30, 24 March 2016 (UTC)
select distinct ?item ?label (lang(?label) as ?lang) where {
  ?item rdfs:label ?label filter(str(?label) = "Paris")
} limit 10
Try it!

@Yellowcard, ValterVB:

A better one, with only one result by item for the label "Paris" sorted by number of languages :

select  ?item  (group_concat(lang(?label); separator=", ") as ?lang) (count(?label) as ?nb)where {
  select ?item ?label where {?item rdfs:label ?label filter(str(?label) = "Paris") } limit 350
} group by (?item) order by ?nb
Try it!

Unfortunately it seems that the service can't handle to find more than 400 labels in total before the timeout, so this result is not complete. author  TomT0m / talk page 09:15, 25 March 2016 (UTC)

And if I need only one language? ex. Alla the item with italian label="Paris" or italian label starting with "Paris"? (I'm thinking to dynamic disambiguation pages )
I have a Category:Partial query for that :) {{LabelContains}}
select ?item where { {{tl|LabelStartsWith|?item|Paris|it}} } }}
will give [8]
Except of course I should write a "startwith" version /o\ author  TomT0m / talk page 09:48, 25 March 2016 (UTC) ✓ Done
(works with limit 10, but timeouts with unlimited number of results) author  TomT0m / talk page 10:06, 25 March 2016 (UTC)

Display "tree map"

Is there a query where we can get this to work? I keep getting "generating view".
--- Jura 12:50, 26 March 2016 (UTC)

sssloww

Please see https://phabricator.wikimedia.org/T130797
--- Jura 11:54, 27 March 2016 (UTC)

Quantities

How to work with quantities? I have a query, where I want to order lakes by area (P2046) (which probably has mainly two quantity types: hectare (Q35852) and square kilometre (Q712226)). First, I would need to get the hectare (Q35852)/square kilometre (Q712226)/etc. itself shown. Then I could move forward myself or with further questions :) Edgars2007 (talk) 11:12, 13 April 2016 (UTC)

Trying converting with conversion to SI unit (P2370)
--- Jura 11:14, 13 April 2016 (UTC)
this does the conversion to SI units --Pasleim (talk) 12:19, 13 April 2016 (UTC)
Nice, thank you! --Edgars2007 (talk) 12:43, 13 April 2016 (UTC)
@Pasleim, Edgars2007: and now the partial query template : Template:Value in SI unit :) author  TomT0m / talk page 14:10, 13 April 2016 (UTC)
Thank you. Edgars2007 (talk) 19:41, 13 April 2016 (UTC)

Search for items with certain label or "also known as" string

Hi! I want to write a query that lists all items that have a certain string as label or "also known as" string. It would be cool if additionally ?lang could be bound to the respective language. Can you help me with that query? As string take for example "TV total". --Jobu0101 (talk) 16:54, 27 April 2016 (UTC)

I want to search to be case insensitive. So it would also be nice if the result string could also be bound to a variable. --Jobu0101 (talk)
@Jobu0101: This does the job :
select ?item ?label ?lang where {
  
  { { ?item rdfs:label ?label } union  { ?item skos:altLabel ?label } }
  filter (lcase(str(?label)) = lcase("Paris") )
  bind(lang(?label) as ?lang) .
} limit 10
Try it!
however this does not work for your example, too long to compute. Also maybe it would be useful to take only one result per item. author  TomT0m / talk page 19:32, 27 April 2016 (UTC)
Thank you. Yeah, it's sad that such functions take such a long time to compute. --Jobu0101 (talk) 19:39, 27 April 2016 (UTC)
@TomT0m: Can I speed up the query by allowing only a few languages? For example English, German and French? --Jobu0101 (talk) 19:43, 27 April 2016 (UTC)
You can try : https://query.wikidata.org/#select%20%3Fitem%20%3Flang%20where%20{%0A%20%20%0A%20%20{%20{%20%3Fitem%20rdfs%3Alabel%20%3Flabel%20filter%28lang%28%3Flabel%29%20%3D%20%22fr%22%20%7C%7C%20lang%28%3Flabel%29%20%3D%20%22en%22%20%7C%7C%20lang%28%3Flabel%29%20%3D%20%22de%22%20%29%20}%20union%20%20{%20%3Fitem%20skos%3AaltLabel%20%3Flabel%20filter%28lang%28%3Flabel%29%20%3D%20%22fr%22%20%7C%7C%20lang%28%3Flabel%29%20%3D%20%22en%22%20%20%7C%7C%20lang%28%3Flabel%29%20%3D%20%22de%22%29%20}%20}%0A%20%20filter%20%28lcase%28str%28%3Flabel%29%29%20%3D%20lcase%28%22Paris%22%29%20%29%0A%20%20bind%28lang%28%3Flabel%29%20as%20%3Flang%29%20.%0A}%20limit%2010 I tried to put the filtering inside the union in case of the union operation is costly, not sure it's efficient. author  TomT0m / talk page 20:01, 27 April 2016 (UTC)
@TomT0m: Thanks. Maybe I'll use the websearch for my purpose. It is much faster. After that I could collect the results and put them into the query to go on working with them.

Problem with simple query

Hi! Is there a bug or am I missing something? I don't see why this query doesn't return a result:

SELECT ?item ?itemLabel ?instance ?instanceLabel
WHERE {
  VALUES ?item {wd:Q23010088} . ?item wdt:P31 ?instance .
  SERVICE wikibase:label {bd:serviceParam wikibase:language "en"}
}
Try it!

--Jobu0101 (talk) 11:13, 29 April 2016 (UTC)

Possibly Wikidata:Project_chat#Query_service_corruption.
--- Jura 11:42, 29 April 2016 (UTC)
Thanks. --Jobu0101 (talk) 11:52, 29 April 2016 (UTC)
I missed the VALUES keyword. Handy. author  TomT0m / talk page 16:40, 29 April 2016 (UTC)
I learned about it above ;). --Jobu0101 (talk) 17:05, 29 April 2016 (UTC)

Optional column with grouping

Hi! Can you help me with manipulating the following query?

PREFIX p: <http://www.wikidata.org/prop/>
PREFIX v: <http://www.wikidata.org/prop/statement/>
PREFIX pq: <http://www.wikidata.org/prop/qualifier/>

SELECT ?film ?ordinal ?filmLabel ?imdb ((MIN(?pub)) AS ?publication) WHERE {
    ?film p:P179 ?series .
    ?series v:P179 wd:Q2484680.
    ?series pq:P1545 ?ordinal .
    ?film wdt:P345 ?imdb .
  	OPTIONAL {?film wdt:P577 ?pub}
    SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}
GROUP BY ?film ?filmLabel ?imdb ?ordinal
ORDER BY ?publication
Try it!

I'd like to make ?ordinal optional. So all items without having ?ordinal set should also appear with an empty entry in the ?ordinal column. Unfortunately, this doesn't work with OPTIONAL around ?series pq:P1545 ?ordinal because grouping is stronger and won't take such a row. --Jobu0101 (talk) 22:00, 30 April 2016 (UTC)

If I replace ORDER BY ?publication by ORDER BY ?ordinal the list isn't ordered correctly because ?ordinal is saved as string. Can that be fixed, too? --Jobu0101 (talk) 22:17, 30 April 2016 (UTC)
Answer to original question: use sample. Result. --Edgars2007 (talk) 05:34, 1 May 2016 (UTC)
Use ORDER BY ASC(xsd:integer(?ordinal)). --Succu (talk) 06:06, 1 May 2016 (UTC)

@Edgars2007, Succu: I want to thank both of you. That's great! I've got a further question about the solution to the original question. In this case using SAMPLE is a good solution because we've got at most one ordinal number per item. But in general there might be applications where we have many matches. For those items I want the query to behave like with GROUP BY. So I want to get one line per value. Only if there is no match at all I want a line, too (with no value in the respective column). Is there also a soltion for that kind of behaviour? --Jobu0101 (talk) 07:28, 1 May 2016 (UTC)

Hmm, this query shows something wrong for Bond 24, but this probably would be a good start. --Edgars2007 (talk) 08:43, 1 May 2016 (UTC)
@Edgars2007: Thanks. I think the problem occurs because we've got three publication dates. By the way, what I originally intended is one extra row for each value. In your solution all values are concatenated into one row. --Jobu0101 (talk) 14:26, 1 May 2016 (UTC)
Yeah, sorry. Too much words in your question - didn't read it carefully. OK, now I understand what you want have experimented a little bit, but without results. If somebody else wants to experiment, this should return 5 rows, the last person in VALUES list doesn'tdoesn't have the particular property. --Edgars2007 (talk) 08:38, 2 May 2016 (UTC)
  • You could replace ?series pq:P1545 ?ordinal with OPTIONAL { ?series pq:P1545 ?ord } . BIND(IF(!BOUND(?ord),"?",?ord) as ?ordinal) . BTW, shouldn't we provide several ways to number Bond movies? Broccoli's view might not be the only one.
    --- Jura 08:49, 2 May 2016 (UTC)
Thank you. Yes, we should actually create an EON Bond series Wikidata item and move the ordinals over to that series. --Jobu0101 (talk) 09:02, 2 May 2016 (UTC)

Label on precision?

The proposal at Wikidata:Property_proposal/Property_metadata#Wikidata_time_precision would allow to add a label to the number of date-timeprecision.
--- Jura 21:35, 2 May 2016 (UTC)

Please help with filtering by non-constant language range

Dear folks, I'm trying to write a bash script for automatic translation of the table of results of the 2016 Democratic Party primaries so that I need the label and article URL for every U.S. state or territory in several languages. I'm struggling, though, with getting the label that corresponds to the ?lang value of each result row. In the code below, line 36, the language code "pt" is hard-coded, so that ?stateLabelInLang always comes in Portuguese. But, misteriously, if you try to comment out line 36 and uncomment line 35, you'll see ?stateLabelInLang returning always blank. Sorry if this question sounds stupid, but it's my first SPARQL query and I have been struggling with this issue for several hours... 😰
Thank you!! --Usien6 (talk) 05:43, 4 May 2016 (UTC)

SELECT ?stateId ?statehoodLabel ?stateFips ?lang ?stateLabelInLang ?stateArticleInLang WHERE {

    ?stateId wdt:P131 wd:Q30

  .
    ?stateId wdt:P31 ?statehoodId

  OPTIONAL
    {
        ?statehoodId rdfs:label ?statehoodLabel
      .
        FILTER (
          str( lang(?statehoodLabel) ) = str( "en" )
        )
    }

  .
    {
        ?stateId wdt:P883 ?stateFips
      .
        FILTER(
          REGEX(?stateFips, "^[A-Z][A-Z]$")
        )
    }

  .
    VALUES ?lang { "cs" "es" "fi" "fr" "it" "ja" "ko" "mr" "nl" "pl" "pt" "ru" "sv" }

  OPTIONAL
    {
        ?stateId rdfs:label ?stateLabelInLang
      .
        FILTER (
          #str( lang(?stateLabelInLang) ) = str( ?lang ) # <-------- Not working as expected...
          #LANGMATCHES(lang(?stateLabelInLang), str(?lang)) # <----- Won't work either...
          LANGMATCHES(lang(?stateLabelInLang), str("pt"))
        )
    }

  OPTIONAL
    {
        ?stateArticleInLang schema:about ?stateId
      .
        ?stateArticleInLang schema:inLanguage ?lang
      . FILTER (
        STRSTARTS(str(?stateArticleInLang), CONCAT("https://", ?lang, ".wikipedia.org/"))
      )
    }

#} ORDER BY ASC(?lang) DESC(?statehoodId) ASC(?stateFips) LIMIT 800 # Production closing
} ORDER BY DESC(?statehoodId) ASC(?stateFips) ASC(?lang) LIMIT 50 # Debug-friendly closing
Try it!



I haven't looked at the entire thing, but usually I do language filtering with lang(?stateLabelInLang) = ?lang. It fails when there is no label in a given language. For US states, this may not be an issue.
--- Jura 06:00, 4 May 2016 (UTC)
Dear @Jura1: Thank you, but it does not work. Label availability is certainly not an issue here because when the language code is hardcoded, it just works. I wonder if ?lang ever reaches the scope of FILTER, even though it does reach the other FILTER scope, below. 🤔 Kind regards, --Usien6 (talk) 16:18, 5 May 2016 (UTC)
@Usien6: did you try without the "str" function in "str("br")" ? As far as I understand, str is necessary for a string with its language (str gives the raw string, and "lang" gives the lang as a string). Seems useless and bug generating prone.
Is that one OK ?
SELECT ?stateId ?statehoodLabel ?stateFips ?lang ?stateLabelInLang ?stateArticleInLang WHERE {

    ?stateId wdt:P131 wd:Q30

  .
    ?stateId wdt:P31 ?statehoodId

  OPTIONAL
    {
        ?statehoodId rdfs:label ?statehoodLabel
      .
        FILTER (
          lang(?statehoodLabel) = "en"
        )
    }

  .
    {
        ?stateId wdt:P883 ?stateFips
      .
        FILTER(
          REGEX(?stateFips, "^[A-Z][A-Z]$")
        )
    }

  .
    VALUES ?lang { "cs" "es" "fi" "fr" "it" "ja" "ko" "mr" "nl" "pl" "pt" "ru" "sv" }

  OPTIONAL
    {
        ?stateId rdfs:label ?stateLabelInLang
      .
        FILTER (
          #str( lang(?stateLabelInLang) ) = str( ?lang ) # <-------- Not working as expected...
          LANGMATCHES(?stateLabelInLang, ?lang) # <----- Won't work either...
          # LANGMATCHES(lang(?stateLabelInLang), str("pt"))
        )
    }

  OPTIONAL
    {
        ?stateArticleInLang schema:about ?stateId
      .
        ?stateArticleInLang schema:inLanguage ?lang
      . FILTER (
        STRSTARTS(str(?stateArticleInLang), CONCAT("https://", ?lang, ".wikipedia.org/"))
      )
    }

#} ORDER BY ASC(?lang) DESC(?statehoodId) ASC(?stateFips) LIMIT 800 # Production closing
} ORDER BY DESC(?statehoodId) ASC(?stateFips) ASC(?lang) LIMIT 50 # Debug-friendly closing
Try it!

──────────────────────────────────────────────────────────────────────────────────────────────────── @TomT0m: Nope. ?stateLabelInLang still all-blanks. I'm starting to think there must be a bug in the endpoint software...  🤔 --Usien6 (talk) 00:35, 7 May 2016 (UTC)

Server down?

There are problems with SPARQL: ERROR: <html> <head><title>502 Bad Gateway</title></head> <body bgcolor="white"> <center><h1>502 Bad Gateway</h1></center> <hr><center>nginx/1.9.4</center> </body> </html>. --Jobu0101 (talk) 18:28, 7 May 2016 (UTC)

Jobu0101, see Wikidata:Contact the development team#Query service returns 502 Bad Gateway for all queries. --Succu (talk) 21:28, 7 May 2016 (UTC)

Items without labels

I'm trying to get most used occupation (P106) without label in Latvian. This times-out. Ideas for optimization? I could use inner SELECT with limit (for P31:Q5), but that is a little bit too high level for my SPARQL knowledge :) --Edgars2007 (talk) 18:14, 8 May 2016 (UTC)

Here is one. To avoid timeouts, I dropped Q5 ..
--- Jura 18:47, 8 May 2016 (UTC)
Nice, thanks (as always)! --Edgars2007 (talk) 18:52, 8 May 2016 (UTC)
I added it to Wikidata:Status_updates/Next.
--- Jura 08:17, 9 May 2016 (UTC)
Thanks for mentioning me as co-author. I don't think I made enough contribution to that query to deserve mention, but anyway - thanks :) --Edgars2007 (talk) 17:32, 9 May 2016 (UTC)


GROUP_CONCAT and ORDER BY

It seems that once in a while "ORDER BY" works to order the results in GROUP_CONCAT. Is there a way to sort it consistently?
--- Jura 08:28, 20 May 2016 (UTC)

Pairs

I'm searching a way for working with pairs given as values. Can you give me the correct syntax?

SELECT ?film ?pos WHERE {
    VALUES (?film,?pos) {(wd:Q172241,1) (wd:Q47703,2) (wd:Q184768,3) (wd:Q163872,4) (wd:Q104123,5)}
}
Try it!

--Jobu0101 (talk) 19:05, 20 May 2016 (UTC)

@Jobu0101: Close -- you just need to remove the commas, and then you're there:
SELECT ?film ?pos WHERE {
    VALUES (?film ?pos) {(wd:Q172241 1) (wd:Q47703 2) (wd:Q184768 3) (wd:Q163872 4) (wd:Q104123 5)}
}
Try it!
-- Jheald (talk) 23:05, 21 May 2016 (UTC)
Great, thank you so much! --Jobu0101 (talk) 23:07, 21 May 2016 (UTC)

Auto-completion broken ?

It seems that autocompletion stopped working recently. It seems to work when one does the first version of a query, but not later.
--- Jura 12:53, 25 May 2016 (UTC)

Works again.
--- Jura 08:10, 11 June 2016 (UTC)

Data outdated

This revert is 15 days ago but SPARQL still doesn't consider it: QUERY. There are many such cases. What can we do about it (apart from manually doing some edit in the problematic items)? --Jobu0101 (talk) 16:50, 30 May 2016 (UTC)

@Jobu0101: Please tell me about them (you can post on my talk page, for example) if they still happen. Please provide item ID, query and edit diff (just as you did). I need more data about updates missing to be able to detect why it happens. --Smalyshev (WMF) (talk) 07:56, 11 June 2016 (UTC)
Thank you for trying to fix the problem. I'll tell you about what I'll find in future. --Jobu0101 (talk) 08:03, 11 June 2016 (UTC)
Somehow we got used to WQS showing somewhat different results so it took us weeks to figure out that deletions aren't reflected at all.
--- Jura 08:09, 11 June 2016 (UTC)

@Smalyshev (WMF): See this query. The result of "Hungary vs. Portugal" is not shown. --Jobu0101 (talk) 10:23, 23 June 2016 (UTC)

@Jobu0101: I think Hungary v Portugal (Q24202018) for Hungary uses number of points/goals conceded (P1359) but query looks for number of points/goals/set scored (P1351). Smalyshev (WMF) (talk) 22:50, 23 June 2016 (UTC)
Thank you for pointing that out. I corrected it. And sorry for posting a wrong example here. I didn't realise that. --Jobu0101 (talk) 05:38, 24 June 2016 (UTC)

@Smalyshev (WMF): Why is Andria-Corato train collision (Q25826723) not shown as result of this query? --Jobu0101 (talk) 07:51, 15 July 2016 (UTC)

On Andria-Corato train collision (Q25826723), number of deaths (P1120) value is marked as deprecated. Deprecated values do not show up in wdt:, though you can use p:/ps: combo to reach them. I have no idea why it's deprecated but that's what it says now. --Smalyshev (WMF) (talk) 19:55, 15 July 2016 (UTC)
@Smalyshev (WMF): I'm sorry. Of course I know that but I didn't realize it. So this was also false alarm. --Jobu0101 (talk) 20:43, 15 July 2016 (UTC)

@Smalyshev (WMF): Maybe this time: Deutschland. Made by Germany (Q25550740) should be listed in [9]. --Jobu0101 (talk) 17:34, 17 July 2016 (UTC)

Meta property to give possibility of doing queries on data set

Hi! In German Wikipedia I'm managing a list of more than 2000 Wikidata items which don't have a German Wikipedia article yet but are supposed to get one. I'm saving the proposed lemma there which gives red links in German Wikipedia much more meaning because they are connected to Wikidata items via my list. Now I want to use SPARQL to get information about those Wikidata items. Unfortunately,

VALUES ?item {wd:Q4661163 wd:Q6047017 wd:Q7774919 wd:Q2500010 wd:Q3745588 wd:Q7769569 wd:Q3821504 wd:Q2328102 wd:Q1755547 wd:Q4768146 wd:Q2357448 wd:Q3999751 wd:Q21076993 wd:Q19796097 wd:Q2428163 wd:Q2159546 wd:Q5167336 wd:Q2615948 wd:Q1134452 wd:Q4659823 wd:Q1753966 wd:Q2293234 wd:Q6090671 wd:Q3988563 wd:Q13222089 wd:Q18811608 wd:Q21456235 wd:Q21458397 wd:Q17560788 wd:Q18155444 wd:Q21468276 wd:Q21443157 wd:Q19750433 wd:Q21226415 wd:Q21246065 wd:Q19800919 wd:Q19264029 wd:Q21368958 wd:Q21174964 wd:Q19985304 wd:Q21095992 wd:Q21096016 wd:Q19608810 wd:Q7286814 wd:Q21334163 wd:Q21174979 wd:Q21328804 wd:Q19730207 wd:Q21334154 wd:Q21031221 wd:Q19878674 wd:Q21096093 wd:Q21075747 wd:Q21096096 wd:Q18164602 wd:Q21095949 wd:Q21031165 wd:Q21095960 wd:Q17998997 wd:Q21095964 wd:Q21007575 wd:Q18125702 wd:Q20032060 wd:Q6940888 wd:Q21007702 wd:Q15831673 wd:Q19895570 wd:Q20674963 wd:Q18152555 wd:Q20080465 wd:Q20240920 wd:Q20962826 wd:Q16025079 wd:Q21027084 wd:Q20962711 wd:Q20899547 wd:Q2582809 wd:Q16249670 wd:Q2894345 wd:Q20898682 wd:Q21008502 wd:Q20873793 wd:Q16998034 wd:Q20873819 wd:Q3795639 wd:Q20873754 wd:Q21008496 wd:Q20850498 wd:Q4204793 wd:Q16677629 wd:Q20850481 wd:Q18711323 wd:Q19706919 wd:Q14453450 wd:Q20817249 wd:Q21008478 wd:Q20011685 wd:Q20971266 wd:Q20751867 wd:Q20971263 wd:Q20630763 wd:Q20971249 wd:Q17071553 wd:Q20795460 wd:Q16531377 wd:Q20795339 wd:Q20795366 wd:Q20817245 wd:Q16661269 wd:Q20856821 wd:Q14859564 wd:Q18170211 wd:Q20643578 wd:Q20795559 wd:Q16249045 wd:Q17605514 wd:Q18920916 wd:Q20113542 wd:Q20855648 wd:Q20855694 wd:Q20113476 wd:Q20855717 wd:Q20080444 wd:Q20742831 wd:Q19984067 wd:Q16007945 wd:Q19998797 wd:Q19998687 wd:Q6726709 wd:Q15839083 ...}

doesn't work anymore because 2000 items are too many for such kind of queries. So I'm asking you for a solution. My solution proposal would be to create a kind of fake property (a meta property) which is set for an item if and only if this item is part of my list. Then I can replace my big value list by simply ?item wdt:P_META_PROPERTY_NUMBER ?sth. Are there already such properties and are they accepted by the community? Or do you have a different solution? --Jobu0101 (talk) 06:01, 4 June 2016 (UTC)

You could give this meta property even more meaning by not setting it to some value but to the proposed lemma for that item in German Wikipedia. This would be a very interesting property also for other languages (or wikis). A qualifier could select the Wikipedia for which the lemma proposal is meant. --Jobu0101 (talk) 06:05, 4 June 2016 (UTC)
Hmm, have you tried tabernacle? Probably not perfect solution, but still. --Edgars2007 (talk) 07:09, 4 June 2016 (UTC)

EM query question

Why is Romania missing in the teams column (first row) of this query?

Maybe because they lost? :D I would be more worried about United Kingdom... --Edgars2007 (talk) 05:44, 12 June 2016 (UTC)
Good point. So why is the "United Kingdom" shown in the line of Wales v Slovakia (Q24201879)? --Jobu0101 (talk) 11:00, 12 June 2016 (UTC)
Okay, I see the reason: Wales national association football team (Q180857) country (P17) United Kingdom (Q145). --Jobu0101 (talk) 11:02, 12 June 2016 (UTC)

Updated query: It really seems to be the case that the problem occurs whenever winner (P1346) is set. Is this a bug? --Jobu0101 (talk) 18:39, 13 June 2016 (UTC)

Okay, I solved the problem. It was my fault. I used ?country twice. --Jobu0101 (talk) 18:46, 13 June 2016 (UTC)
I was just going to write that. BTW, I'd try to use "country for sports" for all teams.
--- Jura 18:52, 13 June 2016 (UTC)
Yeah, that's what I'm using here. --Jobu0101 (talk) 19:03, 13 June 2016 (UTC)
Looks good. This to check for missing ones.
--- Jura 19:16, 13 June 2016 (UTC)
I was just trying something similar: [10]. --Jobu0101 (talk) 19:21, 13 June 2016 (UTC)

List also 0 matches and a sorting problem

Hi guys! How can I manipulate this query such that also items with ?listedParticipants=0 will be shown. Furthermore, why does sorting not work? --Jobu0101 (talk) 12:44, 17 June 2016 (UTC)

@Jobu0101: What you mean by "does sorting not work"? Could you provide more details? You could probably capture those with listedParticipants being zero if you put second clause with ?participant under OPTIONAL too. --Smalyshev (WMF) (talk) 22:56, 23 June 2016 (UTC)
@Smalyshev (WMF): Thanks. OPTIONAL gave me the lines I was looking for. But the sorting problem remains. Sorting by ?instance should group all World Cups together and within that sorting by ?start should order them chronologically. None of that happens. --Jobu0101 (talk) 08:06, 24 June 2016 (UTC)
I think for sorting to work, the variables need to be in the SELECT part.
--- Jura 16:06, 1 July 2016 (UTC)

wikibase:timePrecision "9"^^xsd:integer

@Smalyshev (WMF): I have two questions about this query:

SELECT ?team ?player ?playerLabel ?birth WHERE {
  ?team wdt:P31 wd:Q24199684 .
  ?team wdt:P710 ?player .
  ?player wdt:P569 ?birth . ?timevalue wikibase:timeValue ?birth . ?timevalue wikibase:timePrecision "9"^^xsd:integer
  SERVICE wikibase:label{bd:serviceParam wikibase:language "en,de"}
}
ORDER BY ?birth ?player
Try it!
  1. Why do we get results for this query? For example Steven Davis (Q355807) has precision "day" for his birthday and should not be listed here.
  2. Why do the lines repeat? I know that I can stop that by putting a DISTINCT in the query but I want to understand the reason. Is is a bug?

--Jobu0101 (talk) 08:12, 30 June 2016 (UTC)

This query does not do what you think it does, because ?timevalue is not bound to ?player. What you probably want is this:
SELECT ?team ?player ?playerLabel ?birth ?precision WHERE {
  ?team wdt:P31 wd:Q24199684 .
  ?team wdt:P710 ?player .
  ?player p:P569/psv:P569 [ wikibase:timeValue ?birth ; wikibase:timePrecision ?precision ]
  SERVICE wikibase:label{bd:serviceParam wikibase:language "en,de"}
}
ORDER BY ?birth ?player
Try it!

--Smalyshev (WMF) (talk) 21:56, 30 June 2016 (UTC)

@Smalyshev (WMF): Thank you very much. I didn't know about the square brackets so far. Where can I read about them? --Jobu0101 (talk) 15:50, 1 July 2016 (UTC)
Probably this could answer the initial questions. --Edgars2007 (talk) 17:21, 1 July 2016 (UTC)

Group by language of labels?

Is it possible to group a given set of items by number of labels in each language? I'd like to see which languages are covered well for these items and which languages are not given for any label at all. -- JakobVoss (talk) 12:38, 1 July 2016 (UTC)

Yeah, that's possible. Use ?item rdfs:label ?label to access the label and lang(?label) to access the language. --Jobu0101 (talk) 15:48, 1 July 2016 (UTC)

Thanks, that's more then required! I was looking for -- JakobVoss (talk) 20:08, 2 July 2016 (UTC)

SELECT ?lang (COUNT(DISTINCT ?label) as ?items) WHERE {
    # replace this comment by some query to set ?item
    ?item rdfs:label ?label .
    BIND(lang(?label) as ?lang) 
}
GROUP BY ?lang
Try it!

504 Gateway Time-out

SPARQL doesn't work currently. --Jobu0101 (talk) 06:51, 6 July 2016 (UTC)

Welsh Saints

There should be at least a thousand Welsh Saints on this list. Currently it has a dozen! It should include all saints born or who worked in Wales (Q25) (or counties, towns and villages in Wales). I've used canonization status (P411) = saint (Q43115) but would also like to add Catholic saint (Q3464126). I wish it was just 'drag-and-drop'! Examples of saints NOT listed are: Carantoc (Q2758802), Canna (Q3404543), Saint Eigen (Q7401197), Llechid (Q13129679) and Beuno (Q612390). Llywelyn2000 (talk) 13:26, 19 July 2016 (UTC)

A little bit better query. As I still don't know, how to properly use tree functionality in SPARQL, it still contains only some 60 items. --Edgars2007 (talk) 14:18, 19 July 2016 (UTC)
Is this a SPARQL question or is the result so minimal because of lack data in Wikidata? --Jobu0101 (talk) 14:25, 19 July 2016 (UTC)
@Edgars2007: Thanks! Copied and pasted the code here (which looks different to piped code I've used previously!), with no luck. Any suggestions please? Llywelyn2000 (talk) 14:56, 19 July 2016 (UTC)

Getting the language name from lang()

What do you think of Wikidata:Property proposal/lang? I suppose people around here all tried it ..
--- Jura 15:14, 29 July 2016 (UTC)

Efficient use of P2837 (P2837)

P2837 can be used to get the month name for a date. MONTH(?date) gets us a number and the item for the month has P2837.

Below a series of samples, some work others don't.

All should give a row with the following columns:

  • wd:Q42
  • 3
  • March
(1) Sample
SELECT ?item ?m ?monthnameLabel
{
  	VALUES ?item { wd:Q42 }
	?item wdt:P569 ?date
  	BIND(MONTH(?date) as ?m)

  	?monthname wdt:P2837 ?monthnb
	FILTER(?monthnb = ?m)

	SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }      
}
Try it!
(2) It works also as
SELECT ?item ?m ?monthnameLabel
{
  	VALUES ?item { wd:Q42 }
	?item wdt:P569 ?date .

  	?monthname wdt:P2837 ?m
  	FILTER(MONTH(?date)=?m)

	SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }      
}
Try it!
(3) It doesn't work with
SELECT ?item ?m ?monthnameLabel
{
  	VALUES ?item { wd:Q42 }
	?item wdt:P569 ?date
  	BIND(MONTH(?date) as ?m)

  	?monthname wdt:P2837 ?m

	SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }      
}
Try it!

The problem with the last one seems to be that there is a difference in datatype between MONTH() and wdt:P2837. Is there a way to improve these queries?
--- Jura 12:44, 31 July 2016 (UTC)

The result of wdt:P2837 is of type xsd:decimal and the result of MONTH() is xsd:integer. So the solution is:
SELECT ?item ?m ?monthnameLabel
{
  	VALUES ?item { wd:Q42 }
	?item wdt:P569 ?date
  	BIND(xsd:decimal(MONTH(?date)) as ?m)

  	?monthname wdt:P2837 ?m

	SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }      
}
Try it!
Found out by using the function DATATYPE, which gives as it's result the datatype of a value. Mbch331 (talk) 14:30, 31 July 2016 (UTC)
Thanks! I should looking into these functions.
--- Jura 19:16, 31 July 2016 (UTC)

Translation

I propose to mark this page for translation. Any opposion?--GZWDer (talk) 20:24, 2 August 2016 (UTC)

You might want to copy-edit it before .. personally, I didn't find it an easy read, but it's probably not meant as a starter.
--- Jura 22:20, 2 August 2016 (UTC)

Query to get all relatives

Hi, I am trying to write a query, to list all relatives without WikiTreeID of a person, that has WikiTreeID. My problem is, that relation is not always listed. Maybe also relation can be retrieved on better way from wdt:P.., but I don't know how. Can someone help me with this.

The following query uses these:

  • Properties: WikiTree person ID (P2949)  View with Reasonator View with SQID, father (P22)  View with Reasonator View with SQID, mother (P25)  View with Reasonator View with SQID, spouse (P26)  View with Reasonator View with SQID, child (P40)  View with Reasonator View with SQID, P7, P9, sex or gender (P21)  View with Reasonator View with SQID
    #defaultView:Table
    SELECT ?p ?pLabel ?pDescription ?pWikiTree ?pGenderLabel ?rel ?w ?wLabel ?wDescription ?wGenderLabel WHERE {
      {
    FILTER(STRSTARTS(?pWikiTreeID, 'Dal'))
        ?p wdt:P2949 ?pWikiTreeID. # wikitreeID set
        
        { ?p wdt:P22 ?w. values ?rel {'Father'}.}  UNION
        { ?p wdt:P25 ?w. values ?rel {'Mother'}.}  UNION
        { ?p wdt:P26 ?w. values ?rel {'Spouse'}.}  UNION
        { ?p wdt:P40 ?w. values ?rel {'Child'}.}   UNION
        { ?p wdt:P7  ?w. values ?rel {'Brother'}.} UNION
        { ?p wdt:P9  ?w. values ?rel {'Sister'}.}
        
        FILTER NOT EXISTS {?w wdt:P2949 ?wWikiTreeID.}  # wikitreeID empty
    
        BIND (URI(CONCAT("http://www.wikitree.com/wiki/",?pWikiTreeID)) AS ?pWikiTree) # extend wikitreeID
        OPTIONAL { ?p wdt:P21 ?pGender.}
        OPTIONAL { ?w wdt:P21 ?wGender.}
      }
      SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
    }
    LIMIT 100
    

 – The preceding unsigned comment was added by Lesko987a (talk • contribs).

If you only want direct relations that can work. (I added a template to your post).
Otherwise you might want to try the Genghis Khan query. Not sure how to change "linkType" there to use any family relationship. However, this can easily get very large results (check Magnus Manske's http://magnusmanske.de/wordpress/?p=232 )
--- Jura 07:03, 8 August 2016 (UTC)
I found the solution. With BIND it works.
{ ?p wdt:P22 ?w. BIND('Father' AS ?rel)} UNION
--Lesko987a (talk) 14:40, 10 August 2016 (UTC)

All items without a description

This is a question Edoderoo posted on nlwiki, but I think the chance of getting a correct answer is higher here. He wants all items (that match instance of (P31)Wikimedia category (Q4167836)) without a description in Dutch. How to do that? This one doesn't work:

select ?item where {
  select ?item ?itemDescription where {{ SELECT ?item ?itemDescription  WHERE  { ?item wdt:P31 wd:Q4167836 . {service wikibase:label{bd:serviceParam wikibase:language "nl" . }}} LIMIT 9999 } filter (!bound(?itemDescription)) }
 }
Try it!

This one times out (even with a limit of 10):

SELECT ?item ?itemDescription  WHERE  { 
	?item wdt:P31 wd:Q4167836 . 
	OPTIONAL { ?item schema:description ?itemDescription  }
	filter (!bound(?itemDescription)) 
}
Try it!

Any other suggestions? Mbch331 (talk) 06:05, 12 August 2016 (UTC)

SELECT ?item ?itemLabel WHERE {
    {
        SELECT ?item WHERE {
           
           hint:Query hint:optimizer "None" .
           {
               SELECT ?item WHERE {
                   ?item wdt:P31 wd:Q4167836 .
               } LIMIT 300000
           }
		   OPTIONAL { ?item schema:description ?itemDescription  }
	filter (!bound(?itemDescription)) 
          
        }
    }
           
    SERVICE wikibase:label {
        bd:serviceParam wikibase:language "nl" .
    }
Try it!
}
A dirty one... Analyzing first 300,000 instances of Wikimedia category (Q4167836) returned two items: alternatives to general relativity (Q3702775), Category:Jimmy Wales (Q6890598). It times-out with 500,000. --Edgars2007 (talk) 06:19, 12 August 2016 (UTC)

If there weren't that many elements, something like the below should do.

The following query uses these:
  • Properties: instance of (P31)  View with Reasonator View with SQID
    SELECT ?item 
    WHERE
    { 
    	?item wdt:P31 wd:Q12308941 . 
    	OPTIONAL { ?item schema:description ?d .  FILTER(lang(?d)="nl") }
    	FILTER (!bound(?d)) 
    }
    ORDER BY RAND()
    LIMIT 10
    

Here it's really something for SQL.
--- Jura 08:22, 12 August 2016 (UTC)

Related: Petscan request. --Edgars2007 (talk) 08:41, 12 August 2016 (UTC)
Easier solution: https://phabricator.wikimedia.org/T141553
--- Jura 08:59, 12 August 2016 (UTC)
That would be solution to this specific problem, Petscan solution is general :) --Edgars2007 (talk) 08:20, 13 August 2016 (UTC)