User:Rdrg109/3/4

From Wikidata
Jump to navigation Jump to search

Introduction[edit]

This page contains a SPARQL query that can be used to build a frequency table of identifiers of online accounts used in a filtered set of items.

Motivation[edit]

I wanted to know the online accounts used by people that were born in People's Republic of China (Q148).

Content[edit]

SELECT
  ?property
  ?propertyLabel
  ?count
WITH {
  SELECT DISTINCT ?property ?wdt {
    ?property
      a wikibase:Property;
      wdt:P31/wdt:P279* wd:Q105388954;
      wikibase:directClaim ?wdt.
  }
} AS %0
WITH {
  SELECT DISTINCT ?item {
    ?item wdt:P106 wd:Q5482740.
  }
  LIMIT 1234
} AS %1
WITH {
  SELECT ?property (COUNT(*) AS ?count) {
    INCLUDE %0.
    INCLUDE %1.
    ?item ?wdt [].
  }
  GROUP BY ?property
} AS %2
{
  INCLUDE %2.
  SERVICE wikibase:label {bd:serviceParam wikibase:language "[AUTO_LANGUAGE]"}.
}
ORDER BY DESC(?count)
Try it!

Additional notes[edit]

You can change the content of the subquery named %1 so that the query acts on another filtered set of items (e.g. people that have the statement occupation (P106)programmer (Q5482740). If the subset you want to act involves a lot of items, the query will timeout. A workaround to this problem is to use LIMIT in the subset (as shown in the query above).

I think this can be useful to know which platforms are used by a set of items. For example, if the set of items are people that have occupation (P106)programmer (Q5482740), GitHub username (P2037) will show up. If it is run on people that were born in People's Republic of China (Q148), Weibo user ID (P3579) will show up (link to the query).