User:Rdrg109/1/25

From Wikidata
Jump to navigation Jump to search

Introduction

[edit]

This page contains SPARQL queries for getting the number of uses of properties in direct claims, qualifiers and references.

Motivation

[edit]

Someone in the Telegram group of Wikidata asked whether it would be possible to get the number of uses per each property (message link). I thought I could help so I decided to write these queries.

Content

[edit]

Frequency table of uses of properties in direct claims

[edit]

The following query shows count of use per each property in direct claims.

SELECT
  ?property
  ?propertyLabel
  ?count
WITH {
  SELECT ?wdt (COUNT(*) AS ?count) {
    [] ?wdt [].
  }
  GROUP BY ?wdt
} AS %0
WITH {
  SELECT ?property ?count {
    INCLUDE %0.
    ?property wikibase:directClaim ?wdt.
  }
} AS %1
{
  INCLUDE %1.
  SERVICE wikibase:label {bd:serviceParam wikibase:language "[AUTO_LANGUAGE]"}.
}
ORDER BY DESC(?count)
Try it!

To understand this query, you should know that we could filter predicates that correspond to a direct claim (i.e. a claim that have normal or preferred rank) by using wikibase:directClaim. See minimal working example below.

SELECT ?predicate ?property {
  VALUES ?predicate {
    p:P31
    pr:P31
    pq:P31
    wdt:P31
  }
  ?property wikibase:directClaim ?predicate.
}
Try it!

Frequency table of uses of properties in qualifiers

[edit]

The following query shows count of use per each property in qualifiers.

SELECT
  ?property
  ?propertyLabel
  ?count
WITH {
  SELECT ?wdt (COUNT(*) AS ?count) {
    [] ?wdt [].
  }
  GROUP BY ?wdt
} AS %0
WITH {
  SELECT ?property ?count {
    INCLUDE %0.
    ?property wikibase:qualifier ?wdt.
  }
} AS %1
{
  INCLUDE %1.
  SERVICE wikibase:label {bd:serviceParam wikibase:language "[AUTO_LANGUAGE]"}.
}
ORDER BY DESC(?count)
Try it!

Similarly to the previous query, to understand this query, you should know that we could filter predicates that correspond to qualifiers by using wikibase:qualifier. See minimal working example below.

SELECT ?predicate ?property {
  VALUES ?predicate {
    p:P31
    pr:P31
    pq:P31
    wdt:P31
  }
  ?property wikibase:qualifier ?predicate.
}
Try it!

Frequency table of uses of properties in references

[edit]

The following query shows count of use per each property in references.

SELECT
  ?property
  ?propertyLabel
  ?count
WITH {
  SELECT ?wdt (COUNT(*) AS ?count) {
    [] ?wdt [].
  }
  GROUP BY ?wdt
} AS %0
WITH {
  SELECT ?property ?count {
    INCLUDE %0.
    ?property wikibase:reference ?wdt.
  }
} AS %1
{
  INCLUDE %1.
  SERVICE wikibase:label {bd:serviceParam wikibase:language "[AUTO_LANGUAGE]"}.
}
ORDER BY DESC(?count)
Try it!

Similarly to the two previous query, to understand this query, you should know that we could filter references that correspond to qualifiers by using wikibase:reference. See minimal working example below.

SELECT ?predicate ?property {
  VALUES ?predicate {
    p:P31
    pr:P31
    pq:P31
    wdt:P31
  }
  ?property wikibase:reference ?predicate.
}
Try it!

Frequency table of uses of properties in direct claims, qualifiers and references

[edit]
SELECT
  ?property
  ?propertyLabel
  ?countInDirectClaims
  ?countInQualifiers
  ?countInReferences
WITH {
  SELECT ?wdt (COUNT(*) AS ?count) {
    [] ?wdt [].
  }
  GROUP BY ?wdt
} AS %0
WITH {
  SELECT ?property (?count AS ?countInDirectClaims) {
    INCLUDE %0.
    ?property wikibase:directClaim ?wdt.
  }
} AS %1
WITH {
  SELECT ?property (?count AS ?countInQualifiers) {
    INCLUDE %0.
    ?property wikibase:qualifier ?wdt.
  }
} AS %2
WITH {
  SELECT ?property (?count AS ?countInReferences) {
    INCLUDE %0.
    ?property wikibase:reference ?wdt.
  }
} AS %3
{
  INCLUDE %1.
  INCLUDE %2.
  INCLUDE %3.
  SERVICE wikibase:label {bd:serviceParam wikibase:language "[AUTO_LANGUAGE]"}.
}
ORDER BY DESC(?count)
Try it!

Frequency table of uses of properties in direct claims, qualifiers and references and its values in a given language

[edit]
Results of the query sorted by ?countInQualifiers. ml (language code for Malayalam) was used as the language code and schema:description as the predicate. Both values were set in named subquery %4
Results of the query sorted by ?countInReferences. ko (language code for Korean) was used as the language code and schema:description as the predicate. Both values were set in named subquery %4

Someone asked me if I could add a condition to the query so that it shows those properties that have description in English, but don't have a description in a given language.

I thought that the query would be more useful if it included all the properties in the results and show a column that show Yes or No when a property has a description in that language or not. Thus, it would be possible to sort the table by any column and at the same time see if the property has description in that language.

For setting the language that this query will search for missing values (e.g. missing labels, or missing descriptions), you need to edit the named subquery %4. The line that needs to be edited is the one that have the FILTER statement. In that line, change the string with the language code of the language you are interested in (e.g. "es" for Spanish, "ko" for Korean, "qu" for Quechua, etc.)

There are two main modifications that can be done to the query for making the results more useful

  • Note that the query could be reused for listing properties that might be missing a label (i.e. rdfs:label, descriptions (i.e. schema:description), aliases (i.e. skos:altLabel or even properties (e.g. Wikidata usage instructions (P2559)) in a given language. To do this, just change the predicate that is used in the OPTIONAL statement in the named subquery %4.
  • Don't show the column ?value. You might want to do this if you are only interested in knowing whether a property has a description in a given language or not, so having the descriptions in the result of the query is not that useful.

NOTE: This query shows inaccurate information for some properties due to a bug. I already reported it in T323423.

SELECT
  ?property
  ?propertyLabel
  ?value
  ?valueExists
  ?countInDirectClaims
  ?countInQualifiers
  ?countInReferences
WITH {
  SELECT DISTINCT ?property {
    ?property wikibase:propertyType [].
  }
} AS %a
WITH {
  SELECT ?wdt (COUNT(*) AS ?count) {
    [] ?wdt [].
  }
  GROUP BY ?wdt
} AS %0
WITH {
  SELECT ?property (?count AS ?countInDirectClaims) {
    INCLUDE %0.
    ?property wikibase:directClaim ?wdt.
  }
} AS %1
WITH {
  SELECT ?property (?count AS ?countInQualifiers) {
    INCLUDE %0.
    ?property wikibase:qualifier ?wdt.
  }
} AS %2
WITH {
  SELECT ?property (?count AS ?countInReferences) {
    INCLUDE %0.
    ?property wikibase:reference ?wdt.
  }
} AS %3
WITH {
  SELECT * {
    INCLUDE %a.
    OPTIONAL{INCLUDE %1}
    OPTIONAL{INCLUDE %2}
    OPTIONAL{INCLUDE %3}
    OPTIONAL {
      ?property schema:description ?value
      FILTER(LANG(?value) = "ml").
    }
    BIND(IF(BOUND(?value), "Yes", "No") AS ?valueExists).
  }
} AS %4
WITH {
  SELECT * {
    INCLUDE %4.
    BIND(IF(BOUND(?countInDirectClaims), ?countInDirectClaims, 0) AS ?countInDirectClaims)
    BIND(IF(BOUND(?countInQualifiers), ?countInQualifiers, 0) AS ?countInQualifiers)
    BIND(IF(BOUND(?countInReferences), ?countInReferences, 0) AS ?countInReferences)
  }
} AS %5
{
  INCLUDE %5.
  SERVICE wikibase:label {bd:serviceParam wikibase:language "en"}.
}
Try it!

Bonus

[edit]

Predicates for filtering predicates that use PIDs

[edit]

The following query could be used to experiment with predicates and how they could be used to filter in predicates that contain a property identifier (aka PID).

SELECT ?predicate ?property {
  VALUES ?predicate {
    p:P31 # wikibase:claim
    pq:P31 # wikibase:qualifier
    pqn:P2911 # wikibase:qualifierValueNormalized
    pqv:P31 # wikibase:qualifierValue
    pr:P31 # wikibase:reference
    prn:P214 # wikibase:referenceValueNormalized
    prv:P31 # wikibase:referenceValue
    ps:P31 # wikibase:statementProperty
    psn:P356 # wikibase:statementValueNormalized
    psv:P31 # wikibase:statementValue
    wdt:P31 # wikibase:directClaim
    wdtn:P356 # wikibase:directClaimNormalized
  }
  ?property wikibase:directClaimNormalized ?predicate.
}
Try it!

Predicates that don't have a PID

[edit]

If you are curious enough, you might be interested in knowing which are the predicates that don't have a PID. The following query shows that.

SELECT
  ?wdt
  ?count
WITH {
  SELECT ?wdt (COUNT(*) AS ?count) {
    [] ?wdt [].
  }
  GROUP BY ?wdt
} AS %0
WITH {
  SELECT ?wdt ?count {
    INCLUDE %0.
    MINUS {[] wikibase:directClaim ?wdt} # wdt:
    MINUS {[] wikibase:reference ?wdt} # pr:
    MINUS {[] wikibase:qualifier ?wdt} # pq:
    MINUS {[] wikibase:statementProperty ?wdt} # ps:
    MINUS {[] wikibase:statementValue ?wdt} # psv:
    MINUS {[] wikibase:referenceValue ?wdt} # prv:
    MINUS {[] wikibase:claim ?wdt} # p:
    MINUS {[] wikibase:qualifierValue ?wdt} # pqv:
    MINUS {[] wikibase:referenceValue ?wdt} # prv:
    MINUS {[] wikibase:statementValueNormalized ?wdt} # psn:
    MINUS {[] wikibase:qualifierValueNormalized ?wdt} # pqn:
    MINUS {[] wikibase:referenceValueNormalized ?wdt} # prn:
    MINUS {[] wikibase:directClaimNormalized ?wdt} # wdtn:
  }
} AS %1
{
  INCLUDE %1.
}
ORDER BY DESC(?count)
Try it!