Wikidata:Pywikibot - Python 3 Tutorial/Iterate over a SPARQL query

From Wikidata
Jump to navigation Jump to search

Pywikibot also allows to iterate over a query made with SPARQL. This has the advantage the preselecting the items the bot should look at, can have more conditions than the autolist queries shown in previous examples.

The easiest way to build and test the query is using https://query.wikidata.org/

After the query is finished create an empty (my_query.rq) file in your project directory. Separating your queries from the rest of your code is a good practice and will not confuse your syntax highlighter.

Importing the query is straightforward in Python:

with open('my_query.rq', 'r') as query_file:
    QUERY = query_file.read()

Your query needs to use the ?item variable for the items you want to list. Then you can use the query string to call the page generator:

#!/usr/bin/python3

import pywikibot
from pywikibot import pagegenerators as pg

with open('my_query.rq', 'r') as query_file:
    QUERY = query_file.read()

wikidata_site = pywikibot.Site("wikidata", "wikidata")
generator = pg.WikidataSPARQLPageGenerator(QUERY, site=wikidata_site)

That is all you need to do to get a really nice generator which you can loop over.

Iterate using PreloadingEntityGenerator[edit]

# -*- coding: utf-8  -*-
import pywikibot
from pywikibot import pagegenerators
"""
Iterate over a query given in a string using page generators.
"""
QUERY = """
#Cats
SELECT ?item
WHERE 
{
  ?item wdt:P31 wd:Q146. # Must be of a cat
}
LIMIT 10
"""

site = pywikibot.Site("wikidata", "wikidata")
repo = site.data_repository()

generator = pagegenerators.PreloadingEntityGenerator(pagegenerators.WikidataSPARQLPageGenerator(QUERY,site=repo))

for item in generator:
    if 'en' in item.labels:
        print(item.labels['en'])