Wikidata:WikidataCon 2017/Notes/Wikidata Toolkit: How to use Wikidata in Java

From Wikidata
Jump to navigation Jump to search

Title: How to use Wikidata in Java

Note-taker(s): Lucas

Speaker(s)[edit]

Name or username: Tpt

Contact (email, Twitter, etc.): @Tpt93 thomas@pellissier-tanon.fr

Slides: https://docs.google.com/presentation/d/e/2PACX-1vTusPACVFXTzEd64pDU1WqzM1BFMK3uJ-SO0ATA-OAkJiLZmjkGL7YVkqJQOxXVl7YIhBxN8Eut7Rid/pub

Documentation: https://www.mediawiki.org/wiki/Wikidata_Toolkit

Eclipse setup: https://www.mediawiki.org/wiki/Wikidata_Toolkit/Eclipse_setup

git clone https://github.com/Wikidata/Wikidata-Toolkit.git

Dump: https://people.wikimedia.org/~hoo/tmp/wikidata-20171028-all-first2500.json.gz

http://wikidata.org/entity/Q24075199

Abstract[edit]

Wikidata Toolkit is a Java library for accessing Wikidata and other Wikibase installations. It allows to create bots, to download and parse dumps in order to do, e.g., complex analysis of Wikidata content. The aim of this workshop is to give a quick introduction of the Wikidata Toolkit and show people an easy way to create bots and manipulate Wikidata dumps.

Collaborative notes of the session[edit]

WikidataToolkit

Wikidata client library in Java, developed in 2014/15 by TU Dresden

fast processing based on dumps (e. g. queries that cannot be done on WDQS)

easy editing, without all the complexity of the API

proof of concept of RDF mapping

nice Java objects for Wikidata items, can be used with Java Stream APIs

dump processing: automatically downloads most recent dump and processes it with your processor

RDF export is mostly outdated

some statistics utilities and lots of examples

hands-on session: with computer and an IDE (see slides for instructions)

TutorialExample will first fail because it’s in offline mode. You can disable offline mode (comment out L51), start the program, quickly stop it, and overwrite the partial downloaded file with the truncated one from above (“dump”).