Vikidatumoj:Elŝutado de datumbanko

From Wikidata
Jump to navigation Jump to search
This page is a translated version of the page Wikidata:Database download and the translation is 43% complete.
Outdated translations are marked like this.

Vikidatumoj havebligas, por ĉiuj, elŝuteblajn kopiojn de la enhavo.

Notu ke ekzistas plurajn aliajn metodojn por atingi strukturitan enhavon el Vikidatumoj, kiuj ne postulas plenan datumbankan kopion.

Ŝutkopioj de datumbanko

Ekzistas pluraj specoj de kopioj de la datenbanko. Notu ke la JSON- kaj RDF-kopioj estas rigardataj kiel stabilaj interfacoj, dum la XML-kopioj ne estas. Ŝanĝoj pri la dosierformoj uzataj de la stabilaj interfacoj okazas laŭ la Reguloj pri Stabilaj Interfacoj.

<span id="JSON_dumps_(recommended)_">

JSON-kopioj (rekomendataj)

JSON dumps containing all Wikidata entities in a single JSON array can be found under https://dumps.wikimedia.org/wikidatawiki/entities/. The entities in the array are not necessarily in any particular order, e.g., Q2 doesn't necessarily follow Q1. The dumps are being created on a weekly basis.

This is the recommended dump format. Please refer to the JSON structure documentation for information about how Wikidata entities are represented.

Hint: Each entity object (data item or property) is placed on a separate line in the JSON file, so the file can be read line by line, and each line can be decoded separately as an individual JSON object.

Note that the files are using parallel compression, which means that some decompressors cannot reliably unpack the files. If you are using Windows you can use e.g. Bzip2. On *nix systems, use lbzip2 which can decompress Bzip2 in parallel. pbzip2 is not a good choice because it is not able to decompress in parallel files not compressed with pbzip2.

You can currently download a fairly recent dump using a torrent. wikidata-20240101-all.json.gz (130.53 GiB) on academictorrents.com ( magnet)

JsonDumpReader estas PHP-biblioteko por legi tiujn kopiojn.

RDF-kopioj

First, canonical RDF dumps using the Turtle and NTriples formats can be found under https://dumps.wikimedia.org/wikidatawiki/entities/. The mapping is described here. These full statements are noted as all.

Secondly, so called truthy dumps are provided. They use the nt format. They are in the same format as the full dumps, but limited to direct, truthy statements. Therefore, they do not contain meta data such as qualifiers and references.

The -all dump files contain all entity information in Wikidata with the exception of order (of aliases, of statements, etc.), which is not naturally represented in RDF. The -truthy dump files encode the *best* statements (i.e. the ones with the highest rank of each given (subject, property) pairs) as single RDF triples (qualifiers and references are omitted).

The dumps of Wikidata Lexeme namespace in Turtle and NTriples formats can be found in the same place with lexemes suffix.

For details on the RDF dump format please see the page RDF Dump Format.

Partial RDF dumps

WDumper is a third-party tool to create custom Wikidata RDF dumps. Entities and statements may be filtered.

XML-kopioj

Plenaj XML-kopioj de Vikidatumoj estas haveblaj ĉe https://dumps.wikimedia.org/wikidatawiki/.

Warning: The format of the JSON data embedded in the XML dumps is subject to change without notice, and may be inconsistent between revisions. It should be treated as opaque binary data. It is strongly recommended to use the JSON or RDF dumps instead, which use canonical representations of the data!

Incremental dumps (or Add/Change dumps) for Wikidata are also available for download. These dumps contain stuff that was added in the last 24 hours, reducing the need of having to download the full database dump. These dumps are considerably smaller than the full database dumps

Ili estas troveblaj ĉe https://dumps.wikimedia.org/other/incr/wikidatawiki/.

Malnovaj JSON- kaj RDF-kopioj

Datummodelo

La datummodel estas havebla ĉi tie. La datummodelo priskribas la fundamentajn kolonojn de la datumo de Vikidatumoj.

Datumbanka skemo

Superrigardo pri la skemo de la datumbanko estas havebla ĉe ĉi tiu paĝo. (Tiu ne estas la skemo de la datumo en Vikidatumoj.)

Permesilo

These databases can be used for personal or commercial use, backups or offline use. All structured data from the main, Property, Lexeme, and EntitySchema namespace is available under the Creative Commons CC0 License. Text in the other namespaces is available under the Creative Commons Attribution/Share-Alike License; additional terms may apply. Media items and other content are available under other licenses, as detailed on their description pages.

See also