Wikidata:WikiProject Cultural events/Data sources
Data Sources - Cultural events
[edit]The table below contains an overview over the data sources to import data about cultural events from (listed by country). Try to be as comprehensive as possible when listing potential data sources of a given country.
Country | Publisher | Database | Coverage | Wikidata item | Mapping information | Rights | Contact | Comments | Ingestion Status |
---|---|---|---|---|---|---|---|---|---|
CH | Swiss Theatre Collection | Theatre Metadata - Performing Arts Festivals | Inventory of performing arts festivals in Switzerland | CC-0 | User:Beat Estermann | ||||
CH | Swiss Theatre Collection | Swiss Theatre Metadata - Anniversrary Performing Arts Festivals | Inventory of anniversary performing arts festivals (Festspiele) in Switzerland | CC-0 | User:Beat Estermann | ||||
CH | Swiss Theatre Collection | Swiss Theatre Metadata - Outdoor Theatre Events | Inventory of outdoor theatre events (Freilichttheater) in Switzerland | CC-0 | User:Beat Estermann | ||||
CH | Swiss Theatre Collection | Swiss Theatre Metadata - Revues, Cultural Nights, Vorfasnacht | Inventory of revues, cultural nights, and carneval-related performance events (Revues, Kulturnächte, Vorfasnacht) in Switzerland | CC-0 | User:Beat Estermann |
Describe the Data Sources on Wikidata
[edit]Databases suitable for data ingestion into Wikidata should be described on Wikidata itself. For each database, an item needs to be created on Wikidata, so that the database can be cited as a source when ingesting data into Wikidata. Refer to Help:Sources for information about how to use sources in Wikidata. When it comes to importing data about heritage institutions, the types of data sources most commonly found are:
- Online databases: In the case of an online database, a Wikidata property should be created that corresponds to a unique identifier used to refer to items in the database. For the database itself, a Wikidata item should be created. Refer to Help:Sources#Databases for further guidance. Example: <insert an example of an online database here.>.Note: In fall 2016, it was impossible to properly reference online databases using the Quick Statements Tool, a tool commonly used to batch ingest data into Wikidata. As of December 2016, a new version of the tool is under development. <Check the next release of the tool and update this note if necessary.>
- Database dumps / database exports: In the case of a database dump or a database export, the source file is typically available in a spreadsheet format (e.g. CSV or Excel) or in a hierarchical format (e.g. XML). In this case, a Wikidata item should be created for the database itself (example: Swiss GLAM Inventory (Q26933296)) and for the specific export file (example: Swiss GLAM Inventory, 16 September 2016 (Q27477970)).
- Simple web pages: In some cases you may find relevant lists on simple web pages. If the source is a simple website, it does not need to be described as a separate Wikidata item. Refer to Help:Sources#Web page for further information about how to source statements to simple web pages.
Unique Identifiers
[edit]Before ingesting data into Wikidata, we usually want to make sure that our source database contains a unique identifier which can later be used to match the data on Wikidata with the data in the source database. This is particularly useful in the case of future updates to the source database. There are two commonly used approaches to ingesting such unique identifiers into Wikidata:
- The use of a single-source identifier: In this case, a particular Wikidata property is created that corresponds to a unique identifier used in a given database. Example: Elvis Presley (Q303) has a Library of Congress authority ID (P244) : "n78079487". Typically, identifier properties have their corresponding Wikidata entity, in the case of our example: Library of Congress Authorities (Q13219454).
- The use of a multi-source identifier: In this case, a particular Wikidata property and a Wikidata item are used that are generic for a particular domain; multi-source identifiers should be unique in combination with a qualifier. Example: The item Judith Holding the Head of Holofernes (Q17319619) has an inventory number (P217) : "SK-A-1" that is further qualified by collection (P195) : Rijksmuseum (Q190804). In this case, the collection and the inventory number combined form a unique identifier (the inventory number on its own does not have to be unique across collections). The corresponding Wikidata entity in this example is: accession number (Q1417099).
With regard to the data sources concerning cultural events, it is recommended to use single-source identifiers for well-established databases. In this case, you will have to:
- Create a Wikidata entity for the identifier (if it does not already exist).
- Propose the creation of a corresponding property. Note: creating new properties requires community approval that may take several weeks.
Mapping Between the Data Structure in the Source Files and the Data Structure on Wikidata
[edit]Another important step in view of the ingestion of data from the source databases into Wikidata is the mapping between the data structure in the source files and the data structure on Wikidata. For this purpose, you should create a sub-page of this page that is specific to your country (unless one already exists). Example: Mapping information for data sources covering Swiss heritage institutions.
The data needs to be mapped at two levels:
- Properties: For each property contained in the source data file, a corresponding property needs to be identified (or newly created) on Wikidata. If the source data is in table format, the column headers usually represent the properties, while each row typically represents one item. A list of properties commonly used in relation to cultural events can be found in the Data Structure section.
- Classes / controlled vocabularies: In some cases, the values of the properties may be simple strings, as for example in a physical address ("Museum Street 1"); in this case, no further mapping is needed. In other cases, the values are controlled vocabularies represented on Wikidata by specific classes or entities. In this case, the values in the source data file need to be mapped to those specific classes or entities. See the Mapping Information for the Swiss GLAM Inventory for a series of examples. An overview of controlled vocabularies commonly used in relation to cultural events can be found in the Data Structure section; an overview of the typology of cultural events currently in use on Wikidata can be found in the Typology section. Depending on the dataset, it may be necessary to complement these controlled vocabularies by creating new items on Wikidata before the ingestion of the data can begin.