Wikidata:Property proposal/Carnegie Hall person ID

From Wikidata
Jump to navigation Jump to search

Carnegie Hall agent ID[edit]

Originally proposed at Wikidata:Property proposal/Authority control

Descriptionidentifier for a person or ensemble in the Carnegie Hall Linked Open Data (LOD)
RepresentsCarnegie Hall linked open data (Q30500746)
Data typeExternal identifier
Domainmusic-related human (composer, performer, etc) or group (ensemble)
Allowed values[1-9]\d*
ExampleSergei Rachmaninoff (Q131861)52002 (turtle)
Sourcehttp://data.carnegiehall.org/
Formatter URLhttp://data.carnegiehall.org/names/$1 (http://data.carnegiehall.org/names/$1/turtle for RDF)
Robot and gadget jobsCan get data from http://data.carnegiehall.org/sparql/, see below
Motivation

Carnegie just released their LOD, which was well-publicized in the LODLAM & GLAMwiki communities. This is important data for any music researcher.

Example record: http://data.carnegiehall.org/names/52002 (Rachmaninov) includes:

  • Matches to LCNAF, WD, DBpedia, MusicBrainz
  • gndo:professionOrOccupation (to lcMarRel: and to local http://data.carnegiehall.org/roles)
  • gndo:playedInstrument
  • schema:birthDate, schema:deathDate
  • dbo:birthPlace, dbo:deathPlace (to http://sws.geonames.org/: use rdfs:label and gn:parentCountry/rdfs:label)

Some stats (ask me for the queries):

  • Total persons: 95823
  • Persons with at least one match: 11863 (12.3%)
  • Breakdown of matches per target domain
11373	http://id.loc.gov/authorities/names (11.9% of persons)
 6582	http://dbpedia.org/resource          (6.9% of persons)
 3554	https://musicbrainz.org/artist       (3.7% of persons)
 3157	http://www.wikidata.org/entity       (3.3% of persons)

This query can be used to export data about Persons with no matches (omit "filter not exists" if you also want those with matches)

PREFIX gn: <http://www.geonames.org/ontology#>
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX schema: <http://schema.org/>
select ?x ?name ?birth ?death ?birthCity ?birthCountry ?deathCity ?deathCountry {
    ?x a foaf:Person; foaf:name ?name.
    filter not exists {?x skos:exactMatch ?y}
    optional {?x schema:birthDate ?birth}
    optional {?x schema:deathDate ?death}
    optional {?x dbo:birthPlace ?bp. ?bp rdfs:label ?birthCity. optional {?bp gn:parentCountry/rdfs:label ?birthCountry}}
    optional {?x dbo:deathPlace ?dp. ?dp rdfs:label ?deathCity. optional {?dp gn:parentCountry/rdfs:label ?deathCountry}}
  } limit 100

Vladimir Alexiev (talk) 08:56, 22 June 2017 (UTC)[reply]

WikiProject Music has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead.

Vladimir Alexiev Jonathan Groß Andy Mabbett Jneubert Sic19 Wikidelo ArthurPSmith PKM Ettorerizza Fuzheado Daniel Mietchen Iwan.Aucamp Epìdosis Sotho Tal Ker Bargioni Carlobia Pablo Busatto Matlin Msuicat Uomovariabile Silva Selva 1-Byte Alessandra.Moi CamelCaseNick Songceci moz AhavaCohen Kolja21 RShigapov Jason.nlw MasterRus21thCentury Newt713 Pierre Tribhou Powerek38 Ahatd JordanTimothyJames Silviafanti Back ache AfricanLibrarian M.roszkowski Rhagfyr 沈澄心 MrBenjo S.v.Mering

Notified participants of WikiProject Authority control WikiProject Cultural heritage has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead.

Discussion

Check how many of Carnegie's not-yet-matched Persons are found in LOD. I tried to match the first few:

Note that the data.carnegiehall.org/names/ namespace also includes groups/ensembles -- for this reason, I've suggested a different label for this property, "Carnegie Hall entity ID". Many of the ensembles are also not found in LOD:

Rhudson (talk) 17:22, 24 June 2017 (UTC)[reply]

It seems Carnegie got lots of people that are missing in the world-wide LOD. This means both that matching won't be very profitable, and also an opportunity to create such people in the world-wide LOD. VIAF mostly has records for people who PUBLISHED music or about whom books were published. So even very important performers may be missing from LOD! --Vladimir Alexiev (talk) 09:24, 23 June 2017 (UTC)[reply]

  • Agree with @Rhudson: who is Carnegie's Manager of Archives. But changed entity->agent since in WD entity means "anything".

@Rhudson: the large number of entities that are not found in any of WD, VIAF, LOC means you need to make a decision: are most/all of them "notable enough" to be recorded in WD? And do you have enough data about them? WD is fairly lax about notability but for each set of data, there needs to be an interested community behind it --Vladimir Alexiev (talk) 19:23, 25 June 2017 (UTC)[reply]

I think Carnegie Hall is a serious and public source and thus the entities are notable according to our notability policy. ChristianKl (talk) 18:52, 26 June 2017 (UTC)[reply]