Wikidata:Property proposal/Dimensions grant/project ID
Dimensions grant ID
[edit]Originally proposed at Wikidata:Property proposal/Authority control
Motivation
[edit]Grants fund projects that drive research, so they are crucially important pieces of the science data universe.
Just look at the first example, that's 48M EUR, don't we want to know what it produced? :-) Vladimir Alexiev (talk) 10:22, 24 April 2019 (UTC)
Discussion
[edit]- Support Definitely looks useful, I didn't realize there were public API's for this info. ArthurPSmith (talk) 18:35, 24 April 2019 (UTC)
- For clarification: Dimensions does not provide a dump (and maybe not even machine-readable entities). But SciGraph to which Dimensions have donated data provides excellent RDF --Vladimir Alexiev (talk) 07:43, 25 April 2019 (UTC)
- Added examples cc @Egon_Willighagen: --Vladimir Alexiev (talk) 07:56, 25 April 2019 (UTC)
- https://scigraph.springernature.com/grant.3795343#tabjsonld shows "sdSource": "s3://com.uberresearch.data.processor/core_data/20181219_192338/projects/base/cordis_projects_3.xml.gz". CORDIS has provided open data about all projects at the EU Open Data Portal. But I think Dimensions/SciGraph has a lot more: 400k while all of FP7 had 8k projects in the main programme Cooperation, plus maybe 20k in "fuzzier" areas like eg People. Anyway, if there are volunteers to ingest CORDIS project & grant info, I'll help --Vladimir Alexiev (talk) 08:01, 25 April 2019 (UTC)
- Comment I also suggest adding the regular expression (something like $\d+$). Finally, I note that the cited data dump in Figshare is not CCZero. We should ask them permission to use that data to populate Wikidata with this information, or better, ask them to deposit as CCZero a subset of that data sets with the identifiers. --Egon Willighagen (talk) 08:04, 25 April 2019 (UTC)
- Looked at 10 slices and they all show \d{7}, added
- Will ask Michele Pasin about CC0 --Vladimir Alexiev (talk) 14:04, 26 April 2019 (UTC)
- Michele said they've considered a different license several times but will stick with the current one :-( --Vladimir Alexiev (talk) 10:27, 25 May 2019 (UTC)
- Comment @Vladimir Alexiev, Egon Willighagen: I allowed myself to change the RegEx, because no ID seems to start with a 0 (→
^[1-9]\d{6}$
). Is that ok for you? --Eihel (talk) 09:17, 21 May 2019 (UTC)- Oh, interesting call! I guess this actually applies to other external identifiers too. I'll try to remember this :) --Egon Willighagen (talk) 11:19, 21 May 2019 (UTC)
- Support. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 21:49, 26 April 2019 (UTC)
- Support Simon Cobb (User:Sic19 ; talk page) 12:20, 7 May 2019 (UTC)
Conditional supportHello Vladimir Alexiev,- Would you please add a description, please.
- • p.s.: Modification already done by 99of9. Txs. --Eihel (talk) 14:02, 25 May 2019 (UTC)
To include example 4 (NANOPUZZLES), it must be included in 1 of the domains (project or grant) you mentioned (eg. with a instance of (P31) in the Item like the other examples).On the other hand could you add fifth example belonging to the domain research grant (Q54875403)? Domains are the future constraints. Thank you in advance. Greetings. --Eihel (talk) 17:36, 22 May 2019 (UTC)
- @Eihel: Will try to add examples. Leading non-zero is fine --Vladimir Alexiev (talk) 10:27, 25 May 2019 (UTC)
- Neutral
Strong oppose@Vladimir Alexiev: This request can not go beyond the proposal stage. When creating, you must set the property as part of CC0 1.0. So, as the License page states, all the data that can be entered into WD will be part of CC BY-NC 4.0 as well as the use of these same data (NonCommercial). See also Egon Willighagen's comment above and the reply. Moreover,no one will ever be able to create this property for the same reasons. Really sorry.--Eihel (talk) 15:40, 25 May 2019 (UTC)- For clarity, I do not see the license as a problem by itself: it means we cannot automate pulling in the data. And it certainly does not require this property to be anything than CC0. In fact, we have plenty of properties to non-CC0 databases: that is possible because you basically cannot copyright identifiers. @Eihel: I do not see where this proposal says the property should not be CC0. I have trouble understanding your opposition. It's certainly not supported by my comment. Can you please clarify? --Egon Willighagen (talk) 09:40, 26 May 2019 (UTC)
- Same, @Eihel: your argument make absolutely no sens to me. Cdlt, VIGNERON (talk) 17:02, 27 May 2019 (UTC)
- @Vladimir Alexiev, Egon Willighagen, VIGNERON: Hello everyone. Yes, I'm not a license specialist. The proposal, then the creation, of Property is the most important in this debate. As you are not part of the user group, I give you this picture:
- Egon's intervention contradicts my writings. ("Contredire" est un grand terme: je pensais plus à "est en porte-à-faux", mais je ne connais pas le terme en anglais.) What I (believe) understand: what is inserted in WM can be reused for commercial purposes, and so it conflicts with it (cf. Terms of Use). As Vigneron does not teach me much, I would like to call others to help me see more clearly: @Mdennis (WMF), Sylvain Boissel WMFr:, the editor of ToU and a francophone, Ash Crow here. Does CC BY-NC 4.0 content can integrate WD? Then I vote neutral. Cordially. --Eihel (talk) 11:29, 14 June 2019 (UTC)
- I may call too @KFrancis (WMF): (Senior Paralegal). Hello Madam, could you tell me more about the problem above, please? You are a person identified to address these issues. Cordially. --Eihel (talk) 11:42, 14 June 2019 (UTC)
- (En français, ce sera plus simple) : du contenu sous CC BY-NC 4.0 ne peut évidemment pas être intégré dans Wikidata (sauf exceptions) mais cela ne pose aucun problème pour créer cette propriété puisque cette propriété n'intégre pas le moindre contenu. Son but est juste de lier via l'identifiant. Or un identifiant ne peut - par nature - pas être soumis au copyright (et heureusement car si "3770927" était protégé, plus personne ne pourrait faire l'addition 3770926+1 ). PS: l'équipe juridique de la fondation Wikimédia n'est pas là pour répondre aux questions des utilisateurs (elle s'intéresse à des problématiques plus larges et globales). Cdlt, VIGNERON (talk) 13:12, 14 June 2019 (UTC)
- Same, @Eihel: your argument make absolutely no sens to me. Cdlt, VIGNERON (talk) 17:02, 27 May 2019 (UTC)
- For clarity, I do not see the license as a problem by itself: it means we cannot automate pulling in the data. And it certainly does not require this property to be anything than CC0. In fact, we have plenty of properties to non-CC0 databases: that is possible because you basically cannot copyright identifiers. @Eihel: I do not see where this proposal says the property should not be CC0. I have trouble understanding your opposition. It's certainly not supported by my comment. Can you please clarify? --Egon Willighagen (talk) 09:40, 26 May 2019 (UTC)
- Neutral
Strong oppose@Considering.Different.Routes:
The description of this property is "ID of a scientific grant or project", and this conflates two very different objects and type of objects a Grant and a Project. That the two are very different can be seen by the many cases when a project is supported by several grants (in sequence or in parallel, even), or when a grant supports different projects (often related among themselves, but still different). For example, also within the European Commission funding, which is often indicated as a case where grant and project identify, that is not the case. For instance the majority of projects funded by grants from the LIFE funding program also benefit from grants from the national and local levels (from various European countries). 15:30, 03 June 2019 (UTC).
- @Considering.Different.Routes: Thanks fir the pointer to DINGO. 1. I fixed the description as requested, please change your vote. 2. If you're one of the authors, could you change this horrible gray on yellow color? My eyes are not so good, I can barely read it. 3. Where do i post fixes , github? 4. To show the strength of your conviction, could you make Grants aat least for the example projects, connect them as appropriate, and maybe make the beneficiaries and roles? --Vladimir Alexiev (talk) 21:47, 3 June 2019 (UTC)
- @Vladimir Alexiev: 1) thanks for changing the description and the name of the property, 2) I am one of the authors, we'll improve the documentation, including coloring and layout ;-), 3) yes please comments and suggestions are welcome on Github, and thanks for them, 4) yes, I can model that, it will appear as soon as possible --Considering.Different.Routes.
Grant vs Project
[edit]- At the Wikibase meeting last year in Berlin, there was consensus that grants and projects are different entities. That could be clarified in the proposal. --Egon Willighagen (talk) 08:04, 25 April 2019 (UTC)
- Grants fund all kinds of undertakings: projects, programs, centers, chairs, excellent researchers, coordination actions, etc etc. Eg see NIH grants. We could have different items for grant vs the project funded by that grant, but I think that's counter-productive in the context of WD. --Vladimir Alexiev (talk) 14:04, 26 April 2019 (UTC)
- I looked at instances of grants.
- Some are projects eg Interrogating the folding and function of membrane proteins by mass spectrometry (Q63345006), Structure and dynamics of oligomeric intermediates in amyloid assembly (Q63345007). Similarly, most of the FP7 and Horizon funding is for concrete project-oriented activities
- Others seem to be for ongoing activities, eg Industrial CASE Account - University of Leeds 2015 (Q63344156), DTP 2016-2017 University of Leeds (Q63342051)
- Others are series of grants on a common topic or by a common funding source, eg Killam Research Fellowship (Q56627685), White Rose Social Sciences Doctoral Training Partnership (Q63344487).
- It seems we need to put some order to this heterogeneity and need extra modeling. Any takers? @Egon Willighagen: What do you think? And which "call" do you mean above? --Vladimir Alexiev (talk) 10:27, 25 May 2019 (UTC)
- At the Wikibase meeting in Berlin it was presented and further discussed an ontology for Grants, Projects and related concepts, which may be useful. It also has a "Wikidata variant" (which uses as many already existing Wikidata properties as possible). You can find its documentation here http://w3id.org/dingo . Considering.Different.Routes 15:50, 03 June 2019 (UTC)
@Vladimir Alexiev, ArthurPSmith, Egon Willighagen, VIGNERON, Considering.Different.Routes: and @Pigsonthewing, Sic19: Done Dimensions grant ID (P6854)
ps. The majority of your examples are not part of the domain you mentioned (or a subclass). In example: "Violation: Entities using the Dimensions grant ID property should be instances of research grant (or of a subclass of it), but GMES Space Component Data Access currently isn't." --Eihel (talk) 02:46, 15 June 2019 (UTC)
- @Eihel: I challenged @Considering.Different.Routes: to prove the mettle of his ontological convictions by creating separate entries for the grants that have funded the example projects/programs but I guess he hasn't done it.
- Should we add a second type research grant (Q54875403) to those examples or add more types to the constraint? I would be in favor of the former because the examples are in a sense also grants.
- Here is a query
- @Egon Willighagen: what do you think? --Vladimir Alexiev (talk) 14:26, 15 July 2019 (UTC)
- @Vladimir Alexiev: Sorry for the late reply. Grants and projects are distinct things. The Wikibase meeting in Berlin leading to @Considering.Different.Routes:'s ontology is solid. The DINGO is not product of Considering.Different.Routes (talk • contribs • logs) alone, but instead the outcome of many Wikidata users. Some arguments (not complete) include: the set of project partners is not the same as the set of grant partners; a grant is more at the level of a contract; a project can have more than on grant; the timelines are often different; the project certainly produces output *after* the grant ended (exceptions are rare). So, a grant identifiers is for a grant. That said, using this property on a Project is reasonable: we have many (external) identifiers on items where the database entity (of the identifier) is of a different type than that of the Wikidata with which it is used. But A Project != Grant. --Egon Willighagen (talk) 13:24, 20 July 2019 (UTC)
- Oh, so that answer the question about allowed domains for this property, that depends on whether we want to change current Wikidata practices. Ontologically, the domain is only Grant, but currently in the Wikidata this domain means just "can be used for items of that type" and does not carry the meaning that anything with a property has that domain. (If it does, then Wikidata is pretty broken.) --Egon Willighagen (talk) 13:27, 20 July 2019 (UTC)