Wikidata:Property proposal/propdef

From Wikidata
Jump to navigation Jump to search

definition in SPARQL of calculated property[edit]

Originally proposed at Wikidata:Property proposal/Generic

   Not done
DescriptionSPARQL definition for a possible calculated property. Format: include ?item and ?value
Data typeString
Allowed valuesmust include ?item and ?value
Example
  • Q55237971?item wdt:P212 ?p212 . BIND(REPLACE(?p212, "-", "") AS ?value)
  • <country of birth (new item)> → ?item wdt:P19/wdt:P17 ?value
  • <year of birth (new item)> → ?item wdt:P569 ?dob . BIND(YEAR(?dob) as ?value)

Motivation

If the feature even becomes available, these can be used for definition. In the meantime, these can be used to build queries.
--- Jura 06:06, 30 June 2018 (UTC)[reply]

Discussion

 Comment I think this idea is very interesting. I have two questions regarding this model:

  1. Is it possible to get the Q55237971 value of DSM-5 (Q3064664), for example, in Wikidata Query Service by using the Sandbox-String (P370) in Q55237971, at this time? Or is it required an external tool to build queries?
  2. As this property stores only string values, it will be difficult to find available calculated properties from the original properties, such as ISBN-13 (P212) in this case. Isn't there a need for a property or a qualifier to specify which properties are used to calculate?

Thank you, --Okkn (talk) 04:56, 6 July 2018 (UTC)[reply]

  • Thanks for your interesting questions.
    For (1), I think it's possible to do this, either similar to the (indirect) queries on Property_talk:P4316#Sample_query, or (after editing) with the query built through the formatter url on specific items. Maybe there is also some way to construct the queries directly (I would be interested to learn about).
    For (2), I had thought about this too. Eventually, I think we should also link the properties being used in one way or the other from Q55237971.
    --- Jura 05:11, 6 July 2018 (UTC)[reply]
    •  Support Ok, I also want this proeprty. But I'm not sure the name is appropriate. What you call "calculated property" is not a Wikidata property (Q18616576). The name of "property definition" gives the impression that the subject of this property is Wikidata property (Q18616576). Isn't the "calculated property" something like a formatter? Of course I understand it is not restricted to string data type... --Okkn (talk) 15:49, 6 July 2018 (UTC)[reply]
    • It's currently mentioned at Help:Data_type#Calculated_property (disclaimer: I added it). Supposedly, there are various ways of implementing this: maybe as some shortcut when writing queries or triples directly generated from existing properties. In any case, I think the definition would need to be stored some place. Initially, I thought this could be added to property entities and labeled "calculated property", but there might be several properties that could apply. So the current proposal. If there are better labels, that would be good. We can also change it later if it's problematic. Maybe "SPARQL" should already be in the label.
      --- Jura 07:27, 7 July 2018 (UTC)[reply]
  •  Oppose I see no advantage for this property compared to just adding this code snipped to a SPARQL query help page. --Pasleim (talk) 05:53, 8 July 2018 (UTC)[reply]
  •  Comment I think this would benefit much more upstream discussion with the community and the dev team. If there is no commitment from WMDE to integrate that somehow, this property might not be desirable: users could expect that it is reflected in the Wikidata query service, or it might encourage people to run wasteful SPARQL queries on their own to simulate that functionality. What is really the problem you are trying to solve? More examples of use cases should be gathered. For instance, the requirement that the SPARQL fragment uses both "?item" and "?value" seems to be made up for your own needs, but would not necessarily be generic enough for other use cases. There is a lot of previous work on rule-based inference in RDF stores, which are generally more principled. Marking as not ready. − Pintoch (talk) 13:27, 8 July 2018 (UTC)[reply]
    • If you prefer "?s" and "?o" as variables, that is fine for me. I think in Wikidata, we generally uses "?item" (and possible "?value" for "?o"). If there are other suggestions, I'd be most interested. This is the first time I hear that we should discourage people to use SPARQL.
      --- Jura 13:38, 8 July 2018 (UTC)[reply]
    •  Comment I have tried writing a "wasteful" SPARQL query, which generates a SPARQL query to resolve a calculated property. --Okkn (talk) 15:23, 8 July 2018 (UTC)[reply]
  •  Comment @Lymantria: There are several unresolved issues in the above discussion, I don't see why you marked as ready? I've removed that status. The concerns I have are: (1) the name of this proposed property is confusing, (2) I would like to see a handful more examples of how it would be used (are we setting a standard for the parameter names as suggested in the description? Also see Pintoch's comment above), including issues of handling more complex calculations that may involve qualifiers etc. (if that is part of this proposal) (3) Pasleim's comment above was responded to by Jura with an oblique reference (not directly related to this discussion at all, other than constructing a sparql query in SPARQL). I'd like to see an actual concrete example where that could be helpful for this purpose - is there some category of calculated properties like this that you might search across several of them in such a way that it would be helpful to have SPARQL do the construction, rather than just cutting and pasting in the code for one calculated property you are looking at? If it's just the latter that would be used, then I think better to paste such SPARQL code on the discussion page of the item in question, as an example of how you would do it in wikidata. ArthurPSmith (talk) 18:10, 13 July 2018 (UTC)[reply]
  • @Okkn: I updated the property name above given the preceding comment. I hope this is still acceptable to you.
    Not sure what to add say to the suggestion to store data in an unstructured way on talk pages .. obviously, some users do prefer talk pages and hardly ever venture into adding structured data.
    --- Jura 05:39, 15 July 2018 (UTC)[reply]
    • Okkn said "The name of "property definition" gives the impression that the subject of this property is Wikidata property (Q18616576).", which was my main concern - adding the "in SPARQL" is helpful, but I think the word "property" needs to be removed since the domain is not properties but items. Just "definition in SPARQL" would probably be fine. I really would like to see a concrete example where you think you could make use of this within a SPARQL context. Structured databases are useful when you need a machine to do stuff with them, but I have a hard time seeing what a machine would do with the values of this property, that's why a real example would be important. ArthurPSmith (talk) 14:17, 16 July 2018 (UTC)[reply]
      • Maybe "of calculated property" at the end of the label works better. I'm hesitant to drop it entirely. I think the sample query Okkn provided is perfectly valid. Obviously, either a way needs to be found to interpret it directly or the query needs to be executed in two steps.
        --- Jura 13:15, 28 July 2018 (UTC)[reply]
        • Okkn and your examples are "valid" but do not provide any insight on the purpose here. Yes, you can look up a string via SPARQL - but you're pasting in a longer piece of boilerplate code to do something you could do more directly just with that string directly. How is this an improvement? If there was some case where you had to do this for several different "calculated properties" at once, or you didn't know which "calculated property" was needed when writing the query (i.e. using SPARQL to find the QID rather than BIND'ing it at the start) then I could see a purpose, but I'm having a hard time imagining how that would even work. Surely you have something more in mind here? Or not?? ArthurPSmith (talk) 13:29, 30 July 2018 (UTC)[reply]
  • marking as  Not done per Arthur's analysis, as the discussion has stalled with no consensus. − Pintoch (talk) 09:05, 7 September 2018 (UTC)[reply]