User talk:TomT0m/Classification

From Wikidata
Jump to navigation Jump to search

Events, concepts, items, resources...[edit]

I think that the first question which should be answered in this Help page is what is classified, and after that — why we need to classify that. The current title of this page implies that we classify Wikidata items — but how an item may be a subclass of another item? :) Danneks (talk) 15:32, 6 September 2014 (UTC)[reply]

@Danneks: Absolutely, that's how it was before someone renamed for black magic odd reasons. Restored it in correct state. TomT0m (talk) 13:00, 7 March 2015 (UTC)[reply]

@TomT0m, Danneks: - Did you find the answer to the "first question" - What is classified? FreightXPress (talk) 15:48, 29 April 2015 (UTC)[reply]

@FreightXPress: Of course, it's the objects and notions items refers to. Not the items themselves. We do not mean the class of all items about churches, we mean, the class of all churches. TomT0m (talk) 15:53, 29 April 2015 (UTC)[reply]
So, the Help page is about classification of e.g. church building (Q16970)? I don't understand objects and notions. Do you mean that items are a concept from Wikidata, but what is classified is not that concept, but the real word things that an item represents? FreightXPress (talk) 16:00, 29 April 2015 (UTC)[reply]
yes, except that purely abstract stuffs like numbers do not fit well in that scheme. The church notion is a class of real world objects. TomT0m (talk) 16:09, 29 April 2015 (UTC)[reply]
@TomT0m: Wikidata:Glossary#Item says "represents a real-life topic, concept, or subject.". What is the shortest expression to answer User:Danneks? Maybe 'Help:Classification describes the classification of each "real-life topic, concept, or subject" that is represented by an item in Wikidata'? Or is Help:Classification also concerned with classification of Wikidata:Glossary#Property, or Wikidata:Glossary#Query? Or one of the pages in the namespaces mentioned at Category:Contents by namespace. FreightXPress (talk) 17:53, 29 April 2015 (UTC)[reply]
Think of it as the main namespace. We don't add Item: before items links, but we add property before Property:P21. This is the same: of course we're classifying real worlds stuffs and concepts, this is the main content of Wikidata. It we want to class properties, we will create a Property classification page. TomT0m (talk) 18:12, 29 April 2015 (UTC)[reply]

@TomT0m: You mean in full length it would be Item:Q1 like Property:P21 but this is shortened to Q1 for the items and left full length Property:P21 for properties? And it would be Help:Item:Classification and Help:Property:Classificaton or alternatively Help:Item classification and Help:Property classification in full length but is shortened to Help:Classification and Help:Property classification? That sounds systematic! A made a table below.

Key Item Property
Glossary Wikidata:Glossary#Item Wikidata:Glossary#Property
Explicit/Full length page name Item:Q1 Property:P21
Short page name Q1 Property:P21
Explicit/Full length name for help page (colon and initial uppercase) Help:Item:Classification Help:Property:Classification
Explicit/Full length name for help page (space and lowercase per TomT0m) Help:Item classification Help:Property classification
Short name for help page Help:Classification Help:Property classification

FreightXPress (talk) 19:33, 29 April 2015 (UTC)[reply]

Awards[edit]


I WOULD LIKE TO ADD THIS AWARD TO MY LIST OF AWARDS:

         HONORARY DOCTORATE IN THE ARTS from THE UNIVERSITY OF THE ARTS, PHILADELPHIA, PA.
          Rosalyn Drexler

The problem of "instance of"[edit]

The presented classification gives the following definition for "instance of": "According to this principle, tokens (or instances, or individuals), are concrete objects, or events involving concrete objects, who are localized in time and space". If this definition is correct, this doesn't mean it is useful: according to this classification each water molecule in a glass is an instance of water molecule. But this view is not reelevant from knowledge point of view: knowledge doesn't care about copies because once you knows the characterictics of one thing you don't need to know the same characteristics of another individual copy. From knowledge point of view, tokens (or instances, or individuals) are concepts, objects or events,... which can be described by a least one specific and intrinsic characteristic. From this point of view localization in time and space are not intrinsic characteristic: these are referential properties meaning you need to use one time reference and a spatial reference in addition to the individual. If I change now the definition of the time or of the spacial reference, according to the first definition I will have a new individual. The localization in time and space is only one possibility of the definition of an individual but no a sufficient one.

Morever how can I classify a feeling like sadeness with that definition ? Sadeness is an instance of feeling, isn't ? What is its localization ? We see the limitation of the proposed definition of an individual which is too restricted when dealing with knowledge. The localization definition is working only for material and countable thing. Snipre (talk) 19:53, 9 April 2015 (UTC)[reply]

@Snipre: Did I answer correctly on the RfC page about this ? I missed this message. TomT0m (talk) 15:54, 29 April 2015 (UTC)[reply]

@Snipre: "knowledge doesn't care about copies because once you knows the characterictics of one thing you don't need to know the same characteristics of another individual copy" - how then could you count the molecules? How could one distinguish the 1-USD-coin in your pocket from the 1-USD-coin in my pocket? (coin was just an imaginary example) Each copy shares characteristics of the class, but can be distinguished by other characteristics. - FreightXPress (talk) 16:05, 29 April 2015 (UTC)[reply]

Tokens are not necessarily concrete[edit]

"According to this principle, tokens (or instances, or individuals), are concrete objects, or events involving concrete objects, who are localized in time and space."

I disagree with this characterization of what makes a token. As far as the discussion here is concerned a token can be just about anything. Consider, for example, biology. It is a perfectly good token even though it is not a concrete object or an event involving concrete objects. Starting out with the thesis that the real tokens somehow involve the notion of concreteness has, I think, damaged the organization of Wikidata. Peter F. Patel-Schneider (talk) 17:47, 24 November 2015 (UTC)[reply]

This was not really how Wikidata started, Wikidata started with no real idea of what would make a token, and has a vast confusion against, say, book as an artwork and book as a concrete object. For biology, I'd say that the principle (Quine token/type distinction) works : Biology can be considered as a collection of expermiments, papers, ... that defines what we know about biology. Even for thinking we can adopt a material perspective : our knowledge is encoded somehow in our brain and we can transmit it. Said differently : we can lose knowledge about biiology if we lose all the traces of some work and their authors died without transmitting it. author  TomT0m / talk page 17:34, 24 November 2015 (UTC)[reply]
One can argue that everything bottoms out on concrete things, like Douglas Adams. One can also argue that there are no concrete things, and that the only things that count are sensations. This is philosophy. What is being done in Wikidata is representation, not philosophy, so there are somewhat different considerations that come into play.
I certainly agree that in Wikidata one has to be careful about the distinction between a concrete physical object consisting of a number of printed pages and the abstract object that is closely related to the sequence of characters printed on those pages. However, both of these objects are Wikidata items, and it is fine to just start out with the abstract object. It is necessary, as you imply, to be clear when creating a class as to which of these belongs to the class. However, forcing the "token as concrete object" viewpoint is not helpful in my opinion. Peter F. Patel-Schneider (talk) 17:47, 24 November 2015 (UTC)[reply]
In short, you agree with me but you can't provide a case it's useful not to do this without requiring kind of extreme viewpoints. I claim token/type relationship is really an help to think and I could solve a lot of questions with that in mind that has no other convincing answer overwise, and that it rarely fails. If we adopt this as a guideline we will provide answers to those questions, if not we have no replacement. author  TomT0m / talk page 19:42, 24 November 2015 (UTC)[reply]
I agree that it is necessary to make a careful distinction between classes and their instances. (Actually the problem usually occurs when subclasses and instances are mixed up.) It is the requirement that tokens be concrete localized objects or be events involving such objects that I feel is not helpful in this distinction. There are many examples of things that could be non-classes that are not concrete, such as love (Q316) and theory of emotion (Q9357054). Peter F. Patel-Schneider (talk) 20:09, 24 November 2015 (UTC)[reply]
@Peter F. Patel-Schneider: Love has a lot of tokens. I guess you lived some of them. You did not read carefully my page ? This point is addressed :) The theory is more interesting, in certainly has tokens imho although, similarly to artworks : a theory can be teached, described in documents, used to treat someone or to conceive stuffs. Just the same way a DVD or a partition can be played and a book be read. When you read a book, concrete stuff happen in your brain, you live something. And you can remember and share this experience with your friends.
If you start thinking like this, you'll notice that of course, love can be classified. It can be classified as a type of emotions just the same way as a ship class can be classified as such. author  TomT0m / talk page 20:17, 24 November 2015 (UTC)[reply]
Does love have lots of tokens? Maybe, but that it not the point. Love is itself a perfectly good non-class, even though it is neither a concrete object nor an event involving such objects. Putting love in a class is also not the point. Both non-classes and classes can themselves belong to classes.
As far as I can tell from your page you want to require love to be a class because it is neither a concrete object nor an event involving such objects. You could model love as the class of all loving events if you wanted to, but that is not necessary as you could model the relationship between love and loving events in other ways. I do actually think that modelling love as a class is a good way to go, but I don't want to force people to think of love as a class from the very beginning.
The theory of emotions is more difficult to look at as a concrete object. You may argue that theories relate to concrete objects or explain concrete objects, but I don't think that you can say that a theory is a class even though it is not a concrete object or an event involving concrete objects. Peter F. Patel-Schneider (talk) 20:43, 24 November 2015 (UTC)[reply]
I don't want to force anybody to do anything, I want to find good and practical principles to build Wikidata and good uses of instance of (P31) and subclass of (P279), and it occurs this all adds up pretty well.
For theories : "I don't think that you can say that a theory is a class even though it is not a concrete object or an event involving concrete objects" => Take a theory like general relativity (Q11452)  View with Reasonator View with SQID. It obviously relates to the real world as it tries to describe rules that governs real world object, rules that we can use to plan a space trip for example. Theories are meant, just as Wikidata, to say something about the world. So they relates to the real world. Theories are also knowledge. We transmit them from human to human, or from human to paper to human, since a long time. So there is several ways we can relate theories to tokens, hence I perfectly think there is ways to express theories in Wikidata by following the scheme I propose here.
For example, theories defines classes. Quarks are described by the standard model, and without the standard model we could not really have the "quark" class in Wikidata. Similarly, a theory wrote once in a book never read again is useless. It's useful when we use the theory to calculate something, or when someone explains the principles of the theory. I think that this make total sense, just the way we can assimilate the "love" with the set of all the love tokens, or an artwork to all its manifestation, to consider computations involving the theory as concrete manifestations of the theory, and to assimilate the learning / teaching of that theory to transmission of informations / copy of a DVD ... author  TomT0m / talk page 21:19, 24 November 2015 (UTC)[reply]
@TomT0m: Maybe, but what does this have to do with tokens (non-classes) being concrete? My view is that tokens need not be concrete, for example theories. My view is that stating something like this is not helpful. Peter F. Patel-Schneider (talk) 22:08, 1 December 2015 (UTC)[reply]
@Peter F. Patel-Schneider: Then theories do not need to be tokens. Classes (and higher order classes) are abstract objects. We can totally reflect the token/class relationship with a class/metaclass one, metaclass beeing defined as a class of "class of token" - a set of set of token, if class is a set of "only token", by decaying one step to the right in abstraction. So we just don't need theories to be tokens to classify them. At least for theories who try to model the world, it's a different question for pure math who do not try by definition to be "instanciated" in reality. author  TomT0m / talk page 09:47, 2 December 2015 (UTC)[reply]

[Exdent]

@TomT0m: But when then of the notion that class membership grounds out at concrete objects or events over such objects? Are you retreating from that? If not, then where do theories ground out? Peter F. Patel-Schneider (talk) 16:52, 2 December 2015 (UTC)[reply]
@Peter F. Patel-Schneider: I don't really get the question, but I'm under the impression that we're circling around. But for theories, theories certainly refers, defines, describes, concrete objects or events. I'll quote myself "Biology can be considered as a collection of expermiments, papers, ... that defines what we know about biology. Even for thinking we can adopt a material perspective : our knowledge is encoded somehow in our brain and we can transmit it. Said differently : we can lose knowledge about biiology if we lose all the traces of some work and their authors died without transmitting it." Theory can be considered as a class of knowledge. Say we have a class "scientific experiment". We very well can subclass this as "biological scientific experiment". Similarly we can define a class "biological scientific publication / book / lectures" and define them as "scientific experiments / publication (resp.) that aimed to understand/collect/describe natural organisms". "biological scientific experiment". Similarly we can define a class "biological scientific publication / book / lectures". Let's call "biology" the superclass of both "biological scientific experiment" and "biological scientific publication / book / lectures" and describe that as "class of human activities that aims to understand, describe nature and share this knowledge". I think this is pretty accurate. Let's define a "science" as a metaclass. This metaclass is defined as "a science is a class of class of human activities that aims to understand, describe and share knowledge about a specific part of nature". Then "Biology" is an instance of "science" whose focus is biological organisms. author  TomT0m / talk page 17:13, 2 December 2015 (UTC)[reply]
For a specific theory, say, evolution theory, I tend to think one solution would be to treat them as intellectual productions, that is the same way as musics or novels, stories, ... A theory indeed, in natural sciences, aims to understand and studies something, and propose a description (ideally quantitative in the sense it can be used to make quantitative predictions about the evolution of instances of the studied object). I think we could assimilate a theory to the class of all its manifestation (descriptions/publication, usages as token events, ...) exactly the same way we can assimilate an artwork to its manifestations, and that that pretty much would make a lot of sense. In addition, a theory can be the support of one class of object like "quarks" can be described by "standard model". author  TomT0m / talk page 17:49, 2 December 2015 (UTC)[reply]
The fact that theories may be about concrete objects or events is not relevant here. According to the modelling methodology that you are promoting, everything has to be either a concrete token or a class. I don't see how Newton's theory of gravitation or physics is either one of these. They certainly are not concrete objects, localized in time and space. They are not events over concrete objects either. Neither are they really classes, either. A particular theory is indeed an intellectual production, and is thus akin to musical compositions or novels, but this does not make a theory either a concrete object or an event or a class. Modelling physics as a class is even less viable in my opinion. Just because there are experiments and theories and other things that are part of physics does not make physics a class of all these things and neither does it make these things be instances of physics. One should not represent Newton's theory of gravity as an instance of physics, for example, or a particular observation of gravitational interaction as an instance of Newton's theory.
In my view there are perfectly good tokens that need to be modelled in Wikipedia, including theories, sciences, novels, and pieces of music, that are neither concrete objects nor events involving such objects. Peter F. Patel-Schneider (talk) 22:00, 2 December 2015 (UTC)[reply]
@Peter F. Patel-Schneider: you keep saying that without really answering to my arguments, so I'll stop here, sorry. author  TomT0m / talk page 10:21, 3 December 2015 (UTC)[reply]

Classes without concrete tokens[edit]

What should be the way of introducing classes and classification in Wikidata, particularly if concreteness is removed? I am partial to something like "A class is an item that collects together a group of entities that have some commonality. For example, human (Q5) is a class that collects together all entities that belong to the species Homo Sapiens, such as Douglas Adams (Q42), and ship type (Q2235308) is a class that collects together ship types, such as guided missile cruiser (Q1361980)."

The next thing to say should then be something about the importance of being clear about what belongs to the class and what does not. The ship type class could be used as an illustration, showing that a particular ship is not an instance of the class but is instead is an instance of the ship class and can be an instance of one or more ship type classes. This naturally leads to the idea of metaclasses without going into anything about higher-order logics, which is at best a red herring here. Peter F. Patel-Schneider (talk) 17:53, 24 November 2015 (UTC)[reply]

Then there should be discussion of the instance-of relationship and the subclass relationship.

You should more give example of things classes oups without instances :) author  TomT0m / talk page 17:35, 24 November 2015 (UTC)[reply]
I did give one - Douglas Adams. It probably would be better to start out with more examples of simple classes, maybe having the first two examples be human and ship and having the next paragraph discuss ship type and its relationship to ship and ships. Peter F. Patel-Schneider (talk) 17:51, 24 November 2015 (UTC)[reply]
Not very convincing as an example, one old Bible character would have been better - plus we have sources that acknowledge Douglas Adams existed. As Wikidata is about what can be claimed with sources, this means we have sources that can prove he qualifies to be a token, which is philosophically enough. Feel free to add more examples. author  TomT0m / talk page 19:45, 24 November 2015 (UTC)[reply]
Not very convincing at what? That Douglas Adams has no instances? I'm not arguing about existence in the real world (whatever that is) at all. Existence in the real world is a completely separate matter.
Perhaps you meant classes that have no instances in Wikidata. I don't think that that relates to the argument so far at all. I do agree that such classes (such as quark (Q6718)) are important for Wikidata, but that's a separate aspect of classes. It probably does belong on a page discussing classes and classification, but my suggestion here is only about how to introduce classes without discussing concrete tokens. Peter F. Patel-Schneider (talk) 20:17, 24 November 2015 (UTC)[reply]
Sorry, it seems I made a mistake, I meant classes with no instances at all - and are not metaclasses as instances are not tokens. Of course the fact that there is or not instances in Wikidata, the important thing is wether or not there is tokens in the real world. I challenge the need to remove the concreteness in short, as it's a way to provide well-founded relation (Q338021) to our subclass/instance of system while relating it to real world entities. author  TomT0m / talk page 20:26, 24 November 2015 (UTC)[reply]
Of course, there is the bottom class, but it's a special case :) author  TomT0m / talk page 20:27, 24 November 2015 (UTC)[reply]
Well, there are "classes" with no instances at all, such as the class of triangular squares. There are in fact very many such "classes". These classes are probably not very useful in Wikidata, as Wikidata lacks the machinery to define them. But again this is not relevant to whether the only non-classes should be concrete objects and events involving such objects.
If all non-classes are supposed to be concrete objects or events over such objects, then where does the theory of emotions fit? It doesn't seem to be a class (what would its instances be?) but it is certainly neither a concrete object nor an event.
My claim doesn't rest on these kind of non-classes alone. My claim is that requiring a modelling methodology where non-classes are concrete is not helpful because it requires setting up a such a base for all classes. For example, there is no need to set up a concrete base for colours (even assuming that one could come up with one). All that is needed is a determination of what makes a colour. So one could be agnostic as to what the concrete objects underlying colours are while creating colour and some various colours. One does have to be careful here so as to not mix levels (and have, for example, red be both a subclass and an instance of colour), but there is no need to have a concrete base at the bottom.
I'm not saying that there are not many cases where it is useful to set up a concrete base, just that requiring a concrete base in all cases is not the right way to go because it rules out non-concrete non-classes as well as requiring extra work where the concrete base may be unclear. Peter F. Patel-Schneider (talk) 22:09, 24 November 2015 (UTC)[reply]
For colors, please discuss at Are_colors_instance-of_or_subclass-of_color and comment there. I proposed a I think working scheme based on token/class principle. In short : color in the sensation that someone has when a certain class of light touch his eyes. It occurs we call the "color" of some object the kind of sensation we feel when some white light touches the object and is reflected into our eyes. This just works, and I don't let you say it's not useful as otherwise the questions on how to model this is open and non consensual :) author  TomT0m / talk page 10:21, 25 November 2015 (UTC)[reply]
@TomT0m: My use of color here was to illustrate a point. Color classes and individuals don't need to be defined in terms of concrete objects or events on such objects. It might be possible to define colors as classes over events of individuals perceiving colored object or classes of colored objects, but it is not necessary to do so. Peter F. Patel-Schneider (talk) 22:13, 1 December 2015 (UTC)[reply]
@Peter F. Patel-Schneider: I understand that, and I probably would have support this a few years ago. Although I came to understand that without such a principle, it's pretty hard to come by with solutions and to solve modelling problems in Wikidata. Because of the lack of a credible pre-existing guideline. There is no real previous effort of modelling I know of with such a vast different type of things to model. This principle (the roken/class one) is a pre-existing credible and - I try to demonstrate - practicle one, that provide answer. Combined with higher order classes, we CAN also classify non tokens. So colors can very well be seen both as class of concrete tokens and as something more abstract with metaclassing (type of primary color for example). So the question is not "but it is not necessary to do so" but "how to do this" which I ask you. And how to choose beetween concurrent models, how to represent different models ? Imho the token/class principle based class-system (with higher order classes) has the power to represent different models WHILE as much as possible gives a way to translate from one model to other by providing a "pivot" principle to translate beetween them. This allows us to model all those models in Wikidata and to express how they relate to reality BECAUSE with a token/type principle we try to always relate them to reality. author  TomT0m / talk page 09:39, 2 December 2015 (UTC)[reply]

[Exdent]

@TomT0m: I am not arguing against a distinction between a class and the instances of the class. I am, however, arguing against a requirement that class modelling be based on concrete objects or events over such objects. I feel strongly that this requirement is counter-productive, as there are important cases where a concrete basis is either missing or hard to find. For example, where is the concrete basis for colours? Where is the concrete basis for theories? Where is the concrete basis for songs? Where is the concrete basis for movie genres? Where is the concrete basis for band tours? All these can be effectively modelled without making a determination on a concrete basis. However, all these cannot be modelled without making a determination of what items are instances of the classes. Requiring the specification of a concrete basis only makes a hard task harder.
Requiring a concrete basis also isn't all that helpful in forcing modelling choices. What counts as a concrete object or event over such objects? You could say that a band tour is an event over concrete objects. You could say that a band concert is a class of concrete listening events. You could say that a song is a concrete specification of some concrete musical events or you could say that a song is a class of concrete performance events.
There have been several large efforts to build general-purpose ontologies. Some of these start with a pre-existing philosophy of ontology construction and are probably not so interesting for Wikidata. However, both SUMO [1] and the Cyc ontology [2] have lessons for Wikidata, in my opinion. Peter F. Patel-Schneider (talk) 16:26, 2 December 2015 (UTC)[reply]
@Peter F. Patel-Schneider: I'm not really impressed by your example and your alledge problems are not real problem to me, at least most of them. A tour is for example an event, a composite one : it's a big event composed of a sequence of smaller events. Each of these smaller events shares a lot of things : each concert is at least to some extent shares some characteristics with the preceding one in the tour, so if needed we just can define a class of all such concerts, and this class can be used as a "unique" item to refer to this show as an artwork as such . This should not be a problem. The base for songs and other artistic works has always been extensisvely discussed and have already have a compelling model : https://en.wikipedia.org/wiki/Functional_Requirements_for_Bibliographic_Records Here is answers I would give, and I'd wait some answer now, and not more questions from yourself :) author  TomT0m / talk page 16:38, 2 December 2015 (UTC)[reply]
@TomT0m: The point that I am trying to make is that the requirement that items either be tokens (concrete objects or events involving concrete objects) or classes is not helpful, even if metaclasses are allowed. I provided what I think are several telling examples where such tokens are missing or difficult to determine, including theories, colours, and media.
In many areas, of course, there are fairly obvious concrete objects, e.g. for animals. In these areas it can be useful to start out with these concrete objects and build up from there. This can prevent problems like having both Douglas Adams (Q42) and dog (Q144) be be instances of animal (Q729).
However, problems like mixing together species and individual animals is not exactly a problem about tokens. It is instead a problem of mixing together different kinds of things, i.e., a problem with defining what belongs to a class. For example, it is not necessary for color (Q1075) to be defined up from tokens to see that having gray (Q42519) be both an instance and a subclass of it is a problem. This problem here actually comes from a lack of a good description of what should and should not be instances of color (Q1075). One can see a well-thought-out description of some of the problematic issues related to conceptual works at the site you refer to, https://en.wikipedia.org/wiki/Functional_Requirements_for_Bibliographic_Records, where there is no notion of concrete tokens to be seen.
My view is thus that starting out from tokens only obscures the real issue of poor descriptions or definitions of classes and is not a good foundation for class modelling or classification in Wikidata. Peter F. Patel-Schneider (talk) 21:18, 2 December 2015 (UTC)[reply]

User namespace[edit]

  • I think that "Classes definition" text is too verbose and lacks examples for the scope of this page (should be moved to "WikiProject Reasoning").
Most users want Wikidata property for the relationship between classes (Q28326461) and/or Wikidata property for the relationship of the element to its class (Q28326730) at respective properties.
  • I'm not able to link this page from Q28326484. I propose to move text from TomT0m/Classification to Help:Basic membership properties or Help:Classes and instances.
d1g (talk) 04:09, 15 January 2017 (UTC)[reply]