Property talk:P8248

From Wikidata
Jump to navigation Jump to search

Documentation

Colon Classification
type of library classification, to be used for topics, not for written works; use with qualifier "edition (P747)" with item value "CC 6" or "CC 7"
[create Create a translatable help page (preferably in English) for this property to be included here]
Format “[a-zA-ZΔΣ0-9-]+[a-zA-Z0-9.:;,()'\s]*: value must be formatted using this pattern (PCRE syntax). (Help)
Exceptions are possible as rare values may exist. Exceptions can be specified using exception to constraint (P2303).
List of violations of this constraint: Database reports/Constraint violations/P8248#Format, SPARQL
Single value: this property generally contains a single value. (Help)
Exceptions are possible as rare values may exist. Exceptions can be specified using exception to constraint (P2303).
List of violations of this constraint: Database reports/Constraint violations/P8248#Single value, SPARQL
Distinct values: this property likely contains a value that is different from all other items. (Help)
Exceptions are possible as rare values may exist. Exceptions can be specified using exception to constraint (P2303).
List of violations of this constraint: Database reports/Constraint violations/P8248#Unique value, SPARQL (every item), SPARQL (by value)
Qualifiers “has edition or translation (P747), reason for deprecated rank (P2241), reason for preferred rank (P7452): this property should be used only with the listed qualifiers. (Help)
Exceptions are possible as rare values may exist. Exceptions can be specified using exception to constraint (P2303).
List of violations of this constraint: Database reports/Constraint violations/P8248#allowed qualifiers, SPARQL
Required qualifier “has edition or translation (P747): this property should be used with the listed qualifier. (Help)
Exceptions are possible as rare values may exist. Exceptions can be specified using exception to constraint (P2303).
List of violations of this constraint: Database reports/Constraint violations/P8248#mandatory qualifier, SPARQL
Allowed entity types are Wikibase item (Q29934200): the property may only be used on a certain entity type (Help)
Exceptions are possible as rare values may exist. Exceptions can be specified using exception to constraint (P2303).
List of violations of this constraint: Database reports/Constraint violations/P8248#Entity types
Scope is as main value (Q54828448): the property must be used by specified way only (Help)
Exceptions are possible as rare values may exist. Exceptions can be specified using exception to constraint (P2303).
List of violations of this constraint: Database reports/Constraint violations/P8248#Scope, SPARQL

Please notify projects that use this property before big changes (renaming, deletion, merge with another property, etc.)

Literature Box gadget

[edit]

@Epìdosis, Carlobia, Bargioni, VIGNERON, Nomen ad hoc: @Giaccai, Ijon, Mahir256, Bodhisattwa, Trade: @NahidHossain, ArthurPSmith: (participants of the property creation)

I see that there is now a gadget to fill the values, if I understand correctly Wikidata:Tools/Edit_items (which I am translating) is putting the values for works and authors and not genre items — ( gadget page). This surprised me so I checked the property proposal page, it seems that as the proposal goes the plan was not to use it as work values. Did something change ? Is the gadget doing correct work ? (If I’m totally wrong please forget this and sorry for the disturbance)

author  TomT0m / talk page 14:48, 9 February 2022 (UTC)[reply]

The property represents the value of a bibliographic classification (Colon Classification) that can apply to a variety of subjects/concepts, and authors and literary works (Romeo and Juliet), and classic works (Aristotle's Logic) are considered subjects/concepts too. So, the gadget is consistent in creating values admitted by CC (such as authors and their works). Furthermore, CC could be also extended to editions and copies of written and literary works, but it would require a different format. We would like to develop it but we are still working to refine and improve the gadget for authors and works. Possible applications of CC to literary authors and works are described in the first articles cited in the bibliography at this page: https://www.wikidata.org/wiki/User:Bargioni/CC_literature_box. I hope this can solve your doubt Carlobia (talk) 15:31, 9 February 2022 (UTC)[reply]
@Carlobia The current practice in Wikidata for these kind of values is to try to minimise redundancy. To achieve that we try to put classification code not to the works but to the topics they classify … See for example this query that lists everything with a Dewey classification code on Wikidata, there is 6000 results, which is very few.
The approach is to have an item on wikidata on the topic, to put a classification code on the topic item if needed. The works and the topic are linked thanks to main subject (P921) View with SQID property. So for example we don’t really need to put "004" on all works about computing, we can retrieve the code indirectly thanks to a query such as that one
select ?work ?workLabel ?class ?classLabel ?dewey {
  bind( "004" as ?dewey)
  ?class wdt:P1036 "004" .
  ?work wdt:P921 ?class
}
Try it!
(this one is a bit two lax, we should search for subtopics of computing thanks to wikidata properties such as subclass of (P279) or part of (P361) to be more complete I guess)
This allows to avoid to duplicate all classification codes on all works, as there is a lot of possible works on every possible topics, while allowing to retrieve the informations. This was discussed in the property proposal.
So we would like to avoid to make users adding essentially useless statements and doing useless work. To justify doing differently for this property and adding informations on gazillion of items, I guess we would have a reason ? author  TomT0m / talk page 17:08, 9 February 2022 (UTC)[reply]
@TomT0m, Carlobia: If a particular classification uniquely identifies a work, and that's all it ever could identify, then I suppose it's fine to apply it to the work. But please ensure this is used as an external identifier, that is a unique identifier that is not shared between different works. UDC, LOC and similar library classifications usually place multiple works under the same classification code, so they are identifiers only for the topics, not for the specific works classified. ArthurPSmith (talk) 18:07, 9 February 2022 (UTC)[reply]
CC allows to get to unique class numbers, i.e. identifiers, both for authors and for works. That's way I think it is perfect in the semantic web environment. Carlobia (talk) 18:34, 9 February 2022 (UTC)[reply]
@TomT0m DDC and CC are different classifications and they are structured in a different way; one consequence is their ability to define a class (the main object of a classification). If classes are large, things inside those classes tend to be messy; if classes are smaller things can be identified easier; the smaller the better, especially when classes in a classification try identify or describe the same universe (and this is the case). Let's take three different topics: a book on American novel of the XXth Century, a book on F. Scott Fitzgerald, and a book on The sound and the fury by Faulkner. DDC classifies all the three topics in one number (813.5); while Colon classification classifies the three different topics with three different numbers. (This is not unusual for library classification. LCC does classify works too; your own query modified). The example shows how an author and a work can be topic of another work and shows that CC is able to express that difference and, by this, to organize better books by means of smaller classes. The creator of the CC was well aware of DDC and of its issues in this aspect, and created CC quite to solve those issues. If we admit a property to express an expressive classification scheme, we must allow the scheme to express itself. There is no redundancy, just more expressivity. Carlobia (talk) 18:30, 9 February 2022 (UTC)[reply]
@Carlobia Your linked seems to be pointed to this page so you might want to correct it.
That your classification is finer grained than Dewey’s is fine, of course, Dewey is really not fine grained, but that does not change my point at all.
See for example the enwiki example on w:en:Colon classification
  • Medicine,Lungs;Tuberculosis:Treatment;X-ray:Research.India'1950
  • This is summarized in a specific call number:
  • L,45;421:6;253:f.44'N5
On Wikidata we could have something like
It seems to me this encodes almost the same information than « L,45;421:6;253:f.44'N5 » in a clearer way, easier to query, and that we could compute the code from the Wikidata statements if we have in Wikidata the information that "Tuberculosis" is "421" or "L,45;421". That we can compute the code by an algorithm from the statements is the exact definition of redundancy.
Do we agree that you don’t intend to specify the code on the item « work W » ? author  TomT0m / talk page 18:55, 9 February 2022 (UTC)[reply]
I am very sorry, I do not understand your point. I will try to answer, but I apologize if this tentative answer is not correct. In your example on work W things seems clear, but I have a counter example that shows that there is more than two or three 'component parts' in a complex subject: let's take just two keywords as P921 of a work: Philosphy and History; they can mean "History of Philosophy" or "Philosophy of History", and the two topic are quite different; so in the 'computed' (or redundant) result of the code, only the right one would be recorded. There are two pieces plus their correct relationship. The right value can not be added by an algorithm, because there is no machine, at the moment, as far as I know, able to solve this problem. Second attempt. The Colon Classification class number works as an identifier for authors and works. It could work for an identifier for a specific edition too. Its call number (a class number plus a book number) would be an equivalent of, let's say, an ISBN. In which way an ISBN is or is not redundant (it is the result of an algorithm, starting from the good code 978 - the number of the country - the number of the publisher - the identifier of that edition in the publisher catalog and a last check digit)? Colon Classification is not my classification. It is a public asset, and I think it has potentialities to be applied in the semantic web, and in Wikidata too. I can assure that Property P8248 expresses class numbers created correctly by CC in all its current instances in Wikidata. What is wrong in this? What should be changed? Is there any way to solve the issue you are underlining? Carlobia (talk) 19:42, 9 February 2022 (UTC)[reply]
@Carlobia
What is wrong in my humble opinion is that instead of trying to code this in a totally non-Wikidatan way we should figure out how to express the same stuffs using Wikidata statements, if Wikidata is not expressive enough yet. Maybe this means that we need an item "Philosophy of History" for example (Except we already have one in this case: philosophy of history (Q190721)).
In your example of ISBN this could be totally a feasible point if we stored the identifier of the edition in the publisher catalog, but we will never. And so it’s way more easy to copypaste or have an url from the isbn.
For the Colon Classification, it’s not clear at all to me how to use it. Take "L,45;421:6;253:f.44'N5" it needs to be decoded before any query, for example with the query service, is done to find similar items … author  TomT0m / talk page 20:17, 9 February 2022 (UTC)[reply]
@TomT0m I do not understand what "a totally non-Wikidatan way" means. Tha wikidata item Philosophy of History shows that the two keywords philosophy + history in any other item are not equivalent, in Wikidata too. It seems to suggest we need also more complex subject with two or more ideas and a syntax to join them properly. Wikidata is not expressive enought, (yet!) is a point. I absolutely agree. I would like very much to study it, even if I know it is one of the hardest issues to deal with in subject indexing. When you say "For the Colon Classification, it's not clear at all to me how to use it" a doubt arises to me. Are you talking about Colon Classification or about bibliographic classification schemes? E.g., is 2--4359167 (Dewey Decimal Classification (P1036)) clearer than L,45;421:6;253:f.44'N5 to you? Or PL2947.C59 S3613 (Library of Congress Classification (works and editions) (P8360))? Or 06c,15k (Løøv classification (P6709))? This values are taken from Wikidata item properties, and not from Wikipedia. How do you use them, and why you feel they are easier to use than CC numbers? 151.38.51.170 20:51, 9 February 2022 (UTC)[reply]