User:Jeblad/plans and constituents

From Wikidata
Jump to navigation Jump to search

The class/property search[edit]

The class/property search is done by traversing upward the hierarchy, trying to find all entities that branch off to a message. On all branches a penalty is given, and all branches that drops below a certain level is pruned.

As there can be several claims, and there can also be several starting points for the upward travel of the hierarchy. The starting point is given weight according to the rank, so a preferred statement would weight somewhat more heavy than a normal statement. Several preferred statements might add up to make a distant class with a plan the overall winner. Deprecated statements will not be traversed.

It is probably better to make a breath-first search, and prioritize those with highest weight before those with lower weight.

Content determination[edit]

The actual data is coming from statements in ordinary items, that is the statements about the topic described in the Wikipedia articles. Such items have statements, and those use properties (predicates) that themselves have statements pointing to one or more messages.

The possible messages are collected by an upward search in the hierarchy, trying to find all properties that branch off to a message.

At least one message must exist for the statement to be considered for inclusion. Which one to use must be considered during optimization of the content generation.

Examples[edit]

Page 62

Instance of message
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix wdt: <http://www.wikidata.org/prop/direct/> .
@prefix skos: <http://www.w3.org/2004/02/skos/core#> .
@prefix schema: <http://schema.org/> .
@prefix wd: <http://www.wikidata.org/entity/> .

wd:Q1234567890 a wikibase:Item ; # fake URI
	rdfs:label "message a"@en ;
	skos:prefLabel "message a"@en ;
	schema:name "message a"@en ;
	schema:description "message to use for data values"@en;
	wdt:P31				# "instance of"
		wd:Q1234567891 .	# "NLG Message"

The generated structure will have pointers to the actual data.

Document structuring[edit]

The actual data is coming from ordinary items, that is those that represents articles in Wikipedia. Such items has an instance of, and that class (resource) has possibly statements pointing to one or more plans.

The possible plans are collected by an upward search in the hierarchy, trying to find all items that branch off to a plan.

At least one of the plans must be a document plan. This will be used to build the first iteration of the document, that is the article, for the items class. If multiple is found a tie-break will be done on the weight for the plans, and the one with highest weight is used.

All other document plans pointed to by the found classes are then stripped for the title and then merged to build an overall plan.

Examples[edit]

Page 64

Instance of plan
@prefix wd: <http://wikiba.se/ontology-beta> .
<http://www.wikidata.org/entity/Q42> # fake URI
  wd:label "plan A" .
  wd:description "some generic plan covering the topic" .
  wd:instance_of "NLG Plan" . # this is really an item
  wd:constituent # could have any number of constituents, but is expected to have at least one
    <http://www.wikidata.org/entity/Q142> . # fake URI
Instance of constituent
@prefix wd: <http://wikiba.se/ontology-beta> .
<http://www.wikidata.org/entity/Q142> # fake URI
  wd:label "constituent A" .
  wd:description "some generic constituent covering a small part of the topic" .
  wd:instance_of "NLG Constituent" . # this is really an item
Instance of satellite
@prefix wd: <http://wikiba.se/ontology-beta> .
<http://www.wikidata.org/entity/Q142> # fake URI
  wd:label "satelite A" .
  wd:description "some generic satellite and its rhetorical relation to the nucleus" .
  wd:instance_of "NLG Satellite" . # this is really an item
  wd:relation <http://www.wikidata.org/entity/Q342> . # fake URI
  wd:satellite <http://www.wikidata.org/entity/Q242> . # fake URI, instance of Plan or Message
Extended constituent (rhetoric)
@prefix wd: <http://wikiba.se/ontology-beta> .
<http://www.wikidata.org/entity/Q142> # fake URI
  wd:label "rhetoric constituent A" .
  wd:description "some generic rhetoric covering the topic" .
  wd:instance_of "NLG Constituent" . # this is really an item
  wd:instance_of "NLG Rhetoric" . # this is really an item, necessary if nucleus/satellite is not given, but should be enforced
  wd:nucleus <http://www.wikidata.org/entity/Q42> . # fake URI, instance of Plan or Message
  wd:satellite  # could have any number of satelites, but is expected to have at least one
    <http://www.wikidata.org/entity/Q242> . # fake URI
Extended constituent (set)
@prefix wd: <http://wikiba.se/ontology-beta> .
<http://www.wikidata.org/entity/Q142> # fake URI
  wd:label "set of constituent A" .
  wd:description "some generic rhetoric covering the topic" .
  wd:instance_of "NLG Constituent" . # this is really an item
  wd:instance_of "NLG Set" . # this is really an item, necessary if relation/constituent is not given, but should be enforced
  wd:relation <http://www.wikidata.org/entity/Q242> . # fake URI, instance of Plan or Message
  wd:constituent  # could have any number of constituents, but is expected to have at least one
    <http://www.wikidata.org/entity/Q242> . # fake URI
Extended plan (paragraph)
@prefix wd: <http://wikiba.se/ontology-beta> .
<http://www.wikidata.org/entity/Q142> # fake URI
  wd:label "document plan A" .
  wd:description "some generic document plan covering the topic" .
  wd:instance_of "NLG Plan" . # this is really an item
  wd:instance_of "NLG Paragraph" . # this is really an item, necessary as it can't be derived from a property, but should be enforced
Extended plan (document)
@prefix wd: <http://wikiba.se/ontology-beta> .
<http://www.wikidata.org/entity/Q142> # fake URI
  wd:label "document plan A" .
  wd:description "some generic document plan covering the topic" .
  wd:instance_of "NLG Plan" . # this is really an item
  wd:instance_of "NLG Document" . # this is really an item, necessary if title is not given, but should be enforced
  wd:title <http://www.wikidata.org/entity/Q242> . # fake URI, instance of PhraseSpec or Message