Wikidata:WikiProject Events and Role Frames

From Wikidata
Jump to navigation Jump to search
WikiProject Events and Role Frames

The primary aims of WikiProject Events and Role Frames are

  • to define a set of properties that consistently model event occurrences and their participants;
  • to fill gaps in Wikidata regarding items for events and actions; and
  • to encourage use of the proposed model and newly introduced items across Wikidata.

Join us!

Motivation[edit]

One of the known weaknesses of Wikidata is the spotty coverage of event classes and their role structures. Let’s look at the spotty coverage issue first.

Spotty Event Coverage[edit]

One of the most common verbs in most languages is “to see”, e.g., “I see a house”, “Je vois une maison”, “я вижу дом”, “Ich sehe ein Haus”. The closest concept in Wikidata is visual perception (Q162668) described as “ability to interpret the surrounding environment using light in the visible spectrum.” We argue that an ability to do something is distinct from actually doing it. The latter is an event which takes place in space-time and as such should be a direct or indirect subclass of occurrence (Q1190554). Our notion of events includes actions. We examined over 11,500 rolesets contained in PropBank (Q7250039) that describe English predicates (mostly verbs) and identified over 7500 potentially missing Wikidata items. Each of these “gaps” needs further examination to determine if it warrants a new item, but the list gives us a starting point.

We want to emphasize that although we start with English, the gaps are semantic, not lexical. We should use multiple languages to identify semantic gaps. Wikidata contains concepts at different levels of granularity, e.g., killing (Q844482), capital punishment (Q8454), execution by shooting (Q15747939) – English does not have a single verb for the latter, but Russian does. The item going (Q19279529) is defined as “to move from a place to another” and has aliases “leave”, “to go” and “go”. There is also departure (Q21171241) that includes “leaving” but no item for arrival. Russian has many verbs that specify different modalities of “moving from one place to another”, e.g., “приехать”, “уехать”, “подъехать”, “съехать”, “заехать”, etc. That does not mean that we must create a new item for each of these modalities - going (Q19279529) might suffice.

As we examine Wikidata events, we may also want to clean up some of the “subclass of” properties. For example, departure (Q21171241) is not currently a subclass of going (Q19279529). The item execution (Q3966286) defined as “homicide as capital punishment” does not seem to be connected to capital punishment (Q8454).

Event Role Structures[edit]

All action events have semantic core roles - "eating" has the "eater" and the "eaten", "throwing" has the "thrower", the "target" and the "projectile". These roles are not optional. Every act of "eating" has an "eater" and the "eaten" independently of how and in which language it is expressed. Most of the existing items for action classes do not mention these roles. For example, throwing (Q12898216), defined as “launching of a ballistic projectile by hand” does not have any statements that indicate the existence of the thrower, the target, or the projectile, let alone the specifications of the kinds of entities these attributes are likely to be.

eating (Q213449) defined as “ingestion of food to provide for all organisms their nutritional or medicinal needs” uses the practiced by (P3095) property pointing to eater (Q20984678) defined as “human or other live being who eats something”. There are a few problems here. The description of “eating” clearly indicated that it applies to all organisms, yet the description of “eater” applies only to humans. practiced by (P3095) is defined as “type of agents that study this subject or work in this field” which makes it more suitable for specifying someone’s occupation. And indeed, eater (Q20984678) is an instance of occupation (Q12737077) without any indication that “eater” is an occupation of an organism. eating (Q213449) also uses uses (P2283) to point to food (Q2095) and human digestive system (Q9649). This does not differentiate between the two and the second applies only to humans.

The point of these examples is that currently, Wikidata either does not specify action roles or does it in an ad-hoc and inconsistent way. There are two property proposals currently under discussion: "agent of action" and "object of action" to partially remedy this problem. In our opinion, these proposals, while going in the right direction are limited. First, even informally, “agent of action” and “object of action” don’t apply to all actions. For example, in “the car broke down” there is no agent. Neither is the lecture in “the lecture bored John”. One can argue that events bore people, but we would not call then “agents” (PropBank calls it “boring entity”). More importantly, as noted in the proposal discussion, the proposals do not make a distinction between classes and instances. Our examples are event instances. When we define the “eater” role for “eating”, we want it to describe the kinds of concepts whose instances might play that role in an instance of “eating”. For example, we might say that the “eater” is expected to be an organism and the “eaten” is expected food.

Another attempt to address this problem is the proposal for a property “frame element” in which concepts such as dog (Q144) are connected via “frame element of” to the event frames such as dog walking (Q38438) in the spirit of FrameNet (Q1322093). As noted in the proposal discussion, to describe event frames the link direction should be reversed, e.g., “dog walking” “has frame element” “dog”. But even then, the property “has frame element” does not tell us much about the relationship between “dog walking” and “dog”.

Proposal[edit]

Our proposal is inspired by PropBank, which has over 11,500 rolesets which define predicates and their core roles. PropBank and its derivatives have been used extensively in text annotations and semantic parsing, including Abstract Meaning Representations. Many rolesets are cross-referenced with FrameNet (Q1322093), VerbNet (Q7920918) and other knowledge sources. The key to our proposal is the new property which we propose to call “event role” and the key difference between it and “has frame element” is the item at the end of the link. In our proposal, it is a new item, specific to the <event, role> pair that describes the role in greater detail. For example, eating (Q213449) would have two statements:

eating (Q213449)event roleQ_eater

eating (Q213449)event roleQ_eaten

Both Q_eater and Q_eaten are instances of Q_event_role. The Q_eater item will be unique and serve as an anchor for whatever information we might want to associate with the “eater” role of “eating”. For example, we may specify that the eater is expected to be an organism and the eaten is expected to be some form of food. To do that, we need another new property that we call “selectional preference” which is used only with the instances and subclasses of Q_event_role. For example:

Q_eaterselectional preferenceorganism (Q7239)

Multiple statements with “selectional preference” should be interpreted as an “OR”, i.e., the filler of the role slot should descend from at least one of the selectional preference items. The meaning of “descend” could be application-specific, but, generally, we mean a combination of “subclass of”, “parent taxon” and “instance of” properties. Violations of selectional preferences often signal metaphoric use as in “the house ate the savings”. Other information such as dietary restriction statements can also be attached to the event role items.

How many event role items may be needed? Our estimate based on the PropBank rolesets is about 25,000 – 30,000. It might also be possible to cluster the event role items and create a “subclass of” hierarchy. We want to stress that the proposed event role items are not lexical or grammatical constructs. The existence of a killer in a killing event is not tied to any language or grammar. It is a part of the ‘killing’ concept.

For completeness, we also propose “role in event” - the inverse property to “event role”.

So far, we discussed the representation of event classes. The proposed “event role” property applies only to event classes, not instances. We also need to discuss event instances. For example, assassination of Abraham Lincoln (Q1025404) is an instance of assassination (Q3882219). Our proposal will create Q_assassin and Q_assassinated event role items associated with assassination (Q3882219). We propose to create one new property "event argument" with a qualifier "argument type" to represent the roles in an event instance, for example:

assassination of Abraham Lincoln (Q1025404)event argumentAbraham Lincoln (Q91)argument typeQ_assassinated

assassination of Abraham Lincoln (Q1025404)event argumentJohn Wilkes Booth (Q180914)argument typeQ_assassin

One might object that in this example, these properties convey the same information as target (P533) whose value is Abraham Lincoln (Q91) and perpetrator (P8031) whose value is John Wilkes Booth (Q180914). Unfortunately, many instances of assassination (Q3882219) use different properties or none at all to indicate the assassin and the victim. We propose to use a uniform approach even if it causes some redundancy.

We are aware that creating new properties in Wikidata is a time-consuming and difficult process. Our proposal involves 1 new property for event classes (event role), 2 for event roles (role in event, selectional preference) and 1 property and 1 qualifier for event instances (event argument, argument type). We are open to revise our model to follow any existing practices of modeling semantic frames in Wikidata.

Statistics[edit]

(to be filled in later)

Queries[edit]

(to be filled in later)

Current tasks[edit]

(to be filled in later)

Participants[edit]

[+] Add yourself to the list

The participants listed below can be notified using the following template in discussions:
{{Ping project|Events and Role Frames}}

Related links[edit]