Wikidata:WikiProject Events and Role Frames

From Wikidata
Jump to navigation Jump to search
WikiProject Events and Role Frames

The primary aims of WikiProject Events and Role Frames are

  • to define a set of properties that consistently model eventualities (states, processes and events) and their participants (Pustejovsky, 2021). (By eventualities we mean things that occur or happen, as simple as throwing a ball or mowing the lawn, or as complex as one country invading another country or a town being flattened by a tornado};
  • to fill gaps in Wikidata regarding items for these states/processes/events/actions; and
  • to encourage use of the proposed model and newly introduced items across Wikidata.

Pustejovsky, James, (2021) The Role of Event-Based Representations and Reasoning in Language, in Caselli T, Hovy E, Palmer M, Vossen P, eds. Computational Analysis of Storylines: Making Sense of Events. Cambridge University Press.

Join us!

Motivation

[edit]

One of the known weaknesses of Wikidata is the spotty coverage of eventualities (processes, states, events) and their prototypical participant structures. Let’s look at the spotty coverage issue first.

Spotty State/Process/Event/Action Coverage

[edit]

One of the most common verbs in most languages is “to bring”, e.g., “I brought flowers to my mother”, “J'ai apporté des fleurs à ma mère”, “Я принёс цветы маме”, “Ich habe meiner Mutter Blumen mitgebracht”. Until we added "bringing (Q124457329)" in February 2024, there was no such concept in Wikidata. We examined over 11,500 rolesets contained in PropBank (Q7250039) that describe English predicating expressions (mostly verbs) and identified over 7500 potentially missing Wikidata items. Each of the semantic Q item “gaps” needs further examination to determine if it warrants a new item and a corresponding lexeme, but the list gives us a starting point. We will also be carefully comparing the PropBank entries to the over 7500 English verbs already defined as Wikidata lexemes. Ideally we will able to expand these lexemes with additional sense distinctions and predicate argument structures as well as filling in missing entries. We want to emphasize that although we start with English, the Q item gaps are semantic, not lexical and we should use multiple languages, including Chinese, Hindi, Arabic, and Russian, to determine appropriate fillers. We also plan to rely heavily on the careful curation of Czech, German, English and Spanish common predicate argument structures to be found in SynSem to ensure cross-lingual compatibility of our item mappings. We welcome input from other languages as well.

State/Process/Event/Action Role Structures

[edit]

All states/processes/events/actions have core semantic roles - "eating" has the "eater" and the "eaten", "throwing" has the "thrower", the "target" and the "projectile". These roles are not optional. Every act of "eating" has an "eater" and the "eaten" independently of how and in which language it is expressed. Most of the existing items for such classes do not mention these roles. For example, "throwing (Q12898216)", defined as “launching of a ballistic projectile by hand” does not have any statements that indicate the existence of the thrower, the target, or the projectile, let alone the specifications of the kinds of entities these attributes are likely to be.

Some Wikidata items for event/action concepts include statements for some of the semantic roles. For example, "eating (Q213449)" uses the "practiced by (P3095)" property whose object is "eater (Q20984678)". Although "practiced by (P3095)" is defined as “type of agents that study this subject or work in this field”, it is often used to indicate an agent of an action. We can expand the definition of "practiced by (P3095)" to encompass the semantic role use as well, or alternatively we could replace it with a new "has semantic role" property. In either event, we would want to qualify it with the type of role, in this case an Agent. In "eating (Q213449)" there is also a property "uses (P2283)" that points to "food (Q2095)" to indicate what is "eaten". This property also has many other uses. In the absence of a Wikidata property that can be used exclusively to indicate the value of a semantic role, using existing properties would require adding qualifiers such as "object has role (P3831)" to indicate that the value is a semantic role.

Caveats

[edit]

This project does not address the problem of ontological consistency of Wikidata items. But, as we examine Wikidata events, we might also fill in the gaps in some of the “subclass of” inheritance relations. For example, departure (Q21171241) is not currently a subclass of going (Q19279529). The item execution (Q3966286) defined as “homicide as capital punishment” does not seem to be connected to capital punishment (Q8454).

[(arw) There are other semantic issues beyond subclass of (P279) - such as skos:altLabel ... for example, bringing (Q124457329) is a subclass of moving (Q115095261) which has a altLabel of "renaming". This is simply highlighted as an additional item that may be corrected. (map) Mahir has addressed this as he describes in the Discussion page.]

Proposal

[edit]

The proposal outlines a step-by-step procedure for expanding Wikidata state/process/event/action coverage. It has four steps:

1. Adding missing Wikidata state/process/event/action classes as Q items, as well as English lexeme L items, and lexeme sense S items. We will use item for this sense (P5137) or predicate for (P9970) to tie the senses to the concepts.

2. Adding missing state/process/event/action roles to both concepts and lexemes. For English lexeme senses we can use the already defined has semantic argument (P9971). For concepts denoted by Q items our preference is a new "has semantic role" property, as explained below.

3. Specifying selectional preferences for state/process/event/action roles

4. Adding role specifications to the state/process/event/action instances


Step 1. Adding missing Wikidata state/process/event/action classes

We propose to go systematically over the PropBank RoleSets (see below for examples of PropBank Frame Files). For each RoleSet, we look for an existing Q item class that represents the same concept. For example, when we examine PropBank's "see.01" defined as "to perceive an object with one's eyes", we find two relevant Q items: "visual perception (Q162668)" and "seeing (Q25374341)". The former is described as "ability to interpret the surrounding environment using light in the visible spectrum" and the latter as "the event of perceiving something using eyesight" which looks like a better candidate. The other 5 PropBank senses of "see", entered as L items, would map to quite different Q items, if available. The mappings can be specified by creating a new identifier, "PropBank ID".

If no such item is found, we create one. For example, we could not find a Q item equivalent to PropBank's "bring.01" defined as "carry along with, move literally or metaphorically". We created a new Q item "bringing (Q124457329)" described as "transporting something toward somebody/somewhere". We also added translation (P5972) for Russian, French, Spanish, Chinese and Punjabi and made it a "subclass of" "moving (Q115095261)".

Step 2. Adding missing event/action roles

When we find or create an appropriate state/process/event/action Q item, we will go over the roles of the RoleSet. In the "eat.01" example, there are two roles: the "consumer, eater" and the "meal". For each role, we look for a Q item statement that describes the role.

eating (Q213449)practiced by (P3095)eater (Q20984678)

describes the "consumer, eater" role. RoleSet "eat.01" indicates that this role is a "PAG" - a Prototypical Agent or Actor. Since the "practiced by (P3095)" property may have uses other than designating a semantic frame role, we add a qualifier:

eating (Q213449)practiced by (P3095)eater (Q20984678)object has role (P3831)actor (Q23894381)

We will also add an English Lexeme for eat, as Lxxxx-eat with Sense Lxxxx-eat-S1 (defined using the property, ontolex:sense). Then, Lxxxx-eat-S1 can use has semantic argument (P9971) to reference eater (Q20984678), qualified with object has role (P3831) actor (Q23894381).

PropBank's "meal" role of "eat.01" is a PPT (Prototypical Patient). The statement that best describes it is:

eating (Q213449)uses (P2283)food (Q2095)

We can add a qualifier to this statement specifying the role type:

eating (Q213449)uses (P2283)food (Q2095)object has role (P3831)patient (Q170212)

The Lexeme can be treated similarly.

We provide a mapping between PropBank's prototypical roles such as "PAG" and "PPT" and the corresponding Q items in the table below in the Semantic Roles section.

If we cannot find a suitable qualifier value for object has role (P3831), we can use "semantic role (Q117747915)" as a "back off". Ideally the Q items in our table would all be declared as subclasses of "semantic role (Q117747915)".

When an event/action class does not have a statement describing a core semantic role, we look for an existing Q item that most closely describes that role. For example, "creation (Q11398090)" (process during which something comes into being and gains its characteristics) corresponds to PropBank's "create.01". It has a statement for the "creator" role but no statements for "Arg1-PPT thing created". The item that best describes the object of creation is "artificial object (Q3619132)". To complete the frame, we add the following statement:

creation (Q11398090)has characteristic (P1552)artificial object (Q3619132)object has role (P3831)patient (Q170212)

We used the very generic "has characteristic (P1552)" property because we could not find a more specific existing one. The qualifier provides the interpretation.

Another example is "military offensive (Q2001676)" which does not have statements describing the attacker and the defendant. The item "attacker (Q31924059)" seems appropriate for the attacker role and "defender (Q111729140)" for the defender role:

military offensive (Q2001676)has characteristic (P1552)attacker (Q31924059)object has role (P3831)agent (Q392648)

military offensive (Q2001676)has characteristic (P1552)defender (Q111729140)object has role (P3831)theme (Q118826633)

The agent (Q392648) and theme (Q118826633) would more appropriately be defined at a higher level, perhaps for the Q item aggression (Q191797), and inherited by military offensive (Q2001676).

When no role Q item is found, we need to create one.

Step 3. Specifying selectional preferences for event/action roles

Each role, in an event/action frame typically describes the classes of entities that would normally be expected to play that role in that frame's instances. For example, we normally expect that the "eater" in an "eating" instance would be an organism. Because these expectations could be violated we call them selectional preferences, not restrictions. Unfortunately, Wikidata does not have an existing property to specify selectional preferences and we have to resort to a common substitution by using a combination of "has characteristic (P1552)" with an "object has role (P3831)" qualifier:

eater (Q20984678)has characteristic (P1552)organism (Q7239)object has role (P3831)selectional preference (Q124051768)

It may be that the Q Items that instantiate the roles/arguments may provide sufficient information. For instance, eater (Q20984678) is defined as "human or other live being who eats something" and uses has characteristic (P1552) to reference organism (Q7239). Exactly how selectional preferences should be included requires more discussion.

Step 4. Adding role specifications to the event/action instances

When a new event/action instance is created, ideally, the creator should consult the class of the instance and make sure that the semantic roles of the class are instantiated. For example, suppose we want to enter the event of Mickey Mouse creation by Walt Disney on 18 November 1928. Let's call the ID for this event Q_mm_creation. Wikidata uses over 300 properties to indicate event/action instance roles. We can pick "creator (P170)" for the creator role, "has effect (P1542)" for the created artifact role and "point in time (P585)" for time. We add the following 3 statements:

Q_mm_creationcreator (P170)Walt Disney (Q8704)object has role (P3831)creator (Q2500638)

Q_mm_creationhas effect (P1542)Mickey Mouse (Q11934)object has role (P3831)artificial object (Q3619132)

Q_mm_creationpoint in time (P585)18 November 1928object has role (P3831)point in time (Q186408)

We are using the "object has role (P3831)" qualifier to specify the role played by the object. In the case of event/action classes, we used high-level semantic role items such as "agent (Q392648)" or "theme (Q118826633)" as the objects of "object has role (P3831)". In the case of event/action instances we use the actual role items such as "creator (Q2500638)" or "attacker (Q31924059)".

Since we are planning to add semantic roles to companion lexemes as well, an alternative approach would be to follow the path from the Q item to the lexeme, perhaps using item for this sense (P5137) or predicate for (P9970), and retrieve the semantic roles for event instantiations from the lexeme. The more languages we have represented the more desirable this would be. That also might require one or more additional properties. In the short term, associating the roles with the Q items would seem to make them more generally accessible.

Also note that we do not propose to attach the "default" roles such as "location", "start", "end", "point in time" to the event/action classes since all events/actions take place in defined times and places. The instances, though, should specify them (if known). See Semantic Roles below for more details.

Wikidata contains a very large number of event/action instances. For example, "Petsamo–Kirkenes Offensive (Q705222)" is one of many instances of "military offensive (Q2001676)". Currently, it has the following statements for the attacker and the defender roles:

Petsamo–Kirkenes Offensive (Q705222)participant (P710)Soviet Union (Q15180)

Petsamo–Kirkenes Offensive (Q705222)participant (P710)Nazi Germany (Q7318)

These statements do not specify who was the attacker and who was the defender. Ideally, we should add the "object has role (P3831)" qualifier to indicate the role:

Petsamo–Kirkenes Offensive (Q705222)participant (P710)Soviet Union (Q15180)object has role (P3831)attacker (Q31924059)

Petsamo–Kirkenes Offensive (Q705222)participant (P710)Nazi Germany (Q7318)object has role (P3831)defender (Q111729140)

Alternatively, we could replace the current use of participant (P710) and its qualifier in the above statement with target (P533), thus avoiding the object has role (P3831) qualifier.

Obviously, we cannot inspect all existing event/action instances and "fix" them but, at least, we now have a method for doing it.

--Anatole Gershman (talk) 21:33, 12 June 2024 (UTC)[reply]

Semantic Roles

[edit]

Semantic roles, originally termed cases, are often also referred to as predicate arguments, slots, thematic relations (VerbNet, LIRICS), frame elements (FrameNet), etc. Thematic relations traditionally only refer to the core arguments of the predicating element, and do not include more adjunct-like information found in temporal and locative modifiers. The latter can be applied quite generally and are considered more peripheral. Defining adjuncts precisely has remained a persistent challenge for the linguistics community, making it difficult to distinguish consistently between core and peripheral arguments. The term "semantic roles" can encompass both. Time and place are critical elements of useful descriptions of event instances.

Our stated goal is a mapping between Wikidata items and PropBank semantic roles. The original aim of PropBank was to add semantic role information to the syntactic structures in the Penn Treebank. Since there is no one-to-one mapping between syntactic constituents and semantic roles, annotators were asked to examine every clause in the Penn Treebank featuring a specific lexical item, such as "throw" as a predicating element, and assign the most suitable semantic role label to each one. A PropBank Frame File, listing the different coarse-grained senses of the lexical item and appropriate argument structures for each one, was referred to during this process. For example, the frame for "throw", as listed below, indicates an ARG0-PAG, a prototypical agent (Dowty, 1990), an ARG1-PPT, a prototypical patient or theme, and an ARG2-GOL (the goal or destination of the entity being thrown). There can be up to six numbered core arguments, and a dozen additional peripheral ARGM's, marked individually with function tags such as manner (MNR), locative (LOC), direction (DIR), comitative/accompanier (COM), etc. There are also several more syntactic function tags to mark modals (MOD), negations (NEG), discourse markers (DIS), etc. The full list with their definitions can be found in the PropBank Guidelines, available at the PropBank GitHub site linked below. The example frame files referred to above are provided below in the PropBank Frame File Examples subsection. After the original 50K sentence Penn Treebank was PropBanked, funding was provided to expand the number of genres and now almost 2M tokens of English have been PropBanked, as well as several other languages including Chinese, Arabic, Korean, Hindi, Urdu, German, French, Russian, Spanish, etc. English PropBank has also been mapped to VerbNet and FrameNet as part of SemLink: Mapping together PropBank/VerbNet/FrameNet, and one can browse a combined representation of those three resources at the Unified Verb Index. PropBank's coverage has also been extended to provide support for Abstract Meaning Representation (AMR) annotation (which uses PropBank Frame Files), unifying PropBank rolesets across different parts of speech.

Below is a table listing our recommended semantic role labels for Wikidata that are mapped to PropBank labels and are adopted from the Uniform Meaning Representation (UMR) project. They have been carefully reviewed to ensure that they accommodate cross-linguistic typological variation (Bonial et al. 2011 A Hierarchical Unification of LIRICS and VerbNet Semantic Roles (Q118174236), Van Gysel et al, 2021 Designing a Uniform Meaning Representation for Natural Language Processing (Q115519832)). For the most part we are relying on existing Wikidata has semantic argument (P9971) definitions to realize our PropBank semantic roles. At this point they also include Start, Temporal and Place.

UMR/PropBank Semantic Roles to Wikidata Items Mapping
UMR Semantic Role Wikidata item PropBank Function Tag Description Example
Actor actor (Q23894381) PAG An animate entity who performs an action "The chef prepared the meal."
Force force (Q126009669) PAG An event or inanimate entity that acts upon an undergoer in a way that is usually spontaneous, forceful, and direct "The wind blew the door open."
Causer agent (Q392648) PAG
CAU
An animate entities who acts on another actor to cause them to engage in the action "My grandmother made me eat liver."
Undergoer undergoer (Q111335542) PPT The entity that undergoes the action when it is not clearly a Patient or Theme. "The kitten licked her fingers."
Patient patient (Q170212) PPT Subclass of undergoer. The patient is an undergoer in an event that is usually structurally changed, for instance by experiencing a change of state or condition; is often acted upon by an agent; is causally involved or directly affected by other participants; and exists independently of the event. "The chef prepared the meal."

"The roommates painted the walls."
Theme theme (Q118826633) PPT Subclass of undergoer. The theme is an undergoer that is central to an event or state that does not have control over the way the event occurs, is not structurally changed by the event, and/or is characterized as being in a certain position or condition throughout the state. Often in motion. "She packed her suitcase for the trip."
Recipient recipient (Q20820253)
addressee (Q19720921)
GOL The entity that receives something. "The librarian handed me a book."
Experiencer experiencer (Q1242505) PPT
PAG
The entity that directly experiences a sensation or emotion "Many tourists saw the accident."

"He felt a sense of relief."
Stimulus stimulus (Q109566760) PAG
CAU
The entity that causes an emotional or mental state "The loud noise startled the cat."
Instrument instrument (Q6535309) MNR An inanimate entity used to perform an action or event "The rock broke the window."

"She cut the paper with scissors.”
Start origin (Q3885844) DIR The entity from which an action originates or the starting point of an action or event "I flew from Heathrow."

"The bidding opened at $5."
Goal goal (Q109405570) GOL Where an action is directed. In motion verbs, the final destination "He ran to the store."
Companion companion (Q106645134) COM An animate entity that accompanies another entity or entities and is presented as an oblique argument; who an action was done with "I went to the movies with friends."
Material/Source material (Q214609)
source (Q31464082)
DIR The location, entity, or material from which an action or event originates "Water flowed from the faucet"
"I milked the cow."

"The shirt is made of cotton."
Place location (Q109377685) LOC The place where an event or action occurs "The party will be at the park."
Affectee affectee (Q125995757) PPT Entity positively or negatively affected by the circumstances of an event or action without being the primary undergoer "The movie made her cry."
Cause cause (Q2574811) CAU Why an event or inanimate entity brings about an action or event "The pool was closed because of lightening."
Temporal duration (Q2199864)
time (Q12322185)
Frequency (Q125995799)
TMP When an action took place. This includes all temporal referents, such as dates, duration, frequency, order, repetition, etc. “He went to the store yesterday.”

"I've been reading email for three hours."

"They cleaned the kitchen first."

"She lost her keys again."
Extent extent (Q125953445) EXT The degree or amount to which something happens "He ran five miles."

"The price increased by 5%."
Manner means (Q12774177) MNR The way in which something is performed "He worked quickly and mechanically."
Reason cause (Q2574811) PRP The reason, explanation or justification for an event or action "I went to the store because we were out of milk."

"He left early because he had another meeting."
Purpose cause (Q2574811) PRP The purpose or intended objective of an event or action "I went to the store to buy milk."

"He left early to get to another meeting."
Attribute attribute (Q109674924) PRD The quality or characteristic ascribed to an entity "The house is big."
Result result (Q2995644) PRD The entity described by a secondary predicate "She kicked the door shut.”

"You scared me to death."

"He painted the door red."
Direction direction (Q2151613) DIR Motion along a specified (literal or figurative) path “I walked down the street.”

"I turned left."

--Anatole Gershman (talk) 22:04, 21 June 2024 (UTC)[reply]

Examples of PropBank Frames

[edit]

Here are the complete PropBank frames referenced above.


Bring

bring.01 - carry along with, move literally or metaphorically

bring (v.)

Role Label Role Description
ARG0-PAG bringer
ARG1-PPT thing brought
ARG2-GOL benefactive or destination brought-for, brought-to ;
ARG3-PRD attribute, state after bringing, secondary action
ARG4-DIR ablative, brought-from

active, benefactive:

She [ARG0-PAG] brought [REL] them [ARG2-GOL] shame [ARG1-PPT]

Eat

eat.01 - consume, comsuming

Aliases: eat (v.) eating (n.)

Role Label Role Description
ARG0-PAG consumer, eater
ARG1-PPT meal

Arg0, 1: His [ARG0-PAG] eating [REL] carrots [ARG1-PPT] constantly [ARGM-TMP] has tinted his skin a suspiciously bright orange hue.

Throw

throw.01 - throw, sending through the air, manually, projection of an object through space

Aliases: throw (v.) throwing (n.) throw (n.)

Role Label Role Description
ARG0-PAG thrower
ARG1-PPT thing thrown
ARG2-GOL thrown at, to, over, etc.


See see.01 - view

Aliases: see (v.) seeing (n.) sight (v.) sight (n.)

Role Label Role Description
ARG0-PAG viewer
ARG1-PPT thing viewed
ARG2-GOL attribute of arg1, further description

sight-n: both args:

The climax is his visit to the dead man 's house and his [ARG0-PAG] sight [REL] of the body [ARG1-PPT].

Create

create.01 - create

Aliases: create (v.) creation (n.)

Role Label Role Description
ARG0-PAG creator
ARG1-PPT thing created
ARG2-VSP materials used
ARG3-GOL benefactive
ARG4-PRD attribute of ARG1

Creation [REL] of a new , realistic U.S. policy [ARG1-PPT]

Attack

attack.01 - to make an attack, criticize strongly

Aliases: attacking (n.) attack (n.) attack (v.)

Role Label Role Description
ARG0-PAG attacker
ARG1-PPT entity attacked
ARG2-PRD attribute

Metaphorical attack, illness:

The new medication has reduced Sally 's [ARG1-PPT] asthma [ARG1-PAG] attacks [REL] .

Potential New Properties

[edit]

a) has semantic role

As we mentioned above, few of the existing event/action classes have fully specified semantic roles. For example, "creation (Q11398090)" does not have a statement for the "created". We used "has characteristic (P1552)" - a very generic property, with an "object has role (P3831)" qualifier to indicate the role function:

creation (Q11398090)has characteristic (P1552)artificial object (Q3619132)object has role (P3831)theme (Q118826633)

It seems desirable to have a more specific property, "P_has semantic_role" for this purpose. It would be a sub-property of "has characteristic (P1552)" and used when no existing property such as "practiced by (P3095)" could be found to indicate a semantic role. We would still use the qualifier to indicate the role function.

The current version of this property proposal is can be found here: [[1]]


b) has selectional preference

We also mentioned that Wikidata does not have an existing property to specify selectional preferences and that we had to resort to a common substitution by using a combination of "has characteristic (P1552)" with an "object has role (P3831)" qualifier:

eater (Q20984678)has characteristic (P1552)organism (Q7239)object has role (P3831)selectional preference (Q124051768)

Here again, it seems desirable to have a more specific property "Q_has_selectional_preference". With this dedicated property, there will be no need for the "object has role (P3831)" qualifier.

--Anatole Gershman (talk) 15:46, 24 June 2024 (UTC)[reply]

Statistics

[edit]

(to be filled in later)

Queries

[edit]

(to be filled in later)

Current tasks

[edit]

(to be filled in later)

Participants

[edit]
[+] Add yourself to the list

The participants listed below can be notified using the following template in discussions:
{{Ping project|Events and Role Frames}}

[edit]