Wikidata:WikiProject Reference Verification
This is a research and development project aimed at helping Wikidata editors check the quality of external references based on various types of AI/ML models. The method heavily relies on an academic paper called ProVe[1].
We are developing a backend to deploy AI/ML models and to be used as an inference server. Furthermore, we aim to launch feasible tools such as a Wikidata gadget, dashboard, and bot-based worklist update page.
Info You can use the AutoEdit tool to quickly add label and description on WikiProject Reference Verification in many languages.
Introduction
[edit]Motivation
[edit]Wikidata is a repository of information that gathers data from many different sources and topics. It stores this data as semantic triples, which are used in various important applications on the modern web, including Wikipedia infoboxes and search engines.
Wikidata mainly serves as a secondary source of information. To be trustworthy and useful, it needs well-documented and verifiable sources for its data. However, checking the quality of these sources, especially whether they actually support the information in Wikidata, has mostly been done manually. This manual process doesn't work well as Wikidata grows larger.
ProVe aims to solve this problem. It's an automated system that checks whether a piece of information (a triple) in Wikidata is supported by the text from its listed source. This approach can help ensure the quality of Wikidata's content more efficiently as it continues to expand.
Challenges
[edit]ProVe implementation for helping Wikidata editors faces several challenges:
- How to best support Wikidata editors' workflow based on ProVe results
- How to design system architecture for ProVe to support AI/ML inference and integrate with Wikidata tools using Toolforge and gadgets
- What is the most effective method to present ProVe results for reusability
- How to handle claims or triples that lack references
Worklists (Under Development)
[edit]- These are worklists of priority items that need their references checked, for example using ProVe. These are based on the number of incoming and outgoing links of the item, and their relevance to various use cases.
- https://kclwqt.sites.er.kcl.ac.uk/page/worklist/generationBasics
This Wikidata Item Verification Table ranks various Wikidata items (e.g., countries, concepts, or entities) based on how well their claims are supported by external references. Each item has several associated claims, and this table shows how many of those claims are supported, refuted, or lack sufficient information. The table also includes a metric indicating the number of external sites connected to each item. The items in this table are ranked based on their connection to external sources. Items with more claims that are supported by external sources are ranked higher, while items with many refuted claims or with not enough information are ranked lower. The Number of Connected Sites is also a factor in ranking, as items with more connections generally have more sources for verification. Here is an example of how the table might look with different Wikidata items and their verification results:
Wikidata Item (QID) | Not Enough Info | Refutes | Supports | Number of Connected Sites |
---|---|---|---|---|
United States of America | 429 | 42 | 177 | 406.0 |
Turkey | 124 | 39 | 13 | 402.0 |
Japan | 243 | 12 | 9 | 401.0 |
Russia | 152 | 21 | 8 | 401.0 |
Italy | 152 | 34 | 9 | 392.0 |
- https://kclwqt.sites.er.kcl.ac.uk/page/worklist/pagePileList
This Wikidata Claim Verification Table designed to verify claims in Wikidata by checking external sources using AI/ML models. Each claim is transformed into a natural language sentence (referred to as Final Verbalization) and compared to external references to determine if the claim is supported or refuted. Claims are ranked based on the confidence scores and text entailment results from the AI system. Claims in the table are ranked based on the AI system’s confidence in determining whether a claim is supported by external sources. Claims with high SUPPORT scores are ranked higher, while claims that are REFUTED or have NOT ENOUGH INFO are ranked lower. The number and relevance of extracted sentences from the external source also influence the ranking. Here is an example of how the table would look with claims and their verification details:
Final Verbalization | Source URL | Extracted Sentences (NLP) | Sentence Scores (NLP) | Top Matching Sentence | Evidence TE Probabilities | Final Decision (Claim Label) | Wikidata Item (QID) |
---|---|---|---|---|---|---|---|
The semicircular canal is described by source in Gray's Anatomy (20th edition). | Gray's Anatomy Reference | ['The osseous labyrinth consists of three parts: the vestibule, semicircular canals, and cochlea.', 'Another sentence from the text...'] | [0.2726, 0.1543] | The osseous labyrinth consists of three parts: the vestibule, semicircular canals, and cochlea. | [0.85 SUPPORTS, 0.10 REFUTES, 0.05 NOT ENOUGH INFO] | NOT ENOUGH INFO | Semicircular canal |
The vestibular system is studied in the field of audiology. | British Academy of Audiology | ['Audiology professionals are involved in helping to diagnose problems with the vestibular system.', 'Another sentence from the text...'] | [0.3424, 0.1678] | Audiology professionals are involved in helping to diagnose problems with the vestibular system. | [0.88 SUPPORTS, 0.07 REFUTES, 0.05 NOT ENOUGH INFO] | SUPPORTS | Vestibular system |
- We plan to publish a set of RDF triples derived from ProVe results.
- This will allow Wikidata editors and users to access a subset of Wikidata items with ProVe results that need to be addressed via SPARQL query.
- Worklist pages
Participants
[edit]- Elena Simperl
- Albert Meroño
- Odinaldo Rodrigues
- Miriam Redi
- Yiwen Xing
- Yihang Zhao
- Jongmo Kim
- So9q (talk • contribs • logs)
- salgo60 (talk • contribs • logs)
References
[edit]- ↑ Amaral, G., Rodrigues, O., & Simperl, E. (2022). ProVe: A pipeline for automated provenance verification of knowledge graphs against textual sources. Semantic Web, (Preprint), 1-34.