Wikidata:Property proposal/wikimedia revision identifier

From Wikidata
Jump to navigation Jump to search

wikimedia revision identifier

[edit]

Originally proposed at Wikidata:Property proposal/Property metadata

   Not done
Descriptionidentification of Wikimedia page revision
Data typeNumber (not available yet)
DomainWikimedia page
Allowed valuesrevision identifier
Exampleimported from Wikimedia project (P143)English Wikipedia (Q328) → 1234
Planned useTo specify the exact revision identifier of a Wikimedia page
Formatter URLhttps://www.wikidata.org/w/index.php?title=Wikidata:Main_Page&oldid=$1, https://en.wikipedia.org/w/index.php?title=Main_Page&oldid=$1 etc.
See alsowiki Page Revision ID by Dbpedia
Motivation

Currently bots make use of imported from Wikimedia project (P143) to specify the source of information. For example, when some property values are extracted from a English Wikipedia page, imported from Wikimedia project (P143) is used as a property along with the item English Wikipedia (Q328). When it's manually set, property stated in (P248) is used. Currently we are not tracking the exact revision identifier of the Wikipedia page used as a reference for the (extracted) information. The proposed property can be used as a qualifier to specify the exact identifier. Jsamwrites (talk) 15:33, 27 May 2017 (UTC)[reply]

Consider the case when information is extracted from a Wikipedia page and fed to Wikidata. imported from Wikimedia project (P143) can be used to specify the Wiki page. Let's assume there are conflicting information concerning a property value in two wiki revisions, A and B. Wikidata editor/Bot extracted the value from Wiki revision A and updated Wikidata. Meanwhile, wikipedia editors has updated the property value (article) in revision B. A user coming to Wikidata must be able to conclude that the property value may be stale since the information was from revision A and Wikipedia has undergone changes and is currently in revision B. The user can update the Wikidata entry, if needed. Jsamwrites (talk) 17:50, 22 August 2017 (UTC)[reply]

Discussion
  • The problem is that imported from Wikimedia project (P143) doesn't really tell people where a fact comes from. It could come from many different places on a specific Wikipedia. It doesn't allow a reader to check the provenance of data. This means there's no good way for a bot to tell that if a fact gets removed from a Wikipedia it also should be deleted from Wikidata. It can also make it hard when we find in Wikidata that a value we imported was wrong to go and fix the error in the Wikipedia. ChristianKl (talk) 20:39, 28 May 2017 (UTC)[reply]
    I believe that is Multichill's point. It is a pointer that we utilised at a time of the creation of the data; it is an artefact of the process at that time, not the means we should be using now. We can and should do better.  — billinghurst sDrewth 02:44, 29 May 2017 (UTC)[reply]
We still import new data from Wikipedia into Wikidata. Of course, it's better to enter data by hand with proper references but in many cases the work of entering data by hand is a completely different project than a quick import. If a person wants to copy the proper citation from EnWiki, they might not know which EnWiki pages holds the citation information. ChristianKl (talk) 11:37, 30 May 2017 (UTC)[reply]
I agree with ChristianKl. There are a lot of tools that are currently available which make use of imported from Wikimedia project (P143) and sadly they are not tracking the exact version of the Wikipedia page. Thereby, we lose the exact source of information. Even considering Multichill's comment that imported from Wikimedia project (P143) is a short term solution, we are still losing the track of very important piece of information regarding the correct permalink page. However, it's not a vague pointer as pointed by billinghurst. Revision identifiers are commonly used even in source code versioning system to point to a particular commit. Luckily all Wikimedia properties give us the revision identifiers and it is left to us to make use of it while developing (import) tools. Jsamwrites (talk) 18:01, 30 May 2017 (UTC)[reply]
It can be used similar to "pages". Not redundant if one want to specify exact version.
URL formatter could be used to "compare" what was changed.
Bots would appreciate this property even more than humans because Wikipedia/Wikidata dumps are periodic.
There is another property Wikimedia database name (P1800), which uses Wikimedia in its name (English). In this way, we are tracking all Wikimedia revision numbers. Jsamwrites (talk) 17:20, 7 June 2017 (UTC)[reply]
Hi d1g, I changed the datatype to Integer.