User:Research Bot/issues

From Wikidata
Jump to navigation Jump to search

This page serves the curation of issues that have come up with edits made by User:Research Bot.

Please help describe/resolve existing issues, use Template:Curate issue to add.


Duplicates

duplicates at source

label duplicates at source
samples Strategies used in the clinical trials of gene therapy for cancer (Q38563811) and Strategies used in the clinical trials of gene therapy for cancer (Q41924800)
description items are apparently about the same publication, but the latter gives a dysfunctional PubMed publication ID (P698)
to do build queries to identify all of them, merge items
query if same journal only
impact anecdotal
status open


duplicate

label duplicate
samples
description two or more items are created for the same article
to do normalize doi? make doi check non case-dependent? (phab:..), merge duplicates
query
impact high for impacted items, overall: TBD
status open


Missing statements

non-English article

label non-English article
samples
description only English title is imported; except for the brackets, English title is presented as original title
to do Add Q66724591, identify language in language of work or name (P407), add actual title with preferred rank, remove brackets from English label. See also: Bot requests#Add original title of scientific articles
query
impact low/medium
status open


mostly blank item

label mostly blank item
samples Q28681089, Q28705231, Q28661357
description possibly metadata missing at source or not correctly imported
to do complete
query [1], maybe all of: [2]
impact low
status open


no "published in" statement

label no "published in" statement
samples Q24799245
description published in (P1433) is usually missing due to the item for the publication venue not existing at the time of creation of the item for the article. At Q24799245, it seems to be due to a mismatch between the journal's official title and the title used in source
to do
query Special:Search/haswbstatement:P31=Q13442814 -linksto:P:P143
impact TBD
status open


no author (closed)

label no author
samples
description seems to happen if there is explicitly no author listed or for corporate authors
to do
query
impact low
status closed (import not planned)


Description

no English description

label no English description
samples Wikipedia: An Info-Communist Manifesto (Q66711338) (by another bot, had description)
description item lacks a description such as "article published <date>" which can lead to its confusion with items for non-articles
to do complete description, bot request at Wikidata:Bot_requests#Add_description_to_items_about_articles
query Special:Search/haswbstatement:P31=Q13442814 -article
impact medium/high for items with short labels/titles, low for others
status open


Character set

mojibake

label mojibake
samples
description
to do fix import, repair items; Krbot fixes some of them in author name strings
query
impact high for affected items, tbd overall
status open


P31

P31 value (closed)

label P31 value
samples Q64665886
description use of scientific publication (Q591041) instead of scholarly article (Q13442814)
to do replace
query [3]
impact high for impacted item, low overall
status closed (edits by another bot)


use of scholarly article (Q13442814) in P31

label use of scholarly article (Q13442814) in P31
samples Q21563418, Q56162962
description scholarly article (Q13442814) is used for any type of article from such publications. This is correct for most, but not all.
to do see discussion at Wikidata:Project_chat#Use_of_scholarly_article_(Q13442814). Possible solution: use more general article (Q191067)
query
impact low
status


Label and title

English label includes a final "." (closed)

label English label includes a final "."
samples
description the bot imports a dot added by the source it's using, generally not present in the original
to do TBD/add original title
query
impact low
status closed (considered an editorial decision)


title statement includes a final "." (closed)

label title statement includes a final "."
samples
description the bot imports a dot added by the source it's using, generally not present in the original
to do TBD/add original title
query
impact low
status closed (considered an editorial decision)


label is in CAPS (closed)

label label is in CAPS
samples Q66615545
description label is added in capital letters
to do normalize label per Help:Label
query
impact high
status closed (edits by another bot)


title is in CAPS (closed)

label title is in CAPS
samples Q66615545
description
to do normalize title
query
impact medium
status closed (edits by another bot)


English label is copied to other languages (closed)

label English label is copied to other languages
samples
description label in English (translation or original title of article) is copied to other languages, e.g. nl
to do TBD/delete?
query
impact low/medium for these languages as it requires additional maintenance there
status closed (edits by another bot)


title/label includes HTML tag

label title/label includes HTML tag
samples Q28937634
description labels include HTML tags such as <i> or <br>
to do remove html-tag from label and title and add markup version with qualifier title in HTML (P6833) to title
query see constraint reports on P1476
impact low overall, medium for items with the issue
status open


title/label includes "formula: see text"

label title/label includes "formula: see text"
samples Q35209922
description as the formula is complicated to reproduce, the label/title merely includes "formula: see text"
to do add qualifier title in LaTeX (P6835) to the title with complete title in LaTeX markup
query see constraint reports on P1476
impact low
status open


title/label include $$ or $ and TeX markup

label title/label include $$ or $ and TeX markup
samples Q29396284, Q30039935 (by another user)
description
to do add qualifier title in LaTeX (P6835) to the title with complete title in LaTeX markup, normalize title/label. Automated conversion may be possible
query see constraint reports on P1476
impact low overall, medium for items with the issue
status open


missing title element

label missing title element
samples A Novel Monoallelic Nonsense Mutation in the NFKB2 Gene Does Not Cause a Clinical Manifestation (Q64100401)
description part of the title/label is missing
to do complete title
query
impact high for impacted items, overall: TBD
status open


Other

author name string conversion

label author name string conversion
samples
description author name string (P2093) is meant to be a temporary solution. Eventually these should be converted to author (P50). Given their mere number, it's not easy to find which string is most in need of conversion. Papers with hundreds of authors aren't easy to edit in Wikidata
to do some tools are available to decode them, e.g. /author-disambiguator/ or an identifier based one. A report of most frequent strings could be interesting (e.g. as provided by Scholia's /missing pages, which exist, e.g., for topics, venues, organizations, authors, awards). A tool starting from a given paper could be helpful too.
query
impact TBD
status open


coverage

label coverage
samples n/a
description it's unclear which parts are imported or skipped
to do
query TBD, check by id?
impact TBD
status open


recent import

label recent import
samples n/a
description recent articles might not be imported, making the overall quality of the corpus stale
to do request made at Wikidata:Bot_requests#weekly_import_of_new_articles
query
impact medium
status open


label
samples XXX
description
to do
query
impact TBD
status open