Wikidata talk:Item quality campaign

From Wikidata
Jump to navigation Jump to search

Pilot Campaign Analysis Result

[edit]

Ping EpochFail (talkcontribslogs), GerardM (talkcontribslogs), Abián (talkcontribslogs), QZanden (talkcontribslogs), BrillLyle (talkcontribslogs), Lymantria (talkcontribslogs), Jsamwrites (talkcontribslogs), Alessandro Piscopo (talkcontribslogs), ChristianKl (talkcontribslogs)

Hi! I have finished analyzing the pilot campaign result. You can find the analysis result here. I have also proposed some changes on the existing item quality criteria. --Glorian WD (talk) 11:26, 3 April 2017 (UTC)[reply]

Full Campaign

[edit]

Ping EpochFail (talkcontribslogs), GerardM (talkcontribslogs), Abián (talkcontribslogs), QZanden (talkcontribslogs), BrillLyle (talkcontribslogs), Lymantria (talkcontribslogs), Jsamwrites (talkcontribslogs), Alessandro Piscopo (talkcontribslogs), ChristianKl (talkcontribslogs)

Hi! I'd like to let you know that the full campaign has been launched. In the full campaign, there are around 5,000 items that have to be labeled. --Glorian WD (talk) 20:02, 10 April 2017 (UTC)[reply]

The new campaign is called "Item quality (5k sample). Glorian and I have worked to make sure that the dataset is as representative of the whole quality scale as possible. In my estimation, rating 10 items (a workset) should take around 5 minutes. Rating the whole set should take us about 42 hours. --EpochFail (talk) 20:39, 10 April 2017 (UTC)[reply]
Hi Glorian, thanks for letting me (us) know. I have a question: are there any labeling-specifications made regarding scientific articles or proteins? As I have encountered them already, but still in doubt what label I should give them... Q.Zanden questions? 20:44, 10 April 2017 (UTC)[reply]
@QZanden: I rated C because there are no translations at all and no image. How do you do it? Tubezlob (🙋) 16:15, 11 April 2017 (UTC)[reply]
@Tubezlob: I rated B because of the many statements and references... Q.Zanden questions? 16:19, 11 April 2017 (UTC)[reply]
@QZanden: I agree, I hesitated with B. Tubezlob (🙋) 16:32, 11 April 2017 (UTC)[reply]
Hi Glorian, thanks for informing. I just started labelling and I wonder what have we decided for category pages like "Category:Lamborghini (Q6700148): Wikimedia category". Should I label it as 'C'? Similar question arises for list pages? I don't think both these types of pages have images. Jsamwrites (talk) 11:08, 11 April 2017 (UTC)[reply]

We need information for special pages:

  • just edit by bots (proteins, genes, scientific articles)
  • categories
  • list pages
  • pages of Wikinews
  • soon (24 april) will follow the pages on wiktionary...

Complete if you found other cases. Tubezlob (🙋) 16:32, 11 April 2017 (UTC)[reply]

Hey @QZanden:, @Jsamwrites:, and @Tubezlob:!
I think bio-related items such like proteins are "D" items because they generally really lack of translation (i.e. they only have one complete set of translations - label & description). As you can see from Wikidata:Item_quality#Grading_scheme, criteria "D" is usually for items that only have 1 complete set of translation (label & description) in any language. In addition, the item should consist of some relevant statements.

The criteria "C" is typically be given to items which have references (including references to Wikimedia projects like Wikipedia) and have more than one complete set of translations (label & description) in any language.

The criteria B are generally given to items which have many external references (i.e. references other than Wikipedia) and have some complete set of translations of 8 important languages which specified on Wikidata:Item_quality#Translations.

Finally, the criteria "A" is usually for items that have external references which come from different sources (see a plurality of external references) and the items have a lot of complete set of translations (label & description) for 8 important languages which specified on Wikidata:Item_quality#Translations.
Hope this helps! else, please do ask again :) --Glorian WD (talk) 18:58, 11 April 2017 (UTC)[reply]
Glorian WD, if I understand you right: items without a Chinese set of translations are rated as "C" or below? --Succu (talk) 20:30, 11 April 2017 (UTC)[reply]
Nope. It falls on class "B" if the other important translations are completed. --Glorian WD (talk) 20:50, 11 April 2017 (UTC)[reply]
What's the reason ignoring other criteria like has solid refereneces and make this a major degrading rule? --Succu (talk) 20:55, 11 April 2017 (UTC)[reply]
Falls on class "B", by assuming the items have external references beside the translations that I have mentioned. --Glorian WD (talk) 20:58, 11 April 2017 (UTC)[reply]

Hi, great to see that the campaign has been already completed. When will the data be released? I am currently working on a study about Wikidata quality and user group characteristics and Item quality labels would be a suitable measure of quality for our research. Thanks, --Alessandro Piscopo (talk) 15:49, 18 May 2017 (UTC)[reply]

Removing Unwanted Pages from Full Campaign

[edit]

Ping GerardM (talkcontribslogs), Abián (talkcontribslogs), QZanden (talkcontribslogs), BrillLyle (talkcontribslogs), Lymantria (talkcontribslogs), Jsamwrites (talkcontribslogs), Alessandro Piscopo (talkcontribslogs), ChristianKl (talkcontribslogs), Adelheid Heftberger (talkcontribslogs), Mvolz (talkcontribslogs), Tubezlob (talkcontribslogs)

Due to the unwanted pages bug that slipped into the full campaign, EpochFail and I decided to re-launch a new campaign which does not consist of unwanted pages (wikimedia template, wikimedia disambiguation page, wikimedia category, and so on). But, rest assured, we saved the items that you have labeled in the previous campaign (i.e. the one that you have done today). So, they can be used for training the machine learning algorithms after the campaign. We're really apologize for this bug. Please let me know for any problems that you encounter! --Glorian WD (talk) 19:25, 11 April 2017 (UTC)[reply]