Wikidata:Requests for permissions/Bot/CanaryBot 3
The following discussion is closed. Please do not modify it. Subsequent comments should be made in a new section. A summary of the conclusions reached follows.
- Approved --Lymantria (talk) 08:40, 26 May 2018 (UTC)[reply]
CanaryBot 3[edit]
CanaryBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Ivanhercaz (talk • contribs • logs)
Task/s: Remove end full stops/periods in the items descriptions
Code: Being developed. Old code is available as a IPython Notebook in PAWS.
Function details: In Help:Description we can read "Descriptions are not full sentences, and should not end in periods/full stops", so all the full stops/periods in the descriptions should be removed. This script will remove all the full stops/periods at the end of a descriptions. This script doesn't affect to another punctuation, only remove the last full stop/period in a description.
I would need to make some edit tests in Wikidata to confirm the script works fine and it hasn't errors. I can't test in test.wikidata because there isn't a Wikidata Query Service for it, and it doesn't seem to be implemented soon.
Thanks in advance for your attention!
Regards, Ivanhercaz (Talk) 16:08, 13 May 2018 (UTC)[reply]
Comments about the task after the discussion below: After the discussion below I want to specify here, in the function details, some changes in the task:
- I am going to remove the full stops/periods only in the languages where I have confirmated the removal of the full stops in the descriptions as correct.
- If any user would like that I run this task in another languages, the user would only have to send me a message confirming if it is correct to remove the full stops in the description in the language in which he/she is insterested.
- I will try to develop a log system that report the languages in which the bot find the full stop in the description, but in which the bot is not authorized to work. Then, after some reports, I could ask about it to native users in the respective language in their respective project chats.
- At this moment, the languages in which I am planning to work are: Spanish, English and Italian.
Regards, Ivanhercaz (Talk) 17:30, 14 May 2018 (UTC)[reply]
- That rule is for english label, may not be valid for all languages. --ValterVB (talk) 17:21, 13 May 2018 (UTC)[reply]
- No objections if this will be applied to English only. In some languages there is a (slight) possibility that this mark is used in other purpose than the end of sentence. Wostr (talk) 22:19, 13 May 2018 (UTC)[reply]
- @ValterVB, Wostr: Yes, both of you have reason. I thought that this rule was for all descriptions, now I know it is true for English. But also for Spanish, and probably for more languages. For Spanish I can argue with the examples in Help:Description/es and the third point of this page say at the end "debe evitarse añadir el punto final" (in English, should avoid adding the full stop).
- If some user of another language think this task could be useful for the descriptions in his/her language, we can discuss about it and I could run the bot for the suggested language too. Obviously, the user would have to argue why the descriptions in his/him language should be without a full stop at the end.
- Thank you for your comments!
- Regards, Ivanhercaz (Talk) 23:21, 13 May 2018 (UTC)[reply]
- For italian is OK delete them --ValterVB (talk) 05:55, 14 May 2018 (UTC)[reply]
- @ValterVB, Wostr: I added some details to the functions of this task at the top of the discussion, below the "Function details" point. Wostr, taking advantage of that you are in the discussion, do you know if in Polski descriptions are correct the full stops at the end or incorrect?
- Thank you to both of you for your comments.
- Regards, Ivanhercaz (Talk) 17:34, 14 May 2018 (UTC)[reply]
- Main doubt for me is for german, and no latin language, and last thing, can you check also the ellipsis (Q32518). --ValterVB (talk) 17:58, 14 May 2018 (UTC)[reply]
- I think most descriptions in Polish with full stops at the end are to be corrected, but I can think of some situations in which full stop at the end of description should not be removed, e.g. after a year («w 1999 r.» in 1999), after a name of a company («spółka zależna PBG S.A.», subsidiary company of PBG S.A., S.A. = spółka akcyjna, joint-stock company) — so there may be many situations in which acronym or abbreviation is the final word in description and in many acronyms/abbreviations in Polish dots are used (but there may be also other situations that didn't come to my mind right now, other than acron/abbr, in which full stops are correct).
- However, I think in most cases it can be manually corrected to a form without dots, e.g. «w 1999 r.» to «w 1999 roku» (or maybe «w 1999»). So in languages like Polish the approach could be different: (1) generate a list of a few hundred descriptions with full stops, (2) check this list (by someone fluent in certain language), (3) mark descriptions that shouldn't be corrected, (4) correct (remove full stops) from everything else from the list. That is, however, quite different task than that of yours and a bit harder (not only technically, but also by the need of cooperation with someone fluent in certain language). Wostr (talk) 18:49, 14 May 2018 (UTC)[reply]
- For italian is OK delete them --ValterVB (talk) 05:55, 14 May 2018 (UTC)[reply]
- I honestly think that automating a task with such a long, probably endless, list of exceptions isn't a good idea. By approving every change manually — that is, simply writing Y or N, or pressing a button — we would make sure these mistakes aren't made and, besides, the operator would save much programming time and all the later checks and reversions. Above all, thanks for your help, Ivanhercaz. --abián 10:46, 15 May 2018 (UTC)[reply]
- @Wostr: First of all, thank you very much for your analysis and your help. Yes, tomorrow checking the results of the SPARQL query I checked the problem of the abbreviations and another similar issues. Your idea is awesome Wostr, but I admit that it requires a time that I haven't at this moment. I have thought in make this task semiautomatically, approving the change manually, as abián commented in the last comment. Maybe run the task automatically at this moment could be problematical due to the reasons exposed by both of you. So, what I am thinking about to use two options: 1. remove the full stop, change the description; 2. full stop correct, add this description to a list of descriptions to review. Then, with the list generated by the option 2, I might elaborate a exception list with regular expressions and with the help of native users in the language in which I have to review.
- What do you think about it?
- Regards, Ivanhercaz (Talk) 12:49, 15 May 2018 (UTC)[reply]
- Sorry for the delayed answer. Yes, this may work and it would be a very good thing to check the dot-ended descriptions in such semiautomatic way. Wostr (talk) 19:41, 20 May 2018 (UTC)[reply]
- Don't worry, Wostr, I am working on the code for this task yet. Anyone can check the code. So Wostr, in the next improvements I am going to do to the code I will add the necessary variables to work with Polish too.
- I will try to make CSV with the descriptions with full stops that I am not going to remove with the bot. Then I would publish the documents to work with people native and be sure of what we have to do in each case.
- Regards, Ivanhercaz (Talk) 19:48, 20 May 2018 (UTC)[reply]
- Sorry for the delayed answer. Yes, this may work and it would be a very good thing to check the dot-ended descriptions in such semiautomatic way. Wostr (talk) 19:41, 20 May 2018 (UTC)[reply]