User talk:AdrianoRutz

Jump to navigation Jump to search

About this board

Flor WMCH (talkcontribs)

Bonjour,

Je travaille pour Wikimedia CH, où je m'occupe du soutien à la communauté francophone. J'ai remarqué par le biais de la carte du MOOC Wikidata que vous êtes en Suisse, et je vous contacte donc pour vous proposer de participer aux événements locaux.

Pour info, nous avons un calendrier ici sur meta, avec des événements qui ont lieu un peu partout en Suisse. Le 27 avril 2024 nous organisons à Berne un moment d'échange et de rencontres autour des projets Wikimedia (plus d'infos ici sur notre site web).

Au plaisir de vous rencontrer !

AdrianoRutz (talkcontribs)

Bonjour,

Merci beaucoup d'avoir pris contact, heureux de voir que la carte du MOOC a servi! Je ne suis pas encore certain de pouvoir participer le 27 mais m'inscrirai si jamais.

A bientôt peut-être,

AdrianoRutz (talkcontribs)
Reply to "Evénements Wikimedia en Suisse"

'Group of stereoisomers' batch from Nov 7th '23

2
Wostr (talkcontribs)
AdrianoRutz (talkcontribs)
Wostr (talkcontribs)

Hi, I have to write to you, because it is the third time I have to revert some of your semi-automatic edits (QS) regarding:

  1. UDYKDZHZAKSYCO-CIBWSTISSA-N (Q105270642)
  2. JWYXXCZNFFQYGQ-UOMKELITSA-N (Q105136469)
  3. UDYKDZHZAKSYCO-LSSWMSPRSA-N (Q5049581)

I understand the nature of these edits, however, we have to think of some process to prevent such changes. With the #1 and #2 items names from PubChem are invalid. In the first case we have kastalagina (Q104387655) and UDYKDZHZAKSYCO-CIBWSTISSA-N (Q105270642) is its stereoisomer. As I don't see any other valid name for this, I decided to put an InChIKey as a label. For the second item (JWYXXCZNFFQYGQ-UOMKELITSA-N (Q105136469)) we have prymnezyna 1 (Q7253248).

In the case of UDYKDZHZAKSYCO-LSSWMSPRSA-N (Q5049581) I think that InChIKey as a temporary name is much better than PubChem CID (CID 3002104).

I tried to sum up this problem on WikiProject subpage: Wikidata:WikiProject Chemistry/Names exceeding character limit. However, it didn't receive any interest on Wikiproject discussion page, so it's merely a proposition of how to deal with this issue.

I'd suggest that any changes of en:label (in the future probably from the mul:label) from InChIKey to other label should be done manually or reported somewhere (Wikidata talk:WikiProject Chemistry/Names exceeding character limit?) for manual checking.

AdrianoRutz (talkcontribs)

Hi @Wostr,

Thank you for getting in touch and notifying me. I was indeed trying to import masses, names, and CIDs from PubChem based on InChIKeys. Regarding the very long names, I know the issue. We did the same as you did in the frame of https://www.wikidata.org/wiki/Wikidata:WikiProject_Chemistry_Natural_products (putting the InChIKey).

I have high hopes https://www.wikidata.org/wiki/Wikidata:Property_proposal/Pending#IUPAC_name will also contribute to improving this situation.

Thank you again, I will avoid any unsupervised semi-automated updates from now on.

Über den Mechanismus der elektrolytischen Stromleitung in Kristallen

2
Mykhal (talkcontribs)

Let me inform you I have fixed the item Q114754880, which had e.g. broken title and a incorrect issue date (taken from wrong item from Crossref API) – probably was created automatically somehow. Regards, —Mykhal (talk) 21:39, 12 August 2023 (UTC)

AdrianoRutz (talkcontribs)

Thank you very much, was created automatically indeed

Wostr (talkcontribs)
AdrianoRutz (talkcontribs)
AdrianoRutz (talkcontribs)

By the way, your remark also allowed me to spot a priority list of chemicals to disambiguate... in case: https://w.wiki/6gi$

Wostr (talkcontribs)
AdrianoRutz (talkcontribs)
RPI2026F1 (talkcontribs)
AdrianoRutz (talkcontribs)

Hi, I am using the https://github.com/SuLab/WikidataIntegrator, I opened an issue to try to fix this upstream (https://github.com/SuLab/WikidataIntegrator/issues/197). I know @Daniel Mietchen had similar issues and wrote a maintenance query to partially address this point. @Daniel Mietchen Do you have any updates on this? @RPI2026F1 Thank you for reporting, I will try to come up with a solution that handles this on a broader range so I might not correct everyrhing directly, thank you for your patience.

RPI2026F1 (talkcontribs)

If your script is in Python, you should be able to sanitize the title yourself. How are you fetching the titles?

Noraberto (talkcontribs)

Hi Adriano,


you are producing tofu characters (�). Please switch your software to UTF-8.--

AdrianoRutz (talkcontribs)

Hi! Fixed, Thank you!

Wostr (talkcontribs)

Hey, could you explain to me the current situation with deleting a lot of P703-statements? I see on my watchlist that you are deleting many statements added in the past by NPImporterBot, in the meantime the same bot is adding some statements (cf. for example the edit history of Q421964: https://www.wikidata.org/w/index.php?title=Q421964&curid=398641&action=history). Were your actions discussed with the bot operator? I get the impression that at least some statements deleted by you are then re-added by the bot...

AdrianoRutz (talkcontribs)

Hi @Wostr!


Thank you for getting in touch! I am actually operating both the bot (in common with @Bjonnh) and performing some cleaning. I kept the habit of perfoming the deletions on my personal account through QS and adding via the bot. Since some deep cleaning was needed because of some taxon synonymy chains accumualted over time and the recent introduction of P10585, I went for some drastic cleaning so there might be a small overlap between what is deleted and then re-added (very small). I took care of not deleting things that were uploaded by the bot on our side so I guess everything should be fine. Btw, some of the "re-added" statements are sometimes with different references as qualifiers.


Except for the strong traffic, did you notice something particuarly disturbing?

Wostr (talkcontribs)

Okay, thanks, I just got the impression that these changes might not be discussed, because there are no visible link between your account and the bot account or discussions between you and the bot owner. However, since my impression was wrong, I can only wish you a good day :)

AdrianoRutz (talkcontribs)

You are perfectly right, I just added the info on both the bot and my personal account pages in case! (We operate in the frame of Wikidata Chemistry Natural products) Thank you very much again and wishing you a good day too!

Mykhal (talkcontribs)

I don't think isotopically modified compounds are "stereoisomers". (I'm referring to e.g. this edit, and many others.) Regards, —Mykhal (talk) 08:10, 31 March 2024 (UTC)

AdrianoRutz (talkcontribs)

Thank you for bringing it to my attention. I believe there are only a few hundreds which I'll fix once the batch is finished.

Mykhal (talkcontribs)

I don't think batch job is unstoppable. You are still adding nonsense. —Mykhal (talk) 08:19, 3 April 2024 (UTC)

AdrianoRutz (talkcontribs)

Batch management is not that easy lately because of the instability of QuickStatements, but I just manually started another one to remove the incorrect entries (approx. 5,700) I will keep on cleaning the rest as soon as it appears. I already corrected the issue for the next batches. Thank you again

AdrianoRutz (talkcontribs)
Reply to ""Stereoisomers""
Succu (talkcontribs)
AdrianoRutz (talkcontribs)

Import is still not finished. I will look at it once finished, still almost a million to go. Thank you for your feedback.

AdrianoRutz (talkcontribs)

@Succu Import is more or less halfway.

I just want to further clarify what you mean by "fix it".

Let's start by telling you what I did: I took the latest (v3.6) Open Tree Taxonomy (https://files.opentreeoflife.org/ott/ott3.6/ott3.6.tgz) and joined all OTT IDs to the mappings OTT has internally (so GBIF, NCBI, IRMNG, WORMS). For all taxa, if one (or multiple) of those were present, I added the corresponding OTT ID.

This given I added almost two millions now, the "violation" rate seems aroung 0.1%, which seems reasonable.

So the two types of violations concerned are:

- If there are a GBIF ID and an NCBI ID linking to different OTT IDs on the same item, then it will end up having two OTT IDs

- If there are two different GBIF (or else) IDs on two different items linking to the same OTT ID, then this OTT ID will be present on both.

How do you expect me fixing both of these?

Succu (talkcontribs)
AdrianoRutz (talkcontribs)

As said, import is just more or less halfway. I had a look at it, and it will be added based on the NCBI ID (289280) linking to Arctia.

Reply to "Fix it!"
There are no older topics