Wikidata talk:Report on potential duplicate Lexemes
Jump to navigation
Jump to search
- @DVrandecic (WMF): For example Lexeme:L54814 and Lexeme:L299405 have different and unrelated etymologies, so it is worth keeping them separate.--GZWDer (talk) 07:14, 6 October 2020 (UTC)
- @GZWDer: yes, agreed. This is a list of potential duplicates. I need to improve my script to filter this further down. I think this list is currently too long to curate manually.
- Lexeme pairs that should be removed:
- those that are linked as homographs
- those that have different forms
- those that are classified differently
- but the latter will be difficult. Let's see how far I get. --DVrandecic (WMF) (talk) 16:49, 6 October 2020 (UTC)
- It would be easier to read as a table with language and lexical cat. --- Jura 10:34, 8 October 2020 (UTC)
- @Yurik: is there some problem with the bot that created the Russian ones? --- Jura 10:36, 8 October 2020 (UTC)
- https://petscan.wmflabs.org/?psid=17565904 finds 2126 Russian ones. In the meantime, I fixed both French duplicates. --- Jura 10:44, 8 October 2020 (UTC)
- I suppose they are really different but I should watch them (and add homograph lexeme (P5402)). --Infovarius (talk) 22:24, 21 April 2021 (UTC)
- It not entirely clear what a homograph lexeme (P5402) is. Lexeme:L54814 and Lexeme:L299405 are the same in the lemma, but some of the forms are different. So should Lexeme:L54814 and Lexeme:L299405 be linked with homograph lexeme (P5402)? So far I am only using this property if the lexemes share forms. — Finn Årup Nielsen (fnielsen) (talk) 09:11, 9 May 2023 (UTC)
- I suppose they are really different but I should watch them (and add homograph lexeme (P5402)). --Infovarius (talk) 22:24, 21 April 2021 (UTC)