Wikidata talk:Lexicographical data/Archive/2020/02

From Wikidata
Jump to navigation Jump to search
This page is an archive. Please do not modify it. Use the current page, even to continue an old discussion.

Russian import from wiktionary

Hi, I changed recently from editing wiktionary to editing wikidata lexemes instead. I see that there are a lot of imported russian words with forms from wiktionary. Is this accepted as CC0? In the case yes, I would like to do something similar for the danish, swedish, norwegian and english wiktionary. Would someone like to help with that?--So9q (talk) 13:09, 19 November 2019 (UTC)

@So9q: Some of the wiktionary communities have expressed reservations on this sort of mass action due to copyright license differences (Wikidata is CC-0 while the Wiktionaries are CC-BY). In the case of Russian they obtained a consensus there before initiating the import (and note that senses were not included in the import). If you can communicate with the communities you are suggesting and ensure this would be ok with them then I'm sure it would be fine. You should also request "bot approval" here for any mass import of thousands or more of items or lexemes. ArthurPSmith (talk) 15:03, 19 November 2019 (UTC)
User talk:ArthurPSmith, I think your answer isn't following the definitions of the copyright licenses. When content is copyrighted under CC-BY, the rights belong to all the editors, and no group of editors can "obtain a consensus" and stop the BY (attribution).
So9q, this is what Yurik, the editor who imported from Russian, wrote on my talk page:
"According to this paper, words and their forms are not copyrightable, but senses are."
You can use the not copyrightable data in any way you want. Uziel302 (talk) 09:51, 11 December 2019 (UTC)
Thanks! I saw this answer paragraph from Yurik and read the page recently. This mean that we don't have to ask for permission (if I understood correctly) from the WT we extract from, but I think it would be preferable to do it in a cooperative manner and ask them if they want to link back to wikidata from their pages like ru:WT now do.--So9q (talk) 15:23, 11 December 2019 (UTC)
@So9q: I asked the Danish Wiktionary here. There was not much discussion. — Finn Årup Nielsen (fnielsen) (talk) 19:51, 5 February 2020 (UTC)
@fnielsen: I really liked the way you asked :). There was only one answer and it was quite odd. As this answer was not against I think we should go ahead and prepare the import without further consultation. We can notify them when we start and finish.--So9q (talk) 10:41, 6 February 2020 (UTC)

Taxon names?

I am attending a workshop around the taxonomic data model on Wikidata, and one of the issues discussed is to what extent lexicographical data could help keep taxa and their names more properly separated. Any insights and opinions on this would be most welcome. --Daniel Mietchen (talk) 15:26, 13 February 2020 (UTC)

  •  Support I've thought for a while that would be helpful. However, is there clearly a language associated with taxon names? Maybe we should have a special "scientific latin" language? ArthurPSmith (talk) 17:43, 13 February 2020 (UTC)
  • Scientific names are not "latin names". What about homonyms? --Succu (talk) 18:27, 13 February 2020 (UTC)
    • What would exactly be the problem with homonyms, do you think ? It would be possible to choose a scheme « one lexeme » for the name and one « sense » in this lexeme for each homonyms. author  TomT0m / talk page 13:06, 15 February 2020 (UTC)
  • In fr.wiktionary we use « Conventions internationales » (= international conventions) for the "language" and then « Nom scientifique » (= scientific names) for the "part of speech". I guess we can get some inspiration here. We use the international conventions for all the terms that are not proper to one language but recognized internationally, we do the same with the measurement units. And then only we add the informations of each language bond to that conventions (like vernaculary names, etc.) V!v£ l@ Rosière /Murmurer…/ 03:30, 14 February 2020 (UTC)
  • I agree that these should not be latin. Words can be derived from Latin sometimes (not always) but are scientific symbols which doesn't belong to single language. They should have "mul" code, but Lexemes don't support this... --Infovarius (talk) 11:25, 14 February 2020 (UTC)
  • Taxon common names would fit nicely in lexeme namespace. --- Jura 09:34, 15 February 2020 (UTC)
  • Taxon scientific names are a vocabulary of its own in a language of its own, this language could be named « biological taxonomy name » for example. author  TomT0m / talk page 13:06, 15 February 2020 (UTC)
  • Aren't they already fully handled as items? Not sure what adds. Maybe africana (L247716) should be linked from the item? --- Jura 15:14, 18 February 2020 (UTC)

Update the Wikidata-Glossary

I just noticed that the Wikidata:Glossary is missing all the lexeme related terms. Con someone (with more expertise than me) add them there? -- MichaelSchoenitzer (talk) 01:33, 18 February 2020 (UTC)

It has Lexemes, but not Forms and Senses - I guess I can add those. What else do you think ought to be on there? ArthurPSmith (talk) 14:45, 18 February 2020 (UTC)
Done - but I also noticed we have Wikidata:Lexicographical data/Glossary which is very detailed. I added links from the main glossary. ArthurPSmith (talk) 15:06, 18 February 2020 (UTC)
Hmm .. heavy text. Good thing I don't recall reading it. I suppose we should be using lexemes for the glossary. --- Jura 15:19, 18 February 2020 (UTC)

Southern African languages

I work from southern Africa. When I arrive on a page it gives me English, German, Zulu and Xhosa which is fine, but nowhere, not even under "All entered languages" can I find any other language from southern Africa, or Afrikaans, which appears nowhere. "In more languages" is completely useless, as it closes all languages when I click on it. "Configure", under the latter, takes me to a technical page that makes no sense to me. In short, I am prevented from adding any names or descriptions in Afrikaans, or other southern African languages besides Zulu and Xhosa. JMK (talk) 20:48, 28 January 2020 (UTC)

@JMK: If you indicate a list of languages on your userpage (as I just did now--note that those two-letter abbreviations are ISO 639 language codes for German and the eleven official languages of South Africa), then those languages (if supported on Wikidata) will be shown to you by default so that you can edit labels/descriptions/aliases in those languages without any further setup. Mahir256 (talk) 21:43, 28 January 2020 (UTC)
I noticed the difference, thank you so much. I would recommend that wikidata support as many languages as possible, if that is not a hassle. JMK (talk) 19:24, 22 February 2020 (UTC)

IATE terms import

Hi, have anyone contacted/lobbied the IATE team to release the 7.1 mio. terms under CC0? Who should I talk to for this? Wikimedia Belgium?

An import of this size could increase our lexemes by a factor of 28 :).--So9q (talk) 10:36, 25 February 2020 (UTC)

@Dimi z: (EU Policy Project Manager) could probably tell us more about this. Cheers, VIGNERON (talk) 14:54, 27 February 2020 (UTC)

Southern African languages

I work from southern Africa. When I arrive on a page it gives me English, German, Zulu and Xhosa which is fine, but nowhere, not even under "All entered languages" can I find any other language from southern Africa, or Afrikaans, which appears nowhere. "In more languages" is completely useless, as it closes all languages when I click on it. "Configure", under the latter, takes me to a technical page that makes no sense to me. In short, I am prevented from adding any names or descriptions in Afrikaans, or other southern African languages besides Zulu and Xhosa. JMK (talk) 20:48, 28 January 2020 (UTC)

@JMK: If you indicate a list of languages on your userpage (as I just did now--note that those two-letter abbreviations are ISO 639 language codes for German and the eleven official languages of South Africa), then those languages (if supported on Wikidata) will be shown to you by default so that you can edit labels/descriptions/aliases in those languages without any further setup. Mahir256 (talk) 21:43, 28 January 2020 (UTC)
I noticed the difference, thank you so much. I would recommend that wikidata support as many languages as possible, if that is not a hassle. JMK (talk) 19:24, 22 February 2020 (UTC)

IATE terms import

Hi, have anyone contacted/lobbied the IATE team to release the 7.1 mio. terms under CC0? Who should I talk to for this? Wikimedia Belgium?

An import of this size could increase our lexemes by a factor of 28 :).--So9q (talk) 10:36, 25 February 2020 (UTC)

@Dimi z: (EU Policy Project Manager) could probably tell us more about this. Cheers, VIGNERON (talk) 14:54, 27 February 2020 (UTC)