Wikidata:Requests for comment/Spelling convention for labels and descriptions in English

From Wikidata
Jump to navigation Jump to search
An editor has requested the community to provide input on "Spelling convention for labels and descriptions in English" via the Requests for comment (RFC) process. This is the discussion page regarding the issue.

If you have an opinion regarding this issue, feel free to comment below. Thank you!

I'd like to invite you to discuss the default spelling convention for the en label in the Wikidata project. Currently, there is no clear guideline on which spelling convention should be used and this may lead to inconsistencies and debates among users. With English having multiple regional variations, it is essential to establish a consistent approach to labelling that respects these differences.

Given that Wikidata is a global knowledge base that aims to provide neutral and unbiased information, I believe it's essential that the en label favours the so-called International English or Oxford spelling convention. This convention is widely recognized and used globally and is adopted by various international organizations such as the UN, WHO, IMF, NATO, etc.

It's worth noting that there are already separate en-us, en-gb, en-ca labels, which specifically cater to their respective English spelling conventions. Therefore, it's logical to keep the en label as neutral as possible, avoiding any perceived bias towards a specific regional variation of English.

By adopting the International English spelling convention for the en label, we can uphold a consistent approach that resonates with Wikidata's diverse global audience.

Labels and descriptions in English are crucial for facilitating search, discovery, and understanding of entities. However, the lack of a clear standard for handling ambiguous words can lead to inconsistencies, errors, and confusion.

By addressing this challenge, I believe we can provide a better experience for our users.

Carbonaro. (talk) 14:45, 25 June 2024 (UTC)[reply]

This probably should be a discussion on Help:Label. I would note that Wikidata is all about inconsistencies :) Looking at en:List of countries by English-speaking population there are a lot of places that use English in their own way beyond the 3 different labels we already have. Help:Label for English specifically states the spelling should be the "most common" - so how does one assess that? I think in general we are fine if the editor creating an item or adding an English label uses whatever spelling that editor is accustomed to. If a new label replaces the old one, the old one should normally be kept on as an alias (for consistency and ease of search). But I don't think it matters which one is the main label and which is an alias. ArthurPSmith (talk) 20:45, 1 July 2024 (UTC)[reply]
Thanks for your input. There have been various discussions, on Help talk:Label, Wikidata:Project chat, Wikidata:Report a technical problem and Phabricator, however, with no consensus reached so far. In the interest of promoting consistency, I invite comments on establishing a standardized framework. Many users have already pointed this out and I also suggest that Wikidata, as a global knowledge base, use International English for the en label, whereas regional variations are designated with their respective country codes (e.g., en-us for American English and en-gb for British English). Carbonaro. (talk) 15:04, 2 July 2024 (UTC)[reply]
I'd not heard of Oxford English, and while it has a code en-GB-oxendict, I don't think we should be flooding WD with uses of it compared with en and en-gb. Pragmatically as a en-gb native I assumed en was the North American camp, and I had en-gb for when I wanted to record a variant, as that seems the main demarcation in https://en.wikipedia.org/wiki/Wikipedia:Manual_of_Style/Spelling..
WD is weak in that it shows en rather than en-gb at the top of every article and in searches, even though I declare my preference. If it was better tailored then I could work in a en-gb world and have en or en-us issues less prominent.
We also have the mul changes coming https://www.wikidata.org/wiki/Help:Default_values_for_labels_and_aliases. We need to decide what English flavour applies there.
I see that enWP aren't consistent, they have https://en.wikipedia.org/wiki/Wikipedia:Manual_of_Style#National_varieties_of_English, but their main focus is avoiding edit wars, and can use redirects to mask their inconsistencies, which we can't do.
So I think getting the software to handle languages better, both with mul code and better user focus, is more important than a mass change of labels at present. Vicarage (talk) 15:14, 10 July 2024 (UTC)[reply]
"can use redirects [...] which we can't do" - of course Wikidata does have redirects, but actually to my mind, the way "alias" labels work in Wikidata is pretty much identical to how redirects work in Wikipedia: these are alternative labels for the item that are available through the search interface. So I think aliases are just fine for handling this in a similar manner in Wikidata. However there is no equivalent to aliases for descriptions on items, so the language of a description would perhaps be an obstacle here. ArthurPSmith (talk) 13:51, 29 July 2024 (UTC)[reply]
I think that the only manageable rules are:
  1. If the item has a English Wikipedia page, the en label should match it.
  2. If the item is the product of a specific country and physically located in an English speaking country then there is a strong preference that the en label matches the English dialect that would be used in that country.
  3. If the item is not a physical entity and is a concept and is strongly and uniquely associated with a specific English language country, we should prefer (but not require) that the en label is in the English dialect of that country.
Anything more than that would need/devolve into a huge number of special cases. Trying to establish a preferred en label for items that do not strongly relate to an English speaking country is fraught and will cause more label edit wars that it would solve.
And after all I've written, I think my #2 and #3 are likely a bridge too far as well. William Graham (talk) 00:03, 11 July 2024 (UTC)[reply]
Always using the English Wikipedia label would be unfair. There are 9 other English projects and 9 multilingual projects with sitelinks. We don't actually want the label to always match the sitelink either, as Help:Label already explains. - Nikki (talk) 09:56, 11 July 2024 (UTC)[reply]
Moreover, Wikipedia and Wikidata are two separate projects. That being said, to me it's incorrect to follow en:WP:MOS. Carbonaro. (talk) 14:47, 11 July 2024 (UTC)[reply]
As I expected, my intuition of what would work was off the mark. I think it's a perfect example of how establishing rules on these is a bit of a fool's errand. Thus, I remain skeptical that a rule or guidance can be created and implemented. William Graham (talk) 15:00, 11 July 2024 (UTC)[reply]
All right, imagine a bot systematically reviewing all articles, swapping the generic en label for a neutral alternative and relocating regional variations to their respective en-us and en-gb labels. Would this approach spark any adverse reactions? Carbonaro. (talk) 18:58, 27 July 2024 (UTC)[reply]
I would argue there are no "neutral alternatives" in most cases and having a bot make changes programmatically would create the same, if not more, arguments and edit wars. William Graham (talk) 19:21, 27 July 2024 (UTC)[reply]
 Oppose in current form. This RfC hasn't articulated any specific policy/guidance proposals in its current form. I only see a "plan to make a plan". I'm unable to make any informed analysis or commentary when it's so open ended. I think the proposer(s) should step back write their own set of proposed rules/policy with others and then return at a later time with a new RfC. William Graham (talk) 19:31, 27 July 2024 (UTC)[reply]
 Oppose I agree. Vicarage (talk) 21:09, 27 July 2024 (UTC)[reply]

The suggestion is, as I said initially, to name the items according to the rules of International English. Thanks for your participation anyway. That is:

International English (en) British English (en-gb) US English (en-us)
flavouring flavouring flavoring
organization organisation organization
metre metre meter
anaesthesia anaesthesia anesthesia
traveller traveller traveler
inquiry enquiry inquiry

It would have been prudent for me to initiate our conversation with this information. As I continue to acquaint myself with the intricacies of this project, I appreciate your understanding and patience. Carbonaro. (talk) 08:05, 28 July 2024 (UTC)[reply]