Wikidata talk:WikiProject Names/Archive/1

From Wikidata
Jump to navigation Jump to search

question sur nom/prénom d'un auteur sous pseudonyme

Salut gentils contributeurs très francophones...

ça serait peut-être bien d'internationaliser un peu ce projet, si vous voulez avoir d'autres que les francophones qui participent ;)

question de principe : pour une personne sous pseudonyme, on met en nom/prénom ceux de son pseudo ou ceux de son nom réel ?

  • ex : "George Sand"
  • autre exemple : Laure Conan (Q3218752) - son prénom, c'est "Marie-Louis-Félicité" ou "Laure" ou les deux ? - idem pour le nom...

Amicalement, --Hsarrazin (talk) 07:21, 10 September 2014 (UTC)

Pour le deuxième article, "Marie-Louise Félicité Angers, dite Laure Conan", je mettrais trois: Marie-Louise (Q18012396), Félicité (Q18012403), Laure (Q3218740) --- Jura 17:36, 10 September 2014 (UTC)

derivative question/corollaire

Ash Crow
Dereckson
Harmonia Amanda
Hsarrazin
Jura
Чаховіч Уладзіслаў
Joxemai
Place Clichy
Branthecan
Azertus
Jon Harald Søby
PKM
Pmt
Sight Contamination
MaksOttoVonStirlitz
BeatrixBelibaste
Moebeus
Dcflyer
Looniverse
Aya Reyad
Infovarius
Tris T7
Klaas 'Z4us' van B. V
Deborahjay
Bruno Biondi
ZI Jony
Laddo
Da Dapper Don
Data Gamer
Luca favorido
The Sir of Data Analytics
Skim
E4024
JhowieNitnek
Envlh
Susanna Giaccai
Epìdosis
Aluxosm
Dnshitobu
Ruky Wunpini
Balû
★Trekker

Notified participants of WikiProject Names

See Andrea H. Japp (Q2846340) : it's an author, much better known under her pseudo than her real name… 3 possibilities for first names :

  1. given name (P735) contains all first names, and each one is qualified with "birth name", "pseudo", etc.
  2. pseudonym (P742) and birth name (P1477) each have given name (P735) as qualifier…
  3. both solutions, as long as requests do not search in qualifiers…

so, please give your advices ? --Hsarrazin (talk) 19:11, 5 October 2014 (UTC)

I'd add them primarily in given name (P735) to make sure it can be found when checking P735 on the item. --- Jura 16:55, 12 October 2014 (UTC)

Female surnames

Hello, there migt be little problem with female surnames in czech and slovak language - female surnames are usually changed (gender inflection (Q1124523)) to female variant. But in most cases there are not separate article about this variant of surname, only in cases, when this form should be from more male surnames

Typical examples:

  • Novák -> Nováková
  • Fišer ->Fišerová
  • Malý -> Malá (adjective surnames)

Second problem is, that these articles are usually marked as disambiguations, even if they contains infobox, origin of name and list of people. JAn Dudík (talk) 08:12, 10 September 2014 (UTC)

We could create items for the female form and link these with a new property to the basic form of the surname. --- Jura 17:50, 10 September 2014 (UTC)
Shall we go ahead and do that? ----- Jura 10:32, 17 September 2014 (UTC)
@JAn Dudík, Jura1: What is the consensus regarding this question? I personally believe that Sidorov/Sidorova are 2 forms of the same surname (see en:Sidorov) and we can use female form of label (P2521) to specify female label on any language that needs it. But if you decided that we need 2 separate items, I'm fine with this. --Ghuron (talk) 16:43, 25 January 2018 (UTC)
@JAn Dudík, Jura1, Ghuron: The same issue with Polish surnames Kowalski (Q3199417) → Kowalska (Q37469735). Should we follow like this and create thousands of items of female form names ? To complicate things further: in Polish there's a bit outdated pattern Nowak → Nowakowa that is quite rarely seen nowadays and usually "Nowak" is used instead like "Anna Nowak". "-ski" and "-cki" male form surnames definitely have their female forms "-ska" and "-cka". Sometimes female form is put into Alias of the male form item, is it legit?
Is there a guideline/consensus for surnames like these ? Kpjas (talk) 20:31, 6 February 2019 (UTC)

My suggestion: Two different data sets should be created for Czech and Slovak surnames. A data record for the male basic form and a data record for the female form, e.g. Novák (Q16516593) and Nováková. What do you think? --HarryNº2 (talk) 19:17, 8 August 2020 (UTC)

@HarryNº2: Please discuss at Wikidata:Project_chat#Gender_forms_of_surnames Vojtěch Dostál (talk) 13:31, 18 January 2021 (UTC)

Historical persons

Another problem I see with some historical persons, which have their names translated into latin or other languages (John Calvin (Q37577): Jan/Johannes/Juan/John/Jean Kalvín/Calvin/Kalvyn/Calvinus). Which form to use? french? latin? some other? JAn Dudík (talk) 20:36, 11 September 2014 (UTC)

All of them, with the one the person actually used as prefered value. -Ash Crow (talk) 20:39, 14 September 2014 (UTC)

Property proposal(s)

Please see Wikidata:Property_proposal/Person#second_or_maternal_family_name_of_Spanish_name. --- Jura 06:11, 11 September 2014 (UTC)

Please comment there. ----- Jura 10:32, 17 September 2014 (UTC)

Fun list :)

Please see Wikidata:WikiProject Names/first names (1) ----- Jura 16:15, 14 September 2014 (UTC)

New report: items by label (same or different)

At User:Jura1/Person names, Ivan made a list of items used with P735 and P734 based on the labels they have in various languages using Roman script. --- Jura 16:34, 14 September 2014 (UTC)

I hope noone start "working" on this list, separating "different" names and surnames, as they are not intrinsically different. --Infovarius (talk) 18:33, 14 September 2014 (UTC)
Well, they are different enough so that the only sustainable way to work constistantly for all given names and surnames is to have an item per variant, linked with said to be the same as (P460) between each other. -Ash Crow (talk) 20:38, 14 September 2014 (UTC)
They are as different as a pear is from an apple. We're separating them since some weeks now. The goal is to respect consistency, explain the links between given names and add some properties related to one but not another (Okki is willing to indicate the matching saints celebration days, and this isn't a bad idea, as this is a property use on some given name infoboxes use).
That also allows a consistency for given name and surname between property and item value.
That finally allows to show links of etymology (surname Rose is named after first name Rose, named after the Rosa flower).
According my watchlist, we're at least 5 users to work like according this methodology: Jura1, Okki, Harmonia Amanda, Ash Crow and me. --Dereckson (talk) 20:46, 14 September 2014 (UTC)
Looking at this list I see next issue: Some names have variants of writing, in czech language especially with lenght of vowels, i/y, ending letters (Justina/Justýna, Vasil/Vasyl, Tereza/Terezie/Teréza, Anastasie/Anastázie, Magdalena/Magdaléna...). Separating these names only because the most used variant is unique among other languages is not good, because less common variant (older, archaic) is usually redirected to main article. Creating separate article for each variant of each name should be useless. JAn Dudík (talk) 05:57, 15 September 2014 (UTC)
Indeed, no need to create articles for these items. On Wikipedia, a single article can include lists based on various items linked through "P460". Some languages already do this quite thoroughly. --- Jura 19:54, 15 September 2014 (UTC)
Don't forget we create a general-purpose database, not a Wikipedia helper tool.
If the variants are really used, ie when there is at least one item instance of Q5 with this given name, create an item.
The core issue is what's a variant? For you, it could be a variant, for other people, two distinct given names. There is no objectif criteria to separate that. So 'said to be the same' claims make the perfect solution for such cases. --Dereckson (talk) 20:17, 17 September 2014 (UTC)
So you are acquainted with problem of transliteration duplicates at last, thanks to Jan Dudik. Look at my example below. Many of packs of names in the list do not differ in cyrillic languages. So you are prophets of latinocentric point of view... Infovarius (talk) 12:40, 17 September 2014 (UTC)

father's name (patronyme) in Russian names

Hello,

Since I recently tried to add translitterated names from Russian items, I think there should also be a propriety for those... a lot of people can only be distinguished through their father's name... Ivanovitch (in French), Ivanovich or (in English), etc. should be values to add to those (generally) Russian people.

and also, should there be feminine version items, or just a "feminine value" (Ivanovna), as those are used in the same sibling, to give the father's name.

A specific property ? an adaptation of the mother's name (in Spanish) ? what do you think... ? --Hsarrazin (talk) 11:49, 16 September 2014 (UTC)

For readers of the above question, w:Eastern_Slavic_naming_customs#Patronymic gives an introduction. --- Jura 12:36, 16 September 2014 (UTC)
The 2nd surname in Spanish has the advantage that the property can use items already available as first surname (of the mother). Here the situation is somewhat different.
A possible solution could be to create a monolingual string property that could be used with (Russian) given names (sample "Ivan") to indicate "Ivanovich" and another property (Russian patronym?) to link to these "given name" items. ----- Jura 10:32, 17 September 2014 (UTC)
As patronymic is by definition father's name it is enough to have link to a name of the father. Infovarius (talk) 12:26, 17 September 2014 (UTC)
@Infovarius : for a Russian, yes, obviously… but this would be very interesting for non-russian speakers, I think :)
Jura1's proposition seems simple, but, since there are not a lot of values, and many errors/typos/different translitterations could be done in monolingual string properties, while a value based on the original russian value, and then translitterated in other languages would provide an easy way to translitterate russian names in other alphabets... :) --Hsarrazin (talk) 19:03, 5 October 2014 (UTC)

Constantin / Konstantin / Constantine merger at Q7111053

Somehow this merger mixes different given and family names we were trying to keep separate. How to fix it? --- Jura 19:27, 16 September 2014 (UTC)

We ask a sysop (@Ash Crow:) to undo the merge and we ask @Infovarius: to not do that again? --Harmonia Amanda (talk) 06:27, 17 September 2014 (UTC)
Now (with redirects) anyone can undo the merge, but let's discuss first (below). Infovarius (talk) 07:39, 17 September 2014 (UTC)
In the meantime, a bot updated all links pointing to the 2 redirects. This has gotten messy. We will need to do quite a lot of cleanup. --- Jura 10:46, 17 September 2014 (UTC)
Well, in my opinion, all these above should have given name (P735) with ru:Константин in preferred value, and all the transliterations in given name (P735) with devalued value. And we should keep the Latin ones separate. The cases like these concern mainly historical figures like kings or popes ; the merge make no sense at all for contemporary figures. We can deal, albeit heavily, with many different given names for a same person. We can't separate « Wilhelm » and « Guillaume » if they are merged, or worse, what do you suggest about Étienne (Q15727982) and Stéphane (Q3501543)? It's two french given names said to be the same as Στέφανος… I still prefer to separate those given names, and to include all the possible names on disputed items, with the value in the original language as preferred value. --Harmonia Amanda (talk) 20:07, 17 September 2014 (UTC)
I agree with Harmonia Amanda. -Ash Crow (talk) 10:51, 23 September 2014 (UTC)
Preferred values is interesting idea, but it doesn't solve any problem. I believe we should thoroughly think about global plan (including non-latin languages, including future Wiktionary inclusion) or we (mainly Jura1 now) doing superfluous work at least and destructing hard built interwiki-links at most by separating without thinking. I don't know solution too but if you continue with Harmonia Amanda-like strategy we end up with a bunch (up to 6000) of values in P735 each being associated with specific language. It'll be a complete mess and huge technical problem (like now with Germany item). --Infovarius (talk) 17:37, 16 October 2014 (UTC)
@Harmonia Amanda, Infovarius: Salut! May I ask why use the Russian ru:Константин as the point of reference? The people mentioned would had Latin, Greek, Armenian, maybe French, English, German as native language, they would be called Constantinus/Κωνσταντίνος/Constantine/Konstantin/Constantin but all called Константин in Russian. My first initial reflex would be to say that Constantinus (la)/Κωνσταντίνος (el)/Constantine (en)/Konstantin (de)/Constantin (fr)/Константин (ru, bg, sr...) are the same given name, but that does not seem to be the way things are done here. Maybe one should not trust one's first reflex) Place Clichy (talk) 11:23, 19 November 2015 (UTC)

last part moved to New subject for Discussion

@Place Clichy:, your "first reflex" is my opinion. I consider these spellings as the same name. The problem can be that each Wikipedia can have different number of articles devoted to this name: from 1 to tens (English Wikipedia likes to create article for each spelling, doesn't matter if they are same or not). And Wikidata need to place sitelinks in some items and to link them with some statements. There is some modelling (you can read at project page) but it is problematic and in the case of Константин it make more problems than solve them. --Infovarius (talk) 12:39, 20 November 2015 (UTC)
@Infovarius: The situation where a language (even en) has several articles for several transcriptions of the same name seems to be the exception rather than the norm. In some cases, these articles are almost empty and can be redirected without trouble. For this reason, I support having a single Wikidata item for different transcriptions of the same name, until at least one language has at least 2 articles for 2 different transcriptions. That way we can have en:Ivan (name)/de:Iwan (name)/ru:Иван/uk:Іван on the same item, which by the way is the current situation on Ivan (Q830350). Place Clichy (talk) 10:17, 23 November 2015 (UTC)
Unfortunately, for en-wiki it seems to be a rule to have plenty of articles for each variant of a name... So it leads to multiple items... --Infovarius (talk) 21:16, 22 September 2017 (UTC)

I propose to consider another example: Constantine (Q103314). Look at titles of sitelinks and choose one variant :) Infovarius (talk) 21:16, 22 September 2017 (UTC)

But now the current process has lead to another solution: please welcome Konstantin (Q31362405) - special item for Cyrillic name! Now we can forget about all those Q7111053, Q19327451, Q5163687 and other Latin-centric variants and use one item for all Cyrillic names (Russian, Ukranian, Belorussian, Kazakh, Serbian and many others). As for Latin labels for Q31362405, I can propose to use either "Константин" or "Konstantin/Constantin/Constantine..." to avoid ambiguity. Now I am moving all relevant uses to this item. --Infovarius (talk) 21:16, 22 September 2017 (UTC)

Are you aware that these "transcriptions" can be made even from one latin language to another? Look at the labels in different languages for Charles VI of France (Q160349)! We have many "Carl/Karl"-names among the Swedish kings, and which item is used together with the property for first name, depends on the language preferences of the user who have added the claim. Some says that we should use the same spelling as the person himself did. The problem with that is that it imply that the person could spell and were used to some kind of orthography (Q43091). Orthography was not introduced for the Swedish language until the 19th century, and it didn't become stable until the 1920's. And the Swedish tradition for the names of the royalties is that the spelling is changed when they have been dead for some time. That tradition is used even for foreign royalties. -- Innocent bystander (talk) 07:00, 23 September 2017 (UTC)

Nameguzzler and Beta labellister dead

This really makes it harder to clean up items. Are there any alternatives available ? --- Jura 08:27, 20 September 2014 (UTC)

Split between surname and given name

Some items still seem to mix the two. To make it easier to separate them, I made a property proposal at WD:PP/P#Family_name_identical_to_this_first_name --- Jura 08:27, 20 September 2014 (UTC)

Maiden name = real name = full name?

We have:

  1. birth name, string: P513 (P513)
  2. birth name, monolingual text: birth name (P1477)

Used for:

  1. maiden name: Rodham (maiden name of Hillary Clinton)
  2. real name: Samuel Langhorne Clemens (pseudonym: Mark Twain)
  3. full name: William Jefferson Clinton (nick name: Bill Clinton)

--Kolja21 (talk) 14:30, 20 September 2014 (UTC)

or "full name at birth"? --- Jura 15:34, 20 September 2014 (UTC)
^^ what Jura said is how I have always worked with it, and that covers all the cases above I believe. I just wish for the better explanation of 513 -> 1477 which is a bit of a PITA in the format, and lack of explanation.  — billinghurst sDrewth 13:46, 21 September 2014 (UTC)

I left a note about the project there. --- Jura 09:44, 21 September 2014 (UTC)

Translation

Would you mind if I prepared/marked the page for translation? The idea comes from using {{TranslateThis}} and from adding to Status updates. Matěj Suchánek (talk) 11:28, 21 September 2014 (UTC)

It might be a bit early. The project just started and I'm not if sure if the current version was re-read.
The most detailed explication is still in the blog post (in French). ----- Jura 13:45, 21 September 2014 (UTC)

Cadet branches of noble families

Hello,

I am wondering what would be the best way to link a cadet branch of a noble family to the main tree. Should we use instance of (P31)cadet branch (Q2057658) with a qualifier of (P642) or create a whole new "cadet branch of" property? -Ash Crow (talk) 11:12, 23 September 2014 (UTC)

first name property should be Multilingual text, but not Item

See the following discussion: Property talk:P735#Mess --DixonD (talk) 16:38, 2 October 2014 (UTC)

1. "Multilingual text" would be an other concept. 2. "Multilingual text" would mean you couldn't add Paul (Q4925623) as given name without specifying which language "Paul" is. The name might be of French origin, mostly used in Germany and still used by the Roman pope Paul I (Q103404). --Kolja21 (talk) 19:01, 2 October 2014 (UTC)
"Paul" is (was) used for Paul I (Q103404), but only in some languages, check the labels or Wikipedia links. This is actually a good reason to use multilingual text datatype: He should have "Paul" for English, "Paul" for French, "Paul" for German, but "Paulus" for Latin, "Pavel" for Czech, "Paweł" for Polish etc. Another person with the same name (e.g. Paul Cézanne (Q35548)) should probably have "Paul" in nearly all languages using Latin script (including Czech, Polish etc.) Check the labels and Wikipedia links again. How can this be solved through "item-based concept" without adding hundreds of statements for the pope, one for every language variant of his name?--Shlomo (talk) 06:00, 7 October 2014 (UTC)

Until someone explains me better when the particular given name as an item should be added to the person's item, I'm going to remove all added given names items I see that do not match in Ukrainian or any other language I know with the real given name of that person in that language. Like this or this --DixonD (talk) 14:35, 6 November 2014 (UTC)

Mario / Marius

Hello,
I met a problem while adding given name (P735) to persons named Marius (Q2159938) and Mario (Q3362622). I just inverted the two items: people who have Marius (Q2159938) are actually named Mario and people who have Mario (Q3362622) are actually named Marius. All this because of false labels in French: it is "Mario" on Marius (Q2159938) and "Marius" on Mario (Q3362622).
I have already added ~2000 claims about Mario and Marius names... What could be the best solution in order to solve this problem? Is it possible to set the current 'false' properties to every Mario and Marius and then to switch all the links and label from Marius (Q2159938) to Mario (Q3362622) (and vice versa)? The second solution would be to cancel all my edits about Mario and Marius and then to restart adding claims from the beginning. What do you think about that? Mathieudu68 talk 17:39, 6 October 2014 (UTC)

Hmm .. I was wondering if it was a Latin name version that made you add "Marius" to persons named "Mario". Anyways, don't worry.
You can switch them over by opening two sessions of Autolist2: one for Mario, one for Marius. --- Jura 17:44, 6 October 2014 (UTC)

For "Marius" in "Mario", something like:

http://tools.wmflabs.org/wikidata-todo/autolist2.php?find=Marius%20%&statementlist=&language=en&project=wikipedia&category=&depth=12&wdq=claim[735:3362622]&mode=undefined&chunk_size=360&find_label=1&find_langs=&mode_wdq=and&run=Run

to replace

-P735:Q3362622
P735:Q2159938

should work. --- Jura 17:51, 6 October 2014 (UTC)

I just started moving the Marios over to their item. --- Jura 18:10, 6 October 2014 (UTC)

Thank you so much Jura1 ! :) Mathieudu68 talk 18:19, 6 October 2014 (UTC)

AutoList 2 down, requests page to avoid to list information

Hi,

I've created Wikidata:WikiProject Names/To fill to note given name or surname fill requests when AutoList 2 is down.

--Dereckson (talk) 02:52, 13 October 2014 (UTC)

Need for Discussion…

Ash Crow
Dereckson
Harmonia Amanda
Hsarrazin
Jura
Чаховіч Уладзіслаў
Joxemai
Place Clichy
Branthecan
Azertus
Jon Harald Søby
PKM
Pmt
Sight Contamination
MaksOttoVonStirlitz
BeatrixBelibaste
Moebeus
Dcflyer
Looniverse
Aya Reyad
Infovarius
Tris T7
Klaas 'Z4us' van B. V
Deborahjay
Bruno Biondi
ZI Jony
Laddo
Da Dapper Don
Data Gamer
Luca favorido
The Sir of Data Analytics
Skim
E4024
JhowieNitnek
Envlh
Susanna Giaccai
Epìdosis
Aluxosm
Dnshitobu
Ruky Wunpini
Balû
★Trekker

Notified participants of WikiProject Names

In fact, what Infovarius says above is exactly the concern I stressed while chatting with Harmonia Amanda and other contributors of this project on IRC some days ago… the problem of translitteration is NOT solved by this, and interwiki-links fail, and it is already developping as a monstruous Heath-Robinson machine (usine à gaz in French) ;

Believe me, as regards databases, sobriety is the best solution… each time you duplicate info, each time you complicate the path, you add complexity, and Murphy's_law is the result…

I do not think that we should continue this way, before really examining what are the goals of this project..

  • interwiki-links ?
  • stats about names (spreading, history, etc.)
  • string manipulations (like sorting of names)
  • others ? (please add, I can't think of others for now)

and there could be several technical solutions, the one that's being used now not necessarily the best :

  • maybe multilingual-type property could be used ? (I don't really understand the implications of that type of value - it's just an hypothesis)
  • maybe we could create a "top" value for each "group of names", on which all variants would be linked to, that could then allow interwiki... ? this would allow to fall back to a "1-to-many" relation, instead of a "many-to-many" that is the core of the problem… - choice of how this "top" item should be named and used, to be discussed…
  • maybe we still have to create another solution no-one has thought of, already…

One of the biggest problem for now is that it is somewhat difficult to see how wikidata will develop, technically, as many linking and querying are not possible, yet… and trying to get around that problem leads to duplication of information in many ways… among them, the treatment of names… :( Here is my position : testing is right… to see what works and what doesn't… and we've seen that it raises many problems… — for now, I think that we should R E S T, O B S E R V E, then T H I N K… and D I S C U S S… before launching into more action… :| --Hsarrazin (talk) 19:08, 16 October 2014 (UTC)

There seems to be a real need for discussion, now…

Here are my first reflections, after sleepless night :) … feel free to add on this page or we can just move it to sub-page here :) --Hsarrazin (talk) 07:12, 17 October 2014 (UTC)

I also thought about a "top value" for all name cognats, may be it's really good idea against this crazy splitting and mess in each person item. Wikipedias which have pages for several variants of the same name (like en-wiki) can also have one common article (like "Constantine and variants") which would be linked inside the top value. --Infovarius (talk) 03:54, 18 October 2014 (UTC)

Nikola

Ash Crow
Dereckson
Harmonia Amanda
Hsarrazin
Jura
Чаховіч Уладзіслаў
Joxemai
Place Clichy
Branthecan
Azertus
Jon Harald Søby
PKM
Pmt
Sight Contamination
MaksOttoVonStirlitz
BeatrixBelibaste
Moebeus
Dcflyer
Looniverse
Aya Reyad
Infovarius
Tris T7
Klaas 'Z4us' van B. V
Deborahjay
Bruno Biondi
ZI Jony
Laddo
Da Dapper Don
Data Gamer
Luca favorido
The Sir of Data Analytics
Skim
E4024
JhowieNitnek
Envlh
Susanna Giaccai
Epìdosis
Aluxosm
Dnshitobu
Ruky Wunpini
Balû
★Trekker

Notified participants of WikiProject Names Nikola (Q18247339) and Nikola (Q15501913) were merged by Milicevic01 few days after I worked with these items and let my bot add given name (P735): Nikola (Q15501913) and given name (P735): Nikola (Q18247339). I think this should be resolved by unmerging but members of this project could make a better cleanup than me... Matěj Suchánek (talk) 14:56, 12 November 2014 (UTC)

I'm unmerging the items. Meanwhile, could you explain to Milicevic01 why the merge weren't the optimal solution? --Dereckson (talk) 15:16, 12 November 2014 (UTC)
Er... could you indicate the exact given names of these two items? I can't distinguish between the two Nikola. --Dereckson (talk) 15:20, 12 November 2014 (UTC)
@Dereckson: I have just realized that my organisation wasn't well too. I am giving you a summary of my thoughts:
Male given name Female given name Unisex given name Not know
enwiki, bgwiki, srwiki, slwiki plwiki cswiki, skwiki, nowiki, hrwiki, ptwiki
Q18247339 (mostly my bot added backlinks) Q15501913 (mostly my bot added backlinks) to be created ?

Matěj Suchánek (talk) 18:37, 12 November 2014 (UTC)

@Matěj Suchánek: I can tell you this hrwiki says Nikola is given name and later on lists both male and femle variations of name and points out in which countries is used as male name and in which as a female name so i gess it is unisex? I also added slwiki --Milicevic01 (talk) 20:07, 12 November 2014 (UTC)
well, in this case, personally, I would think only one item, with unisex given name (Q3409032), since it is exactly the same name, so one for male and one for female is one too many :) --Hsarrazin (talk) 20:57, 12 November 2014 (UTC)
Like Hsarrazin, I've also a preference to use one item per first name variant, but keep one item for the use of the same firstname in different languages, so I concur with a merge too. --Dereckson (talk) 23:41, 12 November 2014 (UTC)
"Nikola" seems to be less clear-cut than "Jean", so it seems hard to formulate a way how some may be differentiated. --- Jura 07:47, 13 November 2014 (UTC)
They mention men in pl-article too. --Infovarius (talk) 08:35, 16 November 2014 (UTC)

Congratulations! So we have now a monolingual property for names. I believe that this property will be not controversial as previous given name (P735), so I propose to pay more attention to the new property. --Infovarius (talk) 21:32, 16 November 2014 (UTC)


Please see Property_talk:P53#Other_families. --- Jura 07:51, 21 December 2014 (UTC)

Language for given name items

Please see Wikidata:Property_proposal/Generic#Language. --- Jura 07:51, 21 December 2014 (UTC)


Name items

There are some items that link to articles that describe both the use of a name as a given name and a family name (exemple: en:Alonso).

These articles are different from articles that just describe a surname and mention that it's being used as a first name too.

Currently, two ways are suggested for these:

(A) See the approach used in the examples Bruno (Q955175) or Patrick (Q4927850):

(B) Alternate solution:

  • (B) doesn't have the advantages and inconveniences of (A).

Maybe there are other ways to solve this. --- Jura 10:30, 21 December 2014 (UTC)

I would use (B) and add on the general item has part(s) (P527) with the family name and the given name, so we can still found easily these. --Harmonia Amanda (talk) 11:57, 21 December 2014 (UTC)
Sounds good. --- Jura 22:23, 29 December 2014 (UTC)

You might want to participate in the discussion, it's about the item "François". --- Jura 22:23, 29 December 2014 (UTC)


WikiProject Names in "Class Instance Analysis"

From Wikidata:Project_chat#Class_Instance_Analysis:

  • There are 40k+20k names: 40038 family name, 10320 given name, 5569 male given name, 4828 female given name.
    • Due to the good efforts of the WikiProject "Wikidata names", these items provide valuable information on names themselves, eg variations, male/female correspondences, etc.
    • This can probably be used for disambiguation or for generating language-specific name variants, but we have not investigated this topic

Nice :) --- Jura 16:40, 28 January 2015 (UTC)

Aliases for non-English characters

Generally, names of languages in Roman script will not have aliases. James (Q677191) would not have the alias "Jim", for example. However, in English it would seem to still make sense to do the normal aliasing to handle accented or other special characters not normally used in English. For example, José (Q2190619) would have the alias "Jose". In these situations, both accented and non-accented usage are widely used in references. Is there any reason this should not be permitted? Josh Baumgartner (talk) 19:11, 23 March 2015 (UTC)

The alias for mere ease of finding the accented version seems fine, but it probably shouldn't replace the un-accented version (item) entirely. --- Jura 19:17, 23 March 2015 (UTC)
Agreed, where there are both accented and non-accented names in use, both should have their own items. Josh Baumgartner (talk) 21:14, 27 March 2015 (UTC)

Aliases for non-Roman script

Q4925477 with the label "John" (in English) has "John" as alias for ru and be.

Based on this, I set a similar alias for the more frequent names (I think I limited it to names with > 1000 uses). Shall I do the same for other languages? Samples: ja or zh? --- Jura 10:34, 15 April 2015 (UTC)


Finally! 1,825,881 compared to 1,821,876.

From the department of pointless statistics ;) --- Jura 17:30, 18 April 2015 (UTC)


Top-down approach

Ash Crow
Dereckson
Harmonia Amanda
Hsarrazin
Jura
Чаховіч Уладзіслаў
Joxemai
Place Clichy
Branthecan
Azertus
Jon Harald Søby
PKM
Pmt
Sight Contamination
MaksOttoVonStirlitz
BeatrixBelibaste
Moebeus
Dcflyer
Looniverse
Aya Reyad
Infovarius
Tris T7
Klaas 'Z4us' van B. V
Deborahjay
Bruno Biondi
ZI Jony
Laddo
Da Dapper Don
Data Gamer
Luca favorido
The Sir of Data Analytics
Skim
E4024
JhowieNitnek
Envlh
Susanna Giaccai
Epìdosis
Aluxosm
Dnshitobu
Ruky Wunpini
Balû
★Trekker

Notified participants of WikiProject Names

Following a bot request by @Sascha:, I finally tried to write down what could be such an approach (section at Wikidata:WikiProject_Names#How_to_clean_up_given_name_items_.28top-down_approach.29, previously empty).

Please suggest more. --- Jura 17:36, 21 April 2015 (UTC)


Japanese names

Ash Crow
Dereckson
Harmonia Amanda
Hsarrazin
Jura
Чаховіч Уладзіслаў
Joxemai
Place Clichy
Branthecan
Azertus
Jon Harald Søby
PKM
Pmt
Sight Contamination
MaksOttoVonStirlitz
BeatrixBelibaste
Moebeus
Dcflyer
Looniverse
Aya Reyad
Infovarius
Tris T7
Klaas 'Z4us' van B. V
Deborahjay
Bruno Biondi
ZI Jony
Laddo
Da Dapper Don
Data Gamer
Luca favorido
The Sir of Data Analytics
Skim
E4024
JhowieNitnek
Envlh
Susanna Giaccai
Epìdosis
Aluxosm
Dnshitobu
Ruky Wunpini
Balû
★Trekker

Notified participants of WikiProject Names

Where to start? They are currently filling up the newly created Wikidata:Database reports/Most linked disambiguation page items (mostly surnames) and some of the constraint reports.

Personally, I had been avoiding them as there can be an issue with the sequence of family name/given name. Maybe this isn't so much an issue as English labels might be using the standard order. --- Jura 03:41, 6 May 2015 (UTC)

Looking at some of the names, there seem to be exceptions. Maybe its mainly for person who don't have an article on en.wikipedia.org . --- Jura 06:25, 13 May 2015 (UTC)
Among the items, there seem to be quite a few like Q6360446 that are mark as disambiguation, but only link to items for the given name. --- Jura 06:47, 19 May 2015 (UTC)
There are items like Wikimedia disambiguation page (Q4167410) that are disambiguations on Wikipedias, but they really refer to one name (Mochizuki/望月 in this case). So I think these items should have instance of (P31) => Wikimedia disambiguation page (Q4167410) removed. —Wylve (talk) 14:23, 19 May 2015 (UTC)
I fixed a few of those manually, but given the number of them (several hundred used on 12000+ items), it might be easier to replace Wikimedia disambiguation page (Q4167410) with given name (Q202444) for all Japanese names there. --- Jura 09:05, 21 May 2015 (UTC)
Yes, I tried to correct them manually but it seems like it's only names and not "real" disambiguation pages. We will have less work correcting given name (Q202444) to Wikimedia disambiguation page (Q4167410) than Wikimedia disambiguation page (Q4167410) to given name (Q202444), I think. --Harmonia Amanda (talk) 11:42, 21 May 2015 (UTC)
Ok, I went ahead and changed the following ones: Q10312977, Q3814192, Q8060712, Q3194472, Q2875045, Q7827665, Q376265, Q1906740, Q3199149, Q5752764, Q3566615, Q3453970, Q6381801, Q1618042, Q8049964, Q6455143, Q7496332, Q6849888, Q11738555, Q8062787, Q5770234, Q7385588, Q7705328, Q5996341, Q3532641, Q5389041, Q7677975, Q6387191, Q8056124, Q6444650, Q6381390, Q7820325, Q7706920, Q7446720, Q6378245, Q7681648, Q5752722, Q1609557, Q7678557, Q5675603, Q6311404, Q9286649, Q8056012, Q6314440, Q6381549, Q6380666, Q5508575, Q7706863, Q7385648, Q6862794, Q6782534, Q6378087, Q7827685, Q5771368, Q4803426, Q3519203, Q7820113, Q6381477, Q5771436, Q5770644, Q7496317, Q6381402, Q6378214, Q8049934, Q7827748, Q5770464, Q6381748, Q3189975, Q5771423, Q7678610, Q1666942, Q6782731, Q5752312, Q7827655, Q6883309, Q8062778, Q7674305, Q7496343, Q6884537, Q6455275, Q5675731, Q8049958, Q7688335, Q7674394, Q6782538, Q7827733, Q7674265, Q6392056, Q7677045, Q7496310, Q7415457, Q5770589, Q7677432, Q7385640, Q6410131, Q5772304, Q7688273, Q7674762, Q7497855, Q6389407, Q585483, Q8060684, Q6782282, Q6410145, Q8056328, Q6884482, Q6782694, Q6455199, Q5770184, Q7827727, Q7820193, Q6875091, Q6849876, Q6381484, Q4700946, Q5772471, Q6837942, Q6381730, Q8050007, Q7678034, Q7496321, Q5349231, Q6782402, Q8060695, Q8062705, Q5349988, Q7385612, Q8056310, Q8056300, Q7333107, Q7310136, Q6782815, Q5349260, Q6782681, Q8056298, Q8050125, Q7827662, Q5752434, Q8050046, Q5771411, Q8056182, Q7674397, Q6444686, Q8050113, Q7827743, Q925943, Q7681463, Q6381738, Q5102673, Q7677144, Q7674401, Q7506285, Q5530781, Q6883744, Q6383869, Q4701238, Q7505114, Q7336794, Q6418996, Q8061641, Q7960775, Q7827814, Q7705352, Q6378053, Q7638831, Q7428553, Q6883306, Q3856896, Q6413889, Q1741213, Q8062766, Q8062743, Q8049954, Q7695023, Q5560380, Q7678596, Q3482314, Q5508600, Q7402978, Q9071518, Q5771401, Q6419012, Q5752501, Q8049961, Q7827677, Q7705294, Q7688280, Q7688277, Q2365258, Q7496340, Q7397616, Q6381784, Q8050117, Q7850222, Q7678172, Q7677968, Q4817756, Q6883198, Q5752454, Q6837934, Q6837904, Q6782546, Q6447424, Q5772430, Q7820328, Q7818898, Q7705310, Q3200459, Q9355297, Q7674398, Q5100103, Q6782395, Q5509874, Q8062648, Q7827680, Q7688265, Q7499483, Q6381422, Q8056132, Q6434171, Q7813589, Q5349770, Q7678655, Q7677437, Q5752491, Q7496350, Q7403211, Q7397505, Q8060763, Q8050037, Q7830674, Q7830637, Q7705317, Q4830936, Q7505006, Q6405993, Q6883384, Q6837949, Q7635744, Q225187, Q248718, Q7379297, Q7335057. About 250 for 9000 items. --- Jura 04:04, 22 May 2015 (UTC)
It's now down to about 2000 (here). The Japanese ones I left out are some that are marked "family name" and "disambiguation". Maybe for these just a second item needs to be created. --- Jura 04:51, 22 May 2015 (UTC)
The problem with some of these items is that they are purely transliterations. For example, Rinko (Q7335057) refers to all possible combinations of kanji that is pronounced "Rinko" when spoken. The en.wp site link on that item already shows that "Rinko" refers to both 凛子 and 倫子. These two are different names with different meanings that coincidentally have the same romanization and pronunciation. If these name items are used on person items then they would lead to inaccuracy. I suggest that we fix these items by looking at the ja label. If the ja label is unavailable then we need to manually separate the romanizations to actual names. —Wylve (talk) 06:07, 22 May 2015 (UTC)
Yes. For Special:Search/Yuriko given name this is nicely done (currently 3 variants). In these cases, the Japanese spelling should probably go into the label, not the description. For Yuriko, we might want to create an undifferentiated item. --- Jura 06:15, 22 May 2015 (UTC)
I'm not so sure about an undifferentiated item. To me, it doesn't have any use. It can be used to link the three differentiated items together, but that would not be a semantic relationship, but a linguistic one. —Wylve (talk) 07:52, 22 May 2015 (UTC)
For Wikidata contributors, it has the advantage that they could use that if they don't know which other one applies (people like me). It may also be that (sample) an American has a the given name "Yuriko" without relating it to any Japanese spelled name, or at least none that is publicly known. Most given name pages at English Wikipedia would probably also be linked from such an item. --- Jura 07:59, 22 May 2015 (UTC)
I see. Anyway, I'll try to differentiate some items in the coming days. Which property should I use to link the differentiated items to their undifferentiated counterpart? —Wylve (talk) 08:28, 22 May 2015 (UTC)
Not sure, which one would you use? P:P527/P:P361 and P:P460 come to my mind. What do you think of adding the Japanese spelling to the label (such as "ゆり子 (Yuriko)"). For any person, we could probably add the undifferentiated item in P735 as well. Maybe we should create a P31 value for the undifferentiated ones. --- Jura 08:38, 22 May 2015 (UTC)

────────────────────────────────────────────────────────────────────────────────────────────────────About the property, I'm reluctant to use said to be the same as (P460), since it begs the question, "if two items are said to be the same, then why aren't the items merged? If they can't be merged, then they couldn't be the same but only similar". I don't feel that has part(s) (P527)/part of (P361) are suitable either, since they do not form parts of each other. They merely related in that they have the same pronunciation. As to the label, it would certainly be convenient for editors, but I feel that labels should be confined to their own language. Alternatively, editors can enable ja in their babel boxes. About the new P31 for undifferentiated items, we first need to separate items with disambiguation sitelinks that only disambiguates between names, and those that disambiguates also between places with the same spelling. The former can either use transliteration (Q134550) or given name (Q202444), depending on whether the name is also used in other languages besides Japanese. The latter should stick with Wikimedia disambiguation page (Q4167410). —Wylve (talk) 09:06, 22 May 2015 (UTC)

So which property do yo plan using? "different from" could be another option.
In the recent plan for Wikitionary (for 2020?), there is a new datatype. One that would have just one label for all languages. For given names, something in these lines could be an option as well. The labels are already mostly identical. The Romanization would just need to go into a statement. --- Jura 15:10, 22 May 2015 (UTC)
Maybe we should try to add a section about this to the project page. --- Jura 07:03, 23 May 2015 (UTC)
I've given it some thoughts and maybe we don't need a property linking them. They are merely synonyms and we don't collect those until we rollout Wiktionary support. —Wylve (talk) 03:31, 24 May 2015 (UTC)
I'm afraid if we don't link them together, people are likely to merge them or ignore that they exist. I wouldn't count on Wiktionary support as nothing is really certain about it (if/when/how).
We could create a dedicated property for the Japanese names. If it at some point there is a Wiktionary feature for that, it can easily be converted to that, the information already being structured. --- Jura 05:46, 24 May 2015 (UTC)

The report has finally a manageable size .. --- Jura 07:04, 23 May 2015 (UTC)

The bad news is that there are still 4000 items mixing given names with disambiguations and or family names (list). --- Jura 14:36, 23 May 2015 (UTC)
It seems that most are unused (3200) or rarely used (450), just about 100 to 300 need work. --- Jura 16:01, 23 May 2015 (UTC)

Korean versus Chinese surnames

Hello! I've recently been editing records of Korean persons, adding surname fields and such. I'm finding inconsistency between how different names are handled on Wikidata/Wikipedia, but I don't know enough about names to know why. For instance, the name "Lee" has one record for the Chinese name (Q686223) and one for the Korean name (Q13498149) even though they both use the same Chinese character. However, others such as the Korean surname "Heo", is combined with the Chinese name ("Xu") with whom it shares a character. So if I add a surname field to a Korean person named Heo, it shows their surname is Xu (they are different pronunciations of the same character - like Japanese, Korean uses Chinese characters but with its own pronunciations). I was wondering if it's okay for me to create, say, a new record for the Korean surname "Heo", or will I mess up some existing master plan? Again, I don't know the reasoning behind the way it's currently handled so I don't want to go messing around with things without checking first. So I'd love any advice on how to proceed in a case like this. Thanks so much! Shinyang-i (talk) 01:57, 28 May 2015 (UTC)

The surname part hasn't been worked on much yet.
There may be no reasoning behind it, except that people like me wouldn't see the difference (I can't read either language).
If the names are different, it seems reasonable to have separate items, this even if in some languages, two different names are spelled the same. --- Jura 03:05, 28 May 2015 (UTC)
Thanks for the response! I think I will go ahead and create the Korean surname records, and if later on the records are deemed unnecessary they can be merged or something like that. There aren't actually that many Korean surnames even in existence, and even fewer that are common. I will also make some Korean given name records. I made a few then worried maybe I shouldn't be doing so. Is that okay, too? I can't fill in much info but at least they'd exist. Is there a place where editors should list any name records they create, for the sake of organization? Thanks again! Shinyang-i (talk) 01:04, 29 May 2015 (UTC)
I suppose it depends on how you prefer to work.
I think it would be worth creating separate items for each of the names on en:List of Korean family names. A few may already have distinct items, but some might mixed this up with given names and other things. Does the Reasonator: List of people with the Korean family name Lee make any sense? Note you can display it with lang=ko as well. Running Reasonator on the Korean surname Lee obviously does the same.
I would create the items beforehand even if the list was much longer. It takes some time work it out how, but it's can be done with QuickStatements. The list could also be copied over to Wikidata. Not sure which Wikipedia on list of Korean surnames (Q5934917) has the most complete one. Depending on how the items are defined "TAB" can generate a live list: List of Korean surnames. If you need additional properties for these names, they can be requested at PP/T (it might take some time though).
For English given names, I started adding properties based on automated queries like this. It doesn't work well for English language surnames as querying for the last part of the label is slow. As Korean surnames come first, using language "ko" and the name might work as well. --- Jura 04:43, 29 May 2015 (UTC)
Thanks for the great response! You've given me a lot to think about. I'm actually primarily working on people records, and the name issue has come up as part of that. For now I'm going through all the Korean surnames on ko-wiki, because I'm finding a lot of them are already on Wikidata, they just have no English labels or descriptions. I'm not in a position to request properties, as I can't add much to the records besides bare bones, ha ha. I'll check out some of the tools you've listed! Thanks again! Shinyang-i (talk) 06:30, 29 May 2015 (UTC)
For storing transliterations, it might be worth creating new properties (similar to P:P1721). --- Jura 10:51, 29 May 2015 (UTC)
One is already about to be created: Wikidata:Property_proposal/Person#McCune-Reischauer. --- Jura 18:51, 29 May 2015 (UTC)
Oh excellent, that's something that never even crossed my mind. The other major romanization system is Revised Romanization. Those are the two "official" romanizaiton systems. However, in reality, there are often a number of other romanzations that are actually used by Koreans for the spellings of names. I've added them to the "also known as" fields to assist in searching, but are those the kinds of things that should be part of the "official record"? Shinyang-i (talk) 16:31, 30 May 2015 (UTC)
Actually, in thinking further, they definitely should be. You can't really have a complete record for a person without including all widespread spellings of his/her name, especially when used professionally. Mc-R is designed to assist people in knowing how to pronounce words correctly, hence the various accent marks and such; it's not really used in other contexts these days. Revised Romanization is "official", but I've rarely seen it used in practice for anything but names of major cities. Person name romanizations are officially whatever they registered with the government shortly after birth, and rarely follow any standardized romanization system. Usage in the media can vary based on circumstance and market. Yet all of these romanizations are relevant for notable persons. Example: 김재중 is the name of a famous singer. McCune-Reischauer: Kim Chaejung (used by no one); Revised Romanization: Gim Jae-jung (used by no one); romanization by subject himself on his website/album covers/etc): Kim Jae-joong/Jaejoong/Jae Joong; typical romanizaton by Korean media: Kim Jae-jung/Jaejung/Jae Jung; romanization for Japanese market: Kim Jejung; official spelling on birth certificate: who knows. So I think three properties are needed: one for Mc-R, one for RR, and one for "other commonly-used romanizations" or something like that. The record would be very incomplete without the latter. Shinyang-i (talk) 17:28, 30 May 2015 (UTC)
Personally, I prefer them in a formatted way, but currently strings in properties can't be searched easily. We can work around this by bulk-copying all string values into aliases once in a while.
I gave some more input in the Mc-R discussion, maybe it will be made available. If you plan on using the other two properties, I suggest you make proposals for them as well. I can help you convert current aliases into property values. --- Jura 04:38, 31 May 2015 (UTC)
Thanks for the input! Right now, aliases are inconsistent, with some records having many and others none. I was wondering if you know, from a technical point of view, if there is a way to make it possible so that "Jaejoong", "Jae Joong", and "Jae-joong" can all be "seen" the same way, as this is part of what makes the number of romanizations for Korean names spiral out of control so much. Same for "Kim Jaejoong" versus "Jaejoong Kim". You may have already addressed this issue, but I am totally knowledgeable about the different data types I've seen mentioned, which is another reason I can't make any proposals yet. I have to learn about all that. I'll watch for the availability of the new Mc-R property and see how that works out. Thanks again, I'm learning a lot. Shinyang-i (talk) 05:40, 1 June 2015 (UTC)

────────────────────────────────────────────────────────────────────────────────────────────────────Not sure. I wouldn't worry too much about GUI issues. It tends to change and adapted based on our use of features. If you like to bulk edit the aliases, the following might help: http://quarry.wmflabs.org/query/3846 You can edit the output and use it for QuickStatements. I will eventually change it to include names without any aliases. You can run it yourself by creating a new query on "Quarry". --- Jura 07:48, 1 June 2015 (UTC)

McCune-Reischauer romanization (P1942) was finally created. --- Jura 08:54, 16 June 2015 (UTC)

Hispanic surnames

Just noticed that the proposal is still up at: Wikidata:Property_proposal/Person#second_or_maternal_family_name_of_hispanic_name. Maybe some of the active contributors want to comments. It seems to draw comments primarily form people who don't intend to add statements anyways.

Not sure what it would have to do with double barreled English surnames, but go figure .. --- Jura 04:19, 28 May 2015 (UTC)

second family name in Spanish name (P1950) was created. --- Jura 08:55, 16 June 2015 (UTC)

New datatype: monostring item

Ash Crow
Dereckson
Harmonia Amanda
Hsarrazin
Jura
Чаховіч Уладзіслаў
Joxemai
Place Clichy
Branthecan
Azertus
Jon Harald Søby
PKM
Pmt
Sight Contamination
MaksOttoVonStirlitz
BeatrixBelibaste
Moebeus
Dcflyer
Looniverse
Aya Reyad
Infovarius
Tris T7
Klaas 'Z4us' van B. V
Deborahjay
Bruno Biondi
ZI Jony
Laddo
Da Dapper Don
Data Gamer
Luca favorido
The Sir of Data Analytics
Skim
E4024
JhowieNitnek
Envlh
Susanna Giaccai
Epìdosis
Aluxosm
Dnshitobu
Ruky Wunpini
Balû
★Trekker

Notified participants of WikiProject Names

For names, it could be helpful to have a datatype that displays the same label for any language. Currently, we edit items to ensure that items like Q677191 have identical labels. Transliterations could go into statements. Descriptions and aliases could still be different for each language.

The current Wiktionary proposal (3rd or 4th proposal?) has something in this direction, but doesn't allow sitelinks. As the need isn't dependent on Wiktionary and Wiktionary implementation doesn't have a timeline, one could just define this now.

What do you think of it? --- Jura 05:10, 29 May 2015 (UTC)

I think that label will be different anyway. E.g. for Q677191 it can be James, Джеймс, or Ջեյմս. And all are valid. --Infovarius (talk) 21:20, 29 May 2015 (UTC)
In the current model yes, but only one version is the actual name, others are just transliterations. --- Jura 04:32, 30 May 2015 (UTC)


2 million items for people with P735 reached

I just updated the statistics. It's now at 2,000,204. --- Jura 22:45, 31 May 2015 (UTC)

Related projects

Dans les débats ci-après, ont été rassemblés tous les débats à ce jour et classés par thème, dans un ordre allant du plus général vers le plus particulier : portail, liens avec d'autres wiki, articles, contenu des articles.

Vous êtes tous invités à participer à ces débats.

Bien amicalement. --Guy Courtois (talk) 21:03, 8 June 2015 (UTC)

Débats sur le portail lui-même

  • Projet:Anthroponymie/Débat sur la sensibilisation de la communauté Wikipédia à ce portail
  • Projet:Anthroponymie/Débat sur le périmètre du portail
  • Projet:Anthroponymie/Débat sur le titre du portail
  • Projet:Anthroponymie/Débat sur la palette du portail
  • Projet:Anthroponymie/Débat sur la notification de projet

Débats sur les liens avec d'autres wiki

  • Projet:Anthroponymie/Débat sur les liens avec Wikidata
  • Projet:Anthroponymie/Débat sur les liens avec le Wikitionnaire
  • Projet:Anthroponymie/Débat sur les liens avec les autres projets sur l'anthroponymie des Wikipédia dans d'autres langues

Débats sur les articles

  • Projet:Anthroponymie/Débat sur les possibles regroupements des surnoms, prénoms et noms de famille au sein d'un même article
  • Projet:Anthroponymie/Débat sur le regroupement des variantes
  • Projet:Anthroponymie/Débat sur le choix entre "nom de famille" ou "patronyme"
  • Projet:Anthroponymie/Débat sur les liens entre les articles de "noms de famille" et les articles de "famille"]]
  • Projet:Anthroponymie/Débat sur les pages d'homonymie
  • Projet:Anthroponymie/Débat sur les titres des articles de famille
  • Projet:Anthroponymie/Débat sur les surnoms

Débats sur le contenu des articles

  • Projet:Anthroponymie/Débat sur la structure type d'un article
  • Projet:Anthroponymie/Débat sur les infobox
  • Projet:Anthroponymie/Débat sur les possibles contenus encyclopédiques dans les articles
  • Projet:Anthroponymie/Débat sur les sources
  • Projet:Anthroponymie/Débat sur la catégorisation des articles et le choix des portails

Hi,

I tried something : using named after (P138) on Kiefer Sutherland (Q103946) and Kiefer Ravena (Q6405173). Is it the right way?

Cdlt, VIGNERON (talk) 22:22, 23 August 2015 (UTC)

I had tried the same at Q18002970#P735. --- Jura 06:27, 24 August 2015 (UTC)
Looks good to me except that I would like to see a reference for stuff like this that isn't obvious. Joe Filceolaire (talk) 13:42, 24 August 2015 (UTC)

Language property, replacement of language of work or name (P407) (for names only)

Please see the proposal at Wikidata:Property_proposal/Term#language_of_name. Please help save it from the controversy about what property to apply to works (unrelated to our topic). --- Jura 06:27, 24 August 2015 (UTC)


Given names: plateau at 72%

No P735
SitelinkItems of total
en 245,188 19%
ru 118,265 40%
ja 116,446 44%
zh 97,169 68%
fr 70,335 15%
de 68,575 11%
pl 42,287 15%
es 41,077 16%
pt 31,559 19%
nl 27,348 15%
ar 25,147
fa 23,881
eo 4,637 12%
any 802,752 28%
all, but * 445,761 17%*
* excluding [ja,zh,ru,uk,ar,fa]

While the quality of the items for given names improves, the coverage of item for people stagnates at 72% (items with P31=Q5 with P735 compared to all items with P31=Q5).

How to go about to lead this further? There are about 800,000 items that still lack P735.

Personally, I skip Japanese names, but even these only account for about 100,000.

As there are some items that wont ever have P735 maybe we are currently way beyond 72%. Some have already been excluded with "novalue" (sample: Q734717#P735).

I will try to reduce the number of items with links to Esperanto, maybe it helps me coming up with a solution for the remaining ones. --- Jura 14:34, 2 September 2015 (UTC)

family names/disambiguations messed up

if someone has time, please see Special:Contributions/114.191.246.187, the usual wrong edits by SU: checks/reverts/improvements needed (<5 % correct edits, majority to be reverted). Holger1959 (talk) 10:43, 6 September 2015 (UTC)

nearly all checked now, only a few left, see the ones marked with "current". Holger1959 (talk) 14:29, 6 September 2015 (UTC)
It seems to be the same as the one that is being followed on the admin board. --- Jura 06:16, 9 September 2015 (UTC)

"family name" versus "surname"

Should the description of "instance of family name" be "family name" or "surname". It seems like no label is standard and it would be nice if it was. "Surname" is ok, but the plain English "family name" conveys the information more clearly, especially where not all readers will have English as first language. The other property is "given name", so they should match. --Richard Arthur Norton (1958- ) (talk) 22:34, 10 September 2015 (UTC)

In some countries the family name is first. Using surname can cause confusion in those cases so I prefer family name. Joe Filceolaire (talk) 22:25, 12 September 2015 (UTC)
Should we migrate all of them from "surname to "family name"? --Richard Arthur Norton (1958- ) (talk) 21:22, 17 September 2015 (UTC)

Groupings of first-name variants: occurrence statistics; and representation

Ash Crow
Dereckson
Harmonia Amanda
Hsarrazin
Jura
Чаховіч Уладзіслаў
Joxemai
Place Clichy
Branthecan
Azertus
Jon Harald Søby
PKM
Pmt
Sight Contamination
MaksOttoVonStirlitz
BeatrixBelibaste
Moebeus
Dcflyer
Looniverse
Aya Reyad
Infovarius
Tris T7
Klaas 'Z4us' van B. V
Deborahjay
Bruno Biondi
ZI Jony
Laddo
Da Dapper Don
Data Gamer
Luca favorido
The Sir of Data Analytics
Skim
E4024
JhowieNitnek
Envlh
Susanna Giaccai
Epìdosis
Aluxosm
Dnshitobu
Ruky Wunpini
Balû
★Trekker

Notified participants of WikiProject Names

Project members may be interested in these tables, which include occurrence statistics for families of first-name variants (ie François / Franz / Frank / Francis etc), as well as queries for multiple individuals with those names sharing the same dates of birth and death, for investigation.

The lists are derived from this query tinyurl.com/pv64gfg (female names), and tinyurl.com/p6xc6yv (male names) on the new SPARQL query service -- be aware that the query for male names may or may not run within the time limit for queries -- it's right on the limit of what is possible.

The queries count how many people have given names that are said to be the same as (P460) each name.

Note that this means that there are many returns for each first-name family, with the 'best inbound connected' versions of the name occurring first.

It would be nice to produce a query where each different family of first-name variants occurred only once. It may be possible to do this, but the present organisation of items makes it a lot less easy than it could be.

I wonder if it would be worth introducing some new items to represent such families -- for example, a new item like "given name cognate to James or Jacob", or "set of given names cognate to James or Jacob", that could contain the grouping information all in one place, and make the querying much easier.

One would then have either

James (Q677191) .. instance of (P31) .. "given name cognate to James or Jacob" .. subclass of (P279) .. male given name (Q12308941)

or

James (Q677191) .. part of (P361) .. "set of given names cognate to James or Jacob" .. instance of (P31) .. "set of given names" .. subclass of (P279) .. "set of names" .. subclass of (P279) ..

I think the first structure would be neater.

Such name-groups could also be nested, for example

Jørgen (Q13409273) .. instance of (P31) .. "given name cognate to Jurgen" .. subclass of (P279) .. "given name cognate to Georg or Jürgen" .. subclass of (P279) .. {{Q|12308941

What do people think: would this be a useful step forward? Jheald (talk) 14:29, 12 September 2015 (UTC)

Hello Jheald,
in fact, you stress a problem that has long ago been already noticed, and discussed, but not acted upon, since there were other priorities.
I don't know exactly what "given name conjugate to James or Jacob" means (English is not my 1st language), but I guess it's something like the 4th solution I had proposed : a top item, that would allow to join all forms, for stats, interwikis, etc. Is it right ? if yes, then, you have my full support, of course :) — (see above discussions about Konstantin, and Need for discussion…). --Hsarrazin (talk) 14:50, 12 September 2015 (UTC)
Sorry, I meant to write "cognate to" rather than "conjugate to", (ie "prénom apparenté à Jacques ou Jacob" ?), as a possible name for a single item relating to Jacobus, Giacomo, Jacques, James, Jakub etc.
It would make stats a lot easier, because one could just count for ?name_family where ?item wdt:P735/wdt:P31 ?name_family and ?name_family wdt:P279+ wd:Q12308941. This would be far more efficient than the query I've given above; and would have the advantage that each family would only show up once.
It wouldn't solve all of the problems for interwiki links (eg there could still be "Bonnie and Clyde" issues), but I think it could certainly help ease some of them. Jheald (talk) 15:38, 12 September 2015 (UTC)
  • That's an interesting set of queries and summaries.
For related given names, currently Reasonator gives an overview. Samples: James or Jørgen.
Given that it took a quite some time to simplify the structure to the reach the current one, I'd hesitate to make it more complicated again. If you feel a need for grouping names, maybe a new property should be used instead. --- Jura 14:55, 12 September 2015 (UTC)
One of the problems is that said to be the same as (P460) is rather hard to maintain in the current structure -- there is no "master copy" for each group of names, which is why there are different counts for different variants of the same name in the query tinyurl.com/pv64gfg.
The difficult bit has been establishing different items for each different individual name spelling -- which has been a mammoth job, that everybody of this project deserves a lot of kudos for. What I'm suggesting wouldn't undo that, but would build on it; and I think ought to be a fairly simple one-off bot job to create. Jheald (talk) 15:50, 12 September 2015 (UTC)
Well, sitelinks found on these items and the mixup with disambiguations is another issue.
I'm not sure if you can establish a strictly hierarchical link between anything that is currently listed in P460 and I don't think adding it into P31 makes things easier.
As it's merely for statistics, you might want to do an extract of the items and analyze this directly. --- Jura 16:13, 12 September 2015 (UTC)
Here's a query for the families of first-names, sorted starting with the largest number of variants: tinyurl.com/ok5ke4q. (Sorry, don't yet know how to turn the numeric QIDs back into items). I would propose making a new item "given name conjugate to ..." for each of these name-families. That should be straightforward enough. Yes there would still be room for discussion as to whether Jurgen-names were or were not a separatable sub-class of George-names; but at least we would then have a structure in which we could analyse such questions. Jheald (talk) 16:25, 12 September 2015 (UTC)
@Jheald: Did the translation from Qids to labels in the query for {{PropertyForThisType}} ( Properties for the class <human> for an example.
Thanks, I'll look into it, but out of time for now. Jheald (talk) 16:44, 12 September 2015 (UTC)
Version of query with labels added tinyurl.com/n9pfqno. Jheald (talk) 10:02, 13 September 2015 (UTC)
And a (slightly baroque) query to return the female given-name families, with their constituent given-names listed by occurrence frequency, tinyurl.com/nvopozw. Times out when I try to run it for male names, so any rewrites to make it more efficient would be very welcome. Jheald (talk) 12:47, 13 September 2015 (UTC)
Also, if anyone (@Jura1: ?) can see how to get only the most frequent name to be printed out for each group, this would be good. Jheald (talk) 13:00, 13 September 2015 (UTC)

@Jheald: This is a case of classification of names, so yes, I'd say it's perfectly appropriate to use instance of (P31) and subclass of (P279). Using part of is not. author  TomT0m / talk page 16:16, 12 September 2015 (UTC)

Would you have some reference to support your POV about the proposed classification? "Conjugate/cognate of James"? --- Jura 16:37, 12 September 2015 (UTC)
en:Jacob (name), en:James (name) ? :-) We aren't exactly over-run with references for all the said to be the same as (P460) claims. Isn't usefulness of the grouping enough? Jheald (talk) 16:44, 12 September 2015 (UTC)
That's two pages at Wikipedia, not one .. --- Jura 16:47, 12 September 2015 (UTC)
So maybe we also create sub-classes "cognate to Jacob" and "cognate to James" for smaller sub-families of the bigger super-family. That's something we can be entirely free to do, or to add at any point, if eg we find we have particular groups of sitelinks we want to place together.
Besides, how often are any other occurrences of subclass of (P279) or instance of (P31) supported by references? (A query could answer that question :-) ) Jheald (talk) 16:59, 12 September 2015 (UTC)
If you want to map items for statistical purposes, why not create a separate property for it? --- Jura 17:06, 12 September 2015 (UTC)
Because dividing things into smaller groups is exactly what subclass of (P279) is for {as per User:TomT0m above; and the names then fit very neatly into an ontological chain. Let me turn it round: Why are you against refining male given name (Q12308941) in this way? Class refinement in this way is a very Wikidata-ish thing to do (it seems to me). Jheald (talk) 17:14, 12 September 2015 (UTC)
What you suggest is very wikitionarish. Things for which Wikidata isn't ready yet.
I think it adds unneeded complexity.
Obviously, I'm not the only participant of this project, if Hsarrazin wants to maintain it ..
Just scroll two topics back on this page to see what awaits you. --- Jura 17:27, 12 September 2015 (UTC)
I certainly wouldn't want to break any maintenance systems.
But presumably it would be very easy to adapt them to look for incidence in the subtree of male given name (Q12308941), rather than a direct instance of (P31) ? Jheald (talk) 17:33, 12 September 2015 (UTC)
Can you clarify your need? Maybe we can find a better approach? For merely finding duplicates, I don't P735 is of much use. --- Jura 17:37, 12 September 2015 (UTC)
"I think it adds unneeded complexity". Jura, you've added unneeded complexity already! Now with your infinite p460 webs I don't know frequently which name to choose, because tens of them have the same Cyrillic spelling... --Infovarius (talk) 14:27, 14 September 2015 (UTC)

Must read : internationalisation of names

Members of this project will be interested by this mail posted by Nemo bis on wikidata mailing list : https://lists.wikimedia.org/pipermail/wikidata/2015-September/007165.html and especially the master thesis linked in it : http://ulir.ul.ie/handle/10344/3450 who discussed the way to model names in international systems such as Interpol (Q8475)  View with Reasonator View with SQID database and the most efficient models to capture the naming of people in different culture. As it seems that right now we mostly have procedures for handling occidental naming, this seems especially interesting :) author  TomT0m / talk page 09:03, 19 September 2015 (UTC)

If it's something WMF employees must do, maybe you should send this to a staff only list. --- Jura 11:40, 27 September 2015 (UTC)

Surnames on enwiki

On enwiki there are many pages like this one: en:Adedoyin. The corresponding Wikidata items, e.g. Adedoyin (Q16479299) have in many cases two claims: instance of (P31)=Wikimedia disambiguation page (Q4167410) and instance of (P31)=family name (Q101352). Is it okay in such cases to remove instance of (P31)=Wikimedia disambiguation page (Q4167410) at least when there is no sitelink to another project? --Pasleim (talk) 14:37, 21 September 2015 (UTC)

In ru-wiki surname articles are mostly disambigs too. --Infovarius (talk) 21:29, 21 September 2015 (UTC)
Infovarius, Pasleim: I think the point of a wikiproject is that we can propose policies and I think this would be a very good policy to propose.
Where a wikipedia page lists people with a family name and related articles named after that family name
- people using the family name as a given name, places named after people with that family name etc. -
then wikidata should treat that article as a family name article and the corresponding wikidata item should have
the statement <instance of:family name> and not <instance of:wikimedia disambiguation page>.
 Support as proposer Joe Filceolaire (talk) 22:45, 21 September 2015 (UTC)
 Support --Pasleim (talk) 16:06, 26 September 2015 (UTC)
 Oppose lacks sample. --- Jura 11:39, 27 September 2015 (UTC)
 Support --Hsarrazin (talk) 23:56, 7 October 2015 (UTC)
 Support --Sascha (talk) 03:29, 15 October 2015 (UTC)

Discussion

There's already a category Category:Disambiguation pages with surname-holder lists (Q8379354). Would the corresponding item to create not be "Disambiguation page with surname-holder list" ? Or does that cause problems with sitelinks, if some wikis have an article on the name, other wikis have a list of name-holders, other wikis have a mix of the two?

Maybe what we need is two items: one for the name, one for the list of name-holders -- with some sitelinks linking to appropriate redirects on the client wikis ? Jheald (talk) 22:39, 7 October 2015 (UTC)

en:Adedoyin isn't a disambiguation page. You don't find it in en:Special:DisambiguationPages because don't contains the tag __DISAMBIG__. --ValterVB (talk) 18:17, 8 October 2015 (UTC) ps. Is a list of person with the same surname. --ValterVB (talk) 18:20, 8 October 2015 (UTC)

Hi everyone. During the past days I tried to clean up this mass of mixtures of surnames and disambiguation pages. There have been more than 30,000 of this kind; currently there is still 28,000 left (see SPARQL query). To do this, I did the following with these items:

  • If there's only one sitelink, I removed instance of (P31)Wikimedia disambiguation page (Q4167410) and changed the descriptions to "family name" (and several other languages) according to the policy above.
  • If there's more than one sitelink, I cannot say if the Wikipedia articles are real disambiguation pages or all refer to the name only. Therefore I added a new item with instance of (P31)family name (Q101352) and removed this claim from the old item. I did not touch any sitelinks to keep the status quo of linked Wikipedia articles. It might be that some of these articles are about the surname only and therefore no real disambiguation pages, but this has to be decided "by hand" and by someone who speaks that language.

This resolved the disambiguation <-> surname mass for these items. However, today Infovarius raised concerns about this, so I want to discuss my solution before proceeding. Yellowcard (talk) 00:24, 12 November 2015 (UTC)

Family names are probably there now where given name have been a year ago. I'm sure you could work in a similar way. A first step could be to implement the result of the proposal above. --- Jura 08:05, 12 November 2015 (UTC)
Hi, Jura1, what do you mean by "Family names are probably there now where given name have been a year ago"? Where have given names been? According to your last sentence: I'm implementing the proposal above for all items with exactly one sitelink. For pages with more sitelinks, the articles in various languages are usually quite different to each other: Some only refer to the name, some also refer to places or terms. Therefore I don't change the sitelinks in items with more than one sitelink unless it is German and English, so I can decide. Yellowcard (talk) 09:16, 12 November 2015 (UTC)
Have look at the column with "[4]" (mixed given name items) at Wikidata:WikiProject_Names#Statistics. We spent quite some time trying to clean them up one by one, but a more top-down approach might work better.
With what is left after the above (removing P31:disambiguation from family names without disambiguation category at enwiki), you could just create new items with P31:family name and remove that P31 from the old items. Both items could be interlinked with the "different from"-property. --- Jura 14:20, 12 November 2015 (UTC)
@Jura1: Alright, sounds good. However, there seems to be a certain lack of consensus with Infovarius. Example: Roerich (Q2161495), I removed the claim "is a family name" as at least the German article is also about an asteroide and the English one is furthermore related to a museum and the en:Roerich Pact. So I created Roerich (Q21452679) and added "is a family name" as statement. I just got reverted by Infovarius (he merged the items, Roerich (Q2161495) now again contains both claims which is very obviously wrong). At the beginning he claimed I wasn't willing to discuss; now, however, it seems he does not take part into this discussion and reverts my changes instead. How to proceed? Yellowcard (talk) 16:52, 12 November 2015 (UTC)
I think we were several to have had a similar discussion with him. As disambiguations are about titles with the same string sequence, I'm not sure if Cyrillic sitelinks should be on items with non-Cyrillic page titles. Obviously, disambiguation pages can include all sorts of things and we could be tempted to add properties for anything mentioned there. For given names, it's made clear that we wont use disambiguations in P735. Eventually items for disambiguations might even be replaced by a software feature. We do need to find a solution for names in different scripts (see possible solutions further down on this page), but this shouldn't have an impact on handling items for "Smith", "Jones", "Williams" etc. --- Jura 17:07, 12 November 2015 (UTC)
It seems to me that the problem is that ru.wikipedia is partly messy. Unfortunately, this messiness then tends to spill over to Wikidata. --Leyo 22:46, 12 November 2015 (UTC) PS. See ru:Рерих (51 unreviewed changes since April!) or ru:Служебная:Статистика проверок for more evidence.
The problem with unreviewed changes is specific for this surname (it has some problematic persons). --Infovarius (talk) 13:27, 16 November 2015 (UTC)
Given that some of the structure at WD comes from KrBot, I would be inclined to assume that they are highly structured :) .. it might just be that they are a structured in a way different from other wikis (possibly preferring one sentence on a disambiguation page over one sentence in a stub). Besides, their context might get them problems other wikis don't have when writing "the sandwich is made of .. and can be eaten" --- Jura 07:33, 13 November 2015 (UTC)
OK, so I will continue my cleanup work, then. The idea to add a statement with different from (P1889) to both items to express the difference seems to be a very good one. @Leyo: This is by the way a problem that occured to me during the preparation to the automatic adding of family names to soccer players, so I want to solve it as good as possible before starting with the other job. I'm going to get back to you as soon as I'm done with this one. :) Regards, Yellowcard (talk) 12:59, 13 November 2015 (UTC)
Yes, Jura, you've guessed right that a page with one trivial sentence in ru-wiki rather get "disambig" status (of course, if there are several pages with similar titles) than stub. So most of the pages about names/surnames have some "disambig-like" template. I see that Template:surname in en-wiki looks the same as russian equivalent, is in the pages with the same content, but (formally) it is not disambig-template while russian equivalent is. So I just simply don't want to separate en/ru article only because of formal existance of __DISAMBIG__ word. @Yellowcard:, look what you've done to Carradine (Q1044840): de/en/ru pages all are about surnames only, all contains only persons and why it is not surname after all?? --Infovarius (talk) 13:27, 16 November 2015 (UTC)
Infovarius, it simply does not make any sense at all to have one item with two claims "is a disambiguation page" and "is a surname". As I have explained various times now, these two things are completely different: A Wikimedia disambiguation page only exists in the Wikimedia world while a surnime exists out there in the real world. So it is clear that there have to be two separate items. That's what I did, I separated both, there is now Carradine (Q1044840) for the Wikimedia disambiguation and Carradine (Q21482744) for the surname. The next question is what item the sitelinks better fit to. I left it as it is; if you can decide that all the Wikipedia articles are about the surname only (I cannot), just move the sitelinks from Carradine (Q1044840) to Carradine (Q21482744). Yellowcard (talk) 14:04, 16 November 2015 (UTC)
I don't need claim "is a disambiguation" - may be we should go this way and simply remove all such redundant claims? But we should teach some bots not to put it again... In this case, if I move all sitelinks to second item, what would mean that empty item "is a disambig"? --Infovarius (talk) 18:23, 16 November 2015 (UTC)
If all the linked Wikipedia articles are related to the surname, moving the sitelinks to the surname item would be the best solution. But we can only do this if all articles are related to the surname only. With articles in different languages, this becomes difficult to find out. But that would be the next step, then. Regarding the disambiguation items without sitelinks: I think there is no need for them. Probably they should be nominated for deletion (or redirect to the surname item)? Yellowcard (talk) 19:35, 16 November 2015 (UTC)
In such cases I don't understand the creation of new item and just merge them back. Infovarius (talk) 10:05, 17 November 2015 (UTC)
  • @Yellowcard:, if you create a new item for surname then you should change values in existing items for persons with such surname too. Otherwise they will have claims "have surname <disambig>" which you are trying to avoid. --Infovarius (talk) 10:35, 18 November 2015 (UTC)
    • I think it could be done in batches. Sample (1) create all new items, (2) remove P31 from old items, (3) reset descriptions on old items, (4) move all old uses to new items. Constraint reports (or queries) can help identify (4). --- Jura 10:47, 18 November 2015 (UTC)
Exactly, and that's what I originally planned. My cleanup work with the disambiguations is just something that popped up during my preparations. Please give me a little bit of time, I'm doing my best. Help is welcome, by the way. We have about 16,000 of these items left. : Yellowcard (talk) 15:50, 18 November 2015 (UTC)
@Yellowcard: Well in my watchlist I see items that were already mostly cleaned Kelly (Q928249), which just needed to delete some descriptions, and was already used as a family name, brusquely transformed in disambiguation page (where Kelly (Q257429) already existed!) to create Q21507150. So right now what? Are we supposed to merge Kelly (Q928249) and Kelly (Q257429)? And deplacing sitelinks and such? From what I see the more simple manner would be delete the new item and finish to clean Kelly (Q928249). Less work for everyone.
You know I agree with separating disambig/family names. But I really don't agree with creating errors along the way or creating empty new items when existing ones just needed one or two edits to be clean. --Harmonia Amanda (talk) 11:31, 19 November 2015 (UTC)
I agree that the temporary situation might not be optimal, but compared to the earlier one, I think it's minor. It's really easier to fix 20000 descriptions at once, rather than one-by-one. --- Jura 11:58, 19 November 2015 (UTC)
Oh? thousand of edits to repair the system? I cleaned Kelly (Q928249) (less than 30 seconds work). I will delete Q21507150, which doesn't apport anything (except problems). Don't broke items that are already mostly clean and correctly used. @Yellowcard: should create disambiguation pages and let the olf items be family names because dab pages are rarely, if ever, used in others items, when family names can be used many times. That would mean less work. Letting items with false descriptions is creating a huge mess. When s/he "correct" only the P31 without also correcting the description, s/he ensure tje item will be misused. If really s/he want to continue like that, there should be guidelines: create a new dab page instead of a family name one if the item is already in use, correct the description so to not confuse people in the meantime, don't ever create a third item when a dab one and a family name one already exist! These empty third items will have to be deleted later that's just stupid and confusing matters. --Harmonia Amanda (talk) 12:29, 19 November 2015 (UTC)
Well, if there is already a separate item for the family name, there is no need to create a second one (step 1 in my comment of 10:47, 18 November 2015). I don't quite see how step 2 increases the messy situation. --- Jura 12:55, 19 November 2015 (UTC)
@Jura1: I agree with you but it's not what @Yellowcard: is doing! --Harmonia Amanda (talk) 14:34, 19 November 2015 (UTC)
If there are a few duplicates, we can merge them afterwards. Works fairly well with QuickStatements. --- Jura 14:36, 19 November 2015 (UTC)
Hi Harmonia Amanda, thanks for your input. I'm irritated about your statement I create a new surname item when there is already existing one, this shouldn't be happening and I use a SPARQL query for every single item to check if there is already an existing surname item. Can you give me an example where this happend? I want to prevent this in any case. Regarding Kelly (Q928249): There were two disambiguation items existing. This is a rare and even more messy case I had not respected until now: I didn't expect that there could be more than one disambiguation item with the same label. In future, I will do another SPARQL query to also look for another disambiguation item. This issue shouldn't pop up again.
Regarding the other point (this discussion became a little messy, I'll try to separate the different arguments): Sometimes it will be more reasonable to keep the existing item as surname item (remove the disambiguation claim) and create the new page for the disambiguation. This is good when the item is used multiple times on Wikidata but could cause problems regarding the Wikipedia sitelinks as the linked articles (intentionally disambiguation pages, not necessarily related to surnames only) are connected to a surname item, then. This could be completely wrong. Anyways, it is much more difficult to decide about the latter situation due to lacking language skills. Therefore I planned to do it my way, keeping the sitelinks with the disambiguation item. The resulting constraint violations will be fixed in a second step: I will take ALL claims that have family name (P734) → disambiguation page and will repair these statements with the correct surname item. It is, as Jura said, much more easy to do it in two bunches than checking it in parallel. I'm going to be done with all this in less than two weeks. However, I stopped my work for now and wait for your input. My work might cause a few more edits than the other way, but on the other hand it will prevent existing errors after completion of the second step. Your suggestion could hard-to-detect errors regarding the sitelinks. Thanks again, Yellowcard (talk) 16:56, 19 November 2015 (UTC)
@Yellowcard: No the second Kelly item wasn't a disambiguation page, it was a family name page with some wrong descriptions. And you are creating items without the correct descriptions ("nom de famille" in French, "family name" in English, etc.) which would facilitate the work. I assume you create them with a gadget or some script, not manually, so you should add that. And of course it would help with the errors because you can't have two items with the same label/description. When exactly do you plan to correct the sitelinks you destroy? For Russell (Q1158262) you let all the sitelinks to the disambiguation pages when they should be with the family name. And you let also the aliases "*** (family name)" on now dab items. It's a mess.
You could:
  • Create a new family name, with labels, descriptions and aliases
  • Create a list of pages whose sitelinks should be verified (because I don't wait at all to pass on several thousands disambig pages we have already mostly cleaned because you couldn't treat the sitelinks)
  • Correct the descriptions and aliases: if now it's a dab page then the description should reflect that. Use DataDrainer to empty them of wrong descriptions and add after the correct ones.
Kelly isn't the exception. You want another one? Russell (Q1158262) was working as a family name and just needed a little cleanup (and has all the correct interwikis for a family name). Template:Q218032 was the disambig page. Russell (Q21507870) is your new, useless family name. If I continue to poke into your edits, how many more items to delete will I find? --Harmonia Amanda (talk) 17:15, 19 November 2015 (UTC)

Per the example below that I wrote before noticing this discussion, if we have different items for the same family name using <instance of:family name>, <instance of:wikimedia disambiguation page> and also <instance of:family>, then I suggest linking them altogether using P:P460. The reasoning behind this is that their topic is... the same family name, the rest being formatting. Place Clichy (talk) 12:23, 19 November 2015 (UTC)

@Harmonia Amanda: Let's work out your points one by another. "No the second Kelly item wasn't a disambiguation page, it was a family name page with some wrong descriptions." Can you give me the specific ID, please? Kelly (Q928249) was the typical disamb-surname-mix, Kelly (Q257429) was a disambiguation page. Where is the item "Kelly" having been a surname? Yellowcard (talk) 17:31, 19 November 2015 (UTC)
Even if you make reasonable efforts, you can't avoid duplicates entirely. Labels and descriptions aren't necessarily complete and up-to-date. You might want to attempt to standardize some of the existing descriptions for family names. For given names, I had done a few queries with quarry. --- Jura 09:50, 21 November 2015 (UTC)
I add standardized descriptions in various langauges (I took them from LabelLister) when I clean up the disamb-surname-mixes. However, I'm only looking at the claims the item contains. If there's instance of (P31)family name (Q101352), I consider it a surname, and if there is instance of (P31)Wikimedia disambiguation page (Q4167410) then I consider it an item for a disambiguation page. The descriptions are totally unreliable and are often mixed up (e.g. German "Begriffsklärungsseite" → disambiguation page and English "surname").
However, and this bothers me, Infovarius has obviously started to revert some of my work without reacting on his talk page neither giving any reason in the summary. So there are items I had cleaned up back in the disamb-surname-mix state. Annoying. Yellowcard (talk) 17:12, 21 November 2015 (UTC)
Yellowcard, you haven't answered to my reply about merging, so I decided that you are agree with that. I am just correcting your redundant creations when there are no actual disambiguations and the item is about surname only. And it's annoying that Jura, and now you, don't understand that you are creating problems for cyrillic languages and are continuing to do that. --Infovarius (talk) 12:36, 22 November 2015 (UTC)

etymology

@Jura1: you reverted both my changes to the paragraph on etymology

As the discussion on the proposed property referenced in that paragraph makes clear that adding etymologies in a structured way will not happen until Wiktionary is datafied. The current wording is misleading and should be changed. I think my wording was better.

My other change was to show what we can do using named after (P138) and the limits of that property. I think it was accurate and useful. Please don't revert stuff that improves the content. Joe Filceolaire (talk) 21:56, 7 October 2015 (UTC)

The proposal doesn't really exclude anything. New proposals can be made.
Do you have anything to support your claim that in one year (maybe more) Wiktionary will bring this in a structured way?
Do you intend to contribute to this project in one way or the other? We could need help building the items. --- Jura 22:00, 7 October 2015 (UTC)

Cyrillic - values for given name (P735)

Ash Crow
Dereckson
Harmonia Amanda
Hsarrazin
Jura
Чаховіч Уладзіслаў
Joxemai
Place Clichy
Branthecan
Azertus
Jon Harald Søby
PKM
Pmt
Sight Contamination
MaksOttoVonStirlitz
BeatrixBelibaste
Moebeus
Dcflyer
Looniverse
Aya Reyad
Infovarius
Tris T7
Klaas 'Z4us' van B. V
Deborahjay
Bruno Biondi
ZI Jony
Laddo
Da Dapper Don
Data Gamer
Luca favorido
The Sir of Data Analytics
Skim
E4024
JhowieNitnek
Envlh
Susanna Giaccai
Epìdosis
Aluxosm
Dnshitobu
Ruky Wunpini
Balû
★Trekker

Notified participants of WikiProject Names

How about an approach like: Q21103659? --- Jura 17:02, 14 October 2015 (UTC)

Another approach could be: Q21104340 --- Jura 18:36, 14 October 2015 (UTC)

@Jura1: Both appear to have the potential to be an awful lot of work. I'm really sorry I don't have a better idea. Your personal feelings? Jared Preston (talk) 19:05, 14 October 2015 (UTC)
I do not have complete vision of P735 system, so I simple mention some disadvantages. Q21103659 is looked strange, because English or Chinese language does not use Cyrillic symbols, these languages have standard transliteration for names. Phrase "Don't use as single value in P735 for people born in the United States." from Q21104340 is looked strange too because human names are cross-countries in general case. — Ivan A. Krestinin (talk) 19:34, 14 October 2015 (UTC)
A small issue we have now is that an Italian could be named "Ivano" and this would probably get rendered as "Иван" in Russian, but in English it's likely to remain "Ivano". Thus we have an item for "Ivano" with the same label in all languages with Latin script.
A Russian named "Иван" might have his name rendered as "Ivan" in English, but as "Iwan" in German. There is some debate which item to use in these case. Q830350 isn't entirely suitable.--- Jura 19:49, 14 October 2015 (UTC)
I understand the problem, but it is looked wider. For example German name "Hanns". This name have at least 2 variants of Russian transliteration: Ханс and Ганс. Different German persons have same German name, but different name writing in Russian sources. Claim <given name (P735)> Hanns (Q16276296) (with double Russian naming) is true of cause, but it is inaccurate. So I afraid that property given name (P735) can not be fully consistent in principal. — Ivan A. Krestinin (talk) 20:35, 14 October 2015 (UTC)
You have been warned. Now probably you should create a separate item for every single person's first name and every single person's surname, since the combination of transcriptions and translations in other languages can be different for every single person. Enjoy ;) --Shlomo (talk) 21:03, 14 October 2015 (UTC)
... one for transliteration into each non-Latin language (>1000). --Infovarius (talk) 17:54, 15 October 2015 (UTC)
:) hopefully, the same Hans wasn't described in all those languages --- Jura 14:39, 18 October 2015 (UTC)
  • @Jared Preston:: Q21103659 might be a bit less maintenance (depending on the bot), the datatype suggested above would make it easy. As Ivan mentioned, Q21103659 can look strange in Chinese, but it makes it clear what the name is. Transliterations would still be available. Q21104340 is closer to what is currently done, but has the disadvantages of that (a transliteration is displayed, but that might not be the one to look for in this specific case). With both, P1705 should be accurate. --- Jura 14:39, 18 October 2015 (UTC)

Indeed it's more fair and circumvents the problem with different transliterations when we would use cyrillic names.--Kopiersperre (talk) 14:37, 10 November 2015 (UTC)

The situation where a language (even en) has several articles for several transcriptions of the same name seems to be the exception rather than the norm. In some cases, these articles are almost empty and can be redirected without trouble. For this reason, I support having a single Wikidata item for different transcriptions of the same name, until at least one language has at least 2 articles for 2 different transcriptions. That way we can have en:Ivan (name)/de:Iwan (name)/ru:Иван/uk:Іван on the same item, which by the way is the current situation on Ivan (Q830350).
Remember, the purpose of Wikidata is also (some would say initially) to provide inter-language links between Wikipedia articles, purpose which cannot be served if these articles are linked to standalone Wikidata items (with zero or very few other Wikipedia links), whereas there are other existing Wikipedia articles that could very well be linked. Place Clichy (talk) 10:17, 23 November 2015 (UTC)
We already have Q19688630. So your suggestion is to merge "Etienne" et "Stéphane" and choose no special approach for other scripts, e.g. Cyrillic? --- Jura 15:38, 23 November 2015 (UTC)
  • Just to be sure, do you understand guys that there are more than one language that using Cyrillic script (with different transliteration rules) and even could be few traditional transliterations of name and few transliteration rules systems in one language. For now Wikidata looks definitely unusable for Cyrillic languages Artem.komisarenko (talk) 15:11, 14 January 2017 (UTC)

Atsuko

How to model the given name Atsuko? In Japanese, there are ~13 different ways of writing it.

Should Q9161797 be split into 13 separate items with labels like {ja:敦子, en:Atsuko}, {ja:篤子, en:Atsuko}, {ja:惇子, en:Atsuko}, etc.? We already have different items for {en:Sara, ja:サラ}, {en:Sára, ja:サラ}; and it’s hard to argue why 篤子/惇子 should be modeled different from Sara/Sarah. Thoughts?

If we split names that have different spellings in other langauges, what would be the English description string for these items? With Sara/Sarah, the Japanese description indicates the Latin spelling for disambiguation, see for example Q603701. --Sascha (talk) 04:13, 15 October 2015 (UTC)

Merging and unmerging

Apparently, at the other end of Wikidata, Andrawaag and Sebotic have a similar issues with items about genes: Mailing list: Data model explanation and protection.

Most participants of WikiProject Names seem philosophic about such occasional hickups. Maybe there are some things that tend to work we can share with them. --- Jura 11:41, 29 October 2015 (UTC)

It seems they figured it out in the meantime. --- Jura 00:39, 31 October 2015 (UTC)


Roman names

How shall we map Roman names? praenomen, nomen, cognomen, agnomen are currently partially mapped to P735/P734. Shall we create separate properties for these? --- Jura 11:37, 5 November 2015 (UTC)

Proposals are at Wikidata:Property_proposal/Term#Roman_names. --- Jura 12:13, 10 November 2015 (UTC)

Family/Family names/disambiguation pages

Hello, bonjour,

I would like to have the opinion of the project on what is the best way to link the following three items covering very related subjects:

Currently Q1499207 (dab) and the empty Q21507177 (family name) are connected with a P:P1889 (different from) while Q21000667 (family) only has a P:P910 link to the category item Q10024382.

I believe that in such a case one option could be to link all three using P:P460, because for the reader these items cover pretty much the same subject, the nuance between the three is only the format in which each Wikipedia project would present the same information. Theotokis is just an example, the question happens whenever we have Wikidata items for a family name, the family with this name, and the corresponding dab page, which is often the most frequent type of page found on Wikipedia for a family name.

What do you think? Place Clichy (talk) 11:17, 19 November 2015 (UTC)

The type of the items is different (P31). Personally, I use P460 for items with the same or a similar type. To link disambiguations that tend to get mixed-up, I use different from (P1889). Not sure what the optimal link between family and family name items would be.
BTW, in the above samples en:Theotokis should probably go on Q21507177. If the name was "Dupont", one could imagine having hundreds of items for Q8436. --- Jura 13:01, 19 November 2015 (UTC)
@Jura1: I certainly would not support linking en:Theotokis to Q21507177, which has no other interwiki link. Remember, the purpose of Wikidata is also (some would say initially) to provide inter-language links between Wikipedia articles, purpose which cannot be served if these articles are linked to standalone Wikidata items, whereas there are other existing Wikipedia articles that could very well be linked. Place Clichy (talk) 10:17, 23 November 2015 (UTC)
We can't just add sitelinks to more or less related items just to generate more interwikis at Wikipedia. If links to related items are considered desirable, these need to be defined locally. Template:Interwikis from P460 (Q21529474) can help. --- Jura 11:34, 23 November 2015 (UTC)

Once more: Russian name

What the project can propose in the following case. We have a person Sergey Markin (Q19910005) with name "Сергей". When I want to add a name property, I see a bunch of items with such a label: Сергей, Сергей, Сергей, Сергей, Сергей, Сергей and also similar Сергий, Szergej (Q20188306). Almost all items are empty (without sitelinks). Which should I choose? And such mess was created by participants of the project and is considered by some of them as an advantage. By my opinion, it's a disaster. --Infovarius (talk) 12:46, 22 November 2015 (UTC)

I think we need to come to a conclusion with Wikidata_talk:WikiProject_Names#Cyrillic_-_values_for_given_name_.28P735.29. Maybe you could support one or the other solutions or propose another one. --- Jura 12:59, 22 November 2015 (UTC)
Unfortunately, I don't see any of these solutions as ideal, and currently I can't propose better one. I'll try to start thinking about this by creating a list of requirements to names model in Wikidata. Give me please some days to make it (parallel to real life). --Infovarius (talk) 13:06, 25 November 2015 (UTC)

Need help with list

Can someone help me with this list? I need list of names where Latvian (lv) label is not equal to English label (en). Some time ago I and some others added different labels in Latvian to foreign names (our language rules require this). However this is not how it should be here in Wikidata. --Papuass (talk) 11:25, 23 November 2015 (UTC)


This template at plwiki generates additional interwikis for names based on items linked in said to be the same as (P460): sample at pl:Paweł.

It only adds one interwiki for each wiki. The template uses the "interwiki" function in the module at Module:Patches (Q21529482) by Paweł Ziemian. --- Jura 12:39, 23 November 2015 (UTC)

I imported part of the module to Q21533309 and it can be tested at Wikidata:Sandbox/3 with values at Q15397819#P460. --- Jura 15:56, 23 November 2015 (UTC)
"One interwiki for each wiki" still not good solution. In sandbox there are awful things like bg:Юрий connected with ru:Джордж rather than ru:Юрий. --Infovarius (talk) 13:58, 24 November 2015 (UTC)
The question is if ru:Джордж is a reasonable match to "Sandbox/3", not if it's the closest match to "bg:Юрий". --- Jura 18:29, 24 November 2015 (UTC)
I added "old-style" interwiki link to the sandbox. The module look for them before adding automatic links. Paweł Ziemian (talk) 20:24, 24 November 2015 (UTC)
There is also Module:Interwiki (Q20819069) by Innocent bystander. The difference seems to be that it takes interwikis just from one item listed in P460. --- Jura 10:56, 25 November 2015 (UTC)
It take only interwiki from one single item, yes. It take form the first "best statement" ie, first preferred ranked statement if it exists, otherwise the first normal ranked statement, but never any "deprecated" statement. It also allows you to choose the used item yourself by the parameter qid. -- Innocent bystander (talk) 11:57, 25 November 2015 (UTC)

Wikimania 2016

Only this week left for comments: Wikidata:Wikimania 2016 (Thank you for translating this message). --Tobias1984 (talk) 11:49, 25 November 2015 (UTC)


Project focus for 2016

It seems that coverage and structure for given names has gotten fairly stable.

What shall we focus on in 2016?

Possibilities could be:

  • develop reference lists
  • expand items with additional statements (which properties?)
  • Roman names
  • family names
  • lists of most frequent first names

--- Jura 11:44, 22 December 2015 (UTC)

I'd like to see lists of most frequent given names in a particular language. Ham II (talk) 19:13, 21 January 2016 (UTC)
That would be a good addition indeed. Currently, we have two lists for the Netherlands.
--- Jura 07:31, 29 January 2016 (UTC)

Names as labels in as many languages as possible

Is there a tool for adding a person's name as a label in every language that uses the Latin alphabet? Ham II (talk) 19:17, 21 January 2016 (UTC)

@Ham II: Yes, NameGuzzler. --Harmonia Amanda (talk) 04:27, 22 January 2016 (UTC)
@Harmonia Amanda: Thank you. Ham II (talk) 16:13, 23 January 2016 (UTC)
You could also use Add Names as labels (Q21640602) to copy all first name labels from "en" to another language.
--- Jura 07:31, 29 January 2016 (UTC)

Default label for given names

As we can now (or soon) use the language code "mul" (for multiple languages), we could use the native label property to store the default label for given names.
--- Jura 07:31, 29 January 2016 (UTC)

Revise "How to clean up a given name item"

As the initial cleanup is done, I think we should avoid converting disambiguations to given name items. I'd remove that step from the "How to".
--- Jura 07:31, 29 January 2016 (UTC)

If you mean "don't mix disambiguation page with page on name or surname" I agree. --ValterVB (talk) 19:45, 31 January 2016 (UTC)
It's about Wikidata:WikiProject_Names#How_to_clean_up_a_given_name_item.
When it was written we had many items mixing first names and disambiguations (see the stats: 7000+, now: only <100).
Given that this cleanup is now mostly done, we should avoid converting existing disambiguations to given name items.
This to provide stable QIDs.
So if one or several links on disambiguation items are about first names, these should be moved to other or new items.
Empty disambiguations shouldn't be redirected to first name items.
--- Jura 10:12, 1 February 2016 (UTC)

Roman names (bis)

Ash Crow
Dereckson
Harmonia Amanda
Hsarrazin
Jura
Чаховіч Уладзіслаў
Joxemai
Place Clichy
Branthecan
Azertus
Jon Harald Søby
PKM
Pmt
Sight Contamination
MaksOttoVonStirlitz
BeatrixBelibaste
Moebeus
Dcflyer
Looniverse
Aya Reyad
Infovarius
Tris T7
Klaas 'Z4us' van B. V
Deborahjay
Bruno Biondi
ZI Jony
Laddo
Da Dapper Don
Data Gamer
Luca favorido
The Sir of Data Analytics
Skim
E4024
JhowieNitnek
Envlh
Susanna Giaccai
Epìdosis
Aluxosm
Dnshitobu
Ruky Wunpini
Balû
★Trekker

Notified participants of WikiProject Names,

Wikidata:WikiProject Names#Roman names is still empty ; is there still no stable structures for roman names ? Can someone at least add the properties Roman praenomen (P2358), Roman nomen gentilicium (P2359), Roman cognomen (P2365), Roman agnomen (P2366) (if that's ok) and maybe some exemples?

(FYI, I need it for Titus Flavius Postuminus (Q3529655) and Lucius Campanius Priscus (Q22813570)).

Cdlt, VIGNERON (talk) 10:12, 18 February 2016 (UTC)

Just do it :) -Ash Crow (talk) 11:48, 18 February 2016 (UTC)
coucou VIGNERON,
c'est [1]que tu voulais ? ou tu en veux plus... si c'est le cas, n'hésite pas, vas-y !! --Hsarrazin (talk) 15:59, 18 February 2016 (UTC)
Merci Hsarrazin. It's a good start but I'd like some examples too. I'm just beggining with my two items so I'm unsure how to do it best ; for instance, is it mandatory to duplicate the roman praenomen? Like Lucius (Q6697451) and Lucius (Q12382759) (by the way, there is no relation between this two, shouldn't we put said to be the same as (P460) and/or different from (P1889) ?). Cdlt, VIGNERON (talk) 09:26, 19 February 2016 (UTC)
personally, I would swear they are the same, even if one is roman and the other the form used in northern Europe -- but, seems there are 2 enwiki links
the final structure for first names is still under discussion, because of some of the absurdities caused by translitteration -- at least, a said to be the same as (P460) is the norm. I guess some are still missing… - I added it on both items. - If you discover other cases, don't hesitate to boldly add it ;) --Hsarrazin (talk) 08:31, 20 February 2016 (UTC)
for examples on how to use the properties for Roman name parts, I'll work on it later. Mom's visiting ;) --Hsarrazin (talk) 08:31, 20 February 2016 (UTC)
VIGNERON je viens de compléter les 2 éléments que tu donnais en exemple avec Roman praenomen (P2358), Roman nomen gentilicium (P2359) et Roman cognomen (P2365). Lorsque le cognomen n'existe pas encore, il faut créer un élément : j'ai créé Priscus (Q22917052). Par contre, Postuminus (Q16036395) existait déjà avec une définition "page d'homonymie", mais le seul lien (frwiki) le désignait bien comme cognomen, donc j'ai changé l'élément pour l'utiliser ;)
si tu t'attaques aux romains, bon courage ; avantage : les prénoms sont très peu nombreu… pour le reste, il y a du boulot, et je crois que les bulgares, en particulier, en ont créé un bon paquet :) --Hsarrazin (talk) 18:55, 22 February 2016 (UTC)

verschieden von (P1889)

Wird in Bezug auf Namen "verschieden von (P1889)" irgendwo verwendet, z. B. auf gleichlautenden "Begriffsklärungsseiten" oder zur Unterscheidung von "Vorname" und "Familienname", bzw. "weiblicher Vorname" und "männlicher Vorname"? --Harry Canyon (talk) 20:28, 12 March 2016 (UTC)

If the question is where different from (P1889) is being used: it's being used for disambiguation pages, sample: Q16281827#P1889.
--- Jura 07:50, 13 March 2016 (UTC)

Klammerzusatz unter "Auch bekannt als"

Häufig findet sich der Klammerzusatz "Cato (given name)" in der Spalte "Auch bekannt als". Was soll damit geschehen, stehen lassen oder entfernen? --Harry Canyon (talk) 20:39, 12 March 2016 (UTC)

You shouldn't find them in German, please remove them there. It's only used in English. Search makes it sometimes hard to select a name.
--- Jura 07:50, 13 March 2016 (UTC)
@Jura1: You write that the addition should remain only in English. But here you doing my editing reversed if I remove the French supplementary. Why should not stand still the German additional? --Harry Canyon (talk) 07:35, 15 April 2016 (UTC)
" (given name)" is English, not German or French. So you shouldn't find as an alias in languages other than English.
--- Jura 13:58, 15 April 2016 (UTC)
@Jura1: In German is the alias "Hans (Vorname)". So should not remove the other aliases in other languages. Here I have added an example in German. --Harry Canyon (talk) 23:02, 15 April 2016 (UTC)
If you think such aliases are useful for German, you might want to add them. Here is a list of items without such an alias in German]. I think for English I added them to 1000 most frequent given names. It's fairly easy to do with QuickStatements as it wont add the same alias twice.
--- Jura 11:50, 16 April 2016 (UTC)


Barbora (Q13537192)

Ist es richtig, dass die beiden Objekte (Barbora und Borbora) zusammengeführt wurden? Gruß, Harry Canyon (talk) 16:03, 30 March 2016 (UTC)

Die selbe Frage zu Q6207220 und Q18121213. --Harry Canyon (talk) 16:25, 30 March 2016 (UTC)
Hallo User:ValterVB, have you seen my question before you made this change? I think User:Katzenfrucht has already made such changes frequently, see for example here. --Harry Canyon (talk) 20:18, 1 April 2016 (UTC)
@Harry Canyon: I don't see your question, where is? Anyway the problem is that we can't mix disambiguation with surname, given name or list of person. I have splitted Jochen (Q6207220) because de and ja are disambiguation, instead en is a page on given names, isn't a disambiguation. Barbora (Q13537192) isn't correct: only en and ja are disambiguation the other are page about given name. --ValterVB (talk) 20:33, 1 April 2016 (UTC)
@ValterVB:That was the beginning of my question, see one over it, if this change of User:Katzenfrucht was right. Best regards, Harry Canyon (talk) 20:43, 1 April 2016 (UTC)
No it's a wrong merge. I will fix it. --ValterVB (talk) 21:01, 1 April 2016 (UTC)
✓ Done Barbora (Q15938720), Barbora (Q13537192) and Borbora (Q893357)--ValterVB (talk) 21:11, 1 April 2016 (UTC)

Hanna

Are these two objects differently Hanna (Q15729076) and Hanna (Q19664967)? --Harry Canyon (talk) 11:35, 15 April 2016 (UTC)

@Чаховіч Уладзіслаў: What is the difference? --Harry Canyon (talk) 11:48, 15 April 2016 (UTC)
Q15729076 (Ханна) and Q19664967 (Ганна) --Чаховіч Уладзіслаў (talk) 11:52, 15 April 2016 (UTC)
@Harry Canyon, Чаховіч Уладзіслаў, Jura1, Infovarius:, I cleaned up the descriptions of: Anna (Q22713652) (Анна), Anna (Q666578) (Anna), Hanna (Q15729076) (Hanna), Hanna (Q19664967) ((Ганна)), Hannah (Q1554377) (Hannah), Hanne (Q1575808) (Hanne), Ana (Q482671) (Ana). There are probably given names still missing. --Harmonia Amanda (talk) 09:29, 18 October 2016 (UTC)
So you mean "Ганна" as Russian label for Belorussian name. So it would be wrong for Russian spelling. So we have to take 2 different values as a name for many persons - one for Russian spelling, second for Belorussian. So I suppose we end up taking different values as a name for every language - it will be very large mess... --Infovarius (talk) 11:19, 18 October 2016 (UTC)
Infovarius, белорусские, русские и украинские личные имена (за редким исключением) между собой переводятся. Русское Анна по-белорусски будет Ганна, а белорусское (и украинское?) Ганна по-русски Анна. Т.е. 2 элемента будут иметь в вост.-слав. языках одинаковое название (а в случае с Николай, Мікалай и Микола — 3), но разные в других языках — тут важно правильно оформить описание, чтобы избежать путаницы. Куда большую проблему представляют неразделенные по языкам имена, которые вводят пользователей в обман. --Чаховіч Уладзіслаў (talk) 11:54, 18 October 2016 (UTC)
Да, переводятся. Так же, как и все остальные имена - как-нибудь да переводятся. И что из этого значит? Белоруска с белорусским именем Ганна какое должна иметь русское имя? (Ответ: бывает по-всякому). Конкретно в случае этих двух имён - мы уже имеем не один элемент, причём be:Ганна и ru:Анна не соединены. К тому же транслитерируются эти имена в латиницу по-разному. Так что конфликт неизбежен. --Infovarius (talk) 14:44, 19 October 2016 (UTC)
Yes, the languages using Cyrillic will have to separate names not using the same exact string, exactly as we did for Latin languages. And yes, that will mean many people will have several given names, with the qualifier language of work or name (P407) when they had several official languages/given names (I think all Belarusian people had official Russian spelling of their names during USSR?). It's already how we do things for all Latin given names, and I see no reason why it wouldn't work for other writing systems. We already do this for Korean people later becoming American; their Korean name isn't the same item than the transliterated name, both are true, so both are listed (with start time (P580)/end time (P582) when we can). It's a mess when several organisational systems coexist. I'm not overly enthusiastic with the approach "a string = an item" but it's the one which was chosen and the one mostly used now. And no better idea was ever proposed, so we can at least try to implement this one properly. --Harmonia Amanda (talk) 12:51, 18 October 2016 (UTC)
I think "Hanna" as edited by Harmonia looks good. I assume the Belorussian name is transliterated correctly otherwise we would potentially have different strings in languages with Latin script. I'm a bit reluctant to include Russian names though. We keep getting problem reports about them and it seems that the current system works for names in Latin script, Japanese and Korean names, even Belorussian, but maybe we just need to see a clearly outlined solution for Russian ones before. Possibly, it's merely a problem about agreeing how to link ruwiki from such items.
--- Jura 11:45, 19 October 2016 (UTC)
Belorussian treating is no better than Russian. Now User:Чаховіч Уладзіслаў has a tactics to separate Belorussian names, and this creates problems for Soviet people (and all Belorussian, at most). Hieroglyphic names can have the same problems but I am not deep into them. Can anyone say: if there's situation when 1 Latin name translates into several Japanese/Korean strings? And opposite? --Infovarius (talk) 14:44, 19 October 2016 (UTC)
@Infovarius: There is no entity translating automatically with magic unified rules names. Usually one same person name immigrating into a country can maybe take the name people call him, so I say there could be probably one translation per person - how the person want to be called, how people call him, which language is used in the country ... There is probably many many examples. Take for example Beijing (Q956)  View with Reasonator View with SQID. Traditionnaly in france the city is called "Pékin", in english "Beijing" ... And this is for a major name. Imagine what it is probably for the the zillions of family names ... author  TomT0m / talk page 14:55, 19 October 2016 (UTC)
@Infovarius:, Li (Q686223), Li (Q17008106), Li (Q15283218), Li (Q11983876), Li (Q2233716), Li (Q3447118), Li (Q10910874), Li (Q13588410) are all transliterated "Li" (Li (Q770891)) is that what you seek. Most Latin names will have several Japanese transliterations depending on the pronunciation, etc. --Harmonia Amanda (talk) 16:17, 19 October 2016 (UTC)
Ok, Latin-speakers, how do you feel about surname property in Yao Lee (Q1189731)? --Infovarius (talk) 06:34, 21 October 2016 (UTC)
Well thank you because the family name wasn't 莉 but 姚, and I wouldn't have corrected it without you linking it. And I don't see a problem (except this error). --Harmonia Amanda (talk) 07:38, 21 October 2016 (UTC)
Still I see that "Yao Lee has surname Li". --Infovarius (talk) 14:31, 24 October 2016 (UTC)
Yes. And? --Harmonia Amanda (talk) 16:24, 24 October 2016 (UTC)

Linking names item to their string

Hi people, for a while now I'm using Name in original language Search to identify for sure how to write the name in a language, in a "monolingual text" datatype. Arguably the labels used in languages might be ambiguous. This edit : https://www.wikidata.org/w/index.php?title=Q22713652&diff=321920566&oldid=315798228 makes me begin a discussion about that because I can't explain the rationale of the author of the merge and the resulting fusioned item seems like a big ball of mess to me. How do we solve this kind of problem in a clear and unambiguous way ? author  TomT0m / talk page 10:42, 16 April 2016 (UTC)

  • It's a problem that users had in other fields as well (even biologists merging similar items in genetics). It can't be completely avoided.
Not too long ago, I added different from (P1889) with a qualifier to cut down on problematic merges (sample: Q666578#P1889). The property said to be the same as (P460) should also limit it. Obviously, users can still delete them.
native label (P1705) seems like a good solution. I think most of the time we should set the language code of monolingual string to the recently created "mul" (="Multiple languages").
--- Jura 11:50, 16 April 2016 (UTC)

Bernado

In dem Fall ist Bernado kein russischer/weißrussischer Name: Bernardo (Q15731576) und Bernardo (Q20823800). Laut enWP und deWP ist der Name portugiesischen/spanischen/italienischen Ursprungs, trotzdem wurde meine Zusammenführung rückgängig gemacht. Kann mir mal jemand erklären, weshalb die beiden Namen unterschiedlich sind und getrennt behandelt werden sollen. --Harry Canyon (talk) 16:46, 18 April 2016 (UTC)

(Sorry for English) @Harry Canyon: Because they are spelled differently in Cyrillic languages and being in 1 item they make mess to naming of specific persons. Please feel yourself uncomfortable here, as all Cyrillic-languaged users are feeling all the time with Jura's reforms. --Infovarius (talk) 10:32, 7 October 2016 (UTC)

Add labels from names item

Hi, this request needs input and is related to names : Wikidata:Bot_requests#labels_from_name_properties. author  TomT0m / talk page 16:54, 12 July 2016 (UTC)

Use of key event

Hi @Jura1:, I don't understand why you use significant event (P793) View with SQID for the last two examples of the table, like "most frequent name in" and "authorizisation". My opinion is that this is better modelled as instance of (P31) statements - especially the first one - with a "begin date" qualifier. The rationale is that this become a special kind of names, that come to belong to the class of frequent names. But this status can come to an end. It will be easier to query the frequent names at the time of the query if the status of names has a "preferred" rank than to query the class of all names that have a key event of "entering the most frequent name" with no "no more very frequent name". The rank can be changed to "normal" when the name is no more a member of the class. Did I miss something ? author  TomT0m / talk page 10:12, 21 September 2016 (UTC)

I don't think P31 is suitable for statistics nor is the use preferred rank for selection among dozens of statements in P31. I don't think we actually use it that way anywhere. Anything needing a reference is probably not of much use in there either.
--- Jura 17:56, 24 September 2016 (UTC)
Using ranks is no less relevant than in any over case as it's one of their main use - it's actually one of the purpose of ranks, highlight current data in preferred and put outdated one in normal. Statistics very often use of define classes, so I think instance of (P31) could especialy be useful in that area. Anything needing a reference is probably not of much use in there either. ?? author  TomT0m / talk page 18:12, 24 September 2016 (UTC)

Aganippides (Q21548091)

Aganippides (Q21548091) is another name for Muse (Q66016). Which instance of (P31) should it get and how can it be connected to Muse (Q66016)? Thanks, --Marsupium (talk) 11:34, 29 September 2016 (UTC)

Name-gender: can we reach a consensus?

Sorry if (as an outsider) I bring up a topic that has already been resolved or is a non-issue, but sorting out name objects imported from :slwiki, I inadvertedly created a confusion yesterday, and I'd like to have a clear guideline for the future. The issue is labeling names either as male given name (Q12308941) or female given name (Q11879590) where the use depends on the culture. For example, Andrea (Q493293) is a male given name in Italy, but predominantly female in English-speaking countries, so the current solution - labelling it unisex given name (Q3409032) may not be accurate, because it isn't really gender-neutral, at least not everywhere. One solution could be to have two instance of (P31) (male and female) if the goal is to have a single object for equivalently-spelled names. Then, using perhaps language of work or name (P407), we could specify which case applies in which culture.

The last point brings me to the second problem. Judging by the discussion two years ago, people seem to prefer having as few objects as possible. But if we look at the name "Andrea", there's as many as three objects - Andrea (Q493293) as the base, which has part(s) (P527) (1) Andrea (Q18177306) (male) and (2) Andrea (Q18177321) (female). I find this confusing, so I'd like to contribute to a better solution. Thoughts? — Yerpo Eh? 07:45, 18 October 2016 (UTC)

This needs classes of names specialized for languages like "French (mainly) male name". This would naturally occurs however if we consider that linguistic variant of the names are different (related) names, for example if they spells or are pronounced differently and they are derived from each other or from a common linguistic root. Then each of the name items could have its own property (@Infovarius: because of the discussion above related to cyrillic names/spelling - one argument for having different items).
Naturally this solution requires to search if we have sources for derivation of names over times ... author  TomT0m / talk page 11:38, 19 October 2016 (UTC)
  • I don't see why we couldn't have several items for different names that are spelled the same way in Latin script, if there is a way to differentiate them (compare Santos (pt) and Santos (es) and Santos (mul); or Special:Search/Yuriko given name). We would have specialised properties for languages and native label, so we should make use of that.
    --- Jura 11:45, 19 October 2016 (UTC)
    • There's no need to have specialized properties, except maybe in a very few cases. We can use the same properties as we would already have specialized classes - also we definitely can used generic properties with specialized values. author  TomT0m / talk page 12:24, 19 October 2016 (UTC)
So, if I understand correctly, we should make three objects (generic, male, female) for any name that is used for both genders and put interwikis under the generic one (analogous to the "Andrea" example)? — Yerpo Eh? 09:39, 20 October 2016 (UTC)
Not in general, it depends on the name. Santos has several as they have different origins and pronunciations.
--- Jura 09:57, 20 October 2016 (UTC)
Ok, but three for the same origin and pronunciation (if they are used for both sexes)? — Yerpo Eh? 10:59, 20 October 2016 (UTC)
No, just one. Andrea isn't used the same way in, e.g., Italian as in other languages.
--- Jura 11:03, 20 October 2016 (UTC)
  • Hi, I'm super late to this party, but am working on a project where we are creating items for people, and the question of names and gender has come up and we haven't been able to identify a best practice. Thought I might ask here. We have a female person named Sunny, but the given name Sunny only had an object for a male given name Sunny (Q19968383). The Wikipedia article for the name lists both male and female identifying persons. I consider this name unisex. But the person who set up this name item meant to create a male given name. So, is it best to create a new item for a female given name, create a new item for a unisex given name, create a new item for a plain given name or change the existing male given name to a unisex given name? Or should we be doing something else entirely? As far as I can tell, Sunny is used as a unisex name across languages and cultures. Thoughts or recommendations? Thanks, --Crystal Clements, University of Washington Libraries (talk) 16:04, 8 January 2021 (UTC)
  • Personally, I'd just change it to gender-neutral, or just given name (Q202444) would be sufficient. However, I disagree with the discussion above: in my opinion, having multiple Wikidata items for the same name, just because e.g., it's assigned to males in one place and females in another, isn't very desirable. Ghouston (talk) 23:55, 8 January 2021 (UTC)