Wikidata:Project chat/Archive/2018/02

This page is an archive. Please do not modify it. Use the current page, even to continue an old discussion.

Has quality

Property has characteristic (P1552) specifically states that it refers to "non-material" qualities and requires values that are instance or subclass of "quality (Q1207505): distinguishing feature". However, this property is widely used with values that are instance or subclass of "property (Q937228)" - see water (Q283) for an example. Do we need a new property "has material property" for these items, or should we open up has characteristic (P1552) to material properties? - PKM (talk) 19:49, 23 January 2018 (UTC)

@Micru, Emw, Eurodyn: not much to grasp about the inspiration about this property, so ping the proposer and the voters if they are still somewhere.

Also Template:Ping Project as it’s a relevant question. This property seems to be defined for scientific ontologies like https://www.ebi.ac.uk/efo/faq.html efo and Basic Formal Ontology . I’ll try to dig a little bit. One argument to not expand the domain is « it’s an externally defined property, we should stick with the definition ». One argument in favor would be « we know the class tree in which the object belongs, so we can easily filter out wrong values ». One question is : are those values used, by whom ? who set them and why ?

Another thing, I use this property for a project with a special dummy value as object and qualifiers meaningful. This might be questioned here. author TomT0m / talk page 20:49, 25 January 2018 (UTC)

@TomT0m: thanks. My own preference at the moment is to add a new property "has material property (has feature)” and then ask for a bot to change has characteristic (P1552) to the new property where the object is subclass of property (Q937228). However, there may be subtle nuances I am not aware of. - PKM (talk) 19:23, 26 January 2018 (UTC)

Different item for physical properties?

Looking further into this, I see that property (Q937228): predominant feature that characterizes a being, a thing, a phenomenon, etc. and which differentiates one being from another, one thing from another is <subclass of> quality (Q1207505): distinguishing feature - we're still in the realm of philosphy and non-material characteristics. It appears that we need a different item for "property AKA attribute, characteristic, feature" = AAT's "attributes and properties" 'inherent characteristics, especially physical characteristics of materials and objects'. Perhaps physical property (Q4373292): attribute of a physical system or body; OR non-chemical property of a material should have a different parent and most of the current subclasses of property (Q937228) should be moved to "physical property" (see list)? - PKM (talk) 01:24, 1 February 2018 (UTC)

I don’t know. I’m for some time wondering about this concept of « property » and it seem to me that it’s solvable using classification. For example « red object » can be a class whose definition is « object who when lighted by white light reflect a red light spectrum. « Red object » is then a « universal ». My question is « does Wikidata needs a concept of « redness » when we can just put statements that defines « redness » in the « Red object » item ? I work for a while on a template {{Implied instances}} that (ab?)uses « has quality » in this purpose (the documentation explains a little bit). Essentially I think it’s enough if we’re able to define the « redness » concept in the class about all red things. But this is Wikidata and there is item about stuffs that does not fit in this scheme, we may for example have articles about « redness » and this is where « has quality » is useful.

This impression seems actually totally false in the kingdom of physical objects, mass (Q11423) for example, who has its own wikidata property and its own wikipedia articles.

On the human level, we can explore stuffs like stuffs like kindness (Q488085), but it seems they are solvable not by a property but by defining this as a subclass of behavior. Maybe a quality of kind people is to often behave kindly however ?

On a more abstract kingdom, let’s take maths. Take a property like « transitivity ». If the transitive relation (Q64861) case it seems that it’s clearly a subclass of « relation » with an additional logical formula they fulfils. Not sure we need an additional concept of « transitiveness » in that case, that would be a carrier for a formula like

\forall a,b,c\in X:(aRb\wedge bRc)\Rightarrow aRc

(taken from the enwiki article). Or maybe we would need it to carry the formula in different mathematical theory ?

In the end I think the philosophical articles’ about the « property » concepts items should be out of the tree. I think they live in a different level of abstraction and are more at the « metaclass » level, exactly like a philosophical « universal » does. That is, to take a physical property, like « mass », it should be an instance of « physical property ». I see in https://tools.wmflabs.org/reasonator/test/?q=Q4373292&lang=fr that some properties like « mass » are subclasses of « physical properties », some are instances. Some are clearly classes of properties and not properties themselves.

An option concerning the problem of the philosophical notion of « property » is to consider it as a science who studies / theorize the notion of, so link the philosophical « property » item to the

⟨ property (Q937228)  

 ⟩ is the study of (P2578) ⟨ physical property (Q4373292)  

 ⟩

. If physics studies the physical objects themselves, philosophy is more interested in the notions scientists uses to study them and into the definitions they uses. author TomT0m / talk page 12:34, 1 February 2018 (UTC)

Krzysztof Machocki, RIP

I must sadly relay the news that my friend, our fellow Wikimedian Krzysztof Machocki, User:Halibutt, died on 31 January, 2018, aged 36, after a couple of weeks of illness. Messages of condolence may be left at pl:Wikipedysta:Halibutt/Księga. Our projects will be poorer for his loss. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:51, 1 February 2018 (UTC)

Mergeproblems

Can someone merge en:Sierra Leone parliamentary election, 2018 (Q26709117) with german de:Parlamentswahl in Sierra Leone 2018 ?92.76.111.120 11:02, 1 February 2018 (UTC)

And can someone merge en:Democratic Republic of the Congo general election, 2017 with german de:Präsidentschaftswahl in der Demokratischen Republik Kongo 2018 ? --92.76.111.120 11:06, 1 February 2018 (UTC)

First one, Sierra Leone parliamentary election, 2018 (Q26709117)part of (P361)2018 Sierra Leonean general election (Q30274539), that's why both shouldn't be merged, as Q26709117 is only a part of, not is just Q30274539. For the second one, it might be likely, but why there's confusion between 2017 election and 2018. --Liuxinyu970226 (talk) 14:36, 1 February 2018 (UTC)

Never mind, that election was scheduled in 2016, but within two postpones... --Liuxinyu970226 (talk) 14:38, 1 February 2018 (UTC)

Generating citations on Wikidata generated Maps and Graphs on other Wikimedia projects

Hi

I'm looking at creating some maps and graphs using data we are importing into Wikidata from UNESCO. To make them more acceptable to English Wikipedia (and other projects) I want to include citations for the sources of the data, is this possible to either manually or automatically?

Thanks

--John Cummings (talk) 14:33, 3 January 2018 (UTC)

@Lydia Pintscher (WMDE):, @DTankersley (WMF):, @Yurik:, @Pigsonthewing:, @MartinPoulter:, @Mike Peel: any ideas? Thanks, --John Cummings (talk) 10:48, 8 January 2018 (UTC)

You can definitely make a Lua call and get the reference independent of the graph. I am not sure if there is anything in the graph code itself to do this. Yurik would need to say. --Lydia Pintscher (WMDE) (talk) 15:09, 8 January 2018 (UTC)

Thanks Lydia Pintscher (WMDE), do you know of any examples for this? I'm imagining something link a button underneath the graph that would say 'sources' and then when you click on it it gives you a list of sources. A Lua call is a high technical barrier to be able to be able to make graphs with references.... Hopefully @Yurik: will have some good news :) --John Cummings (talk) 15:26, 8 January 2018 (UTC)

If the data is stored in a .map or .tab page on Commons, you can use a trick I used in lines graph (and some other): {{#invoke:TNT|msg|I18n/Template:Graphs.tab|source_table|{{#invoke:TNT|link|{{{table}}}}}}} -- this magical bit of code (on all wikis that have TNT lua module) show a localized link to the Commons data. The {{{table}}} param is the name of the data page on Commons. If you are using graph extension to dynamically query Wikidata, you can use a slightly different {{#invoke:TNT|msg|I18n/Template:Graphs.tab|source_wdqs|https://query.wikidata.org/#{{urlencode:{{{query}}}|PATH}}}}. Here, {{{query}}} is a sparql query that becomes a link to WDQS. Both of these options are used in the mw graph collection. Note that none of this is using the graph extension itself, but rather some simple Lua scripting. Hope this helps. --Yurik (talk) 19:04, 8 January 2018 (UTC)

Thanks @Yurik:, I don't think its practical to add the data to Commons every time but I think we may have found a short term work around using the same set up as the first graph on this page. We also are writing a little outline of what a possible inbuilt reference could look like. --John Cummings (talk) 10:41, 10 January 2018 (UTC)

@John Cummings:, you don't need to add any data to Commons to have a citation - I meant you can use a Lua invocation to link to your WDQS query for the reference purposes, and that link will have a standard, autotranslatable text. But sure, a free-form text at the bottom of the box would work too. --Yurik (talk) 16:49, 10 January 2018 (UTC)

@Yurik:, ah thanks, I think I understand now :) --John Cummings (talk) 17:33, 10 January 2018 (UTC)

@John Cummings, Yurik: sorry for the late reply. I can offer a minor hack: the line chart view of the Query Service UI will include a fourth result variable in a “label” area above the chart. I’m not sure what it’s actually intended for, but you can use it to add a list of all the sources to the chart:

# population of New York City, including sources
#defaultView:LineChart
SELECT ?time ?population (" " AS ?dummy) ?sources WITH {
  SELECT ?time ?population ?source WHERE {
    wd:Q60 p:P1082 [
      ps:P1082 ?population;
      pq:P585 ?time;
      prov:wasDerivedFrom ?reference
    ].
    OPTIONAL { ?reference pr:P248 ?statedIn. }
    OPTIONAL { ?reference pr:P854 ?referenceUrl. }
    BIND(COALESCE(?statedIn, ?referenceUrl) AS ?source)
    FILTER(BOUND(?source)) # we don’t want references which have neither “stated in” nor “reference URL”
  }
} AS %results WITH {
  SELECT (GROUP_CONCAT(DISTINCT ?sourceLabel; separator = "; ") AS ?sources_) WHERE {
    INCLUDE %results.
    SERVICE wikibase:label {
      bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en".
      ?source rdfs:label ?sourceLabel.
    }
  }
} AS %sources WHERE {
  INCLUDE %results.
  INCLUDE %sources.
  # change “source; source; source” to “source; source; and source”, and “source; source” to “source and source”
  BIND(
    IF(REGEX(?sources_, "; .*;"),
       REPLACE(?sources_, "(.*); (.*)", "$1; and $2"), # the first .* is greedy, so this replaces the last ;
       REPLACE(?sources_, "(.*); (.*)", "$1 and $2"))
    AS ?sources)
}

Try it!

It’s not perfect – just throwing it in the ring as another possibility. (Perhaps we should add dedicated support for something like this to the query service UI?) --Lucas Werkmeister (WMDE) (talk) 18:02, 17 January 2018 (UTC)

@Lucas Werkmeister (WMDE):, this is amazing. How would we request that sourcing became part of the query service? My only suggestions for changes would be that the sources were shown at the bottom and that they linked to query results so that people can see the reference URLs and also it could have a maximum number of sources shown if there is a very long list of sources and people click the link to see more. --John Cummings (talk) 10:12, 18 January 2018 (UTC)

@John Cummings: I’ve created phabricator:T185308 to investigate this further. I’m not sure what you mean by making the sources link to the query results – aren’t people already seeing the query results when they see the sources embedded in the results like this? --Lucas Werkmeister (WMDE) (talk) 11:40, 19 January 2018 (UTC)

@Lucas Werkmeister (WMDE):, amazing, thanks. My suggestion to link to the query results are for a couple of reasons:

Where there are more than maybe 5 sources where it is probably not practical to list them all under the graph.
It allows people to which data comes from which source, this would also be helpful in spotting wrong or partial data, e.g if there was a graph for climate modelling and the graph had one very unusual point, this could looked at to make sure the data wasn't coming from climate-change-is-a-global-conspiracy-run-by-the-illuminati.org.uk
It adds another layer of transparency for people who are suspicious of Wikidata (I'm thinking of English Wikipedia especially where there is some push back against Wikidata integration).

I would be very happy to work with you on developing this further although please be aware I am a level 1 mugggle with no coding ability.

Thanks again

--John Cummings (talk) 14:40, 19 January 2018 (UTC)

@John Cummings: ah, I think I forgot that “link” can mean more than just “hyperlink” in English :) you want to have the sources on each individual result, not one overall list of sources, is that correct? In that case the query should actually be easier (example), but unfortunately right now the Chart result views don’t support this very well. --Lucas Werkmeister (WMDE) (talk) 15:47, 19 January 2018 (UTC)

@Lucas Werkmeister (WMDE):, I don't know what the right answer to how it displays references, I really like the way you had them display in plain English (is this manually added or automatic?) but I also think it needs to link through to the query. Is it possible? One additional thought is if it is possible to have a limit to the number of sources shown? If there are 10s of different sources then it is not practical to display them all. Maybe if you say what is possible and then we consider the different options?

Thanks

--John Cummings (talk) 17:00, 20 January 2018 (UTC)

@John Cummings: displaying the references in English was done “manually”, by using the label service and taking ?sourceLabel instead of ?source. Limiting the number of sources displayed shouldn’t be too hard if we make this a dedicated feature of the query service, though I’d like to have some sort of “…” button that you could click to see the full list. (Without this, we can also limit the number of sources in the query directly, but then it’s not clear which sources to show and which to throw away.) But I still don’t understand what you mean by linking the references to the query, sorry… --Lucas Werkmeister (WMDE) (talk) 15:17, 25 January 2018 (UTC)

Lucas Werkmeister (WMDE), ah its done manually? Would it be possible to do this automatically? So the labels on the graph are in plain English. Yes I agree, having a '...' kind of thing at the end to show there are more sources would be great. I guess the most sensible way to prioritise the sources is by number of times the same source is used (is there a way to do this for several sources from the same organisation but slightly different URLs?). I'm sorry I'm not being clear, what I mean is under the graph you would have a plain English list of sources e.g United Nations Population Fund, UNESCO, CIA Factbook, ... and that is a blue link that links through to a query result to show all the sources used. I'm still not sure what the best option is, but hope that makes sense now. Will you be at Wikimedia Conference? Perhaps we can work something out there in person? --John Cummings (talk) 10:39, 26 January 2018 (UTC)

@John Cummings: With what I suggested in phabricator:T185308, it would work pretty much automatically, yes. Sorting references by number of uses should be fairly easy; grouping similar references would be a bit trickier, but perhaps still possible, based on the domain name.

I’m not sure if it’s possible to provide a link to a query for all the sources used… unless it’s really a link to the same query, but with a special instruction to the UI to only show the sources this time. But I’m still not sure this is necessary when we already have all the sources behind the “…” button or similar.

I will not be attending the Wikimedia Conference. I’ll probably be in Berlin while it takes place, but I’m not sure if there’ll be opportunity for some meeting or if you’ll be too busy with the conference :) --Lucas Werkmeister (WMDE) (talk) 12:35, 26 January 2018 (UTC)

@Lucas Werkmeister (WMDE):, this all sounds very sensible, just linking the query results will be fine as long as it includes the reference URLs, grouping references by domain name sounds like the way to go. How you would get the name of the organisation from the URL? Would it be possible to use official website (P856) to find the name of the organisation? This would also encourage people to make Wikidata better as they go :) Sorry you won't be at Wikimedia Conference, let me know what I can do to help with this, I can't do the technical stuff but very happy to work on testing, instructions etc. Thanks, --John Cummings (talk) 11:06, 2 February 2018 (UTC)

Day care/ child care/ creches/ etc

Our item day care (Q364005) is vague. I have just changed the en description from "establishment offering care of during the day of children age 1-3" to "establishment offering care of during the day of children of pre-school age" (who says 4-year-olds cannot be there?), but even then it is listed as an instance of school (Q3914), which may day care facilities are not. The relationship (or lack of it) to child care (Q1455871), kindergarten (Q126807) and crèches (for which do not seem to have an item) is not clear.

Does anyone have a view as to how these should be modelled? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 21:20, 1 February 2018 (UTC)

As with most educational institutions, the terminology varies from country to country, state to state, and maybe even locally. I think we should indeed have one item for "establishment offering care during the day of children of pre-school age" and then have to define subclasses for each of those different models and what they are called in all those different cultures/jurisdictions (like "Kindergarten" in Germany is usually for age 3-6, "Kinderkrippe" is usually for 0-3). --Anvilaquarius (talk) 10:19, 2 February 2018 (UTC)

There is a Kinderkrippe day care (Q20174673). Danish "Vuggestue" (day care (Q12341317)) would in the Danish Wikipedia be described as for children from 0 – 2 år, while "børnehave" (kindergarten (Q126807)) is from "2 year 9 month or 3 years". It seems that "Kinderkrippe" and "Vuggestue" may be mergeable? In Danish, "daginstitution" day care centre (Q12307014) is an institution for children, e.g., vuggestue, børnehave or "fritidshjem", the latter for school children. — Finn Årup Nielsen (fnielsen) (talk) 15:18, 2 February 2018 (UTC)

How would we indicate an external identifier has "issues"

How would we indicate that an external identifier has "issues", e.g., that it does not represent the item correctly. This could, e.g., be an author identifier that in the external database lists the publications of the author, but also mixups publications with another author with the same name. I have an also experiences with the ImageNet (Q24901201) using the WordNet (Q533822) synset (Q1673963) identifier incorrectly. For instance, "Cup" described as "A punch served in a pitcher instead of a punch bowl" is in ImageNet (Q24901201) associated with images that show coffee cups, see http://image-net.org/explore.php?wnid=n07930864. We do not yet have an ImageNet (Q24901201) WordNet (Q533822) synset (Q1673963) identifier, but if we had, how can we best describe the discrepancy? One way would be to put it as "deprecated". Another we would be with sourcing circumstances (P1480) as qualifier and misassociation (Q21097088) as value. I see that misassociation (Q21097088) has never been used [1]. — Finn Årup Nielsen (fnielsen) (talk) 15:52, 2 February 2018 (UTC)

Problem cases could also be documented on the Talk page of the identifier, and forwarded to a responsible party for correction. I think something like this has been done in some other cases (VIAF?) ArthurPSmith (talk) 16:27, 2 February 2018 (UTC)

You may be thinking of en:Wikipedia:VIAF/errors, though the team at VIAF are also aware of Wikidata's constraint reports. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:30, 2 February 2018 (UTC)

Similar discussion from last May at Wikidata:Project chat/Archive/2017/06#Reporting errors in external databases which led to the creation of Template:External reference error reports. By now, I think it would be better to put that in the main database whenever it can be structured, as for duplicates and merged clusters. I'd appreciate any efforts in this domain, still things to arrange there. --Marsupium (talk) 16:44, 2 February 2018 (UTC)

Advertising on a user page

User:Oopsy Daisy Art, perhaps thinking this is Wikipedia, has written an article about the company on their user page. Is there a process for nominating it for deletion? StarryGrandma (talk) 02:24, 8 February 2018 (UTC)

Deleted speedily (already reported at WD:AN). Matěj Suchánek (talk) 09:19, 8 February 2018 (UTC)

This section was archived on a request by: Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:32, 8 February 2018 (UTC)

Interwiki link bug?

Hi all

I found a weird thing, if you look at Help:Adding open license text to Wikipedia (Q48067413) and click on the es.wiki page it shows a link to the en.wiki page, but when you go to the en.wiki page it does no show a link to the es.wiki page. Using the sidebar to try and make the link it gives an error because the page is already linked but not showing.

Thanks

--John Cummings (talk) 09:14, 8 February 2018 (UTC)

There was just a delay in refreshing the page, now it is displayed correctly. Matěj Suchánek (talk) 09:20, 8 February 2018 (UTC)

@Matěj Suchánek:, thanks very much, yes it now displays. --John Cummings (talk) 12:58, 8 February 2018 (UTC)

This section was archived on a request by: Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:32, 8 February 2018 (UTC)

Bog bodies

We need to decide something for bog bodies. Most items now contain both human (Q5) and bog body (Q199414) while they have properties covering geographical and personal aspects. This is creating constraint violations, what would be helpful? Sjoerd de Bruin (talk) 09:01, 31 January 2018 (UTC)

should these be considered as human (Q5)? since we have absolutely no biographical data for them ? also, bog body (Q199414) is presently a subclass of human (Q5), which means that human (Q5) could/should be removed from those items... :/ --Hsarrazin (talk) 09:17, 31 January 2018 (UTC)

There are several bog bodies for which the biography is known, e.g. Jan Spieker (Q1682240). For others, we may not know who they were, but from the body itself many aspects of their lives can be reconstructed (e.g. sex, age, occupation, health, wealth, ...), so I don't see a reason not to see them as humans. --YMS (talk) 09:39, 31 January 2018 (UTC)

Fix the constraints; if necessary, by adding exceptions. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:29, 31 January 2018 (UTC)

Exceptions aren't the solution for everything, Andy. It's also quite maintenance-intensive that way. Sjoerd de Bruin (talk) 12:55, 31 January 2018 (UTC)

That would be why I didn't suggest that "exceptions are the solution for everything", then. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:18, 31 January 2018 (UTC)

That problem also extends to other human remains, such as mummies or skeletons that are of archeological/paleontological interest, like Ötzi (Q171291) or maybe Lucy (Q245388). I remember discussions on Wikipedia regarding the use of Persondata templates and person categories in such articles. The question is whether you want to treat persons and their remains as two separate entities. Human remains, especially when it comes to archaelogical findings are often viewed as things that are found, owned by someone, put on display, stored, sold, etc. And something like ownership on the item of a person who wasn't a slave would be kinda weird. But if you want to go the route of separate items, properties like place of burial also wouldn't actually belong on the person's item. Personally, I wouldn't separate remains and person. Constraints can be fixed, statements could be clarified with qualifiers (you'd have to hope that whoever uses the data knows to factor in such cases, but that's the case with many things on Wikidata). --Kam Solusar (talk) 12:39, 31 January 2018 (UTC)

Splitting them up into two items for the person and the body seems natural for me. The person has biographical data (or, if not known, floruit (P1317) estimated from the body), the body has the rest (coordinate location, museum where it now rests, etc.). I’m just not sure what property should be used to link the two items. @Kam Solusar: why do you think that properties like place of burial also wouldn't actually belong on the person's item? —Galaktos (talk) 23:30, 31 January 2018 (UTC)

@Galaktos: Well, I think my train of thought was that it's hard to differentiate what information pertains to the person and whar pertains to the remains. Place and time of burial is information that you could argue belongs just as much on the remains' item as e.g. date/location of discovery, ownership, etc. But the longer I think about the whole problem, the less sure I am about the way to handle it. --Kam Solusar (talk) 12:16, 2 February 2018 (UTC)

@Kam Solusar: Well, for other people we have the place and time of burial on the person as well. To me it feels like two fairly separate things: one item is the person during their lifetime, who lived, breathed, talked to other people, might have married and have kids, etc.; the other item is an archaeological find, discovered hundreds or thousands of years after the person’s death, any mostly of interest to researchers and museums.

However, I’m no longer sure if place of burial (P119) is always appropriate to use. If a body was found in a bog, then in general the bog is the person’s place of death (P20), but the person wasn’t necessarily buried, right? (However, some were apparently in fact buried properly, e. g. Jan Spieker (Q1682240) according to the dewiki article.) --Galaktos (talk) 12:55, 3 February 2018 (UTC)

Duplicated geodata - Mountains/Rivers/Lakes - QA list from an external system.

Hi! I am matching wikidata geodata with [Natural-Earth geo-database] ; [github:natural-earth-vector/issues/224] ) And I would like to extend NaturalEarth data with wikidata id. As a side effect - I have found lot of data problems, for example, multiple duplicated Mountains/Rivers/* with same/similar name - in the same region ( ~near distance )

Example basic list of "~duplicated?" Mountains: https://www.wikidata.org/wiki/User:ImreSamu/problematic_mountains_201801 ; But I can create other QA list: ~ duplicated Lakes, Rivers, etc. I am using PostGIS with some text similarity checking functions, so I can just create CSV/TEXT/Markdown outputs (Sorry - NO SPARQL)

My question: Is it useful for the community? What is the best form to report this? What is the best practice? --ImreSamu (talk) 01:07, 1 February 2018 (UTC)

From checking a few of those I see that this is another manifestation of the GeoNames crap that some bot spammed into sv and ceb wikipedias from where it came to Wikidata. In order to maintain my sanity, I have chosen to ignore most of those ceb and sv only items (and they should be deleted anyway IMHO), and only merge and edit those from my immediate vicinity. --Anvilaquarius (talk) 10:54, 1 February 2018 (UTC)

I think this is definitely helpful to find possible duplicates. But I can say from experience that it's unfortunately not always easy to find out whether such seemingly (almost) idntical objects are indeed one and the same thing. I remember looking at Canadian lakes for a disambig page and found several instances where there were two or three lakes of the same name right next to each other. But I wasn't able to find out if that was due to a mistake in the official database, whether name parts like "Upper" XY lake or "Lower" XY lake were omitted or whether there are indeed distinct lakes with the exact same name closeby. With mountains, there's also the possibility that the name might apply to a mountain or massif as well as one of its summits, which isn't always handled the same way inbetween different databases. --Kam Solusar (talk) 16:26, 3 February 2018 (UTC)

Primary sources tool URL blacklist: removal of Wikisource

I think this change on Wikidata:Primary sources tool/URL blacklist needs some more discussion, as the given reasons are not supported by everyone (I think). Posting here because larger coverage, also pinging Hjfocs. Sjoerd de Bruin (talk) 15:00, 2 February 2018 (UTC)

If this list is still used, I suppose it depends on the type of page at Wikisource. Things like https://en.wikisource.org/wiki/A_Naval_Biographical_Dictionary/Fitzgerald,_Charles seem fine. Obviously, there are more elegant ways to add it ..
--- Jura 15:23, 2 February 2018 (UTC)

@Sjoerddebruin: thanks for the heads-up. I thought it would be optimal for StrepHit datasets to have fine-grained references, i.e., using reference URL (P854), stated in (P248), retrieved (P813) and source-specific properties for IDs, e.g., Union List of Artist Names ID (P245).

Said that, Wikisource seems to only have a QID: Wikisource (Q263). Therefore, I think it would be too generic to just have stated in (P248) Wikisource (Q263) for statements extracted from Wikisource. I also saw items like Lviv Oblast (Q164193) with full Wikisource URLs. That's why I removed Wikisource from the blacklist. What do you think?

Cheers,

--Hjfocs (talk) 15:27, 2 February 2018 (UTC)

I do not agree with the statement that wikisource would not be a reliable source : the very aim of wikisource is to provide reliable source !

wikisource, contrary to wikipedia, does not rely on the wording or the opinion of contributors : it edits books, published books ! no pov in this !

I agree that stated in (P248) Wikisource (Q263) would be very poor sourcing. But the url sourcing is not that... the removal was for the wikisource url from blacklisted url : Wikisource, by definition, edits texts from books, which means that the source of a statement can very well be a page in wikisource (i.e. a URL).

Of course, it is better to complete the reference with the title of the text, and better, the ID of the wikisource edition, but sometimes, it is simpler to give a direct url of the specific page (of a dictionary for instance) where the info is. Removing a site from a blaclist of url is not, in my opinion, not problematic, and moreover, it was very problematic that wikisource was on that blacklist. --Hsarrazin (talk) 16:07, 2 February 2018 (UTC)

A while ago Wikimedia import URL (P4656) was introduced to distinguish between third-party sources and links to internal Wikimedia projects, and it seems that there was a consensus to exclude Wikisource from this scheme (that is: to keep using reference URL (P854) for Wikisource URLs). So it suggests that it is indeed fine to remove this domain from the blacklist (for the reasons explained by Hsarrazin). − Pintoch (talk) 09:33, 3 February 2018 (UTC)

Terminology for Q and P numbers

For some reason coders on Commons (and possibly elsewhere) have propensity for the word code – searching for “item Q code” reveals many instances of this vocabulary. How is it a code, indeed? Which mathematical operations can they apply to the number after “Q” to gain few bits of information? It’s no code, just a numbering system written in decimal. And the only place where I can see “q-code” on Wikidata is Module:Wikidata date. Moreover, “Q code” can be easily confused with unrelated “QR code” which, indeed, is a code.

Could Wikidata make a statement (for external consumption) discouraging “Q code”, “P code”, and similar? Incnis Mrsi (talk) 18:19, 1 February 2018 (UTC)

I wrote Module:Wikidata date but term q-code is not my invention, as I have seen other people using it. I liked it because I knew what someone meant the first time I saw it, while outside of Wikidata term item is not universally recognized the way article or category are. I think I only use that term in my code comments. --Jarekt (talk) 03:04, 2 February 2018 (UTC)

"QID" would seem to be understandable, without the need to refer to either "item" or "code". But the underlying issue appear to be the necessity of making the Wikidata concept of an "item" known to people who want to work with content from Wikidata. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:54, 2 February 2018 (UTC)

This section was archived on a request by: Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:56, 9 February 2018 (UTC)

Automatically adding a value.

I'd like to submit a request via autolist to add actor (Q33999) to all files which have television actor (Q10798782), film actor (Q10800557), stage actor (Q2259451) or voice actor (Q2405480), so it can properly shown in a biographical template where all the other values have been rendered invisible.

Namely, the problem concerns this biographical template used on wikipedia fr , which displayed all the following professions : "film actor, television actor, actor, stage actor" (in no particular order). It has been recently decided to show only "actor" and to hide the other ones because it was redundant to show all of them in a template as if they were different professions.

But it appears that some files did not have "actor" as a profession, just "film actor", "stage actor" or "television actor" (or all of them). So the change leaves them without profession in the template. Hence, we'd need Q33999 to be added (or added back) in the template so it can be displayed again. However, I'm afraid I don't remember how to use the gadget. Could you tell me how I should do it, or how I should submit the request ? Thank you. Jean-Jacques Georges (talk) 16:50, 3 February 2018 (UTC)

This should be fixed in the infobox module, not Wikidata. --Thibaut120094 (talk) 16:53, 3 February 2018 (UTC)

As I already told Jean-Jacques when he posted the same question on the admin noticeboard... Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:55, 3 February 2018 (UTC)

Well, I was just told to ask here, so I did. If the problem can be fixed in the template, that would be great. Jean-Jacques Georges (talk) 17:33, 3 February 2018 (UTC)

This section was archived on a request by: Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:57, 9 February 2018 (UTC)

Problem editing ranked statements

On The Fall of the House of Usher (Q16992029), there are three statements listed under BBFC rating (P2629). One is ranked as preferred, one normal, and one deprecated. I can only edit the normal one; the other two are uneditable, and the references are unviewable. Yet if I log out, all three are editable. Is this a glitch, have I been blocked from editing something, something else? Thanks. Trivialist (talk) 18:38, 3 February 2018 (UTC)

I can click edit on any of them. Sometimes purging helps. There is a similar report on fr project chat: Topic:U6tkw1d0kp1bwzdz.
--- Jura 12:29, 4 February 2018 (UTC)
- ~~Purging the page, or my browser cache? I tried both, and nothing changed. Same thing happens using a different browser. Trivialist (talk) 20:19, 4 February 2018 (UTC)~~ Update: It loads fine using safe mode, so it's one of my gadgets or scripts. Trivialist (talk) 20:42, 4 February 2018 (UTC)
  - @Trivialist: I have the same problem. Did you find the incriminated gadget or script? Tubezlob (🙋) 22:29, 4 February 2018 (UTC)
    - @Tubezlob: Looks like it was User:Soulkeeper/statementSort.js. Trivialist (talk) 22:40, 4 February 2018 (UTC)
      - @Trivialist: Thank you! Tubezlob (🙋) 17:18, 7 February 2018 (UTC)

This section was archived on a request by: Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:57, 9 February 2018 (UTC)

Requested merger

Can someone merge the items Yansheng Coin (Q15903165) and Yansheng Coin (Q8048955)? I have no idea how to merge items, and I don't see a button to do that anywhere. -- 徵國單 (討論 🀄) (方孔錢 💴) 13:15, 6 February 2018 (UTC)

@Donald Trung:

Done Thanks for recommending this. ArthurPSmith (talk) 16:09, 6 February 2018 (UTC)

This section was archived on a request by: Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:58, 9 February 2018 (UTC)

Non-free image URLs

While we have recently added Commons compatible image available at URL (P4765), and of course have image (P18), we have no way of noting that there is a non-free image available on the web, unless we (ab)use official website (P856) or described at URL (P973). Before I propose a specific property, does anyone have an alternative method? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:40, 6 February 2018 (UTC)

@Pigsonthewing: Perhaps we could expand the scope of full work available at URL (P953) to not just include text? As for a property proposal, you could repropose one. @Strakhov, ChristianKl, Pasleim, ديفيد عادل وهبة خليل 2:, as those involved in the proposal I just linked to. Mahir256 (talk) 18:08, 6 February 2018 (UTC)

I have done that, at least as a temporary measure, for the immediate case I had in mind, which happened to be about paintings. But that won't work for, say, pictures of people or buildings, or stills from movies, or book jackets. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:03, 6 February 2018 (UTC)

@Pigsonthewing: I agree that it is quite frustrating not to be able to link to images outside Commons. There are many fair use images on Wikipedia that we cannot link to, and it seems to me that it is a technical constraint rather than a legal one (as we are technically not importing the image in Wikidata). So I would support the creation of a URL-counterpart to image (P18) (with the appropriate format constraint to ensure that we are not adding Commons images there, for instance). − Pintoch (talk) 18:01, 7 February 2018 (UTC)

Proposed: at Wikidata:Property proposal/external image URL. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:53, 7 February 2018 (UTC)

This section was archived on a request by: Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:00, 9 February 2018 (UTC)

chinese entries

Is there a separate entry for traditional chinese and simplified chinese entries? Artix Kreiger (talk) 20:02, 8 February 2018 (UTC)

Yes, labels can be entered with language code 'zh' (general Chinese, a default), 'zh-hans' for simplified Chinese, 'zh-hant' for traditional Chinese, and there are several other options. You can see the list when selecting languages in the link to "Create a new item" on the left. ArthurPSmith (talk) 20:13, 8 February 2018 (UTC)

This section was archived on a request by: Liuxinyu970226 (talk) 03:26, 9 February 2018 (UTC)

What is the name of the tool in Property Proposal that allows you to type the page name into a box and start the page

Hi all

Does anyone know what the name of the tool in Property Proposal that allows you to type the page name into a box and start the page e.g here. I tried looking at the code for the page and cannot work it out at all.

Thanks

--John Cummings (talk) 21:48, 8 February 2018 (UTC)

Well I worked it out through random googling, its called InputBox. --John Cummings (talk) 22:13, 8 February 2018 (UTC)

This section was archived on a request by: Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:01, 9 February 2018 (UTC)

Wikimania 2018 call for submissions now open

On behalf of the program commmittee of Wikimania 2018 - Cape Town, we are pleased to announce that we are now accepting proposals for workshops, discussions, presentations, or research posters to give during the conference. To read the full instructions visit the event wiki and click on the link provided there to make your proposal:

https://wikimania2018.wikimedia.org/wiki/Submissions

The deadline is 18 March. This is approximately 6 weeks away.
This year, the conference will have an explicit theme based in African philosophy:

Bridging knowledge gaps, the ubuntu (Q213843) way forward.

Read more about this theme, why it was chosen, and what it means for determining the conference program at the Wikimedia blog. Sincerely, Wittylama (talk) 08:18, 5 February 2018 (UTC)

m:Grants:Project/ScienceSource

ScienceSource is the new grant proposal from WikiFactMine. It is quite heavy on technical detail, not all of which could be in the proposal itself: if you wish to raise technical points, or other matters, please contribute to m:Grants talk:Project/ScienceSource.

Since the 2016 proposal that set up WikiFactMine, things have moved on quite a way. Technically, the fact mining has moved over to using SPARQL dictionaries, with hard cases done via PagePile, but in any case all mapped to Wikidata items. There is a conversion tool for that (aaraa). The progress of WikiCite means it is quite realistic to think about routine imports of metadata, that will sit as statements on items about scientific articles. This new proposal would also pay much more attention to the export mechanism to Wikidata.

The latest Facto Post newsletter, by the way, is w:User:Charles Matthews/Facto Post/Issue 9 – 5 February 2018. It is a mass message on English Wikipedia, where you can subscribe. The editorial features the recent Hub tool. Charles Matthews (talk) 12:12, 5 February 2018 (UTC)

Interesting initiative

This is a toothbrush (Q134205)
This is a toothbrush (Q134205)
This is a book (Q571)

You might have tried some of the apps that use a google algorithm to identify objects. Generally, they work quite well and output some text.

At m:Grants:Project/AICAT, there is a proposal to make this work for Commons by suggesting categories.

Categories at Commons are generally linked to items here and some have wordnet/imagenet identifiers (e.g. Q79746#P2888) pointing to image sets that may have been used to train the algorithm.

So the output could effectively be a qid ..
--- Jura 12:51, 4 February 2018 (UTC)

Sorry, why is the above photo of two toothbrushes caption as a book? —Justin (koavf)❤T☮C☺M☯ 20:07, 4 February 2018 (UTC)

@Koavf: I would presume that's an example of an error in some image-recognition algorithm and not Jura's fault. Mahir256 (talk) 00:24, 5 February 2018 (UTC)

Cool, the Commons app could use this! (to apply Commons categories to pictures that users upload) Syced (talk) 12:49, 6 February 2018 (UTC)

Grant ideas

Hi all,

I proposed a Wikimedia Grant: available here. And @Jura1: call me attention that was not clear to him how we could contribute with Wikidata. I made some changes to make it more clear, but I would like hear from you guys what can I include is project that could be beneficial for the community.

The idea for Wikidata is revision and inclusion of descriptions and photos and their categories in entries around the theme that we will work (if approved) in this year: Brazilian Natural Domains ("biomes"). Also in Brazil the biggest encyclopaedia around birds is not Wikipedia, is WikiAves, that is a wiki base website, one idea is create an identifier for it (and try to create a bridge between us, and maybe one day they back to free culture). If this pilot works, the idea as second step will be check all Brazilian birds, away more challenging, but also important.

Any ideas? And if you like and could give an endorsement it would be delightful! Cheers. Rodrigo Tetsuo Argenton (talk) 19:42, 5 February 2018 (UTC)

P4812

Hello. Statistical Service of Cyprus Geocode (P4812). In my proposal the I wrote that the Format must be “[1-9]\d\d\d(-\d\d) . Is correct. But I have noticed that for the 6 district of Cyprus (Q59136), and only for them, the codes are 1, 2, 3, 4, 5 and 6 (just one number). I have written them in Wikidata as 0001, 0002, 0003, 0004, 0005, 0006. I know is wrong. And I get error because is not starting with [1-9]. (The code for each district is also using as the first number of the four digit number of district's place's codes). Xaris333 (talk) 18:17, 9 February 2018 (UTC)

Do you want "0001" as accepted format for districts or "1"? I just changed the regex to the second one. Don't hesitate to delete the regex if it complicates imports. It can be re-added afterwards.
--- Jura 18:23, 9 February 2018 (UTC)

That's what I wanted. Thanks! Xaris333 (talk) 19:36, 9 February 2018 (UTC)

This section was archived on a request by: Matěj Suchánek (talk) 15:44, 12 February 2018 (UTC)

Merge request

I created Q48240082, not realizing that Q48239125 had been created automatically. -- Zanimum (talk)

Done - PKM (talk) 23:35, 9 February 2018 (UTC)

And another

I tried to merge Q48255518 into Q31828687, but it doesn't seem to have worked, someone else who knows what they are doing should take a shot. - Jmabel (talk) 01:22, 10 February 2018 (UTC)

That's odd, I just merged them and worked fine. Ederporto (talk) 01:39, 10 February 2018 (UTC)

This section was archived on a request by: Matěj Suchánek (talk) 09:21, 12 February 2018 (UTC)

In pectore (Italian)

Could an Italian speaker please untangle In pectore (Q1319909) and In pectore (Q21192428) and their it.Wikipedia links? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:02, 12 February 2018 (UTC)

I moved it.wikipedia sitelink in In pectore (Q1319909) to a new item and merged In pectore (Q21192428) and In pectore (Q1319909) --ValterVB (talk) 14:18, 12 February 2018 (UTC)

Thank you. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:42, 12 February 2018 (UTC)

This section was archived on a request by: Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:42, 12 February 2018 (UTC)

Definition of items as terms

We don't define association football player (Q937857) as "instance of sporting term".

We don't define vicar (Q193364) as "instance of Christian term".

We don't define locomotive (Q93301) as "instance of transport term".

We don't define bird (Q5113) as "instance of biological term".

We don't, generally, define classes of objects as an "instance of term"

So there appears to be no good reason to define paratype (Q926578) (and other subclasses of type (Q3707858), such as allotype (Q19353437) and ergatotype (Q19353481)) as "instance of taxonomic term". Can colleagues confirm that there is no need to do so? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:28, 2 February 2018 (UTC)

WikiProject Taxonomy has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead.. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:43, 2 February 2018 (UTC)

Also: If we wish to describe a specific example of a paratype (in, say, a museum collection) then we would not want to class it as being an "instance of an instance of a term". Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:21, 2 February 2018 (UTC)

It becomes increasingly difficult to imagine in what kind of realm Andy Mabbett's mind moves. In this edit he claims that the zoological term "ergatotype" is a subclass of a "reference sample by which a mineral is defined".

His edit comment is "not what source says"; the source in this case being a semi-humorous listing of published and unpublished terms, often of a very obscure nature. The relevant entry repeats something from a 1939 publication, apparently not used since. Not a reliable source in the first place. And in as far as the definition repeated there says anything, there is no indication whatsoever that this "ergatotype" would be a paratype, as Andy Mabbett states. No reason to assume that it would be, either (why shouldn't it be a holotype? Or something else?). - Brya (talk) 17:56, 2 February 2018 (UTC)

Before posting here (and so long before your comment) I'd already replaced "subclass of type specimen (Q7860915)" (which is actually described as "type material of a mineral species" (emphasis mine), and not the term used above, and was labelled "type specimen"; and which was the only item returned when searching for "type specimen", since type (Q3707858) lacked that alias) with "subclass of paratype (Q926578)", but that issue has nothing to do with the question raised, which you do not address. And once again: cease your ad hominem abuse. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:41, 2 February 2018 (UTC)

By way of a discussion point, I've remodelled paratype (Q926578), in these edits; and type (Q3707858), here. Can anyone suggest improvements to those items? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:01, 2 February 2018 (UTC)

Similarly hapantotype (Q19353478), here, which was also marked as an "instance of zoological nomenclature (Q3343211)" (and from which I also removed the use of Q7860915, added by another editor, and which had been present for many months without attracting comment.). Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:52, 3 February 2018‎ (UTC)

I concur to Andy’s questions. They deserves answers. author TomT0m / talk page 11:30, 3 February 2018 (UTC)

See also 196 instances of technical term (Q12812139) and 25 of legal concept (Q16874643). - PKM (talk) 03:27, 4 February 2018 (UTC)

We have the same issue in cases like Marsh Harrier (Q14706270), which is given as an instance of common name (Q502895), where it in fact refers to a group of bird species. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:25, 7 February 2018 (UTC)

ECI Status of Indian political parties

In India a political party (Q7278) can either be at the state level, or the national level. The Election Commission of India (Q2724317) decides which party is at which level and provides an ECI Status. This can be seen in the Indian political party infobox too. I was wondering if this information should be part of Wikidata too, and if so, then how? For example, national party - Indian National Congress (Q10225), and state party - Aam Aadmi Party (Q129844). Prtksxna (talk) 09:42, 4 February 2018 (UTC)

I think the best approach would be legal form (P1454) - you can then create an item for "national level Indian political party" and "state level Indian political party", and use qualifiers for dates in case they've moved from one to another. You could also have a third item for "unrecognised Indian political party" to describe those.

For state parties, you could then use applies to jurisdiction (P1001) to show the state[s] they apply to. (Also, I've learned something today - I never realised the AAP was state not national :-).) Andrew Gray (talk) 15:06, 4 February 2018 (UTC)

@Andrew Gray: I was also going to suggest applies to jurisdiction (P1001), but on reflection I'm now not so sure. To me that would imply that the party only exists / contests elections in that State, but as I understand it parties could stand more widely, but only be successful enough in a small number of states to gain official status there. --Oravrattas (talk) 15:11, 4 February 2018 (UTC)

@Andrew Gray: Thanks a ton ☺! I have added state level Indian political party (Q47717875), and national level Indian political party (Q47717839), and made it the legal form (P1454) of the necessary parties. I was wondering what other details I should add to these two items. Are there items similar to this that could guide me? @Oravrattas:, I agree with you, once elected to the Lok Sabha (Q230003) the ministers make law for the whole country. By the way, in case you're interested, I wanted to add this detail to be able to make the following viz:

#defaultView:Graph
#National political parties in India and their ideologies
SELECT ?party ?partyLabel ?ideology ?ideologyLabel WHERE {
  ?party wdt:P1454 wd:Q47717839.
  ?party wdt:P1142 ?ideology.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}

Try it!

@Prtksxna: Very nice! Other than the usual (leader, headquarters location, inception date, and so on) the thing that would be really useful for India is a way to link the parties to the big alliances/coalitions, particularly for minor parties - so a way to connect Indian National Congress (Q10225) and United Progressive Alliance (Q1323719). @Oravrattas:, could we use parliamentary group (P4100) for this, or is there a better way? part of (P361) doesn't quite seem right... Andrew Gray (talk) 13:51, 5 February 2018 (UTC)

@Andrew Gray: parliamentary group (P4100) is currently scoped to only being used as a qualifier, which I think is probably sensible. I'm not sure I understand this situation well enough, though. Does United Progressive Alliance (Q1323719) appear on ballot papers, or do people stand for the individual parties, but sit as part of the wider group? If it's the latter, our usual approach would be to attach the electoral party to a candidacy in election (P3602) statement, and have the UPA in the parliamentary group (P4100) qualifier to the position held (P39) (see, for example, Angela Merkel (Q567)) In such a case, I'd also suggest making changing the instance of (P31) to parliamentary group (Q848197) as well though. As for connecting one to the other, there doesn't appear to be a lot of consistency at the minute. I wouldn't be opposed to part of (P361), though member of (P463) also looks like it's used. --Oravrattas (talk) 21:05, 5 February 2018 (UTC)

@Oravrattas, Andrew Gray: To the best of my knowledge candidates for election stand on behalf of their party and not on behalf of the party's alliance in India, as parties frequently leave and join alliances all the time; for example, All India Trinamool Congress West Bengal (Q912899) was originally in the NDA, then joined the UPA, then became presently independent. It is highly unlikely to see two parties from the same alliance stand in the same constituency, though I'm sure @Prtksxna: can correct me on this point. As such Oravrattas' idea to use parliamentary group (Q848197) makes perfect sense. Mahir256 (talk) 17:34, 7 February 2018 (UTC)

Merge request

For some reason I cannot merge Patel (Q2056687) and Patel (Q35571410). Can someone look into it.--Hindust@ni^{क्या करें? बातें!} 13:33, 6 February 2018 (UTC)

Because they are two completely different concepts. Sjoerd de Bruin (talk) 13:40, 6 February 2018 (UTC)

Both are on surname. What I am missing here.--Hindust@ni^{क्या करें? बातें!} 13:43, 6 February 2018 (UTC)

The latter has statement instance of (P31): Wikimedia disambiguation page (Q4167410). So this is not a surname. Matěj Suchánek (talk) 13:51, 6 February 2018 (UTC)

Note that, though, I moved the Gujarati article to Patel (Q35571410) because it reads doesn't like a disambiguation page, but rather an actual article, assume to be surname related. I'm not sure if the Hindi one is likely or not. --Liuxinyu970226 (talk) 04:17, 7 February 2018 (UTC)

WikiProject India has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead.

The Gujarati article refers to the Jāti (Q2915083) named "Patel": the Hindi article refers to it as a surname but lists the two jatis mentioned in the second sentence of the Gujarati article. To the best of my knowledge there are articles about many (sub)castes on some of the Indian-language Wikipedias—it will take me some time to find them as I don't recall them being linked to anything else–but whether someone wants to split off the Patel jati article from the Patel surname item into its own item is another story.

@Matěj Suchánek, हिंदुस्थान वासी, Liuxinyu970226, Sjoerddebruin: Mahir256 (talk) 17:32, 7 February 2018 (UTC)

Wikidata edits trigger new user message bot edits at sister wikis

Tracked in Phabricator
Task T186945
Resolved

I have discovered that an edit at Wikidata triggers the new user message bots that can be active at sister wikis. This is definitely confusing to get such a welcome message when you have no account, nor edited at the wiki that notifies you. Uncertain whether the resolution lies with the edit detection, or the edit notification process. I have created a phabricator ticket to raise the issue to the respective developers. — billinghurst sDrewth 05:10, 10 February 2018 (UTC)

While your general point is valid; accounts have been "unified" across all projects for some time now, so "when you have no account" is false. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 09:39, 10 February 2018 (UTC)

Your local account gets created when you visit some project for the first time. Sjoerd de Bruin (talk) 15:43, 10 February 2018 (UTC)

@Pigsonthewing: dear oh dear, if you wished for a clarification then please ask, otherwise sometimes just shut up. Please seeSpecial:CentralAuth/Pigsonthewing; then if you want to quibble about architecture, you have an account per wiki and you have 365 listed all aligned to a username. What you misquoted as an account is a m:single user login, but hey, just keep talking. — billinghurst sDrewth 00:04, 11 February 2018 (UTC)

@billinghurst Thank you for your kind suggestion. No. And since I'm not shutting up, I'll note that the majority of the "365 accounts" you mention are on wikis where I have never visited, much less edited. But - guess what! - if I do visit one of them, I already have an account! You know what all that means? It means that on each and everyone of the wikis that I have never visited, and never edited, "when you have no account" is false. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:34, 11 February 2018 (UTC)

Hello @billinghurst, Pigsonthewing, Sjoerddebruin: I have the same problem.

Until December 2017, I had approximately 108 local accounts listed by meta:Special:CentralAuth/NicoScribe: they were the consequences of my visits on 108 Wikimedia projects. But after the deployment of mw:MediaWiki 1.31/wmf.12 I had approximately 404 local accounts (whereas I have made almost no new visit). Now, after the deployment of mw:MediaWiki 1.31/wmf.20, I have approximately 509 local accounts (whereas I have made almost no new visit).

In December, I had a talk (in French) with Trizek (WMF): everything is linked to Phab:T181731 and Phab:T179832.

Regards --NicoScribe (talk) 18:10, 10 February 2018 (UTC)

NicoScribe as you will have seen, I have added your text to the ticket as I think that your information is useful. — billinghurst sDrewth 00:04, 11 February 2018 (UTC)

@billinghurst: OK, thanks. Phab:T181731 and Phab:T179832 are complex (I don't understand their technical details) so, in my message above, I should have pointed to the sentence "Actually, it's not just imports. My work account got registered in plenty of wikis, even though it hasn't edited anything that would be imported to many wikis. EBernhardson on IRC figured out that it was because of Wikidata changes being reflected in projects' Recent changes and Watchlist feeds. In my case, this edit to an item for a template used in many different projects probably triggered most of those account creations." by Jon Harald Søby in T181731.

Phab:T181731 and Phab:T179832 have been closed yesterday and today by Anomie. But I don't know whether future Wikidata edits will still trigger creations of local account on unvisited projects (which trigger the welcome messages by bots). --NicoScribe (talk) 17:42, 11 February 2018 (UTC)

Done it was noted that code changes have been made so that local accounts are no longer created at the sister wiki where the wikidata edit impacts that sister wiki. No change to newusermessage was required. — billinghurst sDrewth 06:49, 13 February 2018 (UTC)

This section was archived on a request by: Matěj Suchánek (talk) 09:19, 13 February 2018 (UTC)

Patrolling

I asked this above, but it could be unnoticed. There is Wikidata vandalism tool which works using ORES and patrolling – if I click 'mark as patrolled', edit won't show up anymore on the list. As it is quite different than Flagged Revisions I know from pl.wiki, I'm not sure what should I do in a specific situation: vandalism or other 'bad' edit was not reverted, but someone just edited the item and restored the previous version. Can I mark this vandalism as patrolled so as to remove the edit from the tool's list? Or maybe marking it as patrolled has something to do with ORES and by doing this I mark vandalism as a good edit? Wostr (talk) 00:32, 11 February 2018 (UTC)

I've been marking them as patrolled if the problem has been fixed (or when I fix it). It seems to be the only way to get them off the list. Since the list only has a limited number showing, its the only way to get new items to appear at the bottom. StarryGrandma (talk) 02:21, 11 February 2018 (UTC)

That can be modified using &limit=xxx, e.g. https://tools.wmflabs.org/wdvd/index.php?lang=pl&description=on&labels=on&sitelinks=on&limit=200 – but it's a bit annoying when the first xx edits are checked, but are still on the list. Wostr (talk) 13:42, 11 February 2018 (UTC)

First, yes, you should mark them as patrolled, as far as I am aware nothing treats a patrolled edit as good or bad, it just means the edit was reviewed by a person who did whatever was needed. Second, I had a request in the community wishlist to fix this issue with reverted edits showing up in recent changes - meta:2017 Community Wishlist Survey/Miscellaneous#Allow filtering of recent changes and user contributions by whether they have been reverted or superseded - it didn't make the cut with enough support this year, but maybe one of our vandalism tools can look at implementing that feature somehow? It would be a great help! ArthurPSmith (talk) 15:41, 12 February 2018 (UTC)

Okay, thanks for the answer ;) Wostr (talk) 13:07, 13 February 2018 (UTC)

This section was archived on a request by: Wostr (talk) 13:07, 13 February 2018 (UTC)

Merge

I'm not sure where I could ask this so I'm asking here. Please make Q37755107 a redirect to Q37755106. Both articles are about the same river (I have just removed duplicates from the cebwp). Thanks in advance. --Wolverène (talk) 12:12, 13 February 2018 (UTC)

Merged - See Help:Merge how you can do this yourself the next time. Mbch331 (talk) 12:56, 13 February 2018 (UTC)

This section was archived on a request by: Mbch331 (talk) 12:56, 13 February 2018 (UTC)

Merged items that were the objects of statements

If one item gets merged into another, what happens to statements that it was the object of ?

Is there a bot that updates them to point to the new combined item? Or do they stay pointing at the old item, and therefore not picked up by Reasonator or WDQS queries for statements whose value is the new item? Jheald (talk) 14:58, 6 February 2018 (UTC)

I run bot (ie. whenever I have time) that fixes links to redirects that were created at least 7 days ago. Matěj Suchánek (talk) 15:26, 6 February 2018 (UTC)

Thanks! One less thing I need to be worried about. :-) Jheald (talk) 17:38, 6 February 2018 (UTC)

@Matěj Suchánek: Could you wait at least a couple of weeks after the merging? Unfortunately, wrong merges are quite common. So it would be nice to have some extra time to find those wrongly merged items and split them without having to worry on changing statements too. Hope is not a problem, Andreasm ^{háblame / just talk to me} 21:50, 7 February 2018 (UTC)

It isn't problem to let the bot wait, the problem is the negative effects of redirects: constraint violations cannot deal with them, Lua functions cannot resolve them (ie. no labels are shown in infoboxes), native query service functions don't work etc. So that's why I prefer delay as small as possible. Would it help if I made a ListeriaBot list with all recent merges with links to redirects? Matěj Suchánek (talk) 09:00, 8 February 2018 (UTC)

My impression was that your bot already has some kind of an undo feature that can put back the un-merged item into claims again. There was a case recently where the merge was undone after ~2 weeks, and I saw your bot putting the un-merged item back in many cases (example). Doesn’t this work reliably? —MisterSynergy (talk) 09:26, 8 February 2018 (UTC)

Whatever you call, it was just another script. Not a feature of my bot, also I admit unreliable. Matěj Suchánek (talk) 12:05, 9 February 2018 (UTC)

Two items for the same person, under different names

Sonia Fisch-Muller (Q47462781) and Sonia Fisch-Muller (Q25368065) are the same person, linked using replaces (P1365)/replaced by (P1366). In view of discussion at here, I'm seeking consensus from the wider community, before merging them. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:02, 8 February 2018 (UTC)

Why do you think that it is right for you to stir the pot? You don't think that it is a tad provocative, especially while the person is blocked. I agree that we have that human once, and one item, not versions of them, and they should be merged. I don't think that you doing it is particularly helpful. — billinghurst sDrewth 14:40, 8 February 2018 (UTC)

I think it is right that such bad practices be remedied, and that any editor may do so. It is because of the unusual circumstances that, rather than simply merging the items myself, I opened a discussion. What alternative remedy to the problem would you have preferred? [For the avoidance of doubt, Sonia Fisch-Muller is not blocked, and so far as I know has never edited, here] Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:05, 8 February 2018 (UTC)

Mr. Mabbett: The discussion was opened by Totodu74, and the blocked user is Brya (see your involvement). --Succu (talk) 22:20, 8 February 2018 (UTC)

Hello Andy Mabbett, thanks for your help. object named as (P1932) could be helpful, and spare us such forked items for a single person, though I would like to see how to use this property in concrete terms. For instance, Lithoxus boujardi (Q3764850) currently links to Sonia Fisch-Muller (Q47462781) in the taxon name (P225) section. How would you replace the links towards Sonia Fisch-Muller (Q47462781) by Sonia Fisch-Muller (Q25368065) and still getting a correct rendering (only "Muller") in taxobox with the gadget User:FelixReimann/taxobox.js? Totodu74 (talk) 08:28, 9 February 2018 (UTC)

Hello, for authors (writers) who are known with many names, we use object named as (P1932) as qualifier, so that it is possible to retrieve the name under which the work was published. Why not do the same for taxon authors ? the logic is exactly the same. This just need to have the case set in the infoboxes --Hsarrazin (talk) 09:53, 9 February 2018 (UTC)

A question about the API search "wbsearchentities"

When you do a search for an item that has a disambiguity suffix, such as "Meta (spider)", it comes up fine in the ordinary Wikidata search. For example, if you search for Meta, you'll find "Meta (spider)", a genus of arachnids, in the fifth position.

However, if you do the same search for Meta using wbsearchentities, Meta (spider) won't appear anywhere in the first 150 results. It will come up if you search for "Meta spider" or "meta (spider)".

If I'm looking for a Q-number for Meta the arachnid, I may not know whether it's a spider, mite, or scorpion. It seems like wbsearchentities should behave like the ordinary search and find "Meta" without specifying "spider". Is there a parameter or an API search method that I'm missing? Edibobb (talk) 04:05, 9 February 2018 (UTC)

Try this. --Edgars2007 (talk) 04:46, 9 February 2018 (UTC)

Possible relations between topic (Q200801) and session (Q932410)

I don't look forward to merging both, as I don't find anything that support me to do so.

But recently a zhwiki user @YFdyh000: told me that there may have same meaning in some areas, so I'm wondering if using which property to link both could be logical? --Liuxinyu970226 (talk) 03:22, 9 February 2018 (UTC)

Are you sure there ae the right items ? because reading the wp articles, one is « set of data exchange that took place during a computer connexion » (or something like that) and the other one is a « relationship between an exchange of several participant and the object of the exchange », and is used in linguistic. If you generalize the computer networking notion we could create a superclass « generalized session » with definition « set of message exchange that took place during a coommunication between agents ».

I guess that then each instance of « generalized session » could have a property « topic » to link the set of exchanges at sake to the topic of the exchange.

I’d go for an indirect relationship ( and I edited wd:

⟨ main subject (P921)  ⟩ Property:P1629 Search ⟨ topic (Q200801)    ⟩
as « topic » is a relationshipi beetween the communication and what is exchanged about during the communication. Assimilating the communication to a kind of work, this is the corresponding property.
and
⟨ generalized session ⟩ properties for this type (P1963) ⟨ Wikidata item of this property (P1629) ⟩
as each communication session instance may have a statement with the property
and/or
⟨ generalized session ⟩ has quality Search ⟨ Q200801 ⟩
. Not sure about that one, as I « has quality » originates from the OBO project but seem to not apply to classes, but something that mean « any communication session have an associated topic »
And as a working hypothesis
⟨ session ⟩ subclass of (P279) ⟨ generalized session ⟩
(or communication session, a little dubious as linguistics is not mainly interested into giving a meaning to machine / machine communication. author TomT0m / talk page 11:44, 9 February 2018 (UTC)

Perhaps said to be the same as (P460)? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:47, 9 February 2018 (UTC)

We do that when we actually don’t know :/ I think we have better to do. author TomT0m / talk page 12:00, 9 February 2018 (UTC)

Heads-up: Mass item creation for Sachsen monuments

I am about to start creating ~60K items for the Denkmalliste Sachsen, based on lists on German Wikipedia (example), which are based on official lists, and have been curated on Wikipedia. They are all Wikidata-notable by default, and will all have LfDS object ID (P1708), so they are easy to find and fix if something is not right. Example item: Am langen Felde 1, 3, Leutzsch (Q48111431). Bot creations. Discussion about the creation. Let me know (fastest via email or Twitter!) if something goes horribly wrong. --Magnus Manske (talk) 13:11, 9 February 2018 (UTC)

Organisation properties

I have just created {{Organisation properties}} - please help to populate it. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:19, 9 February 2018 (UTC)

Note we have Wikidata:WikiProject Organizations, also Wikidata:List of properties/Organization. ArthurPSmith (talk) 16:38, 9 February 2018 (UTC)

Converting strings to numbers in SPARQL

Does anyone know how to use the Sparql REPLACE function (I can't find a complete documentation).

What I want to do is:

F21 - > 21
F21 - > 21.01

the leading "F" is the same for all inputs The trailing letter can be anything from a to z, I think. My purpose is to enhance the sorting of Wikidata:WikiProject sum of all paintings/Catalog/L'Œuvre de Vincent van Gogh, catalogue raisonné . --Zolo (talk) 10:43, 4 February 2018 (UTC)

Would it make sense to use BIND within the query, and use the new variable to sort the results. For example:

SELECT ?item ?catcode ?catnum WHERE { 
    ?item p:P528 [ pq:P972 wd:Q17280421 ; ps:P528 ?catcode].  
    BIND(REPLACE(?catcode, "[a-zA-Z]", "") AS ?catnum)
  } ORDER BY xsd:integer(?catnum)

Try it!

Sorry if I misunderstood the question. Prtksxna (talk) 10:59, 4 February 2018 (UTC)

Almost, but there is still something that is not quite right: if two items have catalogue code F21 and F21b, we should make sure that F21 b comes after F21, which this query does not do. Actually I have just found a way to do it, but it looks awful:

Query

So maybe there is a cleaner way to convert all those letters into numbers ?--Zolo (talk) 11:42, 4 February 2018 (UTC)

One "solution" would be adding natural sorting in Listeria bot code. Other solution would be some kind of char function, but after a quick googling it doesn't seem to be available for SPARQL. --Edgars2007 (talk) 12:05, 4 February 2018 (UTC)

Maybe using ORDER BY ?catnum ?optionalcatletter
--- Jura 12:09, 4 February 2018 (UTC)

Great ! This way, this way it uses a secondary sortkey with the intended result:

SELECT ?item ?catcode WHERE { 
    ?item p:P528 [ pq:P972 wd:Q17280421 ; ps:P528 ?catcode].  
    BIND( xsd:integer(REPLACE(?catcode, "[a-zA-Z]", "")) AS ?catnum)
    BIND(REPLACE(?catcode, "F[0-9]*", "") AS ?catletter)
  }
ORDER BY ?catnum ?catletter

Try it!

--Zolo (talk) 12:36, 4 February 2018 (UTC)

The second query was causing horizontal scroll issues on the whole page, so I've replaced it with a link. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 23:17, 9 February 2018 (UTC)

Reversing an incorrect merge

Is there a simple way to undo a merge? This merge is incorrect but it would be quite tedious to undo by hand. Thanks in advance for good advice! Pichpich (talk) 21:22, 9 February 2018 (UTC)

You can simply use undo; first in the item into which was merged, then the redirect (item from which was merged). —MisterSynergy (talk) 21:29, 9 February 2018 (UTC)

I've undone it to make sure I have the steps right. Restore the old versions of each item, starting with the one that got turned into a redirct. Do a diff between the version you want and the newest version, and click restore. The process will tell you which sitelinks you will have to remove from the target version for the restore to work. Remove them and click restore again. Then go to the target version and restore to before the merge. I've been doing quite a few of these lately. StarryGrandma (talk) 21:42, 9 February 2018 (UTC)

If you start with the target item, you do not need to handle the interwikis manually. —MisterSynergy (talk) 21:52, 9 February 2018 (UTC)

Thanks. I will put that into practice. The editor who did the mentioned merge went into rapid merge mode yesterday with the Distributed Game. The first one they did isn't right either so I don't have much hope for the others. StarryGrandma (talk) 22:22, 9 February 2018 (UTC)

It would be useful to have a ready and quick means to prevent future mergers of items, especially when the tools that we have make it for the easy to do unreasonable merges. If one can make that mistake, others will follow. If there was a standard and quick measure to stop this for future that would be brilliant. — billinghurst sDrewth 05:14, 10 February 2018 (UTC)

Link them with said to be the same as (P460) or different from (P1889), as appropriate. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 09:41, 10 February 2018 (UTC)

Yes, this is the best way. Though strictly speaking I think any link will stop a merge, so if you have (eg) a father and son who've been mistakenly merged, you don't need the "different from" if you can add "child of" instead. Andrew Gray (talk) 12:06, 10 February 2018 (UTC)

Citizenship status of people before their country existed

There is currently a discussion here, on the Australian Wikipedians' noticeboard on en.wp, regarding the best way to indicate "citizenship" in Wikidata for Australian politicians before 1901 (the foundation of the current nation-state). Before that date there were a series of British colonies and they were British citizens. How should this be modelled - "subject of", "resident in"? This is presumably an issue for any biographies of people in colonies e.g. Canada, New Zealand, America... The issue of dual-citizenship of Australian politicians is a "hot topic" right now because of the w:2017–18 Australian parliamentary eligibility crisis. Wittylama (talk) 13:32, 7 February 2018 (UTC)

cf the discussion above Wikidata:Project_chat/Archive/2018/01#Country of citizenship is Wales. My view is that in order to deal with cases like "Welsh" or "Flemish" we already need to create a property roughly along the lines of "nationality or regional identity" capable of taking any of the values in eg the National/regional section of the Library of Congress Demographic Group Terms authority file, in ways that country of citizenship (P27) is simply not currently working for.

As used by external sources like the LoC, that would allow such individuals to be identified, per sources, as Australian (or indeed American) in a broader sense, without requiring Australia (or the USA) to have been formally constituted as independent countries; and also eg UK individuals to be identified as Welsh or Scottish or English, or British, without getting into the minutiae of which of its various successive states the UK was then constituted as, as a country. Jheald (talk) 13:58, 7 February 2018 (UTC)

Note that there is residence (P551), as well as place of birth (P19). country of citizenship (P27) is for actual citizenship, as well as for being a subject of a state. For individuals holding a particular position, the area to which the position applies can be filled using applies to jurisdiction (P1001) on the item for the position. --Yair rand (talk) 14:52, 7 February 2018 (UTC)

As I said over at AWNB, I think that country of citizenship (P27) is not appropriate in these cases. "Citizenship" is not a synonym of "nationality", and applying it to anything pre-20th century is problematic on data quality grounds, as the concept simply did not exist then as it exists today. residence (P551) is a better fit, but still not perfect. I like User:Jheald's idea of a "nationality" property; it would still be problematic for pre-modern eras, but it would cover a lot of ground that is not well covered presently. Lankiveil (talk) 23:48, 7 February 2018 (UTC).

I am the person who raised the conversation at AWNB. My concern is that if Wikidata is to be the global go-to place for machine-readable open source data, then we need to be diligent in not entering or allowing to be kept any data which is plainly wrong when read back. It might "feel good" to say that some arbitrary 19th century politician (George Ash (Q21176666) was the one I noticed first) in a Provincial parliament was "an Australian". But to code that up as country of citizenship (P27):Australia (Q408) and read it back as "George Ash was a citizen of the Commonwealth of Australia" is false and does not lead to having confidence in other material sourced from Wikidata. He was clearly a British Subject, so the start of one potential solution is to remove P27:Q408 from anyone who died before 1901 and replace it with whatever property is allowed to have the value British subject (Q4971466). It is possible that he was a citizen of the United Kingdom of Great Britain and Ireland, but that might not apply to his contemporaries who were born outside of the British Isles. Further complexity arises in border cases where people lived across times when states changed, but this is a relatively clear-cut case to work out from (noting the discussions above about Wales and historic predecessors of Germany and Italy). --ScottDavis (talk) 00:16, 8 February 2018 (UTC)

Citizenship is hugely problematic; particularly when it is feeds nationalistic sentiments by arbitrarily making them citizen of modern countries. It is like saying that Julius Caesar was an Italian; he was not. The other problem is that people who were citizens of countries that do no longer exist had a personal relation with the borders that were not even drawn at the time. Citizen have a relation with nationhood a nation that existed at that time, pure and simple. Thanks, GerardM (talk) 11:25, 8 February 2018 (UTC)

I think Wikidata has the potential to be a fantastic resource, but only if the data can be reliable. Wikipedia regularly gets criticised as unreliable as anybody can change it. Wikidata will face the same issues if it becomes wellknown and popular. This is one of probably many issues in data quality that are going to need to be resolved and monitored. I am generally a Wikipedian, not a Wikidatian. I need to learn the conventions and rules here before I go through and delete a batch of statements that appear to be false or nonsensical. Right now, there are six "citizens of Australia" who died before the first permanent white settlement, and they are all European. (Janszoon, Furneaux, de Vlamingh, van Diemen, Hartog, Carstenszoon). These are either clearly wrong, or "citizen" has a very specialised meaning on Wikidata that is not the same as the legal definition or common usage. --ScottDavis (talk) 11:58, 8 February 2018 (UTC)

This is also of relevance in South East Asian countries that have been colonial entities prior to current status - there has been laxness on the part of category creators and talk page project taggers (myself included) to allow misnomers arise such as 'indonesian' items against things that were really relative to Netherlands East Indies, or even earlier states and legal governing entities... JarrahTree (talk) 12:24, 8 February 2018 (UTC)

Wittylama, they were citizens of United Kingdom of Great Britain and Ireland (Q174193) prior to 1901, and one can argue that after 1901 that they were British subject (Q4971466) though that is not a country per se, so it may be more accurate to say Australia (Q408) and qualify it as stated as "British subject" as citizenship is a more mobile concept for nations of the broader Commonwealth. Looking at someone like Deakin who was claiming being a native(-born) Australian, and easily into the UK pre-federation as a citizen and based on his heritage.

Citizenship is just awfully messy and that is reflected here, especially with how a country is identified at a wiki, and many citizenships were imported from Italian WP (so processing issues based on modern Britain without nuance of the political history); then add political complexities of dual citizenship, inherited though not claimed citizenship, statuses of permanent residency, ... One wonders whether we should even track it as it is quite nefarious at times. — billinghurst sDrewth 15:01, 8 February 2018 (UTC)

"Citizen of" "British subject" doesn't have quite the right semantics either. That might be an appropriate use for the "Nationality" property that the Wales discussion above says has been been proposed and rejected. Is there any other property that could appropriately have "British subject" as a value? Looking a little wider, I'd expect that subjects of the Kingdom of Prussia (Q27306) or German Confederation (Q151624) should be treated similarly. As my interest is South Australia, this overlaps as around 10% of the early (mid-19th century) immigrants were "Germans".

My direct questions for Wikidata experts now are:

What is the right property to attach British subject (Q4971466) to a person?
How do I propose a broad removal of Citizen of Australia from pre-1901 (or even pre-1948) people? (I would expect most of them to be tagged British Subject instead, and perhaps the specific province/colony)
Is it too soon to initiate another proposal for a Nationality (or similar) property, which has apparently been proposed and rejected before?--ScottDavis (talk) 21:55, 8 February 2018 (UTC)

I think regardless of whatever else, we need to bot-remove the "citizenship:Australia" combo from anyone who died before 1901 (and I think before 1948, although I expect that to be a little more controversial), since that is flat out incorrect. If "nationality" is a non-flier, how about "resident in" as a property concept? It would do essentially what we want it to do and is also valid for historical states that existed before citizenship adopted its current form. Lankiveil (talk) 23:18, 8 February 2018 (UTC).

I agree with Lankiveil. Which leads to an additional question:

How do we initiate a bot action in Wikidata?

As someone presumably added the erroneous property in good faith (they appear to have been made by FischBot in May 2013), I'd like to replace it with something more accurate if possible. In an attempt to answer my first question, I found social classification (P3716). That requires the object to be an instance of social class (Q187588), which British Subject is presently not. I'm not strong enough on linguistics or anthropology to determine if it should be. --ScottDavis (talk) 03:19, 9 February 2018 (UTC)

I have manually removed the property from the six who only visited the island before any white settlement at all, and have attempted to contact Pyfisch, owner of Fischbot, but neither of them have been active in over a year. --ScottDavis (talk) 03:53, 9 February 2018 (UTC)

@Wittylama, ScottDavis, Lankiveil, JarrahTree, billinghurst, GerardM: Hi all. Some quick thoughts on this -

This is part of a really big, really messy, problem (cf Goethe...) - at the moment P27 is used in two subtly different ways and they don't make consistent sense. Following the last debate, I've been spending the last couple of weeks trying to put together some notes on how we currently interpret it in order to frame a discussion that can establish some universal guidelines - things like the old Nationality proposal ended up unsuccessful because people were trying to solve different problems at the same time and it got quite confused. I'm keen for us to get things sorted out but also don't want to rush into them and have the same problem again :-).
We should probably try and avoid focusing on citizenship in "country of citizenship" - "was associated with somewhere by a concept analogous to citizenship" might be better description, so it covers subject status, etc. To represent the fine details of the legal form of someone's citizenship, a qualifier on the P27 value might be the best approach; however, I don't know if we currently have such a qualifier. It may not be necessary in most cases; we don't generally model it on Wikidata at the moment.
For the time being, I would recommend using P27 and associating a person with the country (or colony) that they would generally have been considered to be "from", in a vaguely defined sense, at the time - so post-1901 this would be Australia, pre-1901 it would be Victoria, South Australia, etc. This still has some conceptual flaws but has the major advantage of a) being consistent with (some) existing data uses, b) not being obviously wrong :-), and c) keeping the data in an accessible form pending the outcome of future discussions (it would be hard to disentangle the Australians at a later date if we made everyone P27:UK). In 1901-1948 I think "Australia" makes sense; there seems to be a clear sense that Australia was a defined place, and people were from there.

Does this sound reasonable? If people are happy with it, I can start setting up some scripts this weekend to migrate people where the data is available, and we can work on the unknown cases (there will probably be a lot of pre-1901 people we'll have to check and correct by hand). Andrew Gray (talk) 13:07, 9 February 2018 (UTC)

Quick followup on this - this query gives everyone identifiable as "probably South Australian", ie is in a Wikipedia category that appears to show them as from SA, and who Wikidata records as Australian and born before 1901. A script could add "P27:South Australia" to all of these, remove P27:Australia for anyone who died before 1901, or add appropriate start/end dates if they were alive in both periods. It would take about an afternoon for me to set this up for all the pre-Federation people. Andrew Gray (talk) 13:25, 9 February 2018 (UTC)

Support: I haven't checked your query for completeness (I suspect membership of category:Colony of South Australia people is not well applied), but the text description sounds pretty much like what I was looking for when I found the problem. I agree that making colonial-born 19th century people P27:UK would be at best misleading, and quite possibly wrong. The text of a naturalisation certificate from that period appears ambiguous as to whether its effect extends beyond the borders of the colony it was issued by anyway (there's a reproduction of one in one of my family history books). It appears to grant the rights and privileges of a British Subject to the person in South Australia. The category query misses the mark slightly, because this is not what it was designed for. My test example false positive is Henry Young (Q1607441) who was governor but probably did not consider himself to be South Australian and is currently tagged as a citizen of the UK which is probably right. He got in the list because Governors of the colony is a subcat of colony people. en:Category:Members of the Parliament of South Australia is not broken into pre- and post-federation, so not included as a subcat of Colony people. Thank you. --ScottDavis (talk) 14:34, 9 February 2018 (UTC)

Excellent, and well-spotted - I'll filter out governors (and possibly other colonial-office-holders as appropriate). At the moment the search basically finds anyone associated with SA (eg "people from Adelaide") and then takes out anyone born after 1900. I should reiterate that this may not be the approach we use long term, but it would certainly avoid the anachronism problem, and help keep things coherent until such time as we get consensus on the overall best approach. Andrew Gray (talk) 16:14, 9 February 2018 (UTC)

I really think this is a two pronged problem. The first (#1) is that we have data that is not correct in there; and I think we all agree that needs to be removed. The second (#2) is that we don't have a way of easily identifying "people associated with colonial South Australia" in our current data model. I don't think that we ought to put off fixing issue #1 until we come up with a solution for #2, if only because we may never come up with a purely acceptable way of dealing with #2. If necessary, we can keep any information we scrub on a project space page somewhere until we decide how we want to represent it. Lankiveil (talk) 00:39, 10 February 2018 (UTC).

@Lankiveil: I think you are probably right. Removing all statements that are provably wrong (person citizen of place x where the lifetime of the person does not overlap the existence of the place). Whether it is better to completely lose that relation or replace it with another one that would be "less wrong" but may not be true in all cases is something for people more familiar with Wikidata principles than I am. As I think about it, I realise I have noticed that a lot of statements now have a reference whereas when these statements were made, they didn't, so it might be better to just drop the false data and allow true data to be added when it is available with a suitable reference. @ Andrew Gray: What do you think? --ScottDavis (talk) 10:29, 10 February 2018 (UTC)

@Jarekt: it seems that proposal has been discussed and closed. As I look at it, I can see why, and think I might have voted against it anyway. There appear to only be 20 "nationalities" (sorry - I haven't learned to embed queries neatly yet - the link is a list of instances of nationality) in Wikidata at the moment, and they look more like Ethnic Groups to me anyway. --ScottDavis (talk) 10:29, 10 February 2018 (UTC)

@ Andrew Gray: The more I look at specific examples, I think there are 917 erroneous attributions of Australian citizenship to 19th century people [2] These people have varying claims on being considered citizens of one or more of the colonies, but very few have references. Are you able to remove the citizenship property en masse from the ones with no reference for it please? We can then review a much smaller set with a reference to work out what to do with them. It will later be an exercise to assign referenced properties as appropriate for citizenship, residence, nationality or anything else. Thanks. --ScottDavis (talk) 12:43, 10 February 2018 (UTC)

@ScottDavis: - 850+ have no source on the "Australia" property (and a bit of sampling suggests the others are mostly just "Imported from..."). I can certainly take these out. My concern is that if we remove the data without replacing it, they'll be near impossible to find again when we go to put the "correct" data in... Andrew Gray (talk) 18:23, 10 February 2018 (UTC)

@ Andrew Gray: I also started by thinking that the erroneous attribute should be replaced by something "right" straight away. I don't think there is a simple answer that is "right" for them all that would not also apply to a bunch of other people who do not currently have a citizenship property (eg 16 members of the South Australian House of Assembly died before 1901 and don't have any citizenship. Therefore, when we work out what is "right", any bulk actions will pick up eople form this set too. An example I found last night included Aboriginal people who were notable for leading the resistance against white colonisation. Pretty insulting to tag them with any kind of Australian citizenship, even of the colony that they were rejecting. --ScottDavis (talk) 22:05, 10 February 2018 (UTC)

@ScottDavis: Hmm, here's a solution - I'll remove all the unsourced ones, and drop the list into a userspace page somewhere. This means we can go back and look at them again later, if needed. Hopefully very few will need manual curation as almost all have enwiki articles we can infer nationality from. I will leave the 40-odd with a sourcing statement in place, pending someone taking a look at them. Andrew Gray (talk) 11:47, 11 February 2018 (UTC)

@ScottDavis: I've done a little more prep work on this (haven't pressed the button yet, though).

904 people are currently marked as Australian, died before 1901 - this list should remain even if we remove them.
373 can be matched to one of the six colonies using English Wikipedia categories, so we could reasonably import a "colonial" P27 value for them (list of individuals). (About a dozen match two colonies, interestingly...)
531 can't be matched (list). Of those, the vast majority have no other nationality given (perhaps 10-25% do, though I'm finding it tricky to get an exact number - the query tools aren't lining up quite right)

At this point, I can see a few ways to proceed -

1. Remove Australia for all these people, put off doing anything else pending a better solution
2. Remove Australia for all these people, add the specific colony where known (sourced as imported from English WP)
3. Remove Australia for the ones with a known colony, add that, leave Australia for the rest
4a. Remove Australia for everyone, leave blank, and place of residence either Australia (Australian continent (Q3960), the continent!) or the specific colony
4b. Remove Australia for everyone, add colony if known, add UK and place of residence Australia for the others; run a separate check on Aboriginal people to avoid setting UK for them.

I am cautious about #1/#2 because, as GerardM notes, this is removing data that if not entirely right at least has some indicative value, and leaving nothing to replace it with. If we don't go with those, which of #3/4a/4b looks better? Andrew Gray (talk) 17:28, 11 February 2018 (UTC)

The problem with (potential) consensus on factual issues like this is that when people are attributed to countries that did not exist, we provide fake facts. This does not mean that we should bulk delete this information, it means that we should find a solution that provides factual information and convert the data. Thanks, GerardM (talk) 11:51, 11 February 2018 (UTC)

@GerardM: I agree, and don't think we are at cross-purposes. The data presently in Wikidata for these people is false, fake facts. It reduces the reliability and accuracy of Wikidata. The current statement needs to be removed/replaced/changed to something true. What Andrew Gray is proposing is the first step - to remove some things we have readily identified as wrong. Whatever statements are added to replace these are probably also applicable to other people who presently don't have a comparable statement in Wikidata. For example, it would be valid to make all pre-1900 members of the parliament of South Australia (and in fact all voters) "Citizens of South Australia", possibly with timing qualifiers for their migration (for British Subjects) or naturalisation (others). That step can be done later (hours/days/months) and doesn't depend on being done only to the ones who had a wrong statement removed first. The six statements I manually removed all had another citizenship which was probably more accurate than claiming they were Australians (Europeans who visited the shores of the land before there was any white settlement). --ScottDavis (talk) 12:50, 11 February 2018 (UTC)

When thinking about the sort of property we might want, and what groups we think it should be able to identify in statement-values, I think it's useful to look over to some of the groups that other institutions identify people to, who we might want to source statements from.

In the "Wales" thread last month I posted a list of nationalities/regional or cultural groups that the British Museum identifies people and object-creators to, via a property called PX_nationality: list.

To complement that, from the Library of Congress, here are its controlled vocabularies for national/regional ("nat") identities: list, and ethnic/cultural ("eth") identities: [[3]], which are two sectors of its Library of Congress Demographic Group Terms thesaurus ([list]).

To represent sources faithfully, it seems to me we ought to be able to construct statements linking people to any of these groups. The national/regional list clearly goes beyond just country of citizenship, and we need to address that. The fact that the LoC has a different list for ethnic/cultural identities, I think indicates that ethnic group (P172) cannot suffice to make up the gap. Jheald (talk) 14:32, 11 February 2018 (UTC)

You could as well have a peek at this rather elaborate concept my ancestors developed ;) LoC od BM are just two of myriads of such, usually very subjective, sorting systems of humans according to anybody's ideology. depending on the granularity of the definition of regions/people/folks/ethnicities you will get myriads of concurring systems. And sometimes quite loaded systems, if you look for example at Macedonia (Q103251) and the noise that's being made about it in North Macedonia (Q221) and Macedonia (Q81734) about it. Or as the usual anglocentrics here: Have a peek at Ireland, and define it from all different perspectives of the troubles. Grüße vom Sänger ♫ ^(talk) 16:23, 11 February 2018 (UTC)

I am finding so many corner cases special and examples that I think the best thing to do is remove the wrong data (citizen of Australia before 1901) and participate in a wider conversation about how to represent citizenship/nationhood and what it means. People at other stages of their lives could be equally problematic. Probing for examples I found one tagged as citzen of Poland and Australia. He was in his forties when he migrated from Prussia (having served in the Prussian army) to New South Wales in 1839, spent time in Victoria, Tasmania (the island) and NSW again before going to England in 1843 and becoming a British subject in 1845. From the en Wikipedia article, he should be ethnic group Polish, and citizen of Prussia and UK. The Polish or German wikipedias might have a different stance. I think doing option 1 and making a list with the information for option 2 to assist humans to review and add information where appropriate might be best. Since you asked for a vote of the other choices only, I'd pick 4a - place of residence only as least likely to introduce a new set of errors that are hard to detect. --ScottDavis (talk) 00:31, 12 February 2018 (UTC)

Hall of Fame

Is it "member_of=Important Hall of Fame" or "award=Important Hall of Fame". There seems to be a mix of each. Which should we standardize on? Member could be the president of the Hall of Fame or the the person awarded an honoring. An award is something tangible that you can actually hold. --RAN (talk) 04:00, 11 February 2018 (UTC)

This is a good question as sometimes someone has (e.g.) an exhibit in a Hall of Fame and is thus awarded somehow but isn't a proper member. —Justin (koavf)❤T☮C☺M☯ 08:47, 11 February 2018 (UTC)

Taking Rock and Roll Hall of Fame (Q179191) as a terrible example: it's combining a building, a prize and a museum in one item. We should probably have items like "entry in the Rock and Roll Hall of Fame" (just like star on Hollywood Walk of Fame (Q17985761). Sjoerd de Bruin (talk) 11:53, 11 February 2018 (UTC)

"Membership of the Foo Hall of Fame" is an award, just as "Membership of the Royal Society" is. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 21:02, 11 February 2018 (UTC)

I am not sure we have the correct wording yet. I am a member of the Aviation Hall of Fame, I pay dues that go towards the annual awards dinner and the upkeep of the museum, and membership allows me to nominate someone for the award. We need better wording for the honoree to standardize on. "Membership of the Royal Society" is not a good example because gaining membership is the honor bestowed. Please add other possible wording below: --RAN (talk) 02:28, 12 February 2018 (UTC)

entry in the Rock and Roll Hall of Fame
honoree in the Rock and Roll Hall of Fame
inductee into the Rock and Roll Hall of Fame Y Looking at their website this is the wording that they use. If I create this would it be "award=inductee in the Rock and Roll Hall of Fame"?

Auto-populating v:Template:Article_info

Hello all,

Having seen the successful implementation of Template:Wikidata Infobox over on Wikimedia commons, I was wondering whether it would be possible to do something similar for the Template:Article_info that is used in WikiJournals (e.g. WikiJMed, Q24657325). Currently, the various fields for each published article are entered manually (date of publication, DOI, etc). I'm keen on trying to increase automation where possible, and WikiData might be able to help with that, particularly if data can be moved from CrossRef→WikiData. Any thoughts? Evolution and evolvability (talk) 10:21, 11 February 2018 (UTC)

On frwiki we have Template:Cite Q (Q22321052) which populates the journal, articles template parameters and so on from a Wikidata item about a book, article and so on. You may also be interested into WikiProject Source MetaData author TomT0m / talk page 10:42, 11 February 2018 (UTC)

@Mike Peel:. Also, see en:Template:Cite Q. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:15, 11 February 2018 (UTC)

Templates used for substing

Tried to figure out why mw:ContentTranslation fails in some cases, and found that some templates meant for substing are connected to the same item. (I assume CT use Wikidata for this, but I can't find a page about it.) Thus the real template that usually has mw:TemplateData is confused with the template used for substing, which usually has no TemplateData. That makes the translation process fail.

An example from Norwegian Bokmål is w:no:Template:Cite web which is a "shim" for w:no:Template:Kilde www. At Norwegian Nynorsk both the first one w:nn:Template:Cite web and the second one w:nn:Template:Kjelde www are real templates. Both w:no:Template:Kilde www and w:nn:Template:Kjelde www are connected to the English w:en:Template:Cite web.

During translation an editor would expect CT to use w:no:Template:Kilde www, but if the source use w:nn:Template:Cite web it is translated into w:no:Template:Cite web and then fails.

Can we create an alternate item for these "shim" templates? It would then be possible to stop the attempts to use them during translation. Ie. the Norwegian Bokmål w:no:Template:Cite web would live in an item "Template:Cite web – Shim for Template:Kilde www", and neither Norwegian Nynorsk w:nn:Template:Cite web or English w:en:Template:Cite web would be connected to this item. That means we would have two items, one for the real template and one for the shim. To me this seems rather obvious, one is the real thing while the other is a proxy.

That leaves us with the problem of how to tell CT it can translate w:no:Template:Cite web into w:nn:Template:Cite web, as they will now link to two different items, but that can be done by local iw-linking from w:no:Template:Cite web to w:nn:Template:Cite web, and is thus reduced to a local problem.

Note that both w:no:Template:Cite web and w:nn:Template:Cite web lacks TemplateData, but they use the same parameter names so CT should be able to figure this out in this specific case.

Note also that they are connected to different items right now, but for a completly different reason. Merly using similar names is not enough for connecting them to the same item. The actual behavior should be similar. ;)

Is it some alternate way to do this? Jeblad (talk) 21:31, 11 February 2018 (UTC)

What if official website (P856) is dead?

Do we already have a consensus how to model the situation if official website (P856) is dead? Still adding/leaving it and adding a qualifier archive URL (P1065)? Setting the rank to deprecated? -- MichaelSchoenitzer (talk) 20:38, 4 February 2018 (UTC)

I have deprecated in the past. It still was the official web site. Sometimes I have seen two (e.g. from a musician and also a page from their record label) and if one is live, continuing using that one and deprecating or removing the other as necessary (in this case, if the record label page is down, no big deal--just remove it. If the official site with an actual domain is down, then deprecate.) —Justin (koavf)❤T☮C☺M☯ 20:53, 4 February 2018 (UTC)

If you can identify an end-time add end time (P582), and if you can identify a new site, perhaps set that to preferred rank, and keep the old one at normal rank.

If you do decide to deprecate the site, it may be useful to add a reason for deprecated rank (P2241) to indicate why.

It would be good to also come up with an agreed way to indicate a dead link -- perhaps sourcing circumstances (P1480) = link rot (Q1193907) -- which in German just has the label "Toter link" (Dead link), and which I see is already included in the controlled vocabulary for P2241. – The preceding unsigned comment was added by Jheald (talk • contribs) at 21:15, 4 February 2018‎ (UTC).

I have been using reason for deprecated rank (P2241)link rot (Q1193907) as a qualifier but I am happy to change to anything more established. − Pintoch (talk) 23:27, 4 February 2018 (UTC)

And I've been using end time, with an end cause (P1534) qulalifier. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 23:48, 4 February 2018 (UTC)

I would not deprecate them per Jheald. --Infovarius (talk) 14:04, 5 February 2018 (UTC)

Deprecated means it was never true. Use normal rank with appropriate qualifiers for historical websites. --Yair rand (talk) 20:23, 6 February 2018 (UTC)

I have a very similar case, but with external IDs. For example Lucherberger See (Q17280826) lost its status as a protected area sometime around 2010, and thus is not listed anymore in the catalogs referenced by the external ids. Should I set a novalue at preferred rank, and set the deprecated at normal rank. Do we have a deprecation reason like "withdrawn identifier" to make clear why there id is no longer valid? Ahoerstemeier (talk) 14:23, 6 February 2018 (UTC)

@Ahoerstemeier: Just put a qualifier « end date » to the statement that is not true anymore. I’m not sure the « no value » preferred statement is needed, although it’s not incorrect and it may make querying easy as if done in every case we won’t have to check for an end date to exclude no longer valid statements when we want only values that are true now. author TomT0m / talk page 20:57, 6 February 2018 (UTC)

And what about official website (P856) for pre-Internet people?

I've seen official website (P856) being added to people who have been dead for quite some time or who have lived in a time pre-Internet. I have the impression those links aren't really official websites, although some might be related to an "official" institution dedicated to that person or managed by the dead person's heirs, whereas some might just be commercial sites without any oficial representation whatsoever. Here some examples added by @Avatar6: for Fyodor Dostoyevsky, for H. P. Lovecraft and for Arthur Rimbaud. Andreasm ^{háblame / just talk to me} 22:30, 7 February 2018 (UTC)

I sometimes see that done with websites that represent the ("official") estate of the deceased - https://www.roalddahl.com/ for example: "The Roald Dahl Story Company Ltd manages the copyrights and trademarks of author Roald Dahl". I suppose we could create an item for the estate, but they don't seem to do much harm otherwise. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 23:22, 7 February 2018 (UTC)

Yes, some cases appeared to be that; however, others are less clear. For instance, Fyodor Dostoyevsky (Q991) and Arthur Rimbaud (Q493) have been dead for such a long time that all their works are in the public domain, so no copyrights to manage. I wonder what kind of "official sites" are those, maybe a national institution? Anyway, should we consider such sites as official ones? Andreasm ^{háblame / just talk to me} 07:19, 12 February 2018 (UTC)

Wikidata weekly summary #298

Here's your quick overview of what has been happening around Wikidata over the last week.

Discussions
- AICAT grants proposal
- ScienceSource grants renewal
- Open request for adminship: Kostas20142

Events/Press/Blogs
- From the life of Wikidata: with the Wikidata Concepts Monitor we can now begin to discover how our communities use knowledge across the Wikimedia projects, by Goran S. Milovanović
- See also: WDCM Journal, several examples of the use of Wikidata on the Wikimedia projects
- What GLAM can teach us about multimedia metadata on Wikimedia Commons, by Jonathan Morgan and Sandra Fauconnier
- Wikidata and the German handball player nicknames by k-nut

Other Noteworthy Stuff
- We are saddened to report that Polish Wikimedian Krzysztof Machocki (who was also active on Wikidata) died on 31 January 2018, aged 36, after a couple of weeks of illness. Our condolences to his family and friends.
- Notes of the IRC office hour of January 30th
- The call for submissions for Wikimania (Cape Town, July 2018) is now open. Deadline is March 18th. Ideas of submissions related to Wikidata can be discussed here
- Based on community discussions, the ArticlePlaceholder will soon be deployed on Urdu and Estonian Wikipedias.
- Statistics
  - January 2018 brought us 9,770,248 edits, 445,027 new items were created.
  - The number of users that edited Wikidata per day grew in 2017 from 2439 to 2672 users, 9,6% more compared to 2016. The number of edits by them grew with 18% to 190k edits per day. We also get edited by 542 IP adresses per day, 50% more than in 2016.
  - In 2017, Wikidata got edited by 46 various bots per day, executing 334k edits per day (63% more than in 2016). The most active bot in 2017 was Emijrpbot, who added 18 million edits to Wikidata.
  - 284 million statements now contain references, compared to 67 million at the start of 2017. The average number of statements per item grew from 5 to almost 9. 73 million qualifiers are now used to provide more details for statements, 13 million in early 2017.
- New tool based on Wikidata: Random TV episodes

Did you know?
- Newest properties:
  - General datatypes: uses data storage type, commanded by, dam
  - External identifiers: Who's Who UK ID, Basketball-Reference.com euro player ID
- Query examples:
- Newest gadgets and scripts: a script for semi-automated import of information from Commons categories is waiting for feedback

Development
- Diffs now show the entity ID in the page title (phab:T181077)
- Improved handling of translations in the Query Service UI (gerrit:406301, gerrit:406996), thanks to Li Song
- Continued working on diffs for forms on Lexemes (eg. phab:T186317)
- Added summaries for edits on representations or grammatical features of a form (phab:T184702)
- Worked on showing links to Lexemes and statements (phab:T185332)
- Rolling out fine grain usage tracking on more wikis, so only relevant changes are shown in the watchlist and recent changes (phab:T185032)
- Improved scalability of fine grain usage tracking (phab:T185693)

You can see all open tickets related to Wikidata here.

Monthly Tasks
- Add labels, in your own language(s), for the new properties listed above.
- Comment on property proposals: all open proposals
- Suggested and open tasks!
- Contribute to a Showcase item.
- Help translate or proofread the interface and documentation pages, in your own language!
- Help merge identical items across Wikimedia projects.
- Help write the next summary!

Read the full report · Unsubscribe · Lea Lacroix (WMDE) 15:14, 5 February 2018 (UTC)

Looking at the statistics in the above summary, I was wondering are there any numbers:

for the number of edits, excluding edits induced by changes on client wikis (ie wiki page moves or page merges)
for the average number of statements per item, excluding scientific papers? Or, more generally, broken out by broad category of item?

Thanks, Jheald (talk) 15:02, 6 February 2018 (UTC)

@Addshore: can you provide these? Sjoerd de Bruin (talk) 11:18, 11 February 2018 (UTC)

Hello, we're currently investigating to try to provide such statistics. Lea Lacroix (WMDE) (talk) 09:28, 12 February 2018 (UTC)

Wikidata weekly summary #299

Here's your quick overview of what has been happening around Wikidata over the last week.

Events/Press/Blogs
- Upcoming: Monthly Wikidata workshop in Paris, February 16th
- Upcoming: Wikimedia Community User Group Brasil promotes the 4th Wikdata Lab: How to add a lot of data, 22th February 2018
- Upcoming: #datatónCervantes, Wikidata workshop in Madrid, February 24th
- Upcoming: Wikidata workshop in Lausanne, February 24th
- Discovering Types for Entity Disambiguation, on OpenAI blog
- The File (Dis)connect, by Magnus Manske
- WDCM Journal: What is Love (Q316)? Accessing Wikidata P279 and P31 paths from WDCM by Goran Milovanović

Other Noteworthy Stuff
- IRC office hour for Structured Data on Commons on Tuesday, 13 February from 18:00-19:00 UTC. More information available on Meta.
- mySociety is looking for a vacancy for a WikiData-experienced Community Manager for their Democratic Commons project
- The next Weekly Summary (February 19th) will be the 300th edition of the newsletter! To help making it special, you can share your favorite Wikidata tool, so the other readers discover nice tools

Did you know?
- Newest properties:
  - General datatypes: identifiers.org prefix, season starts, make-up artist, sets environment variable, reads environment variable, Technical Element Score, deductions (in figure skating), Program Component Score
  - External identifiers: Basketball-Reference.com referee ID, Basketball-Reference.com NBL player ID, member of the Assembly of Madrid ID, BTO Birds of Britain ID, Rugby Australia ID, EUAP ID, LoC and MARC vocabularies ID, BVPB authority ID, Amtrak station code, Compagnon de la Libération ID, Gaming-History identifier, Fauna Europaea New ID, Royal Academy new identifier, BWSA ID, Statistical Service of Cyprus Geocode, PARES ID, Inventories of American Painting and Sculpture control number, Lemon 64 ID, Panoptikum identifier, Swedish portrait archive, TORA ID, Cour des comptes magistrate ID, La Poste personality ID, American National Biography ID, org-id.guide ID, Swimrankings meet ID, JORFsearch person ID, Swiss Enterprise Identification Number, Landslagsdatabasen ID, Bandysidan player ID, World Sailing regatta ID, Sailboatdata ID, Deutsche Synchronkartei series ID
- Query examples:

Development
- Make grammatical forms persistent (phab:T173742)
- Improve the edit summary of Forms (phab:T184702)
- Handle adding and/or removing forms in lexeme diffs (phab:T186317)
- Improve formatting of the Lexemes(phab:T185332)
- Enable Lua fine grained usage tracking on more wikis (phab:T186645)

You can see all open tickets related to Wikidata here.

Monthly Tasks
- Add labels, in your own language(s), for the new properties listed above.
- Comment on property proposals: all open proposals
- Suggested and open tasks!
- Contribute to a Showcase item.
- Help translate or proofread the interface and documentation pages, in your own language!
- Help merge identical items across Wikimedia projects.
- Help write the next summary!

Read the full report · Unsubscribe · Lea Lacroix (WMDE) 15:39, 12 February 2018 (UTC)

Something went wrong?

Something went wrong or I'm not getting something. Q48107582 and Q48107583 - both have the same articles. --Edgars2007 (talk) 10:07, 11 February 2018 (UTC)

That sometimes happens. I merged them. :) Sjoerd de Bruin (talk) 10:59, 11 February 2018 (UTC)

Maybe it's time to re-run the report to find them.
--- Jura 18:19, 11 February 2018 (UTC)

This section was archived on a request by: Liuxinyu970226 (talk) 14:58, 18 February 2018 (UTC)

This is a repeated item, but I don't know of which

Greetings. By chance I stumbled onto sculpture (Q3873168) and saw it has only 2 wikipedia links. The automatic Google translation of eo.wiki tells me it's about a sculpture (and not the general topic of the art of sculpture). However, a quick search led me to statue (Q179700) and sculpture (Q860861). It's obvious that Q3873168 should be merged, but I don't know which of the two is more appropriate. Does anyone here know which is better? --Andycyca (talk) 04:06, 12 February 2018 (UTC)

statue (Q179700) is a subclass of sculpture (Q860861), and both w:eo:Skulptaĵo and w:hr:Skulptura seems to be describing general sculpture, so I think sculpture (Q3873168) can be merged with sculpture (Q860861). --Okkn (talk) 06:15, 12 February 2018 (UTC)

Done per okkn's statement above I decided to merge both. --Liuxinyu970226 (talk) 11:55, 13 February 2018 (UTC)

This section was archived on a request by: Liuxinyu970226 (talk) 14:57, 18 February 2018 (UTC)

How to get the cutoff timestamp for a given Wikidata JSON dump?

Using Wikidata enriched with other data source, I must ingest the entire Wikidata JSON dump in a dev graph database of mine. That's easy (yet time-consuming) but once that's done, I want to keep my copy updated by querying the RecentChanges and LogEvents API endpoints to retrieve de changes/deletes/creates that occurred between two timestamps (I'd do so every few minutes) - that's relatively easy too!

How to get the cutoff timestamp for a given JSON dump? Where is this available or how to figure it out since the lastmodified timestamps aren't present in JSON dumps. – The preceding unsigned comment was added by Lazharichir (talk • contribs) at 15:32, 12 February 2018‎ (UTC).

@Lazharichir: I don’t think this info is directly available in the dump (unless you parse all the items and take the latest modified in the whole dump), but perhaps you can use the file modification time on dumps.wikimedia.org? Though I have to admit that I’m not sure what exactly that time means: is it the time when the dump started or when it ended? (For the RDF dump, it seems to be the time when it started – mw:Wikibase/Indexing/RDF Dump Format#Header says that it is guaranteed that no data in the dump is older than this date.) --Lucas Werkmeister (WMDE) (talk) 11:15, 13 February 2018 (UTC)

@Lucas Werkmeister (WMDE): the wikidata JSON dumps do not have the modified timestamp property, nor do they have the lastrevid property. Therefore, it's impossible to extract this information directly from the JSON dumps. I was thinking of taking the highest Qid which would allow me to then manually find the timestamp of the last added Wikidata item. However, I have no idea whether or not other items got updated after that last item (in the dump) was added. --Lazharichir 13:17, 13 February 2018 (UTC)

@Lazharichir: Oh, I didn’t know that the JSON dumps are reduced like that… in that case I guess the modification time on dumps.wikimedia.org is the only information available, at least as far as I’m aware. --Lucas Werkmeister (WMDE) (talk) 16:50, 13 February 2018 (UTC)

Property deletion: views needed

In the absence of 'Wikiproject Properties', I'm posting here as there is a discussion at Wikidata:Properties for deletion#eFlora properties where additional views from people familiar with property creation/deletion deliberations are needed. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:38, 12 February 2018 (UTC)

Maybe you should start to notify

the user who originally proposed the property
the property creator and
(only out of courtesy) ping the respective WikiProject.

--Succu (talk) 19:54, 12 February 2018 (UTC)

We are discussing two properties, not one. You proposed each of them, and as you notified the WikiProject within three minutes of my posting the deletion proposal, are clearly aware of it. HTH. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 23:17, 12 February 2018 (UTC)

How was this #distributed-game merge possible?

Tracked in Phabricator
Task T172406

I don't understand how this was possible if there were two different sitelinks, so it shouldn't have been even suggested on the first place. As a result, the wrong merge was only performed partially. In my previous experience, when such an event occurred (a conflict with sitelinks or statements), the merge is automatically stopped and not done even partially. Am I missing something? Andreasm ^{háblame / just talk to me} 02:08, 13 February 2018 (UTC)

I caused the same issue here with an explicit MERGE in QuickStatements. Pretty frightening. Can't recall if it used to work better. -- LaddΩ chat ;) 03:32, 13 February 2018 (UTC)

I thought Magnus fixed this. :( Sjoerd de Bruin (talk) 14:38, 13 February 2018 (UTC)

Change to external id for identifiers.org prefix (P4793)

identifiers.org prefix (P4793) was proposed and created as a string property, but it turns out it is (a) unique in identifying an identifier (generally in the biomedical fields) and (b) has a formatter URL, so it makes sense to convert to external ID format. See the discussion on Property talk:P4793 and a request to the developers here. ArthurPSmith (talk) 16:06, 13 February 2018 (UTC)

New tool: Wikidata Vandalism Dashboard

Hi everyone!

Patrolling edits on Wikidata can be tough: if you use your local wiki’s watchlist or recent changes (with show Wikidata edits enabled), you’ll only see changes that could affect that wiki, and there might be a lot of unrelated changes that aren’t relevant to you. If you use the watchlist or recent changes on Wikidata, you’ll see tons of changes in all kinds of languages that you don’t speak, or whose alphabets you can’t even read. And sometimes you would like to patrol edits for languages which you know, even though you’re not a part of the corresponding wiki’s community.

To help with this situation, Ladsgroup and I built a pet project: the Wikidata Vandalism Dashboard, in short WDVD. It shows you recent unpatrolled label, description, alias, and/or sitelink changes in a language of your choice, along with the user name and ORES score of the edit. For example, since I can read Portuguese, I can go through the Portuguese changes and check if they make sense or not, even though I’m not active on pt.wikipedia.org. And Ladsgroup checks the dashboard twice a day for the Persian language, which is apparently enough to find any problems: as soon as you mark an edit as patrolled, it vanishes from the list.

The source code is public, and you report issues or add feature requests here.

We hope you find this useful! It could probably be very useful to Wikipedians as well, even if they’re not familiar with Wikidata, since the descriptions and labels can still affect them (for example, the Wikipedia app uses descriptions on the search page) and you don’t need to know a lot about Wikidata to patrol edits to these parts – so feel free to tell Wikipedians you know about this as well :)

Cheers! --Lucas Werkmeister (WMDE) (talk) 15:28, 29 January 2018 (UTC)

@Lucas Werkmeister (WMDE): This is amazing! One thing I can think of off the bat that would be immensely helpful is support for querying multiple languages at once (for example "bn,as,bpy,hi,mr,ne,mai,bho,bh"). Could be helpful for speakers of Arabic varieties ("ar,arz"), Chinese languages ("yue,zh,zh-cn..."), and Norwegian standards ("nb,nn,no"). Mahir256 (talk) 16:40, 29 January 2018 (UTC)

@Mahir256: Good point! I think this might be covered by issue #1, but feel free to open a separate issue too. --Lucas Werkmeister (WMDE) (talk) 17:22, 29 January 2018 (UTC)

@Mahir256:

Done :) (example) --Lucas Werkmeister (WMDE) (talk) 19:12, 29 January 2018 (UTC)

@Lucas Werkmeister (WMDE): I sort of noticed while using this tool that the tag "description ending with punctuation" does not include descriptions ending with । or ॥ (the danda (Q923024)), punctuation marks analogous to periods in various North Indian scripts. Any way to fix that? Mahir256 (talk) 16:54, 29 January 2018 (UTC)

@Mahir256: I have to admit, I have no idea how those tags are added… I can’t find them in the Wikibase code, at least. Perhaps an AbuseFilter? --Lucas Werkmeister (WMDE) (talk) 17:22, 29 January 2018 (UTC)

Yes, AbuseFilter. Tag to revisions are in change_tag --Wargo (talk) 21:00, 29 January 2018 (UTC)

Dumb question. I removed vandalism by two editors on Rangers F.C. (Q19597) by restoring the last good version, but the bad changes are still on the vandalism dashboard. How do I fix that? - PKM (talk) 23:37, 29 January 2018 (UTC)

@PKM: You need to mark the edits as patrolled (see Wikidata:Patrol) – there should be a link “mark as patrolled” link in the diff view (and, with the “mark as patrolled” gadget, also in History and Recent changes). If you use the “rollback” function, it automatically marks the rolled back edits as patrolled too. (However, you say that you reverted vandalism by two users, so in that case rollback wouldn’t have been the best solution, since it only rolls back one user’s changes.) —Galaktos (talk) 00:20, 30 January 2018 (UTC)

Notified participants of WikiProject Counter-Vandalism

I just realized I forgot to ping the relevant WikiProject :) --Lucas Werkmeister (WMDE) (talk) 17:42, 30 January 2018 (UTC)

Great tool, but I'm not sure if I understand one thing correctly. I have two edits: vandalism and then correction of vandalism. I have to mark both edits as patrolled in order to remove them from the list — but is marking vandalism a good action (e.g. does marking vandalism as patrolled tells ORES that this was in fact good edit or something)? Wostr (talk) 00:06, 4 February 2018 (UTC)

@Wostr: sorry I didn’t see this earlier – Ladsgroup is the ORES expert, but as far as I understand, ORES is trained using Wikidata:Edit labels, not whether an edit was marked as patrolled or not. My understanding is that you can mark an edit as patrolled as soon as you’re confident that it no longer requires attention – whether the edit was benign, was already rolled back or reverted, or was fixed manually. Apart from that it seems like you already got some other answers further down :) --Lucas Werkmeister (WMDE) (talk) 18:25, 12 February 2018 (UTC)

@Lucas Werkmeister (WMDE): thank you :) Wostr (talk) 13:10, 13 February 2018 (UTC)

@Lucas Werkmeister (WMDE), Wostr:: You should mark all good edits and "bad edits that have been taken care of" as patrolled. In fact, when you rollback an edit or series of edits (if you have the rights) mediawiki automatically mark reverted edits as patrolled as well. Amir (talk) 21:32, 13 February 2018 (UTC)

Help:Sources and birth/death, etc. registries

At Wikidata:Property proposal/civil registration district, with Samwilson, we are trying to figure out a good qualifier to use in references. inventory number (P217) and catalog code (P528) have been brought up, but neither seems ideal. The closest on Help:Sources might be section, verse, paragraph, or clause (P958).

Do we need a new one are can one of the existing ones do?
--- Jura 14:16, 13 February 2018 (UTC)

(Thanks for bringing this to a wider audience @Jura1.) Surely there's some existing property that's appropriate for this? :-) It feels like a pretty generic idea: a document identifier, or accession number, or index number, or somesuch? (Hm, I just notice that inventory number (P217) has alias 'accession number', so maybe that is appropriate after all). There's UN document symbol (P3069) and Federal Register Document Number (P1544) which are pretty similar. Maybe a new 'document number' property is required. Sam Wilson 23:32, 13 February 2018 (UTC)

Most similar properties are for specific registries. article ID (P2322) is similar, but is for online publications. I tend to think of P217 for inventories of physical objects. Unless someone comes up with a good solution, we probably need a new qualifier for that.
--- Jura 04:53, 14 February 2018 (UTC)

Notable work

Should "notable work" field contain fictional characters? or only the names of actual works authored? See Michael Crichton (Q172140). --RAN (talk) 00:48, 15 February 2018 (UTC)

Looking at notable work (P800) I would have said that it is a grey line. I would expect to see Sherlock Holmes on Arthur Conan Doyle's, and the individual works, though not the house at Baker Street. I would expect to see Jurassic Park as the concept, and the individual works as applicable, though to me the characters are not that significant, and are captured by Jurassic Park. Others may disagree in that the names of the characters are universal, so we have that grey line. So I would say that where is a significant concept, though you need to guess at significant. — billinghurst sDrewth 00:57, 15 February 2018 (UTC)

SELECT ?author ?authorLabel ?character ?characterLabel 
WHERE 
{
  ?character wdt:P31 wd:Q15632617.
  ?author wdt:P800 ?character.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
}

Try it!

says no - 71 results only. --Tagishsimon (talk) 01:05, 15 February 2018 (UTC)

Authors at OCLC

au:Van Winkle, Daniel is the search at OCLC. Are authors given an identifier or are only books given an identifier. It seems like WorldCat just does a dynamic search on the text string that makes up a name, and does not have a unique identifier ... or do I have it wrong? --RAN (talk) 01:02, 15 February 2018 (UTC)

well, afaik, there is Worldcat Identities data, where the identifier is derived from LCCN, like https://www.worldcat.org/identities/lccn-n79-091479/, but those are not used for search in the global public Worldcat catalog. Don't know whether they intend to do it or not.

As the ID is completely derived from LCCN, on frws, we just add the link in Authorities, without having to get a specific property (see here for an example). --Hsarrazin (talk) 08:33, 15 February 2018 (UTC)

Familysearch

Can someone with a Familysearch account double check the links. For instance at Vincenzo Capone (Q428446) when I click on his Familysearch link it takes me to my home page, not to his entry. It may have to do with the Familysearch service being down for a full day yesterday, but to see what the problem is I need someone to check from their own account. --RAN (talk) 01:28, 15 February 2018 (UTC)

After logging on the link will take you to the entry. Just checked and have not noticed any problem.--Jklamo (talk) 09:36, 15 February 2018 (UTC)

Weird that it is happening to me, I will reboot my computer and clear my Chrome cache. --RAN (talk) 14:04, 15 February 2018 (UTC)

"no value" for those items needed

I can't add IMDb ID (P345) = <no value> via Quick Statements for those items but I won't do this by hand. Can anybody get fixed this with bot? Queryzo (talk) 23:17, 20 February 2018 (UTC)

If nobody takes it, I can do it in evening. --Edgars2007 (talk) 06:57, 21 February 2018 (UTC)

This section was archived on a request by: --Edgars2007 (talk) 17:04, 21 February 2018 (UTC)

Help: Description

Hi, I recently created an article on Wikipedia. I wanted to make it look more official and more asthetically pleasing by adding a description under the title. I found out that I could add that by creating a Wikidata page for the topic and adding the description, then linking it to my Wikipedia article. I tried doing this, but it doesn't seem to have worked. Could someone please take a look and see why it hasn't added the description? The Wikidata page is "cheloniology", and the Wikipedia one is "Cheloniology". Thanks so much! 24.238.192.44 23:21, 15 February 2018 (UTC)

Nevermind, I figured it out. Thanks anyway! 24.238.192.44 23:21, 15 February 2018 (UTC)

Structured Data feedback - What gets stored where (Ontology)

This consultation invite from the Structured Data on Commons team is worth a look, because a significant part of the discussion will be what new kinds of information should or should not be stored here on Wikidata:

Greetings,
There is a new feedback request for Structured Data on Commons, regarding what metadata from a file gets stored where. Your participation is appreciated.
Happy editing to you. Keegan (WMF) (talk) 22:58, 15 February 2018 (UTC)

-- Jheald (talk) 23:56, 15 February 2018 (UTC)

Mekniy

Hi! 77.78.89.121 has created Q47320458, Q47319850, Q47319302. All these items do not seems to fit with our criteria of notability. Do you agree? If so, I think we should delete theses items. Pamputt (talk) 09:16, 11 February 2018 (UTC)

All depends on the notability of Q47319850. If that one is notable, the other two are notable due to structural need. I have my doubts about it's notability though. Not every micronation (Q188443) is notable. The only used source here is a wiki. And those are not reliable sources. So I would say not notable. Mbch331 (talk) 12:51, 11 February 2018 (UTC)

I deleted the three items. Pamputt (talk) 07:43, 17 February 2018 (UTC)

"Holding Library"

The Biodiversity Heritage Library registers the "holding library" that holds the material that has been digitised and is available on the Internet.. The Hoya Handbook is held by the library of the Botanical Library of New York. How do I register this? Thanks, GerardM (talk) 13:50, 16 February 2018 (UTC)

Perhaps with a full work available at URL (P953) statement to link to the scan (that way we can distinguish scans from different copies), and then perhaps collection (P195) or donated by (P1028) as a qualifier? Jheald (talk) 14:13, 16 February 2018 (UTC)

WikiProject Books has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead. The Source MetaData WikiProject does not exist. Please correct the name.. Jheald (talk) 14:20, 16 February 2018 (UTC)

I would use collection (P195) - works for all GLAM :) --Hsarrazin (talk) 16:58, 16 February 2018 (UTC)

The problem with collection is that it assumes that the work is specific to a GLAM. Here it is the GLAM that digitised the work. It is not necessarily the same as made it available in this case it is the author who released them from copyright to the BHL. Thanks, GerardM (talk) 17:13, 16 February 2018 (UTC)

sitelinks to Draft namespace

I just noticed this sitelink to the English Wikipedia Draft namespace: Anastasia Vashukevich (Q48641974) Is this OK? I would think that Wikipedia interwikilinks should only be shared among namespace 0 areas, no? Jane023 (talk) 17:51, 22 February 2018 (UTC)

It isn't ok, I have deleted the sitelink because Draft namespace isn't valid, you can see the rule here --ValterVB (talk) 18:49, 22 February 2018 (UTC)

OK great, thx. Jane023 (talk) 19:07, 22 February 2018 (UTC)

This section was archived on a request by: Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:27, 23 February 2018 (UTC)

Merge request

Balch Creek: Q4850235 and Q49844960 -Pete F (talk) 20:16, 22 February 2018 (UTC)

Done--Ymblanter (talk) 20:37, 22 February 2018 (UTC)

This section was archived on a request by: Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:27, 23 February 2018 (UTC)

Formica

Quite a simple issue, but I'm not sure how to go about rectifying it...
Formica (Q1846692) describes "a 2017 film by michael guedes" - complete with lots of statements and external identifiers - but ALL the Wikipedia sitelinks refer to "a type of laminated composite material". Both are called Formica, both are valid topics for wikidata, but they are clearly mixed up. How should they be "separated" correctly? Wittylama (talk) 20:16, 22 February 2018 (UTC)

Apparently, there was a vandalism. I restored the correct version. Matěj Suchánek (talk) 20:22, 22 February 2018 (UTC)

This section was archived on a request by: Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:27, 23 February 2018 (UTC)

Some tricky properties of a book

Q1366818 (the book Escape to Life) presents an interesting situation on several counts. I'm wondering what, if anything, of the following we can somehow convey.

The book was originally published in 1939 by Houghton Mifflin. We have an existing entity Q390074 for present-day publisher Houghton Mifflin Harcourt, but not for this predecessor. It would be inaccurate to say that the book was published by Houghton Mifflin Harcourt; what should we do?
Klauss and Erika Mann originally wrote the book in German, but it was first published in English translation. A German edition did not come out until 1991. Is there any way to convey that the book was written in German, but first published in English? Is there any way to indicate the first German edition as being just that?

Jmabel (talk) 05:37, 16 February 2018 (UTC)

@Jmabel: - There's quite a lot of prior art at Wikidata:WikiProject Books; they seem to list the pertinent statements for the Work, and for the Edition, as far as i can see from a quick glance. (Sorry I'm pointing you elsewhere rather than answering in detail.) hth --Tagishsimon (talk) 09:02, 16 February 2018 (UTC)

Interesting. As a relatively casual user of Wikidata, how would I be likely to have found that page, other than coming here to ask? - Jmabel (talk) 16:13, 16 February 2018 (UTC)

@Jmabel: I remembered a prior discussion on books / editions on this page, found it & found the pointer to the wikiproject. I /should/ remember that wikiprojects are where we tend to store data models & so point you to Wikidata:WikiProjects. --Tagishsimon (talk) 20:01, 17 February 2018 (UTC)

@Tagishsimon: Even after reading that page, I don't see answers to either of the questions I asked above. Did you read the page and see answers to my questions? Or was this just "there's a lot of stuff about books at Wikidata:WikiProject Books, your questions might be answered there"? I think someone more expert than I on Wikidata would do well to see if this can be expressed with current properties (and if so I'd be interested in learning how). In particular, I'm guessing that for Houghton Mifflin there is some way to do this with custom properties, but I haven't been able to work out how to create one of those. - Jmabel (talk) 16:27, 16 February 2018 (UTC)

@Jmabel:

in fact, Escape to Life (Q1366818) is flawed because it is defined as a work Q7725634, but contains publication infos. If you read the Wikidata:WikiProject Books page, you've seen that works and version, edition or translation (Q3331189) are 2 different types. The work should contain only info about the authors, the original language (german), the original (german) title, the genre, and links to edition items.

infos about editions, both in english and german, must each go into an version, edition or translation (Q3331189) item, which would then have all the properties about the publisher, the year of publication, the title of the said publication, etc. like a traditional library catalog. For each edition there must be a different item, and it would be preferable if you could add a library ID for edition, LoC for example, to be able to differentiate editions and have reference.

Then, each edition is linked to the work item through edition or translation of (P629), and in the work item, you may link to the publications through has edition or translation (P747). Then, you can indicate on the English edition, that it was the first edition, like I did with Escape to Life (Q48914392). It should also be done for the first german edition, for which I have no info at all.

this may seem a little complicated, but it is the only way to manage data about the work and data about the different editions, without mixing them up.

if you need help, you may seek it on the discussion page of the project.

as for your question about publisher, on Houghton Mifflin Harcourt (Q390074), I see it was created in 1880, so it is the right publisher. Publishers often change their name through time, and it is written differently on many books, and it still is the same publisher... If it is the actual denomination in 1939 that bothers you, you can add a object named as (P1932) qualifier to set the exact name of the publisher at the time of publication. :) --Hsarrazin (talk) 16:50, 16 February 2018 (UTC)

Houghton Mifflin Harcourt doesn't seem to me like just a "change of name" of Houghton Mifflin. It represents a merger with the historically equally important Harcourt Brace Jovanovich (previously Harcourt Brace, then Harcourt, Brace, and World, then Harcourt Brace Jovanovich). Aside: there used to be a joke in the publishing industry that the name was changed because Jovanovich thought he was more important than the world.

I'm clearly out of my depth here. I'll bring it to Wikidata talk:WikiProject Books. - Jmabel (talk) 17:04, 16 February 2018 (UTC)

So split the Houghton Mifflin entity from Houghton Mifflin Harcourt? (see also Help:Handling sitelinks overlapping multiple items Alternatively (or as a stop-gap) use object named as (P1932). Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 23:37, 16 February 2018 (UTC)

Just a thought that may or may not be helpful: Wouldn't an unpublished book be a manuscript? It is not really a book until published. I do not see entries for many writings in unpublished forms unless they are in museums or archives, most entries are for ancient writings. We do not have entries for screenplays before they are movies, but we could have, people have received Academy Awards for writing them. --RAN (talk) 01:16, 17 February 2018 (UTC)

Somebody with Toollabs account

Could somebody with Toollabs account do a query for me? Database: s51434__mixnmatch_large_catalogs_p, SQL: SELECT * FROM viaf WHERE LNB<>"". And dump results somewhere (Google Sheet, pastebin, whereever). It would be good to include also row headers. SQL should be working (based on this and this). --Edgars2007 (talk) 17:28, 17 February 2018 (UTC)

Pastebin will follow soon. Takes some time as it's a lot of data. Mbch331 (talk) 18:31, 17 February 2018 (UTC)

Pastebin took too long. I've uploaded the data as a gzip to my Google Drive. Mbch331 (talk) 18:48, 17 February 2018 (UTC)

Thanks! --Edgars2007 (talk) 19:23, 17 February 2018 (UTC)

How to qualify Olympians who've represented more than one country?

Looking at Barbara Jezeršek (Q746844), she competed on behalf of Slovenia (Q215) in 2010 Winter Olympics (Q9674) and 2014 Winter Olympics (Q9678), but then for Australia (Q408) in cross-country skiing at the 2018 Winter Olympics – women's 15 kilometre skiathlon (Q47035493). I would expect to be able to tag them with an "on behalf of" or "participating for" or something, but I couldn't find an appropriate Property to use. Thoughts, anyone? — OwenBlacker (talk) 23:32, 13 February 2018 (UTC)

country for sport (P1532).--Jklamo (talk) 09:41, 14 February 2018 (UTC)

As a qualifier of the participant in (P1344) claim, please.

However, in spite of this solution being widely accepted, it is not ideal either. Technically, Olympic participants are members of a team, and this team may or may not represent a country. Typically one team represents one country, but there are many exceptions. Example are: Russian competitors at the 2018 Winter Olympics, or the refugee team at the 2016 Summer Olympics. It would be much cleaner if we had a qualifier member of sports team (P54) with a value item such as "Slovenian team" or "Australian team" instead of country for sport (P1532) with values Slovenia (Q215) or Australia (Q408), and the value item of that P54 qualifier was linked to the country the team actually represented—if there was any. —MisterSynergy (talk) 10:26, 14 February 2018 (UTC)

Yes, I should have been clearer, I did mean for the participant in (P1344) claims. The country for sport (P1532) solution is probably the best we have right now, though I'm inclined to agree with MisterSynergy. That said, arguably Olympic Athletes from Russia (Q28155263) is the country for sport (P1532) value for the Russian athletes competing in Korea right now… — OwenBlacker (talk; please {{Ping}} me in replies) 12:50, 14 February 2018 (UTC)

How do you set up in Wikidata a medal winner in the Olympics

Example Carl Hellström (Q1805994) he took

see sports-reference.com - Salgo60 (talk) 05:58, 18 February 2018 (UTC)

You can see Sven Kramer (Q111320) how they did it with a Dutch skater, but that method does violate constraints. Mbch331 (talk) 08:41, 18 February 2018 (UTC)

Strictly speaking: the medal itself is not the award, thus it should not be used with award received (P166). I’d just replace that qualifier with a ranking (P1352) qualifier. —MisterSynergy (talk) 09:37, 18 February 2018 (UTC)

New users

Noting that there are an unusual number of new users who have edited Guntur (Q3120966) today and are making other questionable and/or unsourced edits on other items. I haven't sent any of them welcome messages or stuff like that. Jc86035 (talk) 10:33, 17 February 2018 (UTC)

I'm guessing this is an editathon of some sort? Mixture of helpful and unusable edits (contributions links: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20). I found them with Special:RecentChanges since they all seem to have created user pages with Babel tags. Jc86035 (talk) 10:43, 17 February 2018 (UTC)

@Krishna Chaitanya Velaga: What happened here? Mahir256 (talk) 02:55, 18 February 2018 (UTC)

@Mahir256, Jc86035: Yes, they're part of edit-a-thon conducted yesterday. I'll warn them about this. Thanks for the ping. Krishna Chaitanya Velaga (talk) 05:41, 18 February 2018 (UTC)

@Krishna Chaitanya Velaga: Is there a page about this editathon somewhere (even if not in English)? Mahir256 (talk) 05:46, 18 February 2018 (UTC)

@Mahir256: There isn't any specific event page. We conducted it as a part of the training sessions for m:IMLD-ODD 2018 Wikidata India Edit-a-thon. But the activity can bee seen at https://outreachdashboard.wmflabs.org/courses/WikiProject_India_on_Wikidata/Wikidata_Workshop_at_VVIT_--_Feb_2018. Krishna Chaitanya Velaga (talk) 05:49, 18 February 2018 (UTC)

@Krishna Chaitanya Velaga: Noting that some of the issues I found were: test edits on live items; removal of correct data points; possibly incorrect changes in data structure (e.g. 1); removal of "imported from Wikipedia" references and other sources; addition of sources without title/date/access date/author; creation of duplicate items; linking to incorrect items (like Paper (Q1402686) instead of paper (Q11472)); lack of awareness of the notability policy; addition of textbooks and various other things without any identifiers like ISBN. Jc86035 (talk) 08:29, 18 February 2018 (UTC)

@Jc86035: Sure, I'll note that. Thank You. Krishna Chaitanya Velaga (talk) 08:32, 18 February 2018 (UTC)

@Krishna Chaitanya Velaga: I hope you will be able to review all of the edits that have provoked @Jc86035:'s concern and revert those that are obviously wrong or have caused the issues he described. I will see if I can do the same at some point, and I wonder if Jc86035 can do the same himself at some point as well. Mahir256 (talk) 09:38, 18 February 2018 (UTC)

@Mahir256: Of course, I'll check the edits, correct the wrong ones. Krishna Chaitanya Velaga (talk) 09:47, 18 February 2018 (UTC)

@Mahir256: I've already reviewed some of the edits (and nominated a couple of items for deletion), and sent two of the editors welcome messages (1 2). Jc86035 (talk) 11:31, 18 February 2018 (UTC)

'blank weapons'

Does anyone know what's going on with Q222405 and Q15407649? I thought 'arme blanche' in French was synonymous with 'bladed weapon' and so I am not sure we're getting it right here. arwiki and ruwiki have both, but I don't read either so I don't know what to think about that. svwiki and fywiki are on the latter, but just eyeballing it they could also fit on the former. 82.31.82.76 11:44, 18 February 2018 (UTC)

In ru.wiki articles Холодное оружие = cold weapons = weapons without explosive/pyrotechnical, pressurised gas or electrical mode of action; mainly, but not necessarily a melee weapon (also includes e.g. throwing weapons). Клинковое оружие = blade weapons = cold weapons (as described earlier) with a blade inseparable from the handle. Wostr (talk) 14:57, 18 February 2018 (UTC)

Not all 'arme blanche' are blades, a baseball bat (Q809910) is an 'arme blanche' but not a blade. Cdlt, VIGNERON (talk) 18:30, 18 February 2018 (UTC)

Series ordinal - P1545

When you read what the "series ordinal" is about, it is "position of an item in its parent series, generally to be used as a qualifier". When you consider its use in combination with USA governors, you will find that there is no Obvious position, that it is often arbitrary. Particularly when you consider the number range for the governor of South Carolina, it is not only about governors and it is not only about USA governors. Consequently this property is abused.

The reason why I object is that I have been harassed because I do not value this property as significant. So there are a few scenarios: the first is to be more relaxed and talk/be a lot less aggressve. The second is that another property is considered, one that acknowledged the arbitrariness of what the number is used for. Thanks, GerardM (talk) 18:48, 6 February 2018 (UTC)

Maybe you should talk with others instead of talking about others. Sjoerd de Bruin (talk) 18:53, 6 February 2018 (UTC)

Really? I do not mind to talk with people when there is a reasonable tone and a reasonable request. This has been absent in this latest altercation. When the facts are considered it is about a property that is obviously abused. All the more reason to consider an alternative. Thanks, GerardM (talk) 19:39, 6 February 2018 (UTC)sontemos

I'm not sure I understand what point you're trying to make. The property seems pretty clear to me. The first person to hold the position is P1545: 1, the second 2, etc. Can you clarify which situations you find arbitrary, or abuse of the qualifier? --Yair rand (talk) 19:59, 6 February 2018 (UTC)

I don't know if this is what Gerard is talking about, but I notice P1545 is used as a qualifier on positions in two different ways, one as you note is if the position is a unique position, to indicate the order of this person in the sequence of holders of the position. But the other meaning is when the position is part of a numbered list (such as members of a state's delegation to the US house of representatives) - so the district 3 representative would have a P1545 = 3 qualifier. Those two uses should probably be separated somehow in future, maybe another property is needed. ArthurPSmith (talk) 20:27, 6 February 2018 (UTC)

This appears to be a continuation of the discussion at Wikidata:Administrators' noticeboard#GerardM (talk • contribs • logs). Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:01, 6 February 2018 (UTC)

Maybe you could point out a better method to record that Q5725737 is presented as the 117th governor [4]?
--- Jura 20:25, 6 February 2018 (UTC)

Why should I. The property is abused it is not a straight forward sequence. I do not care for anyone to be a specific number when there is no method to that madness. Thanks, GerardM (talk) 11:53, 11 February 2018 (UTC)

The use of the qualifier is consistent with the use in the field and relevant constraints. If there were different ways to use such ordinals, we could store them as separate statements, but apparently this isn't even an issue here.
--- Jura 04:51, 16 February 2018 (UTC)

I do not know the specifics of this argument, but I do know that assigning a number is contentious, unless there is a canonical numbering system by that political office at their official website. Some historians do not count a second non-contiguous term as a new holder of the office, and some do. So we have John Smith as the 40th holder of the office, and another historian counting John Smith as the 40th and the 45th holder of the office. Some numbering systems count interim holders of the office, and some systems do not count them. Is this an argument like that? I create lists of mayors and run into this problem all the time. I am currently working with the research librarian for Long Branch, New Jersey to create a canonical list for them. --RAN (talk) 02:58, 19 February 2018 (UTC)

Surname

Can we add "surname" to the list of default fields that show up for "instance of=human". "Given name" is one of our default fields to fill in, but oddly not family_name (surname). This is why the field seems to go unused in so many entries. --RAN (talk) 04:40, 16 February 2018 (UTC)

There is no "default", it happens autonomously. If there are enough people with a surname (at least 6.9%), it will shown up. The more people have the property, the higher you will see it. At the moment, it's the 17th property in the list (you can test on Apor (Q773907)). Matěj Suchánek (talk) 09:06, 16 February 2018 (UTC)

At least 6.9% of what? Breg Pmt (talk) 10:06, 16 February 2018 (UTC)

There are 4 million of people in Wikidata. If at least 6.9% of them has surname, it will be suggested for the rest of them (unless there's a known interference with external identifiers). Matěj Suchánek (talk) 10:20, 16 February 2018 (UTC)

how many persons do then have surname? And whats the 10 most popular? Is it worth running a Query :) Breg Pmt (talk) 11:47, 16 February 2018 (UTC)

Smith, Li, Jones, Williams... tinyurl.com/yan6quny Jheald (talk) 11:57, 16 February 2018 (UTC)

@RAN: You may want to add this suggestion to Wikidata:Suggester ranking input. Deryck Chan (talk) 11:28, 16 February 2018 (UTC)

How much could safely be added by bot, eg from the DEFAULTSORT on en-wiki, to push that number of uses up? (Currently 531,563 out of 4,136,741 humans = 12.85%) ? Jheald (talk) 11:51, 16 February 2018 (UTC)

@Jheald: Runnig Query ?surname wdt:P31 wd:Q101352. I find 235932 surnames. (Using limit 400000) is that correct? Breg Pmt (talk) 12:13, 16 February 2018 (UTC)

@Pmt: That would be the number of different surname items we have.

To find the number of times the property is used, try

SELECT (COUNT(*) AS ?count) WHERE {
  [] wdt:P734 [].
}

Try it!

-- or look it up on the property's talk page.

(The number quoted above is slightly smaller, because there I also required ?item wdt:P31 wdt:Q5 -- ie no surnames of fictional people). Jheald (talk) 12:28, 16 February 2018 (UTC)

Also of interest is the number of family names as yet unused on any person: about 166,000: tinyurl.com/yclf7omz Jheald (talk) 12:32, 16 February 2018 (UTC)

For given name (P735) it took quite some time till it got suggested for items lacking it. When it eventually did, the downside was that people filled it with random items as values, as appropriate values hadn't been made for some names. For surnames, the problem might even be larger. It's likely that frequent surnames (in general) are even more frequent in Wikidata as some additions started out with these. On the opposite end, I think someone made items for all surnames of football players ..
--- Jura 12:31, 16 February 2018 (UTC)

Ittakes less than 10 seconds to create a new given_name entry for a missing one, it only needs instance_of=family_name. I probably add a dozen last month. --RAN (talk) 03:01, 19 February 2018 (UTC)

The suggestion data will soon be updated, then family name will get a good boost (above alma mater, I think). Sjoerd de Bruin (talk) 14:47, 16 February 2018 (UTC)

With Reference to the discussion here I would like to point to the proposal patronym or matronym Wikidata:Property proposal/Person#patronym or matronym (en) – (Please translate this into norsk bokmål.). I.E not all persons do have surnames. Breg Pmt (talk) 16:09, 16 February 2018 (UTC)

Only one bank (Q2897058) for concrete object (Q17553950)

^{from Wikidata talk:WikiProject Rivers#Only one}

How to claim, that an concrete object (Q17553950) is located on right bank (Q27834918)/left bank (Q27834806) of watercourse (Q355304)?

Thanks in advance. - Kareyac (talk) 08:35, 17 February 2018 (UTC)

@Kareyac: I see a dozen of examples like Lawrence County (Q502737) :

⟨ Lawrence County (Q502737)  

 ⟩ located in or next to body of water (P206) ⟨ Wheeler Lake (Q3561541)  

 ⟩
direction relative to location (P654) ⟨ north (Q659)  

 ⟩

Is it what you need?

Cheers, VIGNERON (talk) 18:44, 18 February 2018 (UTC)

@VIGNERON: Thanks, afraid not sure, direction relative to location (P654) shows position according to the compass. In my case I want to say "the Musée d'Orsay (Q23402) is located on the left bank (Q27834806) of Seine (Q1471)". - Kareyac (talk) 19:51, 18 February 2018 (UTC)

@Kareyac: then maybe something like

⟨ Musée d'Orsay (Q23402)  

 ⟩ located in or next to body of water (P206) ⟨ Seine (Q1471)  

 ⟩
applies to part (P518) ⟨ left bank (Q27834806)  

 ⟩

Cheers, VIGNERON (talk) 20:30, 18 February 2018 (UTC)

OK, I‘ll follow your advice. - Kareyac (talk) 20:49, 18 February 2018 (UTC)

Copyright start date

Would there be any interest in having the date that copyright status begins as a field for newspapers and magazines. Currently you have to search here for the publication and it tells you the start date for copyrights (as best can be discerned to date). That information could be pulled into WikiCommons and WikiSource by a template with standardized wording. See for example: here for a hand-written example for the Asbury Park Press which did not file for renewal and The Jersey Jornal which did. Some publications had a more extensive copyright clearance search performed and gaps in renewals were found for individual issues, see Time magazine as an example. We would not be able to have a single date for Time magazine. --RAN (talk) 19:28, 18 February 2018 (UTC)

The template reading the Wikidata date would add this statement to the category for the articles in Commons and in Source:--RAN (talk) 02:46, 19 February 2018 (UTC)

Articles published in the Jersey Journal are in the public domain prior to February 9, 1929. All articles starting on that date are currently under active copyright.

Fusion problems with German wiki

Can someone fusion en:Category:Gynaecology (Q7028220) with German de:Kategorie:Gynäkologie und Geburtshilfe (Q8970781) ?

Can someone fusion en:Category:Sexual ethics (Q30674430) with German de:Kategorie:Sexualethik (Q17303953) ? – The preceding unsigned comment was added by 178.11.14.47 (talk • contribs).

I'm not sure the first pair of categories cover exactly the same range of topics. I don't know much about medicine, but the German category seems to cover not only gynaecology, but also de:Geburtshilfe - which links to en:Midwifery but seems to also cover en:Obstetrics. --Kam Solusar (talk) 23:44, 18 February 2018 (UTC)

Comment: The German Wikipedia (and the Czech also, IIRC) uses a dual category system, with one set of categories using the technical / scientific term, and one set using the vernacular German. So it will not always be possible to merge category data items for the German Wikipedia. --EncycloPetey (talk) 02:53, 19 February 2018 (UTC)

Please can we enable FormWizard on Wikidata?

Hi all

I would really like to use FormWizard on Wikidata to create a very easy to use standardised form for creating entries on the Wikidata:Data Import Hub. Please can it be enabled? Unsure if I make a request here or on Phabricator.

Thanks

--John Cummings (talk) 20:04, 6 February 2018 (UTC)

Support. Seems useful; and harmless. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:15, 6 February 2018 (UTC)
Question: I wonder if this gadget is still getting maintained. The Phabricator project isn't showing much activity and the project page hasn't been updated over almost a year. Sjoerd de Bruin (talk) 21:26, 6 February 2018 (UTC)
- @Sjoerddebruin: A WMF staff member is the maintainer and Wikimedia Foundation are using FormWizard on the new Wikimedia Resource Center (when you want to add a new resource) along with several pages on the WMF grants space, it seems likely that it will be fixed if it breaks. --John Cummings (talk) 22:21, 6 February 2018 (UTC)
  - @Sjoerddebruin: are you able to enable it? --John Cummings (talk) 09:38, 14 February 2018 (UTC)
Support Sounds useful. Richard Nevell (talk) 22:48, 6 February 2018 (UTC)
Support - NavinoEvans (talk) 14:35, 7 February 2018 (UTC)
Support - makes sense to do this.Battleofalma (talk) 15:05, 7 February 2018 (UTC)
Support - seems preferable over your phab proposal for the same. --- Jura 19:55, 7 February 2018 (UTC)

Done, configuration files can be created as subpages of Wikidata:FormWizard/Config. Sjoerd de Bruin (talk) 09:59, 14 February 2018 (UTC)

Thanks so much @Sjoerddebruin: :), John Cummings (talk) 11:25, 19 February 2018 (UTC)

Ships: Copy data from Commons to Wikidata

Images of 17 534 ships are indexed in Wikimedia Commons by the unique IMO number, but only 3976 IMO numbers are set in Wikidata. To manually create an entry for each ship not existing in Wikidata would take several months, but would be easy to do with QuickStatements. However, I don't want to create entries for ships that already have Wikidata entries (without the IMO parameter already set). The questions are then:

How can I find out which of these Commons categories contain images that are used in Wikipedia articles (and then probably will have a Wikidata entry without the IMO parameter set)?
Is there any other way to figure this out? --Cavernia (talk) 14:43, 7 February 2018 (UTC)

From your 13 558 candidates for new items, a first set to exclude would be items that already have a Commons category (P373) mapping to such a ship. From a second-level category scan on Petscan (or an SQL query) you should be able to get the names of the relevant categories on Commons. This could then be compared to the Commons categories for all of the ships on Wikidata with no IMO number, to see whether you can match some more. (Of course, somebody may already have done this -- this may be where the existing IMO numbers come from). Jheald (talk) 15:11, 7 February 2018 (UTC)

The chain Commons category -> image -> Wikipedia article -> Wikidata item may also all be possible to extract all from SQL, but that would need somebody with better knowledge of the SQL tables and their contents than I've got. The first and last links are definitely possible from SQL, I don't know about the middle one. Jheald (talk) 15:14, 7 February 2018 (UTC)

I hope this query will help you. Interestingly, the amount of such categories is very close to the amount of already set IMO numbers. Matěj Suchánek (talk) 16:31, 7 February 2018 (UTC)

Very nice!

Although seeing a query like that always reminds me just why I like SPARQL so much :-) Jheald (talk) 16:51, 7 February 2018 (UTC)

@Jheald: Thanks, I've already done your first tip, but the opposite way, by extracting all items with Commons Category starting with "IMO" and using QuickStatements to set IMO number for the items without IMO number set. The thing is that several items are not using the IMO category, but the ship name subcategory, i.e. c:Category:Canberra (ship, 1961). It would still be possible to determine by using CatScan to extract the subcategories and then compare to an extract of Commons category (P373) mappings. Still, many items have images in Commons, but the image is not set, nor the Commons Category.

@Matěj Suchánek: Thanks for the query, but it doesn't seem to work according to my intention. The query detects some articles containing images from the category, but appearantly not the ship articles. Example: For IMO 5059953 the query finds the Q48249 (Falklands War) and Q7394859 (STUFT), but not Q1032840 which is the main item of the ship Canberra. --Cavernia (talk) 17:44, 7 February 2018 (UTC)

I see the problem. Some database colmuns have title with underscores, some with spaces. I updated the query, the number doubled. Matěj Suchánek (talk) 19:31, 7 February 2018 (UTC)

Thanks, it works better now. --Cavernia (talk) 20:43, 7 February 2018 (UTC)

Now when I've found the way to determine which entries not to import, it must be decided how to import the entries:

Many ships have several names, should the old names be entered into name (P2561) or be added as aliases?
The year the ship was built (finished), is mostly entered into service entry (P729), but in some cases start time (P580) is also used. Before creating more than 10 000 items, it should be decided which property to use
The flag state is possible to extract from one of the online ship databases, but I can't find a way to import more than the current or last registered flag state. Should we import the current flag state or leave it open until we find a way to import the historical flag states in the correct order?
It is possible to import gross tonnage, length, width and draft when using QuickStatements, but then a ±0 will be added to indicate the tolerance. I don't like it, but I can't figure out how to get rid of it when using QuickStatements.
Should gross tonnage (P1093) be imported with the unit gross tonnage or without unit?
It would be practical to add labels, descriptions and aliases to the most common languages for shipping like English, French, German, Dutch, Spanish, Portuguese, Italian, Danish, Swedish, Finnish and Norwegian. There are different standards in how to arrang the name. I.e. SS France is SS «France» in Norwegian and France in German. Could it be that these standards are listed anywhere? --Cavernia (talk) 12:15, 8 February 2018 (UTC)

1) Put all the names in as many statement. The most recent one is to be put a preferrered rank, the older one a normal rank with if known the begin and end date put as statement qualifiers author TomT0m / talk page 12:28, 8 February 2018 (UTC)

I'm resolving ship names from Commons by extracting the subfolders which includes the ship names (of course not all of them) and the date the ship was built. More names (and years) might be included in the category description, but there is no common standard for this (compare 7229502 and 9238404) and it is difficult to harvest with a script, and mostly the names are all upper case letters. Getting the data into Wikidata is actually the easy part here, harvesting and securing the quality of the data to feed into it is the hard part. However, I don't know if it's possible to define rank when importing data by using QuickStatements. --Cavernia (talk) 12:56, 8 February 2018 (UTC)

The script has now been tested: Q48336558 Q48336574 Q48336589 Q48336606 Q48336621 Q48336635 Q48336649 Q48336664 Q48336681 Q48336697. Comments are appreciated before running the rest of it. --Cavernia (talk) 20:26, 10 February 2018 (UTC)

@Cavernia: That's looking really nice. A couple of tweaks might be whether to make the description slightly more detailed, eg "Yacht, built in 1987" or even "61-metre yacht, built in 1987" rather than just "ship", and to add the alternative names to the aliases field, ie "Majestic", "Il Vagabondo Again" and "IMO 1001984"; also perhaps to make it instance of (P31) a more specific class than ship (Q11446). But maybe you were planning to add all of that. The most important thing is to identify the different ships and add items for them, and it looks like your script is doing that really well.

Once it has completed, it may be worth trying to identify what other items for ships we may have on WD, that look as if they could have IMO identifiers, but do not as yet. Do you have a rough idea of how many items we currently have in that state? Jheald (talk) 20:33, 11 February 2018 (UTC)

Regarding the aliases, it would be easy to include "IMO 1001984", as I have this information collected. I'm working on a solution to include alternative names as aliases. When it comes to the description, I have so far not succeeded resolving the ship type, but still trying, then it would be possible to include it in the description. Adding "built in 1987" is a good idea and easy to implement in the script.

When it comes to instance of (P31), for me it seems as natural to use ship (Q11446) for all ships as to use human (Q5) for all people. Instead I miss a property called Ship type. I don't know if this has already been discussed.

My guess is that there are about 1000 ships that have WD items and have an IMO number which is not identified in WD. I think I've added 1000-1500 IMO numbers by template harvesting, scripting and manual entry the last month, and more will be added before running the script to generate as few duplicates as possible. IMO is probably the only unique parameter for ships, and by importing IMO numbers into Wikidata it will make it a lot easier to detect duplicates. --Cavernia (talk) 21:22, 11 February 2018 (UTC)

I'd add (all) names of the ships with official name (P1448). If you are unsure about the language, use "und" as code. Not sure if it's standard that Norwegian labels include brackets («»). [5]. Both points could also be addressed after an initial upload.
--- Jura 21:33, 11 February 2018 (UTC)

Very few ships have official name (P1448) set, and in the cases it is set it is mostly with from and to years. This is information I don't have retrieved (yet), my opinion is that this property should be entered manually or automatically imported if it is possible to extract complete data from any external database. In Norway we include the brackets for ship names, mostly also a prefix describing the ship type, like MS «Granvin» where MS means motorskip--Cavernia (talk) 22:49, 11 February 2018 (UTC)

Regarding property Ship Type this proprty was proposed here Wikidata:Property proposal/Ship type but not done as Consensus is against having any specific type property Breg Pmt (talk) 21:44, 11 February 2018 (UTC)

It seems like this proposal should be reopened since the decision is based on a misunderstanding. There is a major difference between ship type and ship class (like the difference between occupation (P106) and family name (P734) for people). --Cavernia (talk) 22:49, 11 February 2018 (UTC)

Yes, but surely all ships in a particular ship class (Q559026) will have the same ship type (Q2235308), so just make

<ship> instance of (P31) <ship class> subclass of (P279) <ship type>

So one has, eg:

HMS Cumberland (Q1558665)instance of (P31)Type 22 frigate (Q922727), Type 22 frigate (Q922727)subclass of (P279)frigate (Q161705), frigate (Q161705)subclass of (P279)warship (Q3114762)

-- the same as we do for railway engines, aeroplanes, individual cars, etc. etc. Jheald (talk) 23:08, 11 February 2018 (UTC)

But contra this, see note below regarding vessel class (P289). Jheald (talk) 00:26, 12 February 2018 (UTC)

For extracting ship type, it looked like some information (eg "Yacht") was available on the page the IMO number is linked to. Also, it looked like there might often be a Commons category set to indicate it. Jheald (talk) 21:50, 11 February 2018 (UTC)

Yes, ship type is included in MarineTraffic, but the site doesn't allow me to harvest data by using a script. However, I found another site that allows me to do that, so now the script just have to run for some hours to complete this. --Cavernia (talk) 22:49, 11 February 2018 (UTC)

I was able to extract ship type for about 20 % of the ships with images in Commons. Better than nothing.... --Cavernia (talk) 10:05, 13 February 2018 (UTC)

Found another database which contains most of the ships, it also includes former names, home port, shipyard and class society. Do we need the latter as a WD property or does it already exist? --Cavernia (talk) 13:38, 13 February 2018 (UTC)

To comment on name question: names should be available in a way other than as aliases. Aliases are helpful as they make it searchable, but the names should be available in a structured way as well. I think official name (P1448) is preferable over name (P2561). Start and end dates can be added later, when/if known.
--- Jura 23:59, 11 February 2018 (UTC)
See also Wikidata:WikiProject_Ships/Properties, in particular Wikidata:WikiProject_Ships/Properties#Individual_ships. If there are ambiguities there, (eg your questions above), it may be worth checking in with the talk page there, then updating the page to document what you think is the best way forward.

The page recommends using vessel class (P289) rather than instance of (P31) to indicate 'vessel class'. I don't really understand why this is considered necessary or useful, but it has survived two deletion discussions, in 2013, and again in 2015. Jheald (talk) 00:24, 12 February 2018 (UTC)

At that time, before 'arbitrary access' was available in Lua, there may have been a case for keeping P289 to allow infoboxes to give a ship class (where available), and also a ship type. I don't think that should be a problem any more (pinging @Mike Peel: ?), so it may well now make sense to nominate P289 again. Jheald (talk) 00:35, 12 February 2018 (UTC)

If this was a military register the described solution would be great, but many civil ships don't belong to a specific class. Another challenge is the low number of ships in each class. For the Norwegian monitors there are 2 classes, the first contains three ships, the other only one ship. At the moment we have 2856 ship classes containing 9309 ships, mostly military vessels. Practically, ship classes means little for most users or readers, but they will understand the difference between an oil tanker, a tugboat and a passenger ship. --Cavernia (talk) 09:58, 13 February 2018 (UTC)

Agreeing with Cavernia there is qiute a difference between military vessels having vessel class (P289) and civil ships, who do not hav classes. Breg Pmt (talk) 10:43, 13 February 2018 (UTC)

@Jheald: Since you pinged me, I thought I should reply, but personally I don't have a strong opinion here at the moment. There are points against it being in P31 - you end up having to navigate a whole tree to figure out what the Wikidata item is fundamentally about (i.e., going from knowing it's a class of something to finding out that it's a ship), and if you try to say 'type of ship: <P31 values here>' then you can sometimes end up with odd results (e.g., "type of ship: ship"). However, it is easier to include that in more general infoboxes such as the Wikimedia Commons one. Mostly I'd say that it's best to be consistent in the approach across all types of things if possible. (and BTW, migrating ship info from Commons to Wikidata is a great idea that should definitely be done!) Thanks. Mike Peel (talk) 22:55, 16 February 2018 (UTC)

Also, looking at @Cavernia's first example above, Bad Girl (Q48336558) links to commons:Category:IMO 1001192, but that also has a subcategory of commons:Category:Bad Girl (ship, 1992). In general, there can be multiple ship-name categories for each IMO (if a ship has been renamed), and that ideally needs to be reflected here in a way that means that each commons category has a sitelink. Any ideas on how to do that? Thanks. Mike Peel (talk) 23:01, 16 February 2018 (UTC)

Yes, I'm aware of this, some of the existing items links to the IMO category, some items to the ship subcategory. My preference is to link to the IMO category, as there will always be only one IMO category reflecting one Wikidata item for each ship, independent of how many different names the ship have had. --Cavernia (talk) 09:02, 17 February 2018 (UTC)

Please avoid to use the term « category » here, as in the Wikiverse it refers to a Wiki category. Wikidata classes are not at all categories in this sense. Pleuse read User:TomT0m/Classification for an introduction on how classes can be (and are) used in ontology projects. Imho we should just avoid creating a « ship class » property, because such classification systems just work. In fact, there is prior art in wikidata a few years ago as an effort to delete such specialized properties. This allows to avoid to take a community decision like « should we create a property for US military ship class » all the time, whereas the need to classify stuffs is so present in every field of knowledge. It’s enough in a lot of cases to allow in queries to find the instances of the subclasses of « ship », and to use a more specialized class as the instance of (P31) statement. Find all the ships (objects) in Wikidata is just as simple as querying select * { ?ship wdt:P31/wdt:P279* wdt:Q11446 } . Simple enough for me, and applicable way outside of the ship field, generalize easily to any vehicle without worrying to much how the automobile guys have organized their fields. If they followed the same principles of course. author TomT0m / talk page 11:30, 17 February 2018 (UTC)

@TomT0m:: I'm talking about categories in Commons, not in Wikidata. In your comment and the discussion you are referring to it seems that you don't understand the difference between ship type and ship class. This confusion is why I think we need a new property for ship type. --Cavernia (talk) 09:26, 19 February 2018 (UTC)

@Cavernia: No I don’t. Ship classes are types of ships (the converse is not necessarily true). The plan exposed in my classification page is to use a metaclass ( « ship class » ) to discirminate ship types that are ship classes by explicitly those ship-classes by classify them as ship classes (

⟨ Nimitz-class aircraft carrier (Q309336)  

 ⟩ instance of (P31) ⟨ ship class (Q559026)  

 ⟩

), while also acknowledging they are a simple type of ship -

⟨ Nimitz-class aircraft carrier (Q309336)  

 ⟩ subclass of (P279) ⟨ ship ⟩

, just as there is types of everything else. This seems to me much like the difference between a car model and a more generic car type (SUV for example, there is many SUV models). author TomT0m / talk page 10:14, 19 February 2018 (UTC)

No, ship classes are not types of ship. I have explained this earlier in the thread. --Cavernia (talk) 11:34, 19 February 2018 (UTC)

But they are both groupings of ships. A ship-type is a sub-group of ship; a ship-class is a narrower sub-group of ship. The way this is expressed, or would be with any other sorts of thing, is to have

<a ship> instance of (P31) <a ship class> subclass of (P279) <a ship type> subclass of (P279) ship (Q11446)

<a ship class> instance of (P31) ship class (Q559026)

<a ship type> instance of (P31) ship type (Q2235308)

ship class (Q559026) metasubclass of (P2445) ship type (Q2235308).

If the ship is not a member of a ship class, one simply has

<a ship> subclass of (P279) <a ship type> subclass of (P279) ship (Q11446)

It's easy enough for a query (or an infobox) to go up the chain, and if there is an item in the chain that is instance of (P31) ship class (Q559026) return it as the "ship class", and if there is an item in the chain that is instance of (P31) ship type (Q2235308) return it as the "ship type". Jheald (talk) 12:34, 19 February 2018 (UTC)

Adding property proposal to list of proposals

Hello all, not sure if I did something wrong. I used the "create request page" link to create my property proposal. It's showing up at Wikidata:Property proposal/court but not in the actual list of proposals, and I can't seem to find any guidance on what to do. ohmyerica (talk) 02:36, 19 February 2018 (UTC)

@Ohmyerica: There is usually a large red notice to the effect of "You have not transcluded your proposal...Please do it." when you initially create your proposal--I don't know why this is not showing up. I have added it to Wikidata:Property proposal/Generic. Pasleim's bot will pick up on it around 11:15am EST and add it to Wikidata:Property proposal/Overview. Mahir256 (talk) 02:45, 19 February 2018 (UTC)

It didn't show up due to the selective deletion of several fields of the template by the creator. Sjoerd de Bruin (talk) 11:10, 19 February 2018 (UTC)

Tour not working

The first tour seems not to work. The statements tour looks fin, with a pop-up appearing.

All the best: Rich Farmbrough, 11:21, 19 February 2018 (UTC).

Thanks for noticing. I just tried the first tour, it works for me, the pop-up appears (after at least 5 seconds though). Lea Lacroix (WMDE) (talk) 16:17, 19 February 2018 (UTC)

Wikidata weekly summary #300

Here's your quick overview of what has been happening around Wikidata over the last week.

Welcome to the 300th Weekly Summary!

The weekly newsletter was started by Lydia at the very beginning of the Wikidata project, even before the first deployment, to keep the community informed about the developments, the new projects and tools. More than five years later, the newsletter is still there, its content powered by the community, and sent every week all along the years. I wanted to say a warm "thank you!" to each person who helped filling the Weekly Summary <3

Over the past years, as you know, Wikidata has grown a lot. More data, more tools, more editors and reusers, more exciting projects led by the community. The Weekly Summary has evolved with us, and the 300th edition seems a good moment to ask you all your suggestions about the newsletter, how it could continue evolving, and how you would like to improve it.

On that purpose, you can find a feedback page to express all your ideas about the Weekly Summary. We're very interested to know more about your reading habits, the parts you're more or less interested in, the new topics you would like to share with the community. Thanks in advance for filling it.

I stay available anytime to discuss with you, feel free to contact me if you have any question or concern! Cheers, Léa

A selection of cool tools on Wikidata

Here are a few tools that are recommended by some Wikidata community members. External websites, gadgets or scripts, they are very useful for Wikidata editors or users!

- The Wikidata Query Service is an infinite source of amazing data and one of the best ways to explore and use Wikidata. (TweetsFactsAndQueries)
- QuickStatements is a powerful tool that can edit or add Wikidata item en masse, via a text editor or importing a spreadsheet. (Éder Porto via Facebook)
- Mix'n'match (manual), which helps us to interlink Wikidata with the rest of the web and the world :-) (Spinster, Siobhan via Twitter)
- WikiShootMe! allows you to see Wikidata items plotted out on a map and shows you whether they have images or not. (Ham II)
- Yair Rand's WikidataInfo script adds the QID of the equivalent Wikidata item to the page being viewed (on sister projects), along with its Wikidata label and description. (Andy Mabbett)
- Recoin measures the degree of completeness of relevant properties of a Wikidata item and suggests any relevant statements that can be added to the item. (Rachmat04)
- Template:Wikidata list ("Listeria") Self-updating lists on wiki pages, to drive projects and show results. Over 14,000 now live. (Jheald)
- DuplicateReferences gadget adds a link to copy references and add them to other statements on the same item. (PKM)
- checkConstraints gadget adds notifications on the interface to easily notice the violation of constraints and help people fixing them (Léa)
- Resolve authors lists scientific articles with the property author name string (P2093) and groups them on the basis of co-authors and topic, which helps to distinguish people referred to by identical name strings. (Daniel Mietchen)
- The Wiki Loves Monuments map is powered by Wikidata. You can look for a city and find the monuments around. (Stefano Sabatini via Facebook)

Events/Press/Blogs
- Upcoming: Wikidata Lab: How to add a lot of data, São Paulo, February 22th
- Upcoming: #datatónCervantes, Wikidata workshop in Madrid, February 24th
- Upcoming: Wikidata workshop in Lausanne, February 24th
- Upcoming: Wikidata seminar, Oxford e-Research Centre, February 28th
- Ongoing: Fourth Annual month wide d:Wikidata:Events/Nepal#Datathon_2018
- Past: Wikidata doathon, 14-15 February 2018, Göttingen
- A Reconciliation Recipe for Wikidata by Martin Poulter
- Some ways Wikidata can improve search and discovery by Martin Poulter
- From Wikidata to Scholia: creating structured linked data to generate scholarly profiles
- Querying Wikidata about Vienna tram lines, by Stefan Daschek
- Using wikidata for linked data WordPress indexes, by Phil Barker
- Using Wikidata to build an authority list of Holocaust-era ghettos

Other Noteworthy Stuff
There are now over 100,000 ORCID iDs in Wikidata.
The usage history graph that is being linked to on property talk pages now shows usage since the end of August 2016. This used to be 50 days. Thanks Lockal!
Feedback needed: ontology for structured data on Commons

Did you know?

Development
- Fixed incomplete "Label:", "Description:" and "Statement:" entity usage messages in various places (phab:T178090). Thanks, Matěj!
- Improved violation messages for ranges involving the current date (e. g. “should not be in the future”).
- Continued work on caching constraint check results.
- Enabled Lua fine-grained usage tracking for better performance on several more wikis: hywiki, frwiki, svwiki, itwiki, zhwiki, bewiki, nlwiki, glwiki, and Wikimedia Commons (phab:T187265 phab:T186714)
- Representation and grammatical features of the form can be changed using the UI (WikibaseLexeme) (phab:T173743, phab:T160525)

You can see all open tickets related to Wikidata here.

Monthly Tasks
- Add labels, in your own language(s), for the new properties listed above.
- Comment on property proposals: all open proposals
- Suggested and open tasks!
- Contribute to a Showcase item.
- Help translate or proofread the interface and documentation pages, in your own language!
- Help merge identical items across Wikimedia projects.
- Help write the next summary!

Read the full report · Unsubscribe · Lea Lacroix (WMDE) 15:35, 19 February 2018 (UTC)

Usage of "member state of ..." items

We have a bunch of "member state of ..." items. Some of them are used in instance of (P31) as duplication of same info in member of (P463) (I think this is a relict of the times without member of (P463)), some of them not. I think that it is better to keep that info just in member of (P463) and instance of (P31) with "member state of ..." item is unnecessary duplication, also it is good to keep number of values in instance of (P31) as low as possible.--Jklamo (talk) 14:48, 18 February 2018 (UTC)

We don't have to get rid of those items. The (sketch of) approach proposed in Template:Implied instances allows to keep them while (not) using them in instance of (P31) statements.

Another point : keeping the number of instance statements low is easily achievable by keeping the most specific class in the hierarchy and deleting values that are its parent classes. author TomT0m / talk page 15:17, 18 February 2018 (UTC)

As an example, for member states of Mercosur (Q6814224) this gives {{sparql|query={{Implied instances|Q6814224}}}} (after I added a "has quality" statement in it). The query finds Mexico (Q96) amongst others. Will work in the near future to include all the explicit instances of such classes in the query results (and explicit instances of the parent classes which have the statements defined in their child classes parent to our class of interest, and I think we could get something quite flexible. author TomT0m / talk page 15:44, 18 February 2018 (UTC)

@Jklamo: I agree with you. If a statement like

⟨ Republic of Ireland (Q27)  

 ⟩ member of (P463) ⟨ European Union (Q458)  

 ⟩

exists,

⟨ Republic of Ireland (Q27)  

 ⟩ instance of (P31) ⟨ member state of the European Union (Q185441)  

 ⟩

is redundant and useless (though of course,

⟨ Republic of Ireland (Q27)  

 ⟩ instance of (P31) ⟨ member state of the European Union (Q185441)  

 ⟩

is a valid statement). Is there any good way to prevent "member state of ..." from being used in instance of (P31) and to induce editors to use member of (P463) instead? --Okkn (talk) 15:37, 18 February 2018 (UTC)

This query might help to find redundant statements. --Pasleim (talk) 10:44, 20 February 2018 (UTC)

Modeling (textile) of (place)

Here's a situation I'd like feedback on: the Wikipedia articles associated with Rajshahi silk (Q7286431) and Thai silk (Q6580701) seem to logically combine three topics:

silk product (Q47469120): textile spun, plied, knitted or woven from silk fiber <made in> (place)
silk (Q37681): fine, lustrous, natural fiber produced by the larvae of various silk moths, especially the species Bombyx mori <of> (place)
sericulture (Q864650): process of silk production <in> (place)

My first thought is that these articles are mostly about the textiles, and I have tentatively modeled them as <subclass of> silk product (Q47469120): textile spun, plied, knitted or woven from silk fiber.

I am wondering if there is a better way to model these - perhaps as <subclass of> sericulture (Q864650), or even using the rarely used Wikipedia article covering multiple topics (Q21484471)?

Does anyone have thoughts on this? - PKM (talk) 22:32, 18 February 2018 (UTC)

This is a somewhat general comment as my knowledge about silk (and notably) Thai silk is rather limited. Looking at w:Thai silk, sericulture (Q864650) or even a more general "silk production in (place)" might be a good fit. That said, ideally Wikidata would have a separate item for all concepts mentioned in that article, notably "Thai silk" and its types mentioned at w:Thai_silk#Types_of_Thai_silk. Depending on the (place), the actual article might just cover these and not the entire process. So the sitelinks on these items might not necessarily be on a single item.
--- Jura 08:45, 20 February 2018 (UTC)

API to find all properties used for a page/list of pages?

For example, for lion (Q140), I'm trying to make the api output all of the properties under "Identifiers", i.e. "Encyclopedia of Life ID" and below. I wasn't able to find this easily in the documentation. — Tom.Reding (talk) 15:02, 19 February 2018 (UTC)

With the MediaWiki API you can not query for only identifier statements. But you can query for all statements (action=wbgetclaims&entity=Q140) and then performing a filtering on the output. --Pasleim (talk) 10:34, 20 February 2018 (UTC)

Don't know if it works for your use-case, but you could do SPARQL query to simply get all identifier properties and then use data as array for filtering. --Edgars2007 (talk) 10:42, 20 February 2018 (UTC)

A couple of queries

Hi, I have been editing Wikidata for the past three to four months. I have a couple of quries at this point:

While giving input of for instance of (P31), I want to know how broad should the scope be? For instance, let's take the example of 2/15th Battalion (Q4597007), should I mention it as a military unit (Q176799) or battalion (Q6382533) or military unit (Q176799)?
What can considered as a valid reference for an image statement? or can it be considered as common knowledge?

--Krishna Chaitanya Velaga (talk) 07:43, 20 February 2018 (UTC)

For second - I would say it is common knowledge (in most cases). For first - hmm, it depends :) In most cases - the most precise one. As you can see, battalion (Q6382533) has subclass of (P279)=military unit (Q176799). But this doesn't work for humans. We don't put celebrity (Q211236) as P31 value, although it has P279=human. --Edgars2007 (talk) 08:16, 20 February 2018 (UTC)

Epidemic

Hoi, is there a good example of an item for an epidemic outbreak. The impact they have can be major for developments (think the black death in the middle ages or Zika or Aids). What I am seeking is not only that it is an outbreak (of what), also a start and end date and where. The number of casualties and the percentage that survived. Thanks, GerardM (talk) 09:58, 20 February 2018 (UTC)

Page vs page number

There has been a split of page (Q1069725) into 'page number' ('pagina') and 'page' (as a one side of piece of paper; page (Q49138218)). I'm not sure that all sitelinks are correctly linked to proper items, so I would appreciate some help, especially with non-Latin languages. Wostr (talk) 13:55, 20 February 2018 (UTC)

Working in English, I see that Wikidata pagination (Q783209) is linked to Wikipedia en:Pagination. Wikidata page number (Q11325816) is linked to Wikipedia en:Page numbering. Wikidata page (Q1069725) is not linked to any English Wikipedia article. The English Wikipedia does not have a separate article for "Page number", instead it has a redirect from "Page number" to "Page numbering". I'm not sure what the best way is to reflect this situation in Wikidata.

As for shades of meaning, page numbering could refer to the physical process of applying page numbers, or a system of page numbering (for example, starting at 1 and going to the last page in the book, vs giving the first page of chapter 1 the number 1-1, the first page of chapter 2 the number 2-1, etc.). "Page number" in contrast means the number that has been placed on a particular page. Jc3s5h (talk) 15:25, 20 February 2018 (UTC)

Cleaning up our upper class tree

Starting point of this discussion :

Or in a textual form :

Classification
name	description
Entity	something that exists
Object	technical term in modern philosophy often used in contrast to the term subject
Abstract object	object with no physical referents
Concept	mental representation or an abstract object or an ability
Mathematical concept	abstract entity in mathematics
Mathematical object	abstract object in mathematics
Class	collection of sets in mathematics that can be defined based on a property of its members
Metaclass	in knowledge representation, a class of classes
Type	kind or variety of something
Group	summarizes entities with similar characteristics together

We learn that a group, say the group of people that will talk about this subject, is « an object with no physical referents ». Which is absurd. Assuming good faith, nobody wanted that in the first place. This is the result of a complex series of edits and merge by different people.

But it’s a big problem that we let that happen. Let’s try to explain and dig a little bit. First attempt to explain :

Items such as group (Q16887380) lacks external proper reference and are edited/merged and so on a lot, by a lot of contributors, some of whom have been blocked for toying around and so on. Everybody seem to edit them as he or she want, in good faith or probably to highlight this problem for some of them, without actually starting a discussion to settle the problem with everybody. This creates a big mess. Nobody actually can tell what this item is supposed to mean beyond its label.

I see several problems here :

this item is linked to a wikipedia article. So far so good. But … this article may lack the precision on defining the concept of group to avoid the mess on top of our class tree. In this case, it’s a « simple » article in english that rely on « common sense » to define a group as a physical object whose parts are also physical objects. There is no way this definitions fits with any kind of abstract object. There may exists groups of abstract objects, however, there is for example a mathematical theory named « group theory » (group (Q83478)), but the relationship between it and physical group, if any, is rather abstract …
As a consequence, it may not be a good idea to use the wikipedia article as the main reference for the definition of the « group » concept. But creating more precise items or items with different definitions of « group » put them at risk of being merged back into this one, or trolled. Back to step one and the mess. For example Fractaler trolled the concept of group to assimilate this to the concept of « class ».
We could decide that the statements about that item gives some sort of definition of those items. For example, we should decide that one of the item labelled « group » is about groups of physical objects, by putting in it statement suggesting that groups are physical object themselves, leading to put statements like
⟨ group ⟩ subclass of (P279) ⟨ physical objects ⟩
and other as we extend Wikidata languages like defining that a group parts are physical themselves. I tried this approach, but it did not work as it seems Wikidatians removed some, added of them, or merged items as they feel right, resulting in nobody knows what the initial intention was, including myself :/

But we actually need different items for different group. For example for some reason we at that point have a statement

⟨ sequence (Q20937557)  

 ⟩ subclass of (P279) ⟨ group (Q16887380)  

 ⟩

. A series is supposed to be a group. By the current definition this entails that a specific series, say the « Friends » TV show, is a group of physical objects. This is obviously wrong. If a « group » class is to be a superclass of a « series », then it should not be the item of groups of physical object but a group of something more abstract. There seem to be a bibliography about the nature of audiovisual artworks, by a random google search on « film ontology we can find bibliography on the topic which points to whole books.

To avoid these problems, I think we should study some top level ontologies and see what we can do with them. So we have a way to link our top level classes definition those items to external and precise definitions that cannot be merged with items with a different definition. By Wikidata’s nature however, I think we cannot commit to one specific upper ontology (Q3882785). There is a neutrality of point of view that must allow us to represent and use different ontologies in Wikidata. This entails that, in my opinion, we have to use higher level classification tools like metaclass (Q19478619) to class the different classes themselves and to keep track of which item use definitions from which ontology, however.

Following the problem of the diversity of definitions, there is also a potential problem with sitelinks. On these concepts as wikipedia have philosophical articles about the history of those concepts, the different ways we can define them and so on. This entails that the philosophical view ontological realism, in very short the point of view that an ontology is the description of objects of the real world, is hard to apply. Instead for some concepts in Wikidata, are we commited to take a position of ontological idealism (Q33442), for example if we try to describe the concept of luminiferous aether (Q208702) we must take the point that what is real in this concept is the theory our ancesters had in their mind, and not reality itself ? The « universality » (the sum of all knowledge) of our projects does not make the task easy. Maybe it’s possible to just deprecate the statements about out of fashion or disproved concept ?

One source of mixing up of our upper class tree, I think, is the mixing up of concepts coming from topics such as « type theory » type theory (Q1056428), a mathematical theory close to set theory (Q12482) that defines some mathematical concepts of « classes » of « types » and the class concept used in Wikidata. Not to mention type system (Q865760) in programming who can be viewed as applied type theory. Wikidata has the power to gives its description of « type theory » or « class » in those domains. We’re also using our concept of « class », maybe in a more loosely defined way than in mathematical axiomatic theories like Von Neumann–Bernays–Gödel set theory (Q278770). The problem arise when we mix up our class concept with some mathematical concept of class or type. Our concept of class happens to become a subclass of « mathematical object », which messes up our tree a little more … I think this is a big variation of the problem of use–mention distinction (Q2577553) (use/mention). We’re confusing our usage of the concept with our description of these concepts, which create a kind of self-reference (Q1129622) loop. I don’t think we want that as this creates a lot of mess and a lot of confusion. That’s why I created items (before being aware someone conceptualized the use mention problem, which is a big help to be taken seriously :) ) like « Wikidata class ». They seem to have disappeared (merged?) over time as some contributors did not understood the problem and purpose.

For programmers, I think some concepts of programming languages type system like « Generics in Java (Q379273) » or other related concepts of generic programming (Q1051282) in other programming languages like type class (Q1375130) or template (Q1411845), which have been proven useful in programming are for interest for us to solve the problem of the « series » item above if we find the right inspiration. They allows to define classes of classes, some kind of metaclasses in a loose sense. This should give a hint that our abstract « series » item, that could hypothetically represent a series of abstract and concrete object, may exists and be useful but be put out of the superclasses of the « TV series ». «TV series» may be an instance of it, not a subclass. A hint that we’re on such road is the use of the « of » qualifier.

I think we should not conflate the concepts used for type systems in computing, datatypes, with our own (informal) type system concept to avoid this. I think we should find ways to reflect the complexity of the different views on the world while efficiently reflecting the world without worrying too much about those difficulties, and I hope this text will be of any help in this purpose, and is clear enough. Please feel free to ask anything and share your thoughts.

WikiProject Ontology has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead. (could have written that there but I think this should be more visible.) author TomT0m / talk page 15:32, 31 January 2018 (UTC)

I don't think it's as bad a situation as you are conveying here. It sounds like "group" may need some cleanup. But for example, the class list you provide above doesn't seem to match what I see directly in wikidata: "series" is a subclass of "group", which is a subclass of "type", which subclasses "entity". That seems relatively simple, and while I am not certain if the "group"/"type" relation is the best, it makes some sense. Everything with abstract entities (just look at the issues with books & editions etc.) is hard to think about, so I generally encourage people to work on the areas of our ontology that are closest to physical reality, where things are easier. What I think is more concerning is the overuse of instance of (P31) for abstract concepts, when subclass of (P279) is the better relation. ArthurPSmith (talk) 17:01, 31 January 2018 (UTC)

I’m sure « groupe » is not a subclass of type. Say a

⟨ sheep herd ⟩ subclass of (P279) ⟨ herd ⟩

⟨ herd ⟩ subclass of (P279) ⟨ group ⟩

, and

⟨ Bob’s herd ⟩ instance of (P31) ⟨ sheep herd ⟩

. Then if we have also group subclass of type, then « Bob’s herd » is a type. But it’s not, it’s a herd. The type in there is « sheep herd », as there is many examples of sheep herd. If we take « http://dbpedia.org/ontology/Group » as the dbpedia concept (as in an informal group of people), we get that it is « An Entity of Type : Class », that is Group rdf:type owl:Class. If we loosely take « class » as a synonym of « type » this means that group is an instance of type, definitely not a subclass (the relationship would be https://www.infowebml.ws/rdf-owl/subClassOf.htm ). Sometimes the use of instance of (P31) for abstract concepts is legitimate, for metaclassification (see User:TomT0m/Classification or

On the other hand a query such as

select distinct ?class  { 
  [] wdt:P31/wdt:P31/wdt:P31+ ?class .
} limit 20

Try it!

, which searchs classes who have instance that have instance and so on in at least 3 levels returns concept (Q151885)  

class (Q5127848)  

metaclass (Q19478619)  

formal ontology concept (Q19868531)  

philosophical concept (Q33104279)  

administrative territorial entity type (Q15617994)  

Q22302160  

second-order class (Q24017414)  

third-order class (Q24017465)  

fourth-order class (Q24027474)  

software category (Q28530532)  

metaclass (Q19361238)  

type of fruit (Q28149961)  

form of government (Q1307214)  

term (Q1969448)  

product lining (Q3084961)  

classification scheme (Q5962346)  

economic concept (Q29028649)  

triad (Q29430681)  

system (Q58778)  

It takes a long time to compute. That may indicate the query is hard to compute, or that there is not a lot of results, maybe a bit of both (it timeout if we want more results). Nothing really scary even if there is dubious stuffs. author TomT0m / talk page 19:00, 31 January 2018 (UTC)

instances of instances of instances

Here's a version of User:TomT0m's classes query with some sampled chains included, to make them easier to assess. The query is very big, because P31 is one of the most used properties there is, and we're asking for its instances table to be intersected with its instances table, and the results then again with its instances table. Even though the engine efficiently pipelines that, starts working on the second and third stages while the first stage is still running, and quits once it's found enough instances, that's still a huge request. ~~Any clever ideas as to how to streamline the query very welcome.~~ Query seems to work well enough now. Slightly tweaked to include the count of distinct values of ?x for each ?class Jheald (talk) 02:27, 1 February 2018 (UTC)

SELECT ?n ?x ?xLabel ?c1 ?c1Label ?c2 ?c2Label ?class ?classLabel 
WITH  {
   SELECT ?x ?c1 ?c2 ?class WHERE { 
       ?x wdt:P31 ?c1 .
       ?c1 wdt:P31 ?c2 .
       ?c2 wdt:P31 ?class .
   } LIMIT 40000
} AS %classes
WHERE {
  {
    SELECT (COUNT(DISTINCT (?x)) AS ?n) (MIN(?x) AS ?x) ?class WHERE {
       INCLUDE %classes 
    } GROUP BY ?class 
  }
  INCLUDE %classes .
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". } .
} ORDER BY DESC(?n) ?x ?c1 ?c2 ?classes

Try it!

-- Jheald (talk) 20:01, 31 January 2018 (UTC)

Note this is quite similar to the metaclass lists I have added in the Problems area of our Ontology project: Wikidata:WikiProject Ontology/Problems/3rd order metaclasses by subclass for example. While some of these are legitimate, many clearly should not be high-order metaclasses. ArthurPSmith (talk) 20:04, 31 January 2018 (UTC)

Clearly there are some oddities, and I think User:ArthurPSmith is quite right that, as a rule of thumb, one should ask whether subclass of (P279) works instead of instance of (P31) for relations between abstract entities, and prefer it if it does make sense. Abstract items as a rule are quite likely to be classes, because one can very often expand an abstraction into a group narrower abstractions by adding some further distinguishing characteristic. Jheald (talk) 20:21, 31 January 2018 (UTC)

The example of group of living things (Q16334298) just below proves that this rule of thumb may be responsible of some oddities. I’d prefer to point people to stuffs about the « token type distinction » as a rule of thumb, because it’s clear which object are eligible to this rule and which are not. A more robust rule of thumb is « take an example of concrete object or event that is an instance of one of the class. Is it also an instance of the second one ? ». If one can’t find a concrete example of a concrete token of a class, then it should seek on the ontology project help and do nothing, he’s in a rare case :) author TomT0m / talk page 20:34, 31 January 2018 (UTC)

@TomT0m: The "A instance of (P31) B and B subclass of (P279) C requires that A instance of (P31) C" rule is of course fundamental. But what I was meaning, following up the comment of Arthur, is that if B is an abstract thing, then it very often will be a class (because narrower abstractions within it can be devised), and so probably ought to be a subclass of something. Therefore IMO it probably makes sense to start by seeing whether B subclass of (P279) C makes sense, and then consider different possible entities A (concrete if possible), and ask whether they disprove the proposition. I think trying that probably generally makes a better starting point than starting from B instance of (P31) C as an initial hypothesis, and considering what would be its consequences. Jheald (talk) 21:34, 31 January 2018 (UTC)

Amongst wich we can find group of living things (Q16334298), probably because of the « group » class is (was) a subclass of type or class. I solved this. We should think about the status of abstract group like groups of film or TV series by the way … my point of view is that an abstract artwork, not a painting for example, is a class of experience. It’s a virtual stuff in the deleuze sense ( « This virtual is a kind of potentiality that becomes fulfilled in the actual. », see https://en.wikipedia.org/wiki/Virtuality_(philosophy) or http://www.oxfordreference.com/view/10.1093/oi/authority.20110803095349177 ) A movie is a potential actual saturday night experience, a video game is instantiated every time someone plays it. That makes a video game a subclass of experience ( qualia (Q282250) ? « Conscious experience » ? I don’t know) author TomT0m / talk page 20:25, 31 January 2018 (UTC)

@TomT0m: I don't think I agree. There are things which are facet of (P1269) a videogame that are a subclass of experience; but the totality of a videogame I do not think is well described that way. (On the other hand, we probably get there by saying that a videogame subclass of (P279) game and game subclass of (P279) experience, which (curiously) I think I would be happier about -- but that might be a slightly different sense of the word experience.

That meaning of experience is probably not a subclass of qualia -- to me qualia is quite a narrow technical term, and should be reserved to identify more point-in-time sense-perceptions like taste, the quality of what is seen, as so forth. Jheald (talk) 21:49, 31 January 2018 (UTC)

@Jheald: facet of (P1269) is quite vague. I would not be happy to build something on such a vague definition that is not as far as I know used in any external ontology, this is sloppy. :) Take a 3D model of a character. I think it defines as well a class of experiences, what you experiment when you watch the model when it is rendered of your screen. Then it definitely make sense to say

⟨ Gordon Freeman 3D model ⟩ part of (P361) ⟨ Half life ⟩

, meaning the experience of seeing Gordon Freeman is part of the experience of Half Life. I like to read Functional Requirements for Bibliographic Records (Q16388) with that in mind :) (there is a relationship between what they call « item », the DVD of the game you bought, and the abstract work, but which one ?) The view of « video game » as a subclass of game is also interesting, you can see that the logical rules that governs the game are analog to the code of the video game « Rule is Law » ~~Lawrence Lessig~~. Also if a video game is a class of concrete experience, or if all soccer games are instances of « soccer », then the ability to subclass « soccer » with classes like « 1998 workcup football games » encompass the need of a metaclass to identify precisely that all football games. I’m reading the book « buiding ontologies with basic formal ontologies » atm, and I see here that « football » seems to be what Barry Smith and its coauthors calls a « universal », whereas « 1998 workcup football games » is a « defined class », very enriching and practical criteria. Having a metaclass for games like « soccer » would be a very practical way to differentiate those kind of class. I now see that « work » is probably a subclass of universals. We could label each video game a work, but we could not label « 1998 workcup football games » a « work » (there seem to be videos and slide about BFO on the web which explains universals in BFO, http://ncorwiki.buffalo.edu/index.php/Tutorial_on_Basic_Formal_Ontology video and slide 1).

OK for qualia, just an attempt. author TomT0m / talk page 22:46, 31 January 2018 (UTC)

On Jheald query: the query timeouts every time here, except if we set a very low limit such as « 100 » on the subquery. Found something interesting however, see the « concept » subsectionI don’t really think it’s useful to get the full path, once we get the class it become easy to find a representative path, and breaking in on the right place potentially breaks many other with the same destination I don’t find anything about the « with / include » construction in sparql, is this a blazegraph idiom to ease the subquery writing ? a link ? author TomT0m / talk page 20:52, 31 January 2018 (UTC)

@TomT0m: Yes, the named subquery syntax is an extension to the approved SPARQL standard, a very useful one that has been implemented by multiple separate vendors. Blazegraph's page on it can be found here. Jheald (talk) 21:56, 31 January 2018 (UTC)

@TomT0m: Just re-ran it and it worked for me, finding 37 cases in 11.3 seconds. Make sure you're using the version with the named sub-query and the limit increased to 40,000 -- this is much more efficient than the one I originally posted. Jheald (talk) 21:19, 31 January 2018 (UTC)

Update I added a column to the query to add a count of the instances ?x for each main ?class at the other end of the tree, and it's really helpful - it puts the results into a much sharper focus. Almost all of the instances belong to the first 4 cases of ?class: second-order class (Q24017414) (29,026); concept (Q151885) (5273); musical term (Q20202269) (4589); and metaclass (Q19361238) (700). There are oddities within these (is a fable (Q693) really an example of a stylistic device (Q182545) ?) but these are top-level classes that do make sense in a list of this kind.

Beyond this, each case accounts for only a handful of instances. Worryingly, a number of these seem to be due to vandalism, eg Thirteenth Amendment to the United States Constitution (Q175613)instance of (P31)Indiana (Q1415), Araceli Gilbert Bolognesi (Q4783519)instance of (P31)love (Q316), CPU-Z (Q1024234)instance of (P31)Azerbaijan (Q227)

apple (Q89) -- I think should be changed to Reanda (Q1000959)subclass of (P279)apple (Q89)subclass of (P279)pome (Q41274), rather than P31s.

Now done, for all cultivars of apple. Would be good if somebody knowledgeable could sort out the relationship between pome (Q41274) and fruit of Maloideae (Q145150) Jheald (talk) 10:46, 1 February 2018 (UTC)

radio communications (Q872) -- doesn't seem quite right; curious is this is the only class that is P31 industry (Q268592) P31 classification scheme (Q5962346) / economic concept (Q29028649)
ethanol (Q153) and water (Q283) -- I am sure this has been discussed, but I also know I haven't followed those discussions. Should these be subclass of (P279)chemical compound (Q11173) ? Should deuterated ethanol (Q1101193)subclass of (P279)ethanol (Q153) ?

hydrogenated water (Q11549076) and deuterated ethanol (Q1101193) both changed to be subclass of, per ArthurPSmith below. Jheald (talk) 12:15, 2 February 2018 (UTC)

leap day (Q3852142) -- should this be changed to subclass of (P279)day (Q573) ? Should February 29 (Q2364)subclass of (P279)leap day (Q3852142) ?

Done. Jheald (talk) 12:15, 2 February 2018 (UTC)

?x instance of (P31) ?human instance of (P31) human (Q5) is a curious one. 44 instances to clean up.tinyurl.com/yc9bz4wy.

Dealt with. Just The Hardbitten Heretic (Q19560964) remaining now, instance of (P31) anonymous (Q4233718) is probably not quite right, but not sure what it should be. Jheald (talk) 12:05, 2 February 2018 (UTC)

Similarly for country: tinyurl.com/y8t59m3b. 13 examples.

Dealt with. Mostly by User:Oravrattas (thanks!). But looks to be a favourite recurrent target for stupid edits. Jheald (talk) 12:05, 2 February 2018 (UTC)

(to be continued) Jheald (talk) 10:08, 1 February 2018 (UTC)

@Jheald: Nice work. I agree with all of your suggestions above except I would note that WikiProject Chemistry has been discussing "chemical compound" a bit and working on a new way to model chemical species, which is definitely not settled yet. But deuterated ethanol (Q1101193) definitely should still be a subclass of ethanol (Q153) (for example). ArthurPSmith (talk) 14:57, 1 February 2018 (UTC)

A simpler query to find instances of instances of physical objects, generalizing a bit the idea, gives a few hints: https://query.wikidata.org/#%0Aselect%20%3Fitem%20%7B%0A%20%20%3Fitem%20wdt%3AP31%2Fwdt%3AP31%2Fwdt%3AP279%2a%20wd%3AQ223557%20.%0A%7Dlimit%20100%20 universities or college that are instance of other universities for example :) a mistake, a faultly import or a misuse as « part of ». author TomT0m / talk page 20:29, 1 February 2018 (UTC)

@TomT0m: A bit less simple, but I think this variant gives quite useful perspective: tinyurl.com/y7t8zyrt. Some of these should definitely be cleaned up, and changed from P31s to P279s. Jheald (talk) 20:58, 1 February 2018 (UTC)

@TomT0m: Actually, more useful is probably this tinyurl.com/ybkpsoxb, counting on ?c1 instead of ?class. Jheald (talk) 21:06, 1 February 2018 (UTC)

Ontological status of « Concept » and « death »

Related to the text in introduction, it seem that « death » is an instance of « concept », and that death of Caylee Anthony (Q1056362) is an instance of « death ». It make sense if we consider that the article is about the event of the death of someone, it make less sense if this is about a case, as the frwiki article is entitled (« Casey Anthony’s case »). Quite a common problem, though no big deal The statements about death (Q4) are way more interesting. It’s both

an instance of « concept » (This seems like an example of « ontological idealism », we describe concepts that exists in our head), but in the end this does not seem a really informative statement as if everything exists in our heads, basically anything we can imagine is a concept
an instance of property (Q937228) (this one puzzles me) and
…_an instance of event just a few moments ago https://www.wikidata.org/w/index.php?title=Q4&diff=625305390&oldid=623107991 - it’s the easiest to deal with, if it’s an event it is clearly a subclass of events as there is many concrete events of death.
a subclass of state, and a subclass of event, and a subclass of process
a subclass of Q23956356, an item with unclear status where it seems the usual suspects have been toying around looking at the history, an item with an unclear status

It’s not really surprising as there seem to be many definitions of death ( http://www.europsy.org/ceemi/defmort.html in french for example). Enwikis also introduce the topic : https://en.wikipedia.org/wiki/Death#Problems_of_definition I think that we should at least have items for the moment of death, which is more like the transition beetween the « living state » and the « dead state », and for the state of a dead body (cadaver (Q48422) for which we have an item). We even have article parts for https://en.wikipedia.org/wiki/Decomposition#Animal_decomposition decomposition of bodies, death is fascinating it seems …

Thoughts ? author TomT0m / talk page 21:23, 31 January 2018 (UTC)

instances of « Term »

advanced emotion (Q16748888), complex emotion, is an instance of term. And also a part of « theory of emotions » (and « love » is an instance of it). I remember encountering those cases a lot in the beginning of Wikidata, it seems that there were some « idealists » back then. This make sense if « complex emotion » is considered a conceptual entity that we arbitrary choose, and if the goal of Wikidata is to describe how we think about it. From a « realist » perspective, « love » is something in the real world and this item is a description of it. The goal of science is to understand this real things by building the best possible descriptions of it. So I’d tend to think the item about love should embrace these descriptions and describe love. And not that our « love » item refers to a concept or term that science uses to model love … This would mean that the only instances of terms are the items about word … for example about the lexical entity « love », a task for wiktionary.

Also interesting but misleaded, the fact that there is a statement « complex emotion » part of « theory of emotions ». This seems like an idealist perspective as well : this is a term that is used by the theorist to describe emotions, amongst other terms in emotion theory …_i’d tend to think the right property is something like study of Search to link the theory or field that describes the objects of the real world in question.

Thoughts ? author TomT0m / talk page 21:57, 31 January 2018 (UTC)

There is a similar issue with Latin phrases such as sine loco (Q11254169). Should these be <instance of> term, <language> Latin, or rather the associated meaning? - PKM (talk) 00:07, 1 February 2018 (UTC)

Objects

I think in a number of cases items are set as <subclass of> object (Q488383): anything that may be observed or acted upon by a subject when they should be <subclass of> concrete object (Q17553950): a particular or specific instance of an entity. To describe tangible or physical objects use Q223557. I mostly see this down the hierarchy (clothing items should be concrete object (Q17553950), no?). But I admit that the higher up the hierarchy we get, the more uncertain I am of my grasp of ontological first principles. - PKM (talk) 00:41, 1 February 2018 (UTC)

Further: Here are comparative hierarchies for "clothing".

Wikidata: entity > object (philosophy) > abstract object > concept > result > goods > product > clothing
Getty AAT: object > furnishings & equipment (hierarchy name) > costume (hierarchy name) > costume (mode of fashion) > clothing

Frankly, the Wikidata hierarchy makes no sense to me whatsoever (aside from the fact that not all clothing items are products). AAT is useful but not definitive - their hierarchies do not always imply that an item is a subclass of its parent; often the relationship can be better modeled as <facet of>. My person preference would be something like object > physical object > clothing <facet of> costume. Perhaps there's a class between physical object and clothing, to correspond with AAT's "furnishings & equipment", but I wouldn't know what to call it (furnishings (Q31807746) is different, parallel to clothing in the hierarchy). - PKM (talk) 01:02, 1 February 2018 (UTC)

@PKM: Which item do you precisely denote as « clothing » ? There may be many aspect on this notion that is actually covered by items. The « AAT tree » seems to be topical to me, in the sense it describes a way to class topics of interest and their « subtopic » relationship, not real world object like instance of (P31). To compare to Wikipedia, it seem like more a « portal inclusion » representation than a class hierarchy, or a « parent category » relationship. I’d personally hardly be interested in this, but if we have something like that one day this definitely not should be represented with subclass of (P279)_who is not intended to be a random hierarchical property. Unsurprisingly seeing Getty AAT http://www.getty.edu/research/tools/vocabularies/aat/, it’s not an ontology but the thesaururs … those are imho in the domain of the wiktionary, not on Wikidata. The goal of a thesaurus is to represent the terminology to describe a domain, not to represent the domain itself. I guess this is an example of « ontological idealism ». The approach of Wikidata is to describe the object of the domains directly through our items statements (aka. ontological realism), rather than describe the terminology used by expert to represent a domain. If however we have a sturctured wiktionary (and there is thesaurus’ in wiktionary) with a structured thesaurus, it will be possible to link the description of the terms to the description of real world object they are supposed to represent that Wikidata holds … author TomT0m / talk page 15:42, 1 February 2018 (UTC)

We certainly have some items that are <instance of> some subclass of clothing - mostly museum objects - but in general we have a massive class tree of types of clothing (shirt, dress, trousers, kimono) in a structure informed by the AAT and the Europeana Fashion Vocabulary (Q29016777). If structured vocabularies are not appropriate sources for building hierarchies for the objects that make up material culture (and which are heavily represented in Wikipedias), then I can't imagine what is. We're not merely defining terms - clothing items are associated with ceremonial activities, cultures, and time periods; are made of materials using methods and processes; can be named after persons or places, invented by individual designers, and are depicted in works of art. Our current class tree and items set are far from perfect or finished, but we have a WikiProject and an approach. - - PKM (talk) 20:04, 1 February 2018 (UTC)

@PKM: This seems like a good start, but there seem to be problems with your class tree related to my points above. For example, while a lot of items are about types of clothes, there is in the tree items like wasp waist (Q1283782)_is not really about the clothes that allows to build the style, but about the style itself. This item should, in an ontological perspective and not a nomenclatural one, be classified as a « silouhette » type, a different subclass tree that the cloths class tree. There may be relationship between the silhouette, or the clothing style, and the types of cloths that someone wear to bear this style, but this is probably a candidate for a property creation, something like « style allowed by cloth », or

⟨ wasp waist (Q1283782)  

 ⟩ produced by (P2849) ⟨ https://en.wikipedia.org/wiki/Corset ⟩

to paraphrase the wikipedia article. It’s probably a good idea to ping WikiProject Ontology for a review of this approach before starting the work, to ensure the approach is consistent in the whole project … Starting from a thesaurus probably needs a bit of processing to convert it from a consistent ontological perspective across Wikidata. And there is whole ontologies dedicated to consistency between ontologies, see upper ontology (Q3882785), so ontologists take that seriously :) Actually this is part of the whole purpose of having ontologies in the first place. This is the reason we can know a class tree is incorrect and know what to do to clean it :) start from well defined concept, what is a real taxonomy, … so I’m happy to have started this discussion and hope we can cooperate in a constructive manner :) author TomT0m / talk page 20:53, 1 February 2018 (UTC)

@TomT0m:Yes, you are absolutely right about "silhouette" (and princess line (Q10638846) belongs there). There is also cut (Q11626671): style or shape of a garment, and the way its structure hangs on the body which may be the same concept. (And making "cut" a subclass of costume component was probably wrong, but at least it's findable.) I'd love help from the Ontology project on costume and fashion. - PKM (talk) 00:51, 2 February 2018 (UTC)

┌────────────────────────────────────────────────────────────────────────────────────────────────────┘ I've started a conversation about making changes to this ontology here - comments encouraged. - PKM (talk) 20:28, 17 February 2018 (UTC)

Index mineral

Top of the charts tinyurl.com/ybkpsoxb for TomT0m's latest query is index mineral (Q12409135). Now it definitely isn't instance of (P31) geographic region (Q82794). But it would be nice to be able to say that it characterises a geographic region (Q82794). I also think that saying it is part of (P361) mining (Q44497) is a bit questionable. To me the relationship ought to be something more like field of application mining (Q44497).

Any suggestions? Jheald (talk) 21:38, 1 February 2018 (UTC)

@Jheald: A google translate of one of an article on one of the instances of is https://translate.google.com/translate?sl=auto&tl=fr&js=y&prev=_t&hl=fr&ie=UTF-8&u=https%3A%2F%2Ffa.wikipedia.org%2Fwiki%2F%25D8%25A7%25D9%2586%25D8%25AF%25DB%258C%25D8%25B3_%25D8%25A8%25D8%25A7%25D8%25A8%25D8%25A7_%25D8%25B1%25D8%25A6%25DB%258C%25D8%25B3&edit-text= .

It’s the wikipedia in Farsi and from what I can guess the article is about a mineral sample that was used to study the rocks of that area. In that sense « index mineral » I’d say it’s a subclass of sample (Q485146). I’d say an « index mineral » instance « is used to study » a geological area. In turn, the mining industry uses geology to see which places are interested for them. But the « index » might be different when specialized for the mining industry. For the mining industry, maybe we can create a property with domain « mine » and range « mineral ».

From what I understand from the enwiki article, I’d say there is several aspect in this : the types of mineral that can be used as index mineral. For example « chlorite zone » is a subclass of « metamorphic zone » which has the index mineral type « chlorite ». I’d say, to be pedantic, that there is a process in geology called « metamorphic zone mapping ». « metamorphic zone mapping » imply finding sample of index mineral types in an area.

If « geology » as a science is the study of the earth, then « metamorphic zone mapping » is a part of geology.

⟨ « metamorphic zone mapping » ⟩ uses (P2283) ⟨ « index mineral » ⟩

. A specific index mineral like Q5760003 is probably an instance of some index mineral type that has been used for the process of mapping its metamorphic zone. The metamorphic zone is part of the earth crust under some geographic area, I guess.

instance of (P31) geographic region (Q82794) is obviously wrong indeed author TomT0m / talk page 12:33, 2 February 2018 (UTC)

Interesting stuffs on prospection and sampling in mining : https://en.wikipedia.org/wiki/Mining_engineering#Mineral_exploration .

mentions :

Notified participants of WikiProject Geology, to w:Wikipedia_talk:WikiProject_Geology and to w:Wikipedia_talk:WikiProject_Mining. author TomT0m / talk page 12:55, 2 February 2018 (UTC)

@TomT0m: I would think that more straightforwardly index mineral (Q12409135)subclass of (P279)mineral (Q7946) -- the item is about the substance itself, not a portion of it.

More interesting is if/how we should use has characteristic (P1552), has use (P366), used by (P1535), (others?) to express what it is about this class that distinguished its items from more general minerals. Jheald (talk) 13:14, 2 February 2018 (UTC)

« the item is about the substance itself » : Sorry, I don’t understand what you mean. If you look at the farsi article, the index has even a name, « Daddy boss », so clearly it’s about a rock sample. On the other hand, it’s clearly true that « Daddy boss » is an example of some chemical compound, so an instance of substance. This is consistent with the frwiki definition of chemical substance : « Une substance chimique, ou produit chimique (parfois appelée substance pure), est tout échantillon de matière » (a chemical substance or […] is any sample of matter … ». But the use for the mine rocks may not be consistent with the definition in the enwiki article.
« what it is about this class that distinguished its items from more general minerals » You mean its instances of its subclasses ? This makes a big difference. If you find a sample of an index mineral in a rock, you learn something about this rock’s history because you know this substance can only be created in the rock history in certain conditions. In the mineral this means that we may have some properties « created from <other mineral> at pressure <pressure value> », but this is true for any mineral. I think there is nothing intrinsic in the quality of being an index mineral. They are interesting depending on the context. I’m inclined to think, after a little thinking, that what we are interested into when we define index minerals is the types of minerals one could find in a rock sample, and not in the instances of those type by themselves. So « index mineral » may be a metaclass. We would have
⟨ biotite series (Q105794)    ⟩ instance of (P31) ⟨ index mineral (Q12409135)    ⟩
, not index mineral (Q12409135)subclass of (P279)mineral (Q7946). The definition would then be « mineral type whose instances are searched to determine the degree of metamorphism a rock has experienced » which is definitely consistent with the enwiki article imho. This also does not mess with the subclass tree of minerals by mixing their classification by intrinsic quality with their usage in science.
I don’t think has characteristic (P1552) is relevant here. It’s more a property of « metamorphic zone (Q2690925) » to contain minerals of that type or not. has use (P366) : they definitely uses anything. A process or person make use of something. used by (P1535) : maybe. The problem is « to what end » ? then we find a process again, like « metamorphic zone characterization ». A part of science. author TomT0m / talk page 15:51, 2 February 2018 (UTC)

I think index mineral (Q12409135) must be subclass of (P279) mineral (Q7946), part of (P361) metamorphic zone (Q2690925) and studied in (P2579) petrology (Q163082) or something similar, never geographic region (Q82794) or mining (Q44497). --PePeEfe (talk) 16:47, 2 February 2018 (UTC)

User:PePeEfe and some others have this right, it is a subclass of minerals. Using index minerals is a geological technique, but that would be something more like "index mineralogy" and not the same as index mineral. It is not connected to geographic region, or mines or mining. Graeme Bartlett (talk) 01:59, 3 February 2018 (UTC)

@Graeme Bartlett: Correct me if I’m wrong, but being an index mineral is not an instrinsic property of the mineral, but rather a feature of « index mineralogy ». In that sense, it seems that if it’s a subclass of mineral, the actual instances of this class have no interest by themselves. Imagine you are practicing index mineralogy. Knowing you found an instance of index mineral, as a subclass of mineral, does not gives you any information. Rather what you are interested into is « which index mineral did you find ». Meaning « what is the class of mineral you found that we name « index mineral » » ? In that sense, I think it’s more useful to consider « index mineral » as a metaclass, a class of classes (of mineral instances). « which index mineral did you find » can be reformulated « which instance of (the class) index mineral did you found ?». The answer is then « I found biotite ». author TomT0m / talk page 11:26, 3 February 2018 (UTC)

Both instance and subclass of the same item

This one is really simple :

select ?item where {
  ?item wdt:P31 ?class;
        wdt:P279 ?class .
}

Try it!

… and has a scary number of hits : 856741 It was inspired by looking at the results of the last one (thanksJheald (talk • contribs • logs)).

A query to find the most problematic maybeclasses is more reassuring

select (count(?item) as ?num) ?maybeclass where {
  ?item wdt:P31 ?maybeclass;
        wdt:P279 ?maybeclass .
 # SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
} group by ?maybeclass 
  order by desc(?num)

Try it!

Most of the problem concerns only a few concepts like protein, genes and color.

num	maybeclass
405628	protein (Q8054)   ]
374042	gene (Q7187)
38422	non-coding RNA (Q427087)
35060	pseudogene (Q277338)
1067	transfer RNA (Q201448)
560	small nucleolar RNA (Q284416)
215	single-day road race (Q2912397)
97	fish dish (Q18679149)
88	ribosomal RNA (Q215980)
83	small nuclear RNA (Q284578)

author TomT0m / talk page 16:49, 2 February 2018 (UTC)

@TomT0m: Strangely enough, User:GoranSM asked about almost exactly this in the Wikidata group on Facebook just last night: [6].

I don't think anyone had any definitive reply, but for genes and proteins one possibility is that one might expect families of related similar genes and proteins across evolutionarily similar (or even not-so-similar) species, and maybe that is part of what is going on here. If one only had one example so far, one might be combining the item for the gene and for the gene family. Also there's the question of how one represents a gene that may have variants, even within the population of a single species. Jheald (talk) 17:17, 2 February 2018 (UTC)

That’s weird that the problem is not solved yet, there is concepts like allele (Q80726) who should be of some help, plus existing gene ontologies. Let’s assume an « allele » is the class of all DNA fragment with a same genetic sequence. A gene is the superclass of all its alleles. Seems quite simple to me.

« Gene » is then the metaclass of all those superclass. « Allele » is the metaclass for all single DNA fragment defined by a single sequence. author TomT0m / talk page 17:40, 2 February 2018 (UTC)

@TomT0m, Jheald: Note that Wikidata:WikiProject Ontology/Problems/instance and subclass of same class has been running via Listeria for quite a while - I've been working on some of the simpler items on there for a few months, but there are a lot of issues to sort out. The problem with genes, proteins etc. is that the ProteinBoxBot folks have been inserting both instance and subclass relations automatically for everything they add, I don't think there's been much discussion there about which is better, and there are probably some things that depend on one or the other relation that we wouldn't want to break without some community discussion of how to proceed. ArthurPSmith (talk) 17:52, 2 February 2018 (UTC)
- @ArthurPSmith: Community coordination is the key not to make a mess. To take good decisions, decisions should be based on ontological arguments like I try to do. Overwise the result may be like flipping a coin, which is a mess if we flip the coin several times. A comment of my proposed model, as a consequence ;) ? author TomT0m / talk page 18:06, 2 February 2018 (UTC)
  - Mmm forgot this discussion before starting my adventure below … I’d suggest to replace P31 in their queries with P279*, as the only statement I remove are when there is already a subclass path to « biological process ». Or to use Module:PropertyPath to do the same substitution in infoboxes if needed, the function « path.match » can do the trick (but the module and its dependencies have to be be copied to enwiki) author TomT0m / talk page 21:18, 3 February 2018 (UTC)

Colors

Let’s call « red » the subclass of light with a red color spectrum, or a family of spectrum. Let’s do the same for blue, white and so on. « red » is then a subclass of « light », as is blue, white and so on.
Let’s call « color » the metaclass for classes of light defined by a certain spectrum. « red », « blue » and so on are all instances of « color ». Red may be subclassed, this means the family of spectrum of the subclass is a restriction of the family of the more general « red ».

Let’s define the « color » property as « property who takes it value over instances of the « color » metaclass ». An object is of color « red » if when lighted by light color it reflects lights of a spectra conform to the spectra defined for « red ». Or if it emits it directly if it’s active.

Any questions or problems with modelling stuff that way ? author TomT0m / talk page 18:01, 2 February 2018 (UTC)

All three of us participated in Wikidata:Requests for comment/Are colors instance-of or subclass-of color but that ended up inconclusive. Treating "color" as a metaclass with individual colors as instances of "color", while being subclasses of one another sounds like a good solution. However, how do you handle the colors "white" and "black" with your approach? Or even "brown" or "grey"? ArthurPSmith (talk) 18:53, 2 February 2018 (UTC)

@ArthurPSmith: In the absence of the RfC process to reach a conclusion, I guess we have to use the best idea we have. I don’t think « white » is a problem. The scientific approach to light decomposition is spectroscopy (Q483666). Call the result a « spectra ». while colors like will have just some ray in the spectra, white will have rays everywhere or so (the « full » spectra). Spectral analysis allows to define a class of spectra that maps to the white color. The spectra for « gray » are probably similar to the one for white, but with less intensity. Black maps to the empty spectra. author TomT0m / talk page 12:36, 3 February 2018 (UTC)

So I think you are saying that "black" would be a subclass of every color, and every color would subclass "white" - including "gray" (and in general darker shades would be a subclass of lighter shades?) ArthurPSmith (talk) 16:30, 5 February 2018 (UTC)

@ArthurPSmith: No, as the spectrum of some red do not fulfills the criteria of beeing a white, it lacks a lot of other frequencies (at least some primary color). Say we represent colors are RGB, « white » will be those colors whose 3 components are above a high threshold. A red would not qualify for this criteria as it would be low on at least one component. A subclass instance should qualify for all criteria of the superclass, so a red is not a white. It would more be like a « part of » relationship between white and the other colors, as you can add a red (light) with some other color (light) to get white (additive color (Q353267)). author TomT0m / talk page 16:44, 5 February 2018 (UTC)

@TomT0m: Ok I think I see what you're getting add. Each specific color (instance of "color", like "red", "white", etc) is associated with a collection of possible RGB values (or some reasonable other mechanism for specifying color). Subsets of one of those collections correspond to more refined color labels, which are subclasses of the more general ones. That is, a color item here does not generally correspond to a single specific RGB value (r, g, b) but some kind of region in rgb-space that "looks like" the color; say for "red" maybe something like {r,g,b|r > 2(g+b)}? But perhaps not so precisely specified, with somewhat fuzzy edges...? Is there anything that you think would qualify as an "instance of" red then, though? A specific RGB value? Or the color of a particular pixel in an image, would that be an instance of red? ArthurPSmith (talk) 19:00, 5 February 2018 (UTC)

@ArthurPSmith:_Colors instances would be, as said previously, actually light rays. The actual instance of « color » is the light emitted by the red pixel on my screen. This pixel has the property of being red as he emits light of that kind. So there should be only few instances of a color in Wikidata, something that is close would be cosmic microwave background (Q15605) (don’t know if there was anything visible in it :) ) or closer solar radiation (Q17996169). Note that as the RGB space is not the only space in which we can describe colors, there is not only one possible characterization but several more or less equivalent. And that we don’t actually have to provide a precise characterizations, we should ensure ourselves there is one. author TomT0m / talk page 21:12, 5 February 2018 (UTC)

ProteinBoxBot and biological processes

@ProteinBoxBot, Andrawaag, Sebotic, Gstupp: Correct me if I’m wrong, but this kind of edits https://www.wikidata.org/w/index.php?title=Q2355306&diff=615506970&oldid=610494690 especially the one with instance of (P31), are incorrect. They are source with « the gene ontology » but such a page https://www.ebi.ac.uk/ols/ontologies/go/terms?iri=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FGO_0009888 does not show any « type » statement, but « is a » statements which are equivalent to « subclass of ». The namespace « biological process » is not an instance of statement as well. I’m cancelling those statements for misuse of the source and because this causes a lot of classes to be instances of one of their class of superclass. This is incorrect. 20:13, 3 February 2018 (UTC) – The preceding unsigned comment was added by TomT0m (talk • contribs).

Hi @TomT0m:, first off, sorry for not responding to this earlier, but I don't think the Mention template is working, as none of us received any notifications. Please use Ping instead. In regards to using "instance of" to indicate the semantic type, or classes being instance of one of their superclasses: This has been discussed . many . times . and as far as I can tell (?), no consensus has been reached. We use the instance of statements to help with queries as they make them simpler and faster. We also use them in applications built on top of Wikidata such as Wikigenomes.org and Wikidatascape. As these are referenced statements, we'd appreciate if you didn't delete them while ontology discussions are still ongoing.

Regarding namespace vs. type: The namespace in this case is the ontology that these terms are defined in. The [http:// geneontology.org/page/ontology-structure GO documentation] explains how there are three GO ontologies that are "is a"-disjoint, meaning that no is_a relations operate between terms from the different ontologies. And so all terms that are subclass of biological process are a biological process, and we use instance of to capture this.

On using instance of a second-order class (Q24017414) instead of the root node to indicate type (mentioned here, and probably elsewhere), we are definitely open to doing this if this is what the community decides is the best practice. Gstupp (talk) 02:12, 7 February 2018 (UTC)

@Gstupp:_This is not totally a question of community consensus as instance of (P31)_and subclass of (P279)_have analog properties in external ontologies and come with definitions for their intended use, Help:BMP for example is one of the oldest help page. As true our projects work with consensus, I think on the Wikidata case we have a commitment to definitions for items not to be ambiguous for example, and community should be careful with them not to make a mess. In the case of proteins, I think it’s not harmful at all to use a newly created biological process type (Q47989961).

Maybe, if we take the metaclass path, to make it useful we may try to restrict its use to some kind of « universal » notion (universal (Q6497530), see the wikipedia article(s)). For example « Einstein’s digestion processes » is a subclass of « digestion process » but is not a universal as it refers to a real world instance. « digestion process », as a whole, is a subclass of process on the other hand that may be composed of many subprocesses, but that qualify to be a « universal class ». But the class multicellular organismal process (Q22299433), upper in the class tree, may not really be useful as it is subclassed by many very different processes that qualifies to be universals themselves. Maybe only « universal leaf » of the « biological process » class should be classified as instance of biological process type (Q47989961) ?

My personal impression on this is that there is nothing to lose to make things clearer and to rely on practical definition. Community concensus has its limits on this as « local consensus » of a wikiproject may contradict the consensus of a different one, (here the ontology wikiproject) and the real consensus is gained when both wikiproject agrees. author TomT0m / talk page 11:54, 7 February 2018 (UTC)

@TomT0m: There are two issues here. First, there is this mass deletion of well-referenced and well thought-through statements without ample notice. Our processes are well thought through also because we build projects and use cases on top of wikidata. To give an example, http://www.wikigenomes.org/ fully rely on similar structured and referenced data in Wikidata. We also maintain a series of example queries actually being used by the community. So even if that data is "a mess" as you call it, cleaning it up indiscriminately, can actually break working applications that rely on Wikidata. So I would kindly request you to carefully consider this when you are changing underlying well-referenced data models in Wikidata.

The second issue is indeed the ambiguity in the ontological space. It is amusing to see that for example, a Wikidata item can have both an RDF: type property and a P31 property.

wd:Q42 a wikibase:Item ;

wdt:P31 wd:Q5 .

Similarly, subclass of (P279) has an alias "is also a", which suggests some level of synonymy with instance of (P31). This a simple example that demonstrates that there remain some systemic issues in how ontological relationships are modelled in wikidata. With this in mind, it is a bit unfair to be so strict on well-intended efforts like ours. To move forward I would like to suggest to make/keep it a community process. This means that we reinstate the statements that have been deleted. We need to figure out if and if so where changing the data model breaks an application and use-case, while we start a remodelling discussion/effort to fulfil the ontological issues they allegedly create.

BTW I also didn't receive a notice when you mentioned me. Please use ping if I need to look at some issue. --Andrawaag (talk) 17:24, 7 February 2018 (UTC)

@Andrawaag: Per Template:Reply to bath ping and mention are redirects to it, so I don’t think this is the issue. I had also a not notification issue of this thread earlier with the ping template, was pinged but did not was notified. May be a notif bug. That said, I had second thoughts after removing the first claims, and did not continue the job. I’ve seen the bot have been rerun and is re-adding the statements so it’s just a matter of time. This is in no way a satisfying solution. I don’t really understand that the whole model was based on an assumption that instance of (P31) was equivalent to subclass of (P279) when we have Help:BMP for such a long time, and that people don’t really wonder why there is two properties if they are essentially the same. Note that I still thinks that the sources, although you did in good faith not realize it, are not correctly used and that the statements are then not corrects, by my previous argumentation. See for example Help:BMP or User:TomT0m/Classification (this is an essay of mine but it is solid, I wrote the documented enwiki article on the notion of metaclass in the semantic web and knowledge representation in the process of writing it) I hope my arguments about the sources not supporting them, however, convinced you so we can agree to a better solution (one has been proposed just above. What do you think about this ?

On

wd:Q42 a wikibase:Item ;

wdt:P31 wd:Q5 .

That’s because Wikibase data model is represented in RDF, but Wikibase data model essentially represents items, that are a collection of statements about the subject of the item. If there is a meaning to a statement (say, a P31 statement) at the level of a wikibase data model it is « according to [the sources of the statement], the subject of this item is an instance of [the subject of this item]». That way, a statement can be for example deprecated the statement is not believed to be true anymore but the source still says this (see Denny’s explanation on this : https://blog.wikimedia.de/2013/06/04/on-truths-and-lies/ ). Or to deal with inconsistencies easily, if you have two date of births for a person this is a problem for logical reasoning in RDF (principle of explosion (Q60190)), if you have just a statement that says that someone says it’s true, no inconsistency follows. It imply that if we want to do reasoning with the data we do not forgot the references however, so instead of a deduction « the person died at 30 » from its date of death and its date of birth, the deductions become « according to [the source for the birthdate 1 and the source for the date of death], he died at firty » The may be completed with other hypothesis according to other sources or combination of non deprecated sources. This does not imply that we as a community can’t try to make sense on the content of the statements’ collection that Wikidata is. In that sense, P31 only means what we as a community decide, it was defined as an analog of « rdf:type » only because someone thought because we needed one, but there is no formal link between rdf:type and P31. And there won’t be any, this ensures that we don’t mix up the « collection of statement » level the Wikibase data model defines and the level of the meaning of the statements - of course the Eiffel tower is not a collection of statements, and the subjects of the statements we have on the Eiffel tower is the Eiffel Tower, not its item. We decided we’d use P31 in Wikidata the same way rdf:type is used in rdf. The same way we decided that our « date of birth » property would carry statements about the date of birth of a person. This is a problem however if actually the reference does not support the statement. author TomT0m / talk page 18:45, 7 February 2018 (UTC)

TomT0m: Our use case here is only to simplify queries and uses of items in external applications by having a statement on each item indicating the item's type, without having to perform a query through P279*. We are happy to do this any way the community decides, whether it is with instance of the root class, or a first-order metaclass, or any other way, as long as it is consistent and widely agreed-upon. Please let us know if this is the case. Thanks Gstupp (talk) 20:35, 7 February 2018 (UTC)

@Andrawaag, Gstupp: my message on wikiproject molecular biology does not seem to attract attention. I don’t know what that mean, is this project actually inactive ? that would mean the consensus can be decided by the three of us, but I don’t think it’s the case. Where do you think it’s the right place to start a discussion ? And if you have no idea, cam we consider this to be a tacit consensus ? author TomT0m / talk page 13:53, 10 February 2018 (UTC)

@Gstupp, TomT0m: I support the first-order metaclass solution, as I mentioned here. Wikidata doesn't have explicit semantic type, like in Unified Medical Language System (Q455338), but we can represent its relation by using instance of (P31) with some class or metaclass. This is very useful not only to simplify query but also for human to quickly understand what this item is. I don't like to trace the subclass of (P279)* chain one by one. --Okkn (talk) 06:33, 12 February 2018 (UTC)

@Okkn, TomT0m:: I just want to reiterate that this decision affects many (all?) aspects of how items are represented in Wikidata and it should be decided on a system-wide basis instead of being applied to specific types of items. We want to avoid flipping things back and forth and having inconsistencies. As far as I can tell, this issue has not been resolved and is used differently across wikidata. Is red an instance of or subclass of color? Is British Sign Language (Q33000) a language or "language class" ? Is atheism (Q7066) a religion or religion class? Is human p53 a protein or a protein class? Right now, we manage a large quantity of many types of items (diseases, genes, drugs, chemicals, RNAs, sequence variants, etc.) we want to keep them structured in a consistent manner. I don't think that this decision should be made with only "biological processes" in mind or only within wikiproject molecular biology, and should take into account everything that would be affected. Gstupp (talk) 22:54, 12 February 2018 (UTC)

@Gstupp, TomT0m: The problem is that the usages of some root items (eg. disease, color, language, etc...) are still ambiguous and controversial ("instance of (P31) X" vs "subclass of (P279) X"). However, "what can be a value of instance of (P31) in one item" (class?) and "what can be a value of subclass of (P279) in the same item" (metaclass?) are clearly and logically different, so once we separate these two distinct concepts in the root concepts, most of the troubles may not happen. I know ProteinBoxBot team is maintaining a huge number of items, and especially because of that, I think it is important to show a model. If you adopt metaclass solution, is there any inconvenient point? --Okkn (talk) 23:35, 12 February 2018 (UTC)

@Gstupp, Okkn: Some cases are more difficult than others, so I hardly support the idea that we must make only progress wide steps to settle every case, or we’ll stay stuck is the current situation forever. However in that case I’m pretty confident that we’re on a topic in which the type–token distinction (Q175928) (please read the enwiki article if you did not do it yet) is pretty easy to apply. We clearly have a real world object level and a type of real world object (/events/processes) type level. « proteins » are a type of real world objects. Such types can be subclasses of each over if relevant, and we have a clear criteria to decide : if any real world object of some « real world object type » are also real world object of the other one, we’re on such a case. However making sense of such a « real world object type » being an instance another « real world object type » is a problem, as according to type–token distinction (Q175928) only real world objects are instances of « real world object type ». The confusion only can increase if we have chains of « instance of » with « real world object type ». Note that the database that protein bot box imports from do not do that. The metaclass solution solves this by adding a « real world object type type » level. It removes a problem by adding clarity, and nobody has been able to raise a problem with this. We’ll be confident that if we found an instance of the class « protein » or « biological process », it will actually refer to a real world molecule or a real world process. And never a type of proteins or process. However unlikely that we have an article about a molecule instance, we’ll appreciate to never ask ourselves if it’s the case or not, we have a general principle to decide independently from this question. While there is a case for doing things this way, nobody has seem to raise a significant argumentation on keeping stuffs the way they are. author TomT0m / talk page 11:06, 13 February 2018 (UTC)

@TomT0m, Gstupp: Not only protein (Q8054), but also other chemical substances may have some problems. What do you think of

⟨ hydrogen (Q556)  

 ⟩ instance of (P31) ⟨ chemical element (Q11344)  

 ⟩

,

⟨ water (Q283)  

 ⟩ instance of (P31) ⟨ chemical compound (Q11173)  

 ⟩

,

⟨ potassium cation (Q21091971)  

 ⟩ instance of (P31) ⟨ cation (Q326277)  

 ⟩

,

⟨ acetic acid (Q47512)  

 ⟩ instance of (P31) ⟨ carboxylic acid (Q134856)  

 ⟩

, and

⟨ naphthalene (Q179724)  

 ⟩ instance of (P31) ⟨ polycyclic aromatic hydrocarbons (Q407212)  

 ⟩

. What's the difference between

⟨ tumor protein p53 (Q283350)  

 ⟩ instance of (P31) ⟨ protein (Q8054)  

 ⟩

and them? --Okkn (talk) 14:52, 13 February 2018 (UTC)

@Okkn: See Wikidata:WikiProject Chemistry/Proposal:Models and the talk page for some discussion of these issues. Many of the current relationships of this sort in chemistry are wrong, as you suggest. However note chemical element (Q11344) is defined in wikidata as a second-order class (Q24017414) so it is an example of the metaclass approach. ArthurPSmith (talk) 15:43, 13 February 2018 (UTC)

@TomT0m, Okkn: Just to answer two questions above (what is the inconvenient point of changing, and what's the argument for keeping things the way they are), I think Greg is raising a few points here. First, we previously discussed this with the community and the current solution represented the consensus at the time. Second, we've invested time and effort into implementing that model, both in terms of our bots and our downstream applications. Third (and most importantly), we are happy to make all the changes necessary, but only if there is reasonable Wikidata-wide consensus that there is a better solution. Without evidence of that consensus, we run the risk of someone else arguing for yet another solution in six months, which would result in us spending even more time modifying our bots and applications instead of pursuing our team's broader mission (loading high quality biomedical datasets, demonstrating the integrative queries that Wikidata enables, promoting its use in the biomedical community, attracting contributions from domain experts through applications built on Wikidata, building bot automation infrastructure, exploring data modeling and detection of constraint violations using ShEx, etc.). I hope that clarifies our perspective here... Best, Andrew Su (talk) 17:16, 20 February 2018 (UTC)

@Andrew Su, Okkn, Gstupp: Requiring a whole community consensus to solve a bug in the model seems waay outbalanced with way the initial consensus was achieved. Do you have a link ? to the initial discussions ? Actually a real problem in Wikidata is the risk of fossilization of bad design initial decisions for datas that become widely used by external tools, which leads to solve bug in the model really hard. But I don’t think this should stop us to solve design bugs or making sources lie by affirming they support claims they actually don’t ! This is a serious bug. Several users says they are OK to make the change we suggest here, which is quite simple. I guess you are way more in the position of attract the attention of the data users that I am because I just don’t know who they are, nor if they would really be and embarrassed by the change (substituting a couple of qids in a couple of places does not seem like a revolution). RfCs in Wikidata usually fails to attract attention, and discussions on molecular biology project did not exactly attract the crowd. I fail to see a way out in here when the only blocker to make the change seems to be your answer and it’s not even clear what you mean. I don’t really know what question can be posed to the whole community for you to be satisfied. author TomT0m / talk page 09:43, 21 February 2018 (UTC)

Looking for examples of alternative business models for organisations considering open licensing

Hi all

I'm compiling a guide for UN agencies on the steps to implement open licensing.

The piece I'm really missing is alternative business models to those which require traditional copyright. This include publishing books, images and other multimedia licensing and also data.

I'm finding data the hardest of these to find information for.

If you know of any existing compilations of information and/or any examples of organisations which have working business models please brain dump below and I will organise it.

Thanks very much

John Cummings (talk) 16:15, 17 February 2018 (UTC)

Many books get written to create a reputation for the author. A good reputation can lead to consulting gigs and speaking engagements. Or selling flamethrowers ;)

Ubers decision to release data under an open license is noteworthy: https://www.engadget.com/2017/08/31/uber-movement-traffic-data-website-launch/ . It's Creative Commons Attribution Non-Commercial but even that's still open data. ChristianKl ❪✉❫ 11:59, 18 February 2018 (UTC)

Theoretically the cost would drop of software programs used by corporations to translate text if the text translations were free. I find it scandalous that the WHO charges royalties for translations of their medical terms. Jane023 (talk) 09:16, 21 February 2018 (UTC)

how the Wikidata works

hi, i am new here dont know what is the wikidata but have intrest in it.can any one explain how it works. – The preceding unsigned comment was added by Mnish pal (talk • contribs) at 03:42, 21 February 2018‎ (UTC).

Welcome. You may want to read Wikidata:Introduction first. For any questions not found there or on pages linked from there, feel free to ask again here. --Anvilaquarius (talk) 10:16, 21 February 2018 (UTC)

Help:Pages without elements

Does anyone know any good automated ways to merge pages (without elements) with already existing wikidata items? I have about 1000 categories in uk.wiki, which can be added to the elements like Category:2018 in Switzerland (Q27992176) and Category:Foreign relations of Cameroon (Q7304141) ---Andrew J.Kurbiko (talk) 03:47, 21 February 2018 (UTC).

If you give me the list and show me some patterns (like that category type can be matched to that category type for enwiki/ruwiki: "X in Switzerland" and "[the same in Ukrainian]") I can do it when I have some free time (in evening or in holidays). --Edgars2007 (talk) 06:57, 21 February 2018 (UTC)

They all fragmented, for ex., 20 "X in Switzerland" and 10 "X in Asia", thats why i need a tool. It would be unpractical to edit them manually, and will take rly a lot of time ---Andrew J.Kurbiko (talk) 13:47, 21 February 2018 (UTC).

Quick Statements version 1 can do a list of merges, see Help:QuickStatements#Item_merging -- but be very careful that you are indeed giving it the right pairs of items, and that these are indeed appropriate to merge; unlike QS2, QS1 does not give you a preview. Jheald (talk) 10:19, 21 February 2018 (UTC)

Thanks! ---Andrew J.Kurbiko (talk) 13:47, 21 February 2018 (UTC).

Remove Obsolete Getty Vocabularies IDs

Wikidata:WikiProject_Authority_control#Remove_Obsolete_Getty_Vocabularies_IDs: Any takers to replace old values of AAT ID (P1014), TGN ID (P1667), ULAN ID (P245) with the new values from the respective files cited there?

If you'd like to help, please comment there and ping me. Thanks! --Vladimir Alexiev (talk) 15:41, 21 February 2018 (UTC)

Bit confusing to have discussions on the normal page of the WikiProject instead of the talk page. Sjoerd de Bruin (talk) 15:44, 21 February 2018 (UTC)

python script works from the linux command prompt not from a browser

Hi,

I am new to wikidata and a novice with python.

I am running apache on a digital ocean droplet using lampp

The following script works fine when I run it from a linux terminal but when I try to run it on a browser I get an error

"Error occurred: <urlopen error unknown url type: https>"

Any help would be great

Regards

Paul

Not answering to your error problem, but to the question in general. Take a look at this thread. But do you really want to do this with Python? For running in web browser php would be more suited, imho. --Edgars2007 (talk) 06:30, 22 February 2018 (UTC)

You can give feedback about the Weekly Summary

Hello all,

If you read the weekly Wikidata newsletter, written by volunteers and the Wikidata team, we would be very interested by your feedback. The newsletter reached its 300th edition and we are wondering how we together could improve it and make it even more useful for the community.

You can find a page with a few questions. You don't have to answer to all of them.

If you're not reading the Weekly Summary, or used to but stopped reading it, there is a section just for you!

And as a reminder, the Weekly Summary is collaborative and you can add information for the next edition on this page. Your participation would be very appreciated :)

Thanks in advance for your help, Lea Lacroix (WMDE) (talk) 12:46, 22 February 2018 (UTC)

You can participate to Google Summer of Code and help improving Wikidata

Hello all,

As every year, Google Summer of Code support student developers all around the world to work on projects. Wikimedia as an organization is part of the mentors.

A list of projects is already available, you can also add your own. One of these projects is signed statements for Wikidata: developing the technical feature that will allow institutions to donate verified and sourced information and increase the quality of the data.

If you want to participate or become a mentor, feel free to check the information on the pages linked above.

Thanks, Lea Lacroix (WMDE) (talk) 11:48, 14 February 2018 (UTC)

Is "signed statements" the only one for Wikidata? Wonder how that would work with ranks.
--- Jura 09:00, 15 February 2018 (UTC)
- There is phab:T138708 with details. —MisterSynergy (talk) 09:34, 15 February 2018 (UTC)
- I read that, but I don't think it addresses it. statement disputed by (P1310) on a signed statement would even be stranger.
  It would be good if I/we came of with some other Wikidata related proposals. Maybe something that offers a basic step that isn't too complicated to do and some potential to go beyond that.
  --- Jura 09:50, 15 February 2018 (UTC)
  - To my understanding of the phab, "signed statements" as they are presented there are actually "signed claims" (i.e. the signature verifies mainsnak and qualifiers including some related information about the subject and the object item, but not the references and ranks of the statement). This means you should be able to change ranks of a "signed statement" without breaking the signature. —MisterSynergy (talk) 09:57, 15 February 2018 (UTC)
    - So ranking would be disconnected from the snak. I suppose it could work, but I'm not sure if ranks are easily understood.
      --- Jura 10:01, 15 February 2018 (UTC)
      - AFAIR there was some research lately which found that concepts like ranks and snaktypes are indeed poorly understood. Another problem with the "signed statments" approach could be that we use qualifiers to qualify the mainsnak value (this is how they were supposed to be used; they are something that is intrinsic to the entity described), but also to qualify the statement as a whole (think of statement disputed by (P1310), or reason for deprecated rank (P2241)). The latter doesn’t really fit into the "signed statements" concept, but it is somewhat disputed anyway. —MisterSynergy (talk) 10:10, 15 February 2018 (UTC)
        The proposal does mention that changing the qualifiers would break the signed snak. So this wouldn't be a problem.
        reason for deprecated rank (P2241) could be moved to the reference section if it would be visible by default. statement disputed by (P1310) seems in need of a different approach.
        --- Jura 10:32, 15 February 2018 (UTC)
- Yes signed statements is the only idea I added because mentoring takes significant time and effort. Ranks indeed do need to be improved but this is independent of signed statements and requires more design research than coding so is unfortunately not suitable for GSoC. --Lydia Pintscher (WMDE) (talk) 17:34, 15 February 2018 (UTC)
  - I was a bit worried about the impact on ranking, but if it's distinct it might not matter. Most people eventually figure it out. It's just not usual in Wikipedia infoboxes.
    As an additional one: maybe a (more) Wikibase specific diff view to compare to two items or two versions of the same item, for use in https://www.wikidata.org/w/index.php?diff= . Obviously the coding effort on that might be significant.
    --- Jura 17:49, 15 February 2018 (UTC)
    - Are you thinking about something like http://tools.dicare.org/wikidata-diff/ or something else? --Lydia Pintscher (WMDE) (talk) 19:43, 15 February 2018 (UTC)
      - A feature that compares two by showing a compact version of differences only and identical content + differences. The current diff function seems closer to MediaWiki text than an comparison of statements (+labels/etc.).
        --- Jura 04:45, 16 February 2018 (UTC)
- I added it to the list. Two others that might be interesting are https://phabricator.wikimedia.org/T149905 and https://phabricator.wikimedia.org/T139912 . All three would need to have the technical steps presented in further detail.
  --- Jura 05:53, 21 February 2018 (UTC)
  - @Lydia Pintscher (WMDE): the addition got revert. Would you have a look if one or the other of these three could be added? For WMDE development, I suppose an interesting factor could also be that it provides a check if the codebase is sufficiently accessible to new contributors.
    --- Jura 10:59, 22 February 2018 (UTC)
    - I fear the first one is not large enough for a GSoC project. With the second one my fear is that it is not well defined and a newcomer would have a hard time figuring out a solution that'd work :/ --Lydia Pintscher (WMDE) (talk) 07:27, 23 February 2018 (UTC)
      - @Lydia Pintscher (WMDE): Well, the second one has it spelled out pretty much in detail (on a page that has been archived since). BTW, what do you think about the diff/comparison view? A forth one could be https://phabricator.wikimedia.org/T69659 . Ideally the mapping could be checked on Wikidata Query service.
        --- Jura 08:17, 23 February 2018 (UTC)
        Ok. I'll see what the team says. --Lydia Pintscher (WMDE) (talk) 18:15, 23 February 2018 (UTC)

Merger request

Can anyone please merge falling and rising factorial (Q2339261) with Pochhammer symbol (Q132335)? Thanks. --Mhhossein (talk) 20:09, 23 February 2018 (UTC)

There are conflicting sitelinks. Sjoerd de Bruin (talk) 20:35, 23 February 2018 (UTC)

Pochhammer symbol (Q132335) is, as far as I understand it, a special version of falling and rising factorial (Q2339261), see here: de:Fallende_und_steigende_Faktorielle#Verallgemeinerung Grüße vom Sänger ♫ ^(talk) 20:52, 23 February 2018 (UTC)

Is it possible to see the complete history of an individual statement from an Item?

You can get a full record of all the changes made to a Wikidata item, by simply clicking "View history". From that history, I guess, that it is possible to flesh out the full history of an individual statement of that claim. However, to do this for a batch of many statements becomes impossible quickly the more statements are involved. I have looked through the api calls (https://www.wikidata.org/w/api.php), but couldn't find a usable call for this. My guess is that it should be possible, given it is tracked in the full history of an item. Long question short, does anyone have an idea how to track changes on an individual statement from a Wikidata item? When was created, how often did it change, by whom and how?

--Andrawaag (talk) 13:43, 22 February 2018 (UTC)

No other than querying change by change and comparing them. Matěj Suchánek (talk) 09:58, 24 February 2018 (UTC)

Old RFBOTs

I was looking at WD:RFBOT, and noticed that there's a large number of requests that have been inactive for months or even years. There was one three-month-old RFBOT that didn't even have any contents. I used common sense and closed that one on my own, but I'm hesitant to deal with the other ones before bringing it up here. Since these old RFBOTs just clutter things up, I propose the following:

An RFBOT may be procedurally closed by any user if any of the following conditions occurs:
1. It has no meaningful contents and is more than 48 hours old.
2. No user has edited the page in more than six months.
3. The bot's operator (or one of its operators if there are multiple) has not edited the page in more than a year.
Any RFBOT thus closed may be reopened by the operator at any point.

Or we could be more casual about it and just close any RFBOT that seems "too old". Thoughts? — PinkAmpers&^{(Je vous invite à me parler)} 23:13, 23 February 2018 (UTC)

Your proposal looks sensible. I am doing the same for property proposals and I have the feeling that common sense works reasonably well - I do not feel the need for particular new policies. As you said requests can be reopened afterwards so it's not like you are doing something irreversible. My own goal is just to make sure property proposals look a bit more welcoming: newcomers should not be daunted by a backlog of half-supported proposals gathering dust. I think we just need more people volunteering for these bureaucratic tasks rather than more policies about them. I really like the fact that wiki-lawyering is basically non-existent in Wikidata: let's keep it like that! − Pintoch (talk) 10:11, 24 February 2018 (UTC)

I cleaned up these kind of pages in the past. I usually leave a note about lack of activity and that it will be closed in a week unless it becomes active (including a ping for the proposer). Next weekend you can close all the requests that didn't have any replies. No need for more bureaucracy. Be nice, treat the proposers the way you would like to be treated. Multichill (talk) 13:58, 24 February 2018 (UTC)

TED profile

How do I correctly add the Ted Profile for Ayana Elizabeth Johnson? Thanks, GerardM (talk) 15:32, 24 February 2018 (UTC)

described at URL (P973). Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:28, 24 February 2018 (UTC)

Help with subspecies

My sources indicate that eri silk (Q5385913) is produced by Samia cynthia ricini, a subspecies of Samia cynthia (Q1462214). How do I indicate this on the this taxon is source of (P1672) statement? Do I need a separate item for the subspecies? I haven't done much work with biology items. Thanks! - PKM (talk) 22:01, 17 February 2018 (UTC)

I created Samia cynthia ricini (Q49125191). --Succu (talk) 11:47, 18 February 2018 (UTC)

Thank you! I’ll hook everything together. - PKM (talk) 20:22, 18 February 2018 (UTC)

...And it looks like Samia cynthia ricini is a synonym for Samia ricini (Q1995352). In Commons, UniProt, Encyclopedia of life. I'll merge them. - PKM (talk) 21:06, 18 February 2018 (UTC)

I reverted changes your at Samia cynthia ricini (Q49125191) and added the original combination. --Succu (talk) 21:11, 22 February 2018 (UTC)

Oh, okay, now I see how that works. :-) - PKM (talk) 22:09, 24 February 2018 (UTC)

Good to read that. ;) --Succu (talk) 22:10, 24 February 2018 (UTC)

Q49000000

another cebwiki ..
--- Jura 05:33, 21 February 2018 (UTC)

Wasn't there the plan to stop import of these bot-created articles until we find a way how to deal with this low-quality and high amount data? They pop up faster than we can merge them. 13:13, 21 February 2018 (UTC) – The preceding unsigned comment was added by Ahoerstemeier (talk • contribs) at 13:13, 21 February 2018‎ (UTC).

Maybe @GZWDer: has a plan. Personally, I somehow figured it might be easier to ignore duplicates. Most people don't seem to care that they are created, so why should I?
--- Jura 06:39, 22 February 2018 (UTC)
- It would be great if we could stop automatically creating items for cebwiki pages. This is a real issue for many OpenRefine users, as it degrades the quality of reconciliation results for geographical subjects. − Pintoch (talk) 12:33, 22 February 2018 (UTC)
  - See also m:Proposals for closing projects/Closure of Cebuano Wikipedia, this problem should be mentioned carefully by Philippine users e.g. @Sky Harbor:. While this is beyond the control areas of WD, I personally suggest a local community ban on cebwiki aganist Lsj. --Liuxinyu970226 (talk) 15:11, 23 February 2018 (UTC)

However, there is not much of a community in cebwiki. Of the four admins besides the bot operator himself, only one has made more than four edits in 2018 so far. ceb:Espesyal:ActiveUsers lists 133 users currently (incl. bots), but the vast majority just came for a couple of edits, mostly triggered from other projects (like replacing deleted Commons images). Maybe we do want to think about an exclusion rule in WD:N, where items with only a cebwiki sitelink are to be excluded? --YMS (talk) 15:26, 23 February 2018 (UTC)

I think excluding cebwiki is a wrong approach: 1. the pages are being maintained 2. even we choose to exclude the sitelink, most of them still refers to an instance of a clearly identifiable entity described by reliable sources (Geonames maynot, but GEONET is), thus notable.--GZWDer (talk) 16:22, 23 February 2018 (UTC)

I don't think we should ban cebwiki sitelinks - I would just stop mass-creating items for cebwiki pages without a corresponding item. Not because the corresponding items would not be notable - just because we don't have the technical and human means to ensure that no duplicates are created. People can still create items and add cebwiki sitelinks on an individual basis. − Pintoch (talk) 10:38, 24 February 2018 (UTC)

There are two kinds of duplicates - the duplicates with already existing items which can be relatively easily found (especially since its all geographical items), then there are the duplicates between administrative units and populated places imported from geonames, which are annoying but one can argue them as being correct. But the issue which bugs me most are the WRONG statements added. Trying to get an elevation for a hill by taking only the inaccurate geographical location using the radar data which also has some inaccuracies is problematic, importing these without any error margin and often even without the imported from Wikimedia project (P143) severly harms our data quality. See e.g. Osterloh (Q31294773) where the difference in elevation to a real map was just 15m, I had cases where it was >200m! Ahoerstemeier (talk) 23:25, 24 February 2018 (UTC)

Q50000000

And now El Refugio (Q50000000), also by User:GZWDer. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:20, 23 February 2018 (UTC)

User:GZWDer has a long story of indiscriminately mass-importing articles from projects, without precautions, without any statements, which caused a lot of problems for wikisources (fr and others), see User_talk:GZWDer/2015#wikisource_mass_import and others. This user claims to be "semi-retired", but User:GZWDer (flood) is certainly not :( --Hsarrazin (talk) 16:29, 23 February 2018 (UTC)

For ceb and sr wiki case, I plans to import many statements to them (like this). However for personal issues I can only edit Wikidata in certain months each year and I will continue this in June.--GZWDer (talk) 16:34, 23 February 2018 (UTC)

Wikidata:Requests for comment/Privacy and Living People

There has been a request for comment about developing a policy about privacy and items about living people. Your participation is encouraged above. --MediaWiki message delivery (talk) 20:48, 24 February 2018 (UTC) (from Rschen7754)

Property:P2291, and other questions about music

I asked several of these questions at Property talk:P2291, but since the only person who replied was Mbch331 and he ignored my last two pings, I'll assume he doesn't know the answers (or they don't exist) and ask the questions again here. The property currently has just 73 uses, almost or more than half of which violate its constraint values, so I assume it would be easy to change its constraints and the qualifiers' meanings in the context.

The only qualifiers presently allowed for this property, charted in (P2291), are series ordinal (P1545), end time (P582), and start time (P580).
- (a) Why mandate series ordinal (P1545) (34 uses) instead of ranking (P1352) (7)?
- (b) Many charts have a "chart date", often at a different time to the chart's actual publication date. How should this be indicated? Is point in time (P585) (7) sufficient for this, given that a chart is technically current on its chart date regardless of when the data was collected? Should a new property be created to specify this? It would make more sense to me personally to have just the chart date on data points, particularly for the Billboard Hot 100 (Q180072) since its data collection rules are unnecessarily complicated (see next point), with the chart's rules on the item for the chart. This would also make referencing for the qualifiers much more reasonable.
- (c) What do end time (P582) (23) and start time (P580) (18) actually mean in the context? Should they define the period that the data is collected (two different periods for the Billboard Hot 100, since airplay data is Monday–Sunday and rest is Friday–Thursday – no existing schema for entering this information at present), or the period that the chart is current (i.e. from its publication time to the time of the succeeding chart's publication), or the period that the chart was the current chart according to its "chart date" (i.e. from its "chart date" to the next chart's "chart date")?
- (d) If a song is on a chart for multiple weeks at the same position, should each week be a separate data point, or should they be combined?
Many charts, like the Hot 100, combine the data for the original version of a song with the remixed or covered version of a song, usually when the original artist has participated in the recording of the other version.
- (e) How should this be indicated? Should all the data just be put on the "main" item for the song, or should it be placed on the remix version's item (one of which probably doesn't exist; the data model is largely incomplete here because Wikipedia articles usually don't exist for both concepts) if it was the one released as a single? Or should the data be placed on whichever version the chart position appears to be credited to? (The credits for Perfect (Q29051557) changed in its last week on the Hot 100, being credited to just Ed Sheeran instead of Sheeran and Beyoncé.)
- (f) Should different versions of a song, even if released by the same artist(s) as the original version, have different items? Is the live performance of a song on a TV show or an officially-sanctioned podcast notable? What if it wasn't released officially (including fan recordings)?
Before mid-1998, the Hot 100 was a singles chart rather than a songs chart. I was informed by Mbch331 that a single release should have its own item, and each song on it should also have its own item.
- (g) If a single was released in a different format/packaging/whatever in different regions, does each of those releases qualify for its own item? (On the UK Singles Chart, at least one song charted twice in the same week because it was released in four formats instead of the maximum mandated three. This suggests to me that they would.)
- (h) Do digital singles and CD singles also qualify for their own item [if they have an entry in a database]? Does the version of a song sent to radio get its own item? The clean version? The lyric video(s) and music video(s)? The (official) version uploaded to Spotify/YouTube/SoundCloud/Dropbox/Facebook/Wikimedia Commons/the Wikimedia bug tracker (the line's somewhere there)? Is everything released on iTunes/YouTube/etc. Wikidata-notable?
- (i) Should the data be added (in addition, or instead) to the items of the individual songs? What about the B-sides if the B-side(s) is/are the less important song(s)?
- (j) If an album is released with e.g. a different album cover in two countries, do both of them qualify for individual items if they have different identifiers? What if the versions are identical aside from their identifiers? Are they classified as the same thing? If a random assortment of some of a set of objects is included in the release of an album, does every permutation get its own item?
- (k) Which release of the single or album – if there are multiple items – gets the chart position data? All of them? The ones from which the chart position was calculated?
- (l) Which item gets the sitelinks to Wikipedia?

Jc86035 (talk) 09:06, 22 February 2018 (UTC)

Since no one replied, I changed the qualifier from series ordinal (P1545) to ranking (P1352), and started a property proposal for "chart date". Jc86035 (talk) 09:36, 25 February 2018 (UTC)

Import from China Biographical Database (Q13407958)

I'm doing some bulk addition of basic biographical data from this database, to enrich our entries about historical figures of China. There's a project page at Wikidata:WikiProject East Asia/China Biographical Database import. I'm just notifying the community here in case anyone else is using this database (which is already mostly reconciled via Mix'n'Match).

Ontological issue: we often use the same item for a "Chinese dynasty" in the sense of a state within the Imperial era of China and in the sense of the family of rulers of those states. See for example Yuan dynasty (Q7313) or any item which is instance of (P31) both Chinese dynasty (Q12857432) and historical Chinese state (Q50068795). For an encyclopedia article, it makes sense to combine the history of the country with that of the people who ruled that country, and for my purposes I'm not encountering any problems. Still, maybe it would make more sense to separately represent the family and the state? Just raising the question in case someone more ontologically-inclined wants to examine it. MartinPoulter (talk) 14:03, 24 February 2018 (UTC)

I think Yuan dynasty (Q7313) is just an empire (state) and cannot refer to the imperial family in that era. --Okkn (talk) 15:11, 24 February 2018 (UTC)

Thanks User:Okkn. If that's the case, then please go ahead and edit the item to remove the incorrect statements. Cheers, MartinPoulter (talk) 17:18, 24 February 2018 (UTC)

@MartinPoulter: Do you need items about "family"? --Okkn (talk) 19:53, 24 February 2018 (UTC)

@Okkn: Not for my purposes, but others might. I'm interested in linking people and artefacts to states of the Chinese imperial era. I don't expect to do queries about the families. I've just flagged up the issue in case using the same items for the empire and the family causes problems elsewhere. Cheers, MartinPoulter (talk) 12:05, 25 February 2018 (UTC)

@Fantasticfears:, as the original importer of the CBDB data. Mahir256 (talk) 01:36, 25 February 2018 (UTC)

Mentorship

I just added two statements on Anna Naklab (Q20061746) and before continuing, I would like to ask an experienced user whether this constitues an acceptable edit. Who is willing to verify this for me ? Thanks in advance. -- Juergen 95.223.151.37 14:40, 25 February 2018 (UTC)

I'd say those look fine, but would be improved if you could add references. Also, please consider signing up for an account - it will make it possible to notify you of replies to your questions. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:45, 25 February 2018 (UTC)

Thank you. -- Juergen 95.223.151.37 16:30, 25 February 2018 (UTC)

API: action=compare

With action=compare, one can retrieve diffs via API. Does anyone know how to create the diff of an item creation, i.e. when there is no parent revision id to hand over to parameter fromrev? A value of fromrev=0 does not work. For Wikipedia, an empty fromtext= parameter instead of fromrev does the job (example), but that does not work for Wikidata items (example with error). I suspect that I have to provide parameters fromcontentformat=, fromcontentmodel=wikibase-item, and maybe frompst as well (example), but it does not work and I am meanwhile running out of ideas. Unfortunately, there is only little use of action=compare at Wikidata until now, to I can’t read other code… Thanks for help and input, MisterSynergy (talk) 09:35, 22 February 2018 (UTC)

@MisterSynergy: fromtext={ "type": "item", "id": "…" } + fromcontentformat=application/json + fromcontentmodel=wikibase-item seems to work: example. (Hint: if you use Special:ApiSandbox (example), fromcontentformat and fromcontentmodel get much easier to figure out since it offers you a selection of valid values.) --Lucas Werkmeister (WMDE) (talk) 22:29, 25 February 2018 (UTC)

SQL error

Maybe somebody sees what's wrong with this query? The hyphen in use lv-wiki_p; is intentional. Query didn't stop running after 30 mins, so I stopped it in hard way. What the query does? It takes all lvwiki articles, which have Commons category link, but no Commons category (P373) in Wikidata. --Edgars2007 (talk) 13:59, 26 February 2018 (UTC)

Wikidata weekly summary #301

Here's your quick overview of what has been happening around Wikidata over the last week.

Events/Press/Blogs
- Upcoming: IMLD-ODD 2018 Wikidata India Edit-a-thon, February 21st to March 3rd
- Upcoming: presentation of the paper "Knowledge Graphs and Pluralism on Wikidata", February 27th, Luxembourg
- Upcoming: Wikidata workshop in Hyderabad, India, March 2nd

Other Noteworthy Stuff
- Stewards election is running until February 28th
- Structured data on Commons ontology discussion continues until March 1st.
- Decision about the licensing of Lexeme namespace
- It's been possible for a while (but not previously reported here) to include Wikidata IDs in Wikivoyage listings, like this example edit.
- Q50000000 was created on February 23rd.
- You can give feedback about this newsletter
- Constraint checks will be integrated in the interface of Wikidata

Did you know?

Development
- Make it possible to link to Lexemes and Statements (phab:T1854997)
- Disabling senses for the first release of Lexemes (phab:T186995)
- Caching for constraints check

You can see all open tickets related to Wikidata here.

Monthly Tasks
- Add labels, in your own language(s), for the new properties listed above.
- Comment on property proposals: all open proposals
- Suggested and open tasks!
- Contribute to a Showcase item.
- Help translate or proofread the interface and documentation pages, in your own language!
- Help merge identical items across Wikimedia projects.
- Help write the next summary!

Read the full report · Unsubscribe · Lea Lacroix (WMDE) 15:48, 26 February 2018 (UTC)

Manuscripts without language or without collection

I have a favour to ask of the community. On Tuesday there will be a meeting in Oxford University to discuss ways to make collections more discoverable, and there will be a demo of a couple of Wikidata-driven sites. If the demo impresses, this might mean more formal work with Wikidata and more data being released. I'm rushing to make the demo as impressive as I can, and it would help me if there were more rich records about manuscripts. I've improved a lot of records but am concentrating on coding. It would help if more manuscript (Q87167) items had collection (P195) and language of work or name (P407) properties. Here are the manuscripts without a collection (P195):

SELECT DISTINCT ?q ?qLabel ?qDescription ?enwp 
WHERE { 
?q (wdt:P31/wdt:P279*) wd:Q87167.   
MINUS { ?q wdt:P195 []}
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" }
 OPTIONAL{?enwp schema:about ?q ; schema:isPartOf <https://en.wikipedia.org/> }
} ORDER BY DESC(?qLabel)

Try it!

and here are manuscripts with a collection (P195) but no language of work or name (P407):

SELECT DISTINCT ?q ?qLabel ?qDescription ?enwp 
WHERE { 
?q (wdt:P31/wdt:P279*) wd:Q87167.   
?q wdt:P195 []
MINUS {?q wdt:P407 []}
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" }
 OPTIONAL{?enwp schema:about ?q ; schema:isPartOf <https://en.wikipedia.org/> }
} ORDER BY DESC(?qLabel)

Try it!

Any work done to improve these records between now and Tuesday is much appreciated, but of course is beneficial to Wikidata and the humanities irrespective of the Oxford situation. Thanks in advance for any help, MartinPoulter (talk) 12:36, 25 February 2018 (UTC)

I see there is 105 French manuscripts without collection who are also Historical Monument (Q916475). For these ones, I'm not sure we can find a collection (P195) (I'm pretty sure there is no collection for these, it's mostly lonely item left in back of churches :/ ) but I will try to add a language of work or name (P407). Cdlt, VIGNERON (talk) 14:03, 25 February 2018 (UTC)

Is there a P407 for manuscripts comprising illustrations only? There seem to be many of them. --Tagishsimon (talk) 20:09, 25 February 2018 (UTC)

I can help with these. - PKM (talk) 20:29, 25 February 2018 (UTC)

@MartinPoulter: Just some remarks about the current bad classification in WD:

- you mention manuscripts but what's about exemplar of printed books like the ones from the first edition of the Bible by Gutenberg ?

- how can we integrate manuscripts in the FRBR classification currently used for work and edition ? Snipre (talk) 21:48, 25 February 2018 (UTC)

I create a work for the intellectual content and an edition for each manuscript, but there may be a better way to do this. - PKM (talk) 04:05, 26 February 2018 (UTC)

Can someone check my work disentangling the four manuscripts of Cantigas de Santa Maria (Q1033824)? It could use another set of eyes. - PKM (talk) 04:05, 26 February 2018 (UTC)

Ah, I see we have property exemplar of (P1574) which is better than "edition or translation of" (which I suppose should be used for printed editions. I'll fix my work. - PKM (talk) 20:48, 26 February 2018 (UTC)

This query can help you quickly find items that belong to a paticular collection and do not have a P195 yet. Just look at the "image" column, it often looks like "commons:Papyrus 44 - Metropolitan Museum of Art 14.1.527.jpg" or similar. Syced (talk) 09:56, 26 February 2018 (UTC)

Merge noel (Q28135297) and Christmas carol (Q503354) ?

Any Russian speaker could decide whether to merge or not?

The Russian article seems centered on France, which that might incidental. Thanks! Syced (talk) 08:30, 26 February 2018 (UTC)

Have you contacted the creator of the item about this? Sjoerd de Bruin (talk) 09:42, 26 February 2018 (UTC)

I would say they could be merged, though it is indeed good to discuss with the creator.--Ymblanter (talk) 10:33, 26 February 2018 (UTC)

If you look at the data on these two items, it seems that Christmas carol (Q503354) is a Christmas carol in general and that noel (Q28135297) is the subclass of French Christmas carol. I'm not sure what the Christmas carol have in France that make them different from the general Christmas carol (as far as I know, there is nothing different between the two concepts).

@Infovarius: can you explain a little bit?

Cheers, VIGNERON (talk) 13:24, 26 February 2018 (UTC)

I happen to be a native Russian speaker.--Ymblanter (talk) 16:01, 26 February 2018 (UTC)

It is worth noting that the Russian Wikipedia does not seem to have an article about Christmas carols, just this article which is mostly about French Christmas carols (which as a French person surprises me). Maybe information for other countries has just not been added yet? Maybe the article was imported from an old Russian encyclopedia from the times when French trends mattered in Russia? Syced (talk) 02:02, 27 February 2018 (UTC)

How do I know which extensions are installed on Wikidata?

Hi all

I'm really interested in using the Dynamic Page List extension, how do I find out what extensions are already installed on Wikidata?

Thanks

--John Cummings (talk) 18:48, 26 February 2018 (UTC)

Special:Version. Sjoerd de Bruin (talk) 18:53, 26 February 2018 (UTC)

@Sjoerddebruin:, thanks :) --John Cummings (talk) 18:58, 26 February 2018 (UTC)

Also note that futher rollout of this extension is blocked due to performance issues. Sjoerd de Bruin (talk) 18:59, 26 February 2018 (UTC)

Global preferences available for testing

Please help translate to your language.

Greetings,

Global preferences, a highly request feature in the 2016 Community Wishlist, is available for testing.

Read over the help page, it is brief and has screenshots
Login or register an account on Beta English Wikipedia
Visit Global Preferences and try enabling and disabling some settings
Visit some other language and project test wikis such as English Wikivoyage, the Hebrew Wikipedia and test the settings
Report your findings, experience, bugs, and other observations

Once the team has feedback on design issues, bugs, and other things that might need worked out, the problems will be addressed and global preferences will be sent to the wikis.

Please let me know if you have any questions. Thanks! --Keegan (WMF) (talk) 00:24, 27 February 2018 (UTC)

History book and related items

Earlier on there was a request for deletion at Wikidata:Requests_for_deletions/Archive/2018/02/23#Q30277550 concerning historiography (Q30277550) (Spanish only). The only participants in the discussion - @Valentina.Anitnelav:, @Mnnlaxer:, and me - agreed to merge it with historical non-fiction (Q1517777) (German only) and history book (Q10916116) (East Asian languages only).

After I merged the items and marked the deletion discussion as resolved without requiring administrator action, Andreasmperu unmerged them, saying that what I did amounted to "delete valid labels, descriptions, and statements to force a merge". When I raised it on his talk page that there was actually an RfD consensus to merge the items. Andreas simply restated his position in he RfD archive page and is adamant that he is right. Since the RfD is now archived, I'm opening up the discussion here to attract further discussion. Deryck Chan (talk) 15:46, 25 February 2018 (UTC)

I think history book (Q10916116) is not a genre but a subclass of book. historical non-fiction (Q1517777) and history book (Q10916116) should be used in instance of (P31) statements, while historiography (Q30277550) can be used in genre (P136) claim. history book (Q10916116) is a subclass of historical non-fiction (Q1517777) because not all historical non-fiction (Q1517777) is created in book form. In conclusion, historiography (Q30277550), historical non-fiction (Q1517777) and history book (Q10916116) can't be merged. --Okkn (talk) 00:56, 26 February 2018 (UTC)

This is typical conflict in WD classification: use of properties or use of instance/subclass for the classification ? Snipre (talk) 11:06, 26 February 2018 (UTC)

@Okkn: Are you looking from the perspective of the labels, statements, or the sitelinks? When I looked at the content of the Wikipedia articles, I felt that they are basically three cultures' interpretations of the same thing - written non-fiction about history. Deryck Chan (talk) 11:21, 26 February 2018 (UTC)

@Deryck Chan: Items of Wikidata are usually more specific than corresponding Wikipedia articles, and in my opinion historiography (Q30277550), historical non-fiction (Q1517777) and history book (Q10916116) are ontologically different concepts. If the contents of the Wikipedia articles are the same, you can gather the sitelinks into one item without merging items. --Okkn (talk) 12:08, 26 February 2018 (UTC)

@Okkn:

Comment That's the reason that I was requested removal of that user's sysop access, sadly this discussion was simply closed as no consensus without necessary voting. For now I would monitor his contributions regularly, if such panoramas still exist within the following one year, I would consider submitting second request with more complex but also more pointful reasons. --Liuxinyu970226 (talk) 13:35, 26 February 2018 (UTC)

I'm just talking about whether or not the history book and related items can be merged. I think historiography (Q30277550) is to history book (Q10916116) what Chinese cuisine (Q27477249) is to Chinese restaurant (Q868580). So I think they ought to be left as they are. Of cource, I don't care if all of the sitelinks are gathered into historiography (Q30277550)( or historical non-fiction (Q1517777)), unless you merge them. --Okkn (talk) 17:29, 26 February 2018 (UTC)

@Okkn: The articles do seem similar enough in scope that collecting all the sitelinks to one place would help readers find equivalent concepts in other cultures. But yes, I understand your point that we can create three ontologically different concepts here. On the other hand, if we choose to collect all the sitelinks, the other two items might cease to have sufficient WD:N because they are arbitrary subdivisions of historical non-fiction (Q1517777) and don't really serve a structural purpose with anything else. Deryck Chan (talk) 12:29, 27 February 2018 (UTC)

@Deryck Chan: historical non-fiction (Q1517777) and history book (Q10916116) have corresponding category items Category:Historical works (Q13357529) and Category:History books (Q5648742), so historical non-fiction (Q1517777) and history book (Q10916116) are needed to describe the relationship between Category:Historical works (Q13357529) and Category:History books (Q5648742).

historiography (Q30277550) and 【historical non-fiction (Q1517777) and history book (Q10916116)】 are completely different, as mentioned above, so what can be the value of genre (P136) is only historiography (Q30277550) I think. Now historical non-fiction (Q1517777) is set as an instance of genre (Q483394), but I think it is wrong. historical non-fiction (Q1517777) is a subclass of literary work (Q7725634) so it can't be a genre, but a superclass or metaclass of individual/specific historical work. --Okkn (talk) 13:02, 27 February 2018 (UTC)

@Okkn: Thanks. Can you edit the items? To be honest, I don't quite understand the difference between the three items, which is why this discussion started in the first place. Deryck Chan (talk) 13:35, 27 February 2018 (UTC)

Contributor and donator

I want to state that a person was a forced contributor (Q20204892) to a specific tax and also that a person was a vonluntary patron of the arts (Q15472169) of money to start a Public item. What property can be used. Can anyone help? Breg Pmt (talk) 21:59, 26 February 2018 (UTC)

@Pmt Maybe contributor to the creative work or subject (P767) and sponsor (P859)/P1962 (P1962)? - Kareyac (talk) 05:54, 27 February 2018 (UTC)

Thinking maybe of the oposite, who was donating or contribute like item "John Smits" contributed With 150 dollars to poll tax (Q721368). Or "John Smits" donated 150 dollars to University of Tromsø – The Arctic University of Norway (Q279724) Breg Pmt (talk) 16:28, 27 February 2018 (UTC)

@Pmt participant in (P1344) donation (Q1124860) budget (P2769) 150 dollars determination method (P459) poll tax (Q721368) looks not much better. - Kareyac (talk) 17:54, 27 February 2018 (UTC)

Names of people

What is our obligation for naming people at Wikidata in entry titles? Do we go with the full name, the most common name, the English Wikipedia name? Some language Wikipedias always go by the full name, English Wikipedia goes with the most common name and there are a lot of edit wars over names ... do we have a policy?. – The preceding unsigned comment was added by Richard Arthur Norton (1958- ) (talk • contribs) at 15:00, 27 February 2018‎ (UTC).

For English, see Help:Label. In general, that may be the enWP names, but these frequently contain some disambiguation.
--- Jura 16:59, 27 February 2018 (UTC)

Mixed architectural styles

At Benjamin Cornelius Jr. House (Q50163547), a historic house notable for its architecture, I have a source that describes the architecture as primarily style A, with some details in styles B and C. I represented this situation with architectural style (P149) by stating style A with preferred rank, and stating styles B and C with normal rank and qualified by determination method (P459):detail (Q1201081). Anyone have any opinions on whether this is a good/bad approach? — Ipoellet (talk) 05:40, 27 February 2018 (UTC)

You may use applies to part (P518) as qualifier as well, if it is identifiable part of house (like façade (Q183061) etc.).--Jklamo (talk) 09:11, 27 February 2018 (UTC)

Great idea. I did that elsewhere in the same item with regard to the date a veranda was added to the house. In the absence of such distinct parts, maybe applies to part (P518):detail (Q1201081) is a better qualifier than determination method (P459):detail (Q1201081)? — Ipoellet (talk) 16:47, 28 February 2018 (UTC)

Please can the Dynamic Page List extension be installed on Wikidata?

Hi

Please can the Dynamic Page List extension be installed on Wikidata? I would like to use it as part of an improved Wikidata:Data Import Hub that allows each data import to have its own page which are categorised by topic, data type, import status etc. Dynamic Page Lists would allow the automation on to do lists for different kinds of tasks and remove almost all of the manual work.

Thanks

--John Cummings (talk) 17:32, 27 February 2018 (UTC)

Support - NavinoEvans (talk) 17:43, 27 February 2018 (UTC)
Comment: futher rollout of this extension is blocked due to performance issues, like previously said above. The Phabricator task also notes "this extensions might go from being hard to being impossible to deploy for our big wikis, so people should keep that in mind". Sjoerd de Bruin (talk) 18:55, 27 February 2018 (UTC)
@Sjoerddebruin: thanks very much for the information, so it's been an issue for two years with no movement..... can you suggest some ways to get it moving? I'm really quite stuck without it I think. --John Cummings (talk) 09:58, 28 February 2018 (UTC)
You could try to find some volunteer who is willing to work on this, as noted in the task. I don't know if they would be motivated to work on something that might not be even deployed in the future. Sjoerd de Bruin (talk) 10:02, 28 February 2018 (UTC)

Searching recent changes in only one language

Have we any tool to find recent changes of labels, descriptions and aliases in only one language, ex: hy? Have we any tool to find recent changes filled in language I need (ex: name in native language (P1559))? - Kareyac (talk) 18:02, 27 February 2018 (UTC)

For finding recent changes of labels, descriptions and aliases in a specific language, there is this tool. From 'type of edits', select 'terms' and input the language code of the language you want to use. Note, that the tool only shows edits that have not been patrolled. Shinnin (talk) 18:18, 27 February 2018 (UTC)

Thank you. - Kareyac (talk) 18:53, 27 February 2018 (UTC)

@Kareyac: the recently announced WDVD tool can also do this :) for example: https://tools.wmflabs.org/wdvd/index.php?lang=hy&description=on&labels=on (note that it takes a while to load since apparently there aren’t many unpatrolled hy changes) --Lucas Werkmeister (WMDE) (talk) 19:33, 27 February 2018 (UTC)

@Lucas Werkmeister (WMDE): I know and even use that good tool :) I was asked by hywiki users to find all changes in Armenian in WD, even with low ORES damaging score. Thank you. - Kareyac (talk) 22:29, 27 February 2018 (UTC)

@Kareyac: You may find this query helpful. It's not as clean as you or I may like it, but it does what you want. Mahir256 (talk) 23:19, 27 February 2018 (UTC)

@Kareyac: User:Yair rand/DiffLists.js allows for filtering recent changes, along with watchlists and other change feeds. Options include only including certain languages/properties/projects and/or filtering out particular languages/properties/projects. (The script currently doesn't work with the most recent RC redesign, so fiddling with preferences may be required to make it work.) --Yair rand (talk) 23:33, 27 February 2018 (UTC)

@Mahir256: Thanks, I can fork it in hy.

@Yair rand: Thanks, I’ll introduce it to more understanding local users.

I think this collection of tools and methods should not be lost in page history. Have we a page like en.wikibooks:Help:Tracking changes or meta:Help:Watching pages to list them? - Kareyac (talk) 04:53, 28 February 2018 (UTC)

Protection policy: highly used items

We're trialling some templates in Wikispecies, which internationalise common terms by pulling labels from Wikidata. For example, the template species:Template:Botanist displays the word "botanist" in the user's preferred language, using the labels from botanist (Q2374149). I envisage around 50-100 such templates will be required, using an equal number of Wikidata items.

I propose that we semi-protect the items concerned, permanently, so that new and unregistered users cannot edit them; and to adopt a policy that any item used for labelling on a sister project may be so protected. Are there any good reasons not to do this? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 21:20, 27 February 2018 (UTC)

Yes, there are good reasons. Most of the editors in the projects are not autoconfirmed on Wikidata, and they will not be able to add articles to the existing items and to move articles. Whereas this is obviously not needed for large established projects, this is needed for small projects, where these pages have not been created yet.--Ymblanter (talk) 21:31, 27 February 2018 (UTC)

Agreed with Ymblanter. I think that unregistrered users are really useful to contribute in this item in "small" languages. For "main" languages, any vandalism will be detected rather quickly. Pamputt (talk) 06:43, 28 February 2018 (UTC)

I don't see good reason not to do this but I don't see good reason either. I'm mostly confused: what is the reason for protection exactly? « item used for labelling on a sister project may be so protected » so you want to protect almost all items on Wikidata? (or I am missing a subtlety?) I don't believe the risk of vandalism of label being missed is very high, a label is often used on multiple places, if it's not seen on a small project it can be seen elsewhere (and ultimately, there is still the Wikidata community that check the recent changes). For instance, the label in french of botanist (Q2374149) is only used on a few pages on the French Wikisource but it's used on thousands of pages on the French Wikipedia. PS: why 50-100 templates, why not make only one general template? VIGNERON (talk) 07:33, 28 February 2018 (UTC)

The reason for protection would be that the label would appear via template on thousands pages in the projects, and vandalism here would immediately become visible there without even showing up on any watchlists in the projects.--Ymblanter (talk) 08:01, 28 February 2018 (UTC)

Ok, so it would not be for all labels, only high used labels. That make a bit more sense but still I'm confused, if this an item is highly used, then it's probably highly watched too. But good point about not being seen in watchlist. But if this is highly visible, wouldn't it be quickly seen and corrected? And if we semi-protected the items a wikispecist not autoconfirmed on Wikidata will not be able to correct a mistake (which can be vandlism but also can be done in goodfaith by a confirmed user), thus making harder to do correction, isn't it the contrary of intention here? Cdlt, VIGNERON (talk) 08:46, 28 February 2018 (UTC)

Items tour update request

Hi, all. I've just completed the Items tour in the tutorial, linked from the homepage. Looks like we've updated Wikidata to say "publish" instead of "save," but the last step in the item tour still instructs new users to click 'save'. Not sure where that content lives or who can edit it, but should be a quick fix! Jami430 (talk) 16:51, 28 February 2018 (UTC)

Done Now this must be fixed in the translations. Matěj Suchánek (talk) 17:09, 28 February 2018 (UTC)

cancelled films

I found Larrikins (Q22906279) which is a cancelled film from DreamWorks Animation, see [7]. instance of (P31)film project (Q18011172) seems to be suitable for this, but what about publication date (P577), maybe <no value>? Maybe we can say somehow, that it was set for a February 16, 2018 release? And what about the other Statements esp. part of the series (P179) Queryzo (talk) 08:53, 22 February 2018 (UTC)

Maybe <no value> as prefferred rank, and February 16, 2018 as deprecated rank with optional qualifier "reason":"cancelled film". --Edgars2007 (talk) 09:49, 23 February 2018 (UTC)

reason for deprecated rank (P2241) seems to be nice, but which Q should I take??? Queryzo (talk) 20:44, 28 February 2018 (UTC)