Wikidata talk:WikiProject Tennis

From Wikidata
Jump to navigation Jump to search

Doubles record in template[edit]

In nl:sjabloon:Infobox tennisspeler I have added a piece of code to include the doubles record from WikiData if the field dubbelrecord is blank. When this works out well, we can introduce maybe more fields and update the data in a centralised way (or hopefully a bot that retrieves data from WTA/ATP). Edoderoo (talk) 15:47, 14 June 2013 (UTC)[reply]

Very good! I looked into it and it seems to work like a charm. We can then start to introduce more wikidata properties into the infobox templates using your code snippet. At some later point, once we're sure that the data from Wikidata is the same or even more accurate than on the local wiki, we could possibly even remove the "if"-clause. Many properties are still missing on wikidata, however, partly, because there's no number datatype yet. Regarding the the data import, you're right, we'll certainly need a bot to update the data from ATP/WTA/? websites on a regular basis. As I understand, user:Hawk-Eye operates a rather sophisticated one on the french wiki, I'll ask him if he's interested in running one here, too.--Kompakt (talk) 02:35, 15 June 2013 (UTC)[reply]
On the Dutch wiki they reverted all changes, they do not want to use data from WikiData yet. Obviously they prefer old data over more recent data. But I have added the doubles/singles record to the danish, spanish, italian, russian, macedonian and hungarian templates. When the data comes from Wikidata, they are printed in italics instead of regular script. I have also been very active in creating the data (updated from the WTA/ATP websites) for all players active on Roland Garros and now on Wimbledon. I've seen more users are active on this, so WikiData starts to pay off. Hopefully the other hyper-dynamic field related to tennis (earned $$$) is going to be implemented soon too. Edoderoo (talk) 11:57, 24 June 2013 (UTC)[reply]
Thank you for your contributions. I agree that Wikidata gets started by now, although we'd hope some things would get a little faster. I think it's best if we focus on having the data up-to-date and reliable on Wikidata. Once we're sure about this, it should be an easy job to convince people in the local wikis to use Wikidata as a source. I mean storing this data that now has to be updated on each local Wiki on Wikidata is obvious the right thing to do, there's no doubt about that. For a couple of properties including the prize money, we lack the number datatype which unfortunately won't come until August afaik. We also need some way to express the career grand slam results that are shown in the infoboxes (e.g. Wimbledon singles: Final (2003), etc.).
In the meantime, I've written a little bot to import the ITF/WTA-ids along with some other basic properties for all tennis players. I'll start it as soon as I get an approval for the task.--Kompakt (talk) 11:07, 26 June 2013 (UTC)[reply]
Do you think that a bot can also update the doubles record and singles record from the WTA/ATP websites? That is what I hope for.
I've seen many players having their ID's filled in, and I've started to use them on the {{ATP}}/{{WTA}} templates as well in some languages.
If we get the pricemoney property in August, that would be great, as it's only one month from now ;-) In the mean time we'll be busy updating the other fields. Edoderoo (talk) 11:17, 26 June 2013 (UTC)[reply]
In general, the bot could import data from the atp/wta websites, like singles/doubles record as well, but I haven't implemented that functionality as yet. I can see if I can add this, too. When the bot is run, you can be sure that atp/wta/itf-ids for all tennis players on English, French and German Wiki is available on Wikidata. It's a good idea to integrate this into the {{ATP}}/{{WTA}} templates!--Kompakt (talk) 11:25, 26 June 2013 (UTC)[reply]
Please let me know when you can update those variable fields automatically, and if I can be of any help. Until then I will do some manual work, right now that is creating the properties on the players that are active on current tournaments. Edoderoo (talk) 11:30, 26 June 2013 (UTC)[reply]
Alright, I'll tell you when the bot is able to import those fields.--Kompakt (talk) 11:44, 26 June 2013 (UTC)[reply]

Ok, so my bot user:Kompakt-bot is up and running now, and importing several basic properties (nationality, place of birth, WTA/ATP/ITF id etc.) for tennis players into Wikidata. It's also updating the fields for doubles and singles record from the ATP and WTA website. Everything seems to work fine so far. I plan to run it regularly from now on in order to keep those dynamic fields up-to-date, so it's not necessary to edit them manually. Please let me know if you have any questions or remarks.--Kompakt (talk) 18:36, 1 July 2013 (UTC)[reply]

Great news! Wonderful. You used the word updating on the doubles and singles record. Does your bot also add the field when it's missing? Edoderoo (talk) 21:37, 1 July 2013 (UTC)[reply]
Yes, actually, in the first run, that's only what it's doing. It only creates the values if they don't exist as I didn't want to overwrite existing data during the first run. In future runs, I'd change the code for the dynamic fields, so it will also change existing data.--Kompakt (talk) 04:27, 2 July 2013 (UTC)[reply]

About tennis rankings on Wikidata + something else[edit]

Hi! Does anyone know whether any of our tennis related properties are now in use on Wikipedia? At least, I personally would like to see tennis rankings (highest [higher priority] + weekly) on Wikipedia's coming from Wikidata. However this needs a lot of work to do...

  1. First I would like to see getting https://www.wikidata.org/wiki/Wikidata:Properties_for_deletion (P1119 (P1119)) finally archived. The result seems to be that the ranking properties (singles & doubles) will be deleted and we will use property called "ranking".
  2. When #1 has been done, we can move to #2, and it's adding rankings to tennis player items. User:Kompakt already imported last year many useful information to tennis player items using his bot. He is not running his bot anymore, so now we need to find a new bot, or make all work manually. At least I try to help as much as I can.

I collected some statistics:

So it seems that we have now over 5,300 tennis players on Wikidata. English Wikipedia contains almost 5,000 tennis players. --Stryn (talk) 19:04, 3 August 2014 (UTC)[reply]

A question[edit]

If we use Wikidata's data on Wikipedia, for example, rankings, then if the ranking coming from Wikidata is different than the ranking on Wikipedia, how to make those pages shown in a category, like "Category:Tennis players with different ranking on Wikidata"? This would be useful. Also, when data is coming from Wikidata, should we delete the old rankings from Wikipedia? I don't see any reason to keep outdated information on Wikipedia. --Stryn (talk) 19:04, 3 August 2014 (UTC)[reply]

I have been adding the data from WikiData in the infoboxes of some languages, like Russian and Spanish. When there was data from WikiData, that was shown, or in some languages, if the field in the infobox was blank, it would take Wikidata (if available). As you can show data only once, it will either come from one source, or the other. It's hard to tell which one is newer. I can imagagine someone could write a bot scipt to find and list the differences. But my suggestion would be: keep WikiData up to date, and use that whereever possible. Unfortunately, not all languages did support WikiData. On :de they declined my edits in the infobox, only for not showing the win-loose balance with a semi-colon, but with a strike. (40:37 is common on :de, and other languages usually us 40-37). On the Dutch wiki they declined use of Wikidata completely, altough some users now try to push it through in some cases, a year after the latest discussions. Edoderoo (talk) 22:39, 23 August 2014 (UTC)[reply]
Thanks for your reply! On fi-wiki we have a Category:Articles where the IMdb value is not the same as on Wikidata (currently there is no such articles). Yep, I would also suggest to keep Wikidata up to date, but it's easier said than done. I don't know from where to start. I think that there must be some thing to solve the de-wiki issue (maybe using Lua?). I'm going to add some properties to tennis player infobox template on the Finnish Wikipedia in the near future, but we haven't had the win-loose balances etc., so I need to think if we should add them on fi-wiki also. --Stryn (talk) 18:22, 25 August 2014 (UTC)[reply]
Exactly a year ago there was one user who used a bot to update Wikidata from the WTA-website. It would be good if such a bot would run more frequently, but I do not have the knowledge to do that :-( Now I update players "randomly" by hand, and so do others, but this is not the easiest nor the best way to my idea. Edoderoo (talk) 08:09, 26 August 2014 (UTC)[reply]

In the german Wikipedia, I tried to include the values of Wikidata. But there are two problems:

  1. WikiData isn't up to date. (Maybe it's possible that users of the german Wikipedia update the German, Austrian and the Swiss Tennisplayers. Some users of the english Wikipedia update the players of Great Britain, USA, and so on.)
  2. Some important properties do not exist in Wikidata (prize money, heigth, weight)

By the way, I solved the de-wiki issue. In the German Wikipedia is a pattern who can replace the dash by a semicolon. --Korrektor123 (talk) 15:19, 23 September 2014 (UTC)[reply]

Some insight from me (German wikipedia) as well:

  1. I have already called out for a respective bot, however, there are legal restrictions that from my point of view disallow such a regular bot, please see Wikidata:Bot_requests/Archive/2014/03#Tennis-statistics, [[1]] and especially [[2]].
  2. I tried to update Roger Federer's wikidata information in the best possible manner, but this takes just forever on an older pc. If there were the possibility to import data via an interface, that would be good.

As a conclusion, we might have the following solution: an externally hosted Excel document (google docs?) in the format Q...|P....|New Value|Source (eg ATP/WTA-Website)|Data of validity that we all update manually each week after the new rankings have been published and gets put into WD by a bot then - note that if there is no source, then lots of wikipedias will not allow wikidata to feed their infoboxes (the dewp being one of them). This would allows quick copy&paste action in the excel with people from all the different tennis projects on WD and the various WP and a batch entering this data into WD.

Hope this helps/gives ideas --Mad melone (talk) 13:16, 15 October 2014 (UTC)[reply]

I see your point now. We can type over the info, but when we crawl a bot to do the same, it suddenly is seen as copyright violation. Obviously this goes against my common sense gut feeling of justice. But if we could create the above mentioned work around, it would ease the pain. Manually crawling the data is indeed a pain in the butter, because the ATP/WTA sites are often slow and require a lot of tabs to open. Creating an automated list of updates outside wikipedia, and manually typing them into Google Docs and import that into WikiData would be another (quicker) way. Asking ATP/WTA by OTRS to be allowed to crawl their data would also be a good way to get passed this issue. Edoderoo (talk) 13:36, 15 October 2014 (UTC)[reply]
Yeah, thanks for pointing out this Med melone. I agree with Edoderoo that it goes over common sense, but of course we must respect rules. At least current rankings are easy to take and put to Excel document. Other data (win-loss balance etc.) are harder. And it would be nice if someone could contact ATP/WTA and ask if they could have some easier solution for us. --Stryn (talk) 14:24, 15 October 2014 (UTC)[reply]
I agree that it would be perfect if WTA/ATP may give us some workaround for that. I am right now in the process of trying (sic!) to get accredited for the ATP World Tour Finals - if that happens, I will try to talk someone there one on one, which is always better than emails. However, I would like to point out some points in that regard that have held me back to contact them back when I had investigated the issue:
  1. WTA/ATP clearly hold a copyright on the data and put some effort (equals money) in their systems, so I don't know if they would be too happy to give it away, because...
  2. all information published on any Wikimedia project is basically for free use, i.e. generally (!) published under a creative commons or something alike licence. Thereby, if ATP/WTA would want to allow us to automatically crawl their data, they would allow this in the sense that their copyright wouldn't be worth that much, because everyone could automatically crawl WD to obtain the information. There may be a way around this, but I am not sure. As far as my legal understanding is, Wikimedia projects need something more than a permission to use data, but need someone to publish it via a CC or alike licence to use that, i.e. information we use can never be limited to our free usage, but most always ensure free usage by anyone else.
  3. However, I don't exactly know how this plays in this field, because data points from data bases can be used freely by everyone, as long as manually obtained, i.e. eventually we wouldn't require WTA/ATP to publish their data bases under CC/alike, but a permission to crawl their websites/interface their database may be enough, because the information would already be free to be used in its existence.

Hope this makes sense, I am not an English natural speaker and trying to show differences in legal statements is really difficult in another language. Basically it goes down to (just in order to avoid misunderstandings): Can ATP/WTA grant us permission to get their data automatically (this should be ok from a legal POV) and is this sufficient for us to enable free use of our data bases for all end-users (that's where I am not sure)? I would like to get this sorted by the legal team, before we approach WTA/ATP, otherwise we would look really unproffessionell and in the worst case they would give us access and everything and we would have to turn it down for legal reasons regarding free usage...--Mad melone (talk) 14:45, 15 October 2014 (UTC)[reply]

You should not ask to license their whole website under ccbysa ;-) You should ask if we would be allowed to crawl very specific data items, like the win/loose balance, earned $, and highest postition/current position details. Just that, so no descriptions, no images, and only in order to keep the data we already take from their site more in sync with reality. If possible we can link back to them like we already do, either as a source on Wikidata, or to the profiles from the tennis players articles. I'm still not sure how much copyright they could ever claim on such data, as facts can not be copyrighted. Edoderoo (talk) 19:20, 15 October 2014 (UTC)[reply]

@Edoderoo: Sorry if I wasn't clear about that, but I am indeed only referring to the database. Please see the discussion to which I linked above. ATP/WTA reserve a copyright to their database contents. Still, we may use such data if not obtained automatically. The kicker for me basically is: if ATP/WTA allows us to crawl their website/database for these specific data points, is that enough to fulfill the free usability requirements for end users that our licences demand? This is a known problem with OTRS tickets in e.g. photographers give user rights to Wikimedia, but don't publish their photos under CC/alike licence. Simply put: we can't act on a "don't ask, don't tell" basis ;) @Lydia Pintscher (WMDE):: Lydia, some advice or direction you may provide on this? --Mad melone (talk) 08:23, 16 October 2014 (UTC)[reply]

The WMF legal team has provided meta:Wikilegal/Database Rights as an overview. I am happy to help talk to specific organisations about their licenses if you wish. --Lydia Pintscher (WMDE) (talk) 10:10, 16 October 2014 (UTC)[reply]
I'm an OTRS-agent myself, I recognise the problems with images. When a photographer grants rights like "Wikipedia only" we will not upload the image to Commons, but we'll ask for a more specific release of rights and explain the full consequences of ccbysa. We should do the same if it comes that WTA/ATP will allow us to crawl their data, they should have a statement that will take away all the issues, and they should be fully aware of the consequences. I'm not sure how they will react on it anyways, as we already have the data, it's only not as up to date as we would like ourselves. Either they will like that we are always behind, and then they will simply not give permission (then it's good to know for us, we can forget about it then), or they are willing to help getting us more up to date, and then such a statement is only for legal purposes. I believe the ATP/WTA do not have "generating web traffic" as their main goal, so maybe they will like to help us when we ask. They facilitate rankings and tournaments for tennis players as their main goal, and maybe having Wikipedia up to date is somehow in their interest too. I will read Lydia's link after lunch too, I guess it's interesting! Edoderoo (talk) 10:25, 16 October 2014 (UTC)[reply]
Hi Lydia, thanks for the Link, but please note that I this was also included in above's links (not to say that you should have read that all, just to say that I am aware of this already). Probably we could have the legal team have a quick look at this - from my perspective it is a special questions that isn't covered in the Database Rights summary by the legal team. Would that be possible? Further, thanks for the offer of contacting ATP/WTA for us, we might come back to that.--Mad melone (talk) 10:30, 16 October 2014 (UTC)[reply]
Ah sorry. I did indeed miss that. Yeah if you have things that should be covered in that document it is best to ask the question on the talk page there. That should get an answer soonish. If not I can poke. --Lydia Pintscher (WMDE) (talk) 11:29, 16 October 2014 (UTC)[reply]

FYI: m:Talk:Wikilegal/Database_Rights#Crawling_websites_to_obtain_data_points --Mad melone (talk) 12:49, 16 October 2014 (UTC)[reply]

FYI2: In the meantime, I have reached out to the legal team via email and they have responded that they will have a look at it.--Mad melone (talk) 11:33, 21 October 2014 (UTC)[reply]
Splendid! Edoderoo (talk) 13:09, 21 October 2014 (UTC)[reply]
@Lydia Pintscher (WMDE):: Sorry for the second call for help in a day ;) Could you please poke them as you have offered above? So far, I have send them a friendly reminder just to ask about the status and have not gotten a reply since. Thanks a lot, much appreciated.--Mad melone (talk) 09:35, 28 October 2014 (UTC)[reply]

Update: Two more reminders sent out to the legal team dated 31 October and 11 November with no response - I am getting really frustrated about this. --Mad melone (talk) 09:41, 13 November 2014 (UTC)[reply]

Ranking[edit]

PfD for P1119 (P1119) --ValterVB (talk) 19:39, 26 September 2014 (UTC)[reply]

Tournament draws[edit]

Another idea I came up over the weekend, but I want to see whether this is possible first, that's why I put it out on the larger audience. Wikidata:Project_chat#Tournament_draws --Mad melone (talk) 11:09, 29 October 2014 (UTC)[reply]

Infoboxes for doubles/singles records[edit]

Many players exist in many languages, and now we have often the singles/doubles record statisics up to date, I have edited that information in many local infoboxes:

  • :nl - reverted by anonymous user
  • :dk - formatted italics when from Wikidata
  • :en - only when local infobox field is blank
  • :fr - not yet implemented
  • :es - formatted italics when from Wikidata
  • :it - only when local infobox field is blank
  • :sv - formatted italics when from Wikidata
  • :pl - not yet implemented
  • :ru - formatted italics when from Wikidata
  • :pt - suddenly got reverted by an ip-address??? twice
  • :mk - formatted italics when from Wikidata
  • :hu - doesn't work, tricky language
  • :de - changes reverted in 2013
  • :sr - changes reverted jul 2014
  • :sl - formatted italics when from Wikidata
  • :cs - formatted italics when from Wikidata
  • :bg - formatted italics when from Wikidata
  • :tr - formatted italics when from Wikidata
  • :sv - formatted italics when from Wikidata
  • :ro - formatted italics when from Wikidata
  • :ca - formatted italics when from Wikidata
  • :ca - formatted italics when from Wikidata
  • :sk - formatted italics when from Wikidata
  • :hr - formatted italics when from Wikidata
  • :fa - formatted italics when from Wikidata*:zh - formatted italics when from Wikidata

This basically means that when the WikiData properties P564 and P555 are updated, most of these articles are updated as well. Hopefully this will also lead to more languages coming to update statistics. I know that the Dutch volunteers of the tennis project already do so, and the Dutch infobox isn't even supporting WikiData yet (they add the property as field in the infobox, for wiki-political reasons). There are quite some infoboxes not changed yet, mostly of the "smaller" projects, with all respect. I personally believe this is a great move for the international tennis articles! Please give a shout when you have good ideas to make this even better/bigger. Edoderoo (talk) 20:16, 4 November 2014 (UTC)[reply]

Just an inofficial word as being part of the German WikiProject:Tennis: we are heavily looking into bringing data from Wikidata to our infoboxes, but general baseline is that we more or less want to have a bot populating Wikidata on a regular basis, before we make the move. --Mad melone (talk) 17:45, 7 November 2014 (UTC)[reply]

Probably I can do it, but he problem is the license. Must be CC0. --ValterVB (talk) 20:30, 7 November 2014 (UTC)[reply]
Updating right now is done manually, which is better for the license until we have written permission. But the above list shows that manually updating Wikidata makes sense, for many projects it is even mandatory, when WikiData exists, the local values are often neglected. For smaller projects that is the best setting (my own opinion) anyways. For the few projects where my changes to the InfoBox were reverted, I leave the revert untouched (like the German InfoBox), I only want to show that it's possible, when a user community wants to do it differently, so be it. If they want the code at a later stage, they can put it back, or ask me to inject the code again. Right now it's for me just fun to be able to make the Japanse infobox work with all the scribble that actually makes no sense to me ;-) But this list can be used as a good example to advertise to keep the properties updated. Many language articles will benefit from it. Edoderoo (talk) 22:51, 7 November 2014 (UTC)[reply]
But updating manually Wikidata is really painful the necessity not only to enter the singles rank, singles w/l, doubles rank, doubles wl/l with the respective qualifiers for source and date for each of those statements. Therefore, I would like to suggest a "semi-automated" approach in the meantime: an externally hosted Excel file (Google Docs?) in which you only have to enter the new data by player and all the rest (date and source) would be entered automatically (by a lookup) and then we have a bot entering those data into Wikidata let's say each Friday which gives us Monday to Friday to update the list. If agreed, I would be happy to build such a spreadsheet. But we must ensure to contact all the local WikiProjects to gather as much (wo)man power behind this as possible. --Mad melone (talk) 09:40, 13 November 2014 (UTC)[reply]
I don't think that a spreadsheet can be solve the problem. Point 7 of Terms of Use Agreement on atpworldtour.com: «Systematic retrieval of data or other Content from the Website, including but not limited to scores, statistics, and/or rankings, whether to create or compile, directly or indirectly, a collection, compilation, database, or directory, is prohibited absent prior express written permission from ATP.» Is a problem also for Wikipedia. --ValterVB (talk) 21:57, 14 November 2014 (UTC)[reply]
It would be the best if we could get written permission (to OTRS, let me know if we need assistence there) from the ATP/WTA/ITF organisation, when we are allowed any kind of use of some of their stats. Using the facts in the infobox is probably outside this terms, but systematic retrieval might not. Storing the stats in WikiData however, might already be a collection or a database, then the question is if manual updating WikiData is legally a "systematic retrieval" or just ad hoc retrieval. Edoderoo (talk) 22:44, 14 November 2014 (UTC)[reply]
I wonder if tennisabstract.com has some rights to get those rankings weekly. --Stryn (talk) 12:51, 15 November 2014 (UTC)[reply]
In the german wikipedia I wanted to include data from WikiData but there is one big problem: Why is there no property for the price money? I think it makes no sense to update all statistics on WikiData but the price money has to be updated on the local wikipedia projects. --Korrektor123 (talk) 09:48, 15 November 2014 (UTC)[reply]
It's waiting for number datatype, see this and Property for career prize money? --Stryn (talk) 11:06, 15 November 2014 (UTC)[reply]
The creation of new properties is for some reason awfully slow. There is a number property, although it has an annoying extra feature, adding ±1 to every number used. This $-property would be really helpful, so if someone knows how to speed this up, please do. Edoderoo (talk) 17:02, 15 November 2014 (UTC)[reply]
I could create just now if we would have a number datatype which is needed for this property. We have a quantity datatype, but that's not the same. --Stryn (talk) 19:25, 15 November 2014 (UTC)[reply]
Ah, oh ok, I see the difference now. They've added a new dimension to numbers then ;-) Maybe if we call the $ property qty of dollars earned it could be used ;-) Edoderoo (talk) 20:41, 15 November 2014 (UTC)[reply]

Tournament victories[edit]

We have P1131 (P1131) and P1130 (P1130), but how should we use those? I think that we should include both WTA/ATP and ITF tournament victories. Of course we can't combine those victories, so my suggestion is following:

Tournament winnings (Property: Item datatype)

WTA Tour (Q2537906) (Item)
P1131 (P1131) (Property: Quantity)
P1130 (P1130) (Property: Quantity)
ITF Women's Circuit (Q2701085) (Item)
P1131 (P1131) (Property: Quantity)
P1130 (P1130) (Property: Quantity)

So we have to create propose a "tournament winnings" property and we can do as my suggestion is. Any comments? --Stryn (talk) 17:53, 9 December 2014 (UTC)[reply]

Agree, with ATP World Tour and ATP Challenger Tour plus their historic equivalents for men.--Mad melone (talk) 23:23, 10 December 2014 (UTC)--Mad melone (talk) 23:23, 10 December 2014 (UTC)[reply]
Proposed at Wikidata:Property_proposal/Unsorted. --Stryn (talk) 19:45, 23 December 2014 (UTC)[reply]

Occupation of wheelchair tennis players?[edit]

Should we use occupation (P106)tennis player (Q10833314) or make a new item called wheelchair tennis player? --Stryn (talk) 19:42, 23 December 2014 (UTC)[reply]

I think it would be an advantage if there is a new item called wheelchair tennis player. For example: If a Wikipedia project wants something different in the infobox to wheelchair tennis players than to the other tennis players, the infobox template could use the new wikidata-item. --Korrektor123 (talk) 17:10, 19 January 2015 (UTC)[reply]
Agreed. Done at wheelchair tennis player (Q18814798). --Stryn (talk) 19:08, 19 January 2015 (UTC)[reply]
Thank you. Something else I'd like to ask: is there any tool for wikidata which works similar to the catscan for the wikipedia? When I want to know which wikidata items are using occupation (P106) with the value wheelchair tennis player (Q18814798) I can't find such a list except the category of a wikipedia project but there are maybe more items in wikidata which are using that property than in the local category. --Korrektor123 (talk) 15:28, 20 January 2015 (UTC)[reply]
There is. The tool is called Autolist, and here is the query for occupation (P106): wheelchair tennis player (Q18814798). --Stryn (talk) 16:19, 20 January 2015 (UTC)[reply]
Thanks. I've seen that nearly every wheelchair tennis player has no itf-id. Now I'm busy for a while:-) --Korrektor123 (talk) 16:53, 20 January 2015 (UTC)[reply]
I've seen that the ITF player ID before 2020 (archived) (P599) has a formatter URL (P1630) with http://www.itftennis.com/procircuit/players/player/profile.aspx?playerid=$1. so the wheelchair tennis players cannot be found cause they need /wheelchair/ instead of /procircuit/. Is there any possibility to check if the player is a tennis player (Q10833314) or a wheelchair tennis player (Q18814798) and then to create a new link dependent on the request? --Korrektor123 (talk) 14:55, 30 January 2015 (UTC)[reply]
It doesn't matter what is written to formatter URL (P1630). It matters what's at MediaWiki:Gadget-AuthorityControl.js. Does there exist any link where the ID would always go to the profile, no matter is it a wheelchair tennis player or pro circuit tennis player? If not, then I think that we have to make new property for wheelchair ITF ID. --Stryn (talk) 22:46, 31 January 2015 (UTC)[reply]
I didn't find a way to call both kind of players with the same URL. What we also could do is to write the whole URL instead of the ID but then we had to reprogram the templates using the ID from wikidata. So I think it's the best way if we make a new property. --Korrektor123 (talk) 17:30, 3 February 2015 (UTC)[reply]

What is the correct instance of (P31) for items about "career statistics"[edit]

Fot the items seen here? Any suggestions? --Stryn (talk) 12:21, 27 December 2014 (UTC)[reply]

I found two items where were added some statemens already (Andy Murray career statistics (Q4761079) and Sabine Lisicki career statistics (Q16224644)) but IMO they are wrong and should be deleted. --Stryn (talk) 12:23, 27 December 2014 (UTC)[reply]

Seedlists on Wikidata[edit]

I had the idea to save the seedlists of tennistournaments in wikidata. I think we could use the participant (P710) for the seedlist. Then we can list the names of all seeded players with position played on team / speciality (P413) as seed-number. What do you think? --Korrektor123 (talk) 17:45, 20 January 2015 (UTC)[reply]

I'm not sure what you mean... Maybe you can add a small example, either on an old tournament, or maybe on the Australian Open that just started? Edoderoo (talk) 21:28, 20 January 2015 (UTC)[reply]
So I'll try again. (Example at 1966 Australian Championships – men's singles (Q1751631)) I wanted to save the lists of seeds in wikidata. The property where the seed-list could be saved is participant (P710), so I thought. Furthermore the seed-position of the players could be saved with the property position played on team / speciality (P413). Now I just want a feedback if it's a good or rather a bad idea. --Korrektor123 (talk) 15:35, 22 January 2015 (UTC)[reply]
Right, I see what you mean. It's a bit of work, but if we can create the tournament layout created out of that, it will be worth the effort. The next question will then be, how will we create a nl/Dutch tournament schedule out of this (the nl.wiki article is right now missing). If someone knows how to do just that, it will be extremely powerful I think! Edoderoo (talk) 16:22, 22 January 2015 (UTC)[reply]
You mean, a template should use the seeds-list from wikidata to bring the list from wikidata into the nl-wikipedia, don't you? Well, I don't know if the nl-wikipedia has the same syntax as the de-wiki, but I'll try to develop such a template. --Korrektor123 (talk) 16:30, 22 January 2015 (UTC)[reply]
I will see to convert a German template into a Dutch one ... I speak a bit of German, and I speak a bit of template as well ;-) Edoderoo (talk) 17:42, 22 January 2015 (UTC)[reply]
Oh, I think there is a little problem. The de-wiki has a template for the seeds-list but at the moment we aren't using data from wikidata. --Korrektor123 (talk) 13:52, 23 January 2015 (UTC)[reply]
Where does it get it's data from? Or do you have any idea/direction how to set up such a template, then I can puzzel myself maybe. Edoderoo (talk) 09:13, 24 January 2015 (UTC)[reply]

In the de-wiki we have to type in the data in the article. For example: If we want to have a seeds list in any article we have to type in:

{{Setzliste
| Anzahl =
| Modus =
| 1A =
| 1R =  
| 2A =
| 2R =  
...
}}

1A is the player seeded on position 1. 1R is the round where player of position 1 loses. The same is with 2A and 2R. So I think we have to create a new template or to reprogramm the previous template. I know that the de-wiki has a lua-module which can read data from wikidata (Module:Wikidata (Q12069631)). But I don't know if the module is able to sort the properties. --Korrektor123 (talk) 12:02, 24 January 2015 (UTC)[reply]

Interesting articles about data[edit]

Just interesting reading:

I wonder if we could use some data from tennisabstract.com as they are now available as .csv files at github (atp and wta). --Stryn (talk) 13:16, 3 April 2015 (UTC)[reply]

I love the heavytopsin databases, we should be able to extract at least the historic rankings for each player - anyone?--Mad melone (talk) 11:49, 22 August 2015 (UTC)[reply]
I can have a look, once it is clear what we exactly want to do. Edoderoo (talk) 08:52, 31 July 2016 (UTC)[reply]

A few things[edit]

So, a few ideas I have for this WikiProject, and what I think we need to do:

  1. We need a userbox for the participants, this is easy to do, but it'd be better if #2 were completed before we do it.
  2. We need a logo, kinda like these.
  3. An idea: the tennis entries about people need to be updated often, like once a month at least, and for that, it'd be good if we could volunteer ourselves for updating around 12 items (six guys, six girls maybe) for the ranking, records, etc. once a month on a set day. If we do this, I call dibs on Rafael Nadal (Q10132), Sabine Lisicki (Q76338), and Maria Sharapova (Q11666)!
  4. We need a separate page for participants (like this), and that can be transcluded to the main project page. ✓ Done

Thoughts? --AmaryllisGardener talk 18:30, 20 May 2015 (UTC)[reply]

Perhaps this may be of interest to Stryn or Edoderoo? --AmaryllisGardener talk 19:26, 21 May 2015 (UTC)[reply]
It is; I actually started creating some logo yesterday, but then I gave up, since the logo that I created was too boring IMO, and not even .svg, I have only GIMP. I would like to get P1119 at Wikidata:Properties for deletion resolved before than going to #3. Also, I'm still not sure should we add the latest ranking or the best ranking or both of them. --Stryn (talk) 13:20, 22 May 2015 (UTC)[reply]
Wow, finally the PfD regarding singles and doubles rankings has been closed (permalink). Now I would like to know what kind of rankings we should add, I mean only the latest or the best or all what they have... for example Serena Williams (Q11459) is quite random at the moment. --Stryn (talk) 18:04, 10 June 2015 (UTC)[reply]
@Stryn: Well, at least the PfD is over. I think we definitely need to only cover the current and highest rankings right now. The highest ranking could be done with the ranking property with a qualifier (instance of: highest ranking?) or we could create a "highest ranking" property and rename the "ranking" property "current ranking" (but, seeing the pain we had to go through getting those other properties deleted, having the latter done would probably be a pain to get done too.) --AmaryllisGardener talk 20:19, 10 June 2015 (UTC)[reply]
P.S. and IMO if we stick to my plan above, then we'd need to put "point in time" instead of "start date" as the qualifier for the rankings. --AmaryllisGardener talk 15:31, 11 June 2015 (UTC)[reply]
Another thing, how will we separate men's and women's tennis rankings, create items titled "men's tennis singles", "women's tennis doubles", etc.? --AmaryllisGardener talk 16:01, 11 June 2015 (UTC)[reply]
Renaming the "ranking" property would be too difficult as there are already many other items than tennis players (or table tennis players) using that item. Well, this is messy, more discussion and users to discuss is needed. --Stryn (talk) 19:27, 11 June 2015 (UTC)[reply]
My opinion is: let's restrict ourselves to the highest ranking; it moves only slowly, and only one way: UP. (Much the same as the win/loss-balance.) The current ranking position is too volatile, takes up a lot of maintenance effort, offering little relevance in any future. Whenever I encounter a 'current ranking' documented, I replace it by the highest ranking. Vinkje83 (talk) 19:04, 14 September 2015 (UTC)[reply]

ATP links are broken after the new layout[edit]

FYI: https://en.wikipedia.org/wiki/Wikipedia_talk:WikiProject_Tennis#New_ATP_website --Stryn (talk) 08:11, 13 June 2015 (UTC)[reply]

Tournaments[edit]

I've been hard at work adding winner (P1346) to tennis tournaments. Maybe we should get a runner-up property. Also, any more thoughts about how to handle rankings? Maybe we should start a discussion on PC or even start a RfC. --AmaryllisGardener talk 20:36, 7 November 2015 (UTC)[reply]

+1 for a runner-up property and +1 also for a discussion about rankings somewhere else than on this talk page. --Stryn (talk) 13:11, 8 November 2015 (UTC)[reply]
@Stryn: I've made a proposal for a "runner-up" property here. --AmaryllisGardener talk 16:27, 10 November 2015 (UTC)[reply]
Should be do-able with qualifiers, shouldn't it? I think we should limit the number of different more-or-less single purpose properties to a minimum.--Mad melone (talk) 10:52, 12 November 2015 (UTC)[reply]

Wikimania 2016[edit]

Only this week left for comments: Wikidata:Wikimania 2016 (Thank you for translating this message). --Tobias1984 (talk) 12:05, 25 November 2015 (UTC)[reply]

Tennis infobox[edit]

Hello! Do you know yet any Wikipedias that use tennis infobox with data from Wikidata? I just created some start at fi:Malline:Tennispelaaja2 (in use on fi:Brian Teacher). We would still need many more properties on Wikidata. --Stryn (talk) 14:19, 5 July 2016 (UTC)[reply]

Two years ago I have added some properties in the infoboxes of the ES-spanish and RU-russian wiki's, along with quite some others. It might be that I have missed the Finnish one by then, sorry for that. At that time we could only use the win/loose balances, so most languages use just these two properties. On some languages my edits were reverted (on the German wiki because I used a slash--- instead of a colon::: between the numbers ;-), but on quite some my edits were there and even improved later on. Now these properties do exist for more then two years, I see that they are maintained pretty well, better then they are maintained in the infoboxes itself. We might need a few more properties, especially the ones that are changing over time (the win-loose balances and price money are there, though). Edoderoo (talk) 13:24, 18 August 2016 (UTC)[reply]

Grand Slam results[edit]

How do we show the players results in Grand Slam tournaments? I think we should store the best results in Aus/Fra/Wim/Us for singles/doubles/mixed tournaments. Do we have to propose some new properties? --Stryn (talk) 13:34, 12 July 2016 (UTC)[reply]

Anyone? These are used in tennis infoboxes and would be nice to get the data from Wikidata to Wikipedias. Stryn (talk) 15:16, 23 June 2017 (UTC)[reply]

FedCup[edit]

Oioioi. It seems that the FedCup changed their website, but on top of that, they also changed the id's for their players. They used to have the same id as the ITF-id, but now they changed. First of all they seem to start with an 8 instead of a 1, but the other digits also changed, and the old link/id doesn't work anymore. We need to create a plan to fix this. Maybe I can create something in Python to do this automagically. Edoderoo (talk) 13:26, 18 August 2016 (UTC)[reply]

So bad move from the ITF to change those ID's and make thousands of links broken, not alone in Wikipedia. Maybe someone could contact them and ask if they can make a redirect from the old links? --Stryn (talk) 13:28, 22 August 2016 (UTC)[reply]
Me and someone else asked by mail for a translation table, but we didn't get any response after all, even not a simple "no, you won't get that" reply. I'm afraid we have to do a manual exercise in updating all players. Edoderoo (talk) 20:00, 27 August 2016 (UTC)[reply]

Data source available[edit]

Hi,

thought I just let this here: https://github.com/JeffSackmann --195.145.215.1 12:33, 2 January 2017 (UTC)[reply]

I don't think is usable: "Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License" --ValterVB (talk) 12:36, 2 January 2017 (UTC)[reply]

Davis Cup record[edit]

Can the singles record (P564) and doubles record P555) properties be used to indicate a players record in the Davis Cup (or Fed Cup) like at István Gulyás (Q51066)?--Wolbo (talk) 17:32, 31 March 2017 (UTC)[reply]

I think it's ok. Stryn (talk) 15:14, 23 June 2017 (UTC)[reply]

Good news!

I discovered a different entry path to the player profiles at the WTA website:

http://www.wtatennis.com/player-profile/$1

This one accepts the numeric id without further addition. This allows us to get rid of the (ugly) alphabetic title extension to WTA player ID (P597), and stick to the numeric id proper.

Kompakt
Stryn
AmaryllisGardener
Edo de Roo
Wolbo
Matlab1985
Soundwaweserb
Pommée
Mad melone
Kacir
A.Gust14
wallerstein-WD
Sakhalinio
Somnifuguist
See the bright light (talk)
Juanman
DonPedro71

Notified participants of WikiProject Tennis

Already I introduced this entry path, and started editing players' wikidata record. That works. The not-yet-simplified players are still able to click-link, so we have time to convert.

Kind regards, Vinkje83 (talk) 20:18, 19 October 2017 (UTC)[reply]

@Vinkje83: I can convert them with bot... --Edgars2007 (talk) 10:59, 21 October 2017 (UTC)[reply]
@Edgars2007: Great! On this list I converted the first 500 players already manually. I will gladly leave the remainder to your bot. Thanks and greetz. Vinkje83 (talk) 11:09, 21 October 2017 (UTC)[reply]
@Vinkje83: done. --Edgars2007 (talk) 16:38, 23 October 2017 (UTC)[reply]

WTA draw PDFs[edit]

Good news, again.

I discovered where the dearly missed WTA draw PDF files went.

Old address:

http://www.wtatennis.com/SEWTATour-Archive/Archive/Draws/yyyy/nnn

Now to be found at:

https://wtafiles.blob.core.windows.net/pdf/draws/archive/yyyy/nnn
Kompakt
Stryn
AmaryllisGardener
Edo de Roo
Wolbo
Matlab1985
Soundwaweserb
Pommée
Mad melone
Kacir
A.Gust14
wallerstein-WD
Sakhalinio
Somnifuguist
See the bright light (talk)
Juanman
DonPedro71

Notified participants of WikiProject Tennis

Vinkje83 (talk) 00:29, 3 November 2017 (UTC)[reply]

Problem with ~3700 tennis tournament items (+6440 items)[edit]

Hey WikiProject Tennis, there is a fundamental problem with ~3700 tennis tournament items: they use instance of (P31) with one of these 9 items (“women's doubles” and similar) as values (example: 1996 Japan Open Tennis Championships – women's doubles (Q3807213)). However, the nine items do not subclass sports competition (Q13406554) or tennis tournament (Q13219666), so the ~3700 tournament items do not have event-character at all. The items about events are instances of sport (Q349), which is in fact plain wrong as well.

I propose to implement this solution:

Any objections? Would we have to consider use of these items by any modules in Wikipedias? If not, I offer to implement the necessary changes. —MisterSynergy (talk) 12:34, 7 December 2017 (UTC)[reply]

Notified participants of WikiProject Tennis: since I did not hear any complaints about my suggestion, may I assume silent consent? So this is a final call: I am going to start this job tomorrow or on Monday, if nobody objects until then. Regards, —MisterSynergy (talk) 21:59, 9 December 2017 (UTC)[reply]
Hi MisterSynergy, I was not aware that this question had been raised; hence no earlier reaction.
I am not sure that I understand the 'fundamental problem' that you mentioned above. You write: "the ~3700 tournament items do not have event-character". Picking up on your example: 1996 Japan Open Tennis Championships – women's doubles (Q3807213) is part of (P361) 1996 Japan Open Tennis Championships (Q3807212) which instance of (P31) Japan Open Tennis Championships (Q1025949) which is subclass of (P279) tennis tournament (Q13219666), which I consider to have event-character.
Applying instance of (P31) tennis tournament (Q13219666) to these ~3700 tournament items would be fundamentally redundant, since they are all linked to that item already through the chain that I specified above.
In case the part of (P361)-property is missing on any of these tournament items, then *that* is the defect to be repaired.
I cannot understand why competition class (P2094) "women's doubles" whould be any better than instance of (P31) "women's doubles".
The current construction is the result of previous discussion, quite some time ago. Unfortunately I have not been able to find back where that discussion took place. I only remember that it pointed out to me that the subclass of (P279)-property that I used before that, was apparently theoretically wrong – I was specifically instructed to use the instance of (P31)-property instead.
Vinkje83 (talk) 22:46, 9 December 2017 (UTC)[reply]
Thanks for the reply. The part-of relations are good to have, however only P31 and/or P279 claims define which kind of character the items have and here the items are lacking a suitable value. For instance in the example item 1996 Japan Open Tennis Championships – women's doubles (Q3807213) you have given above, winner (P1346) claims create constraint violations since the example item is not of event character. Wikidata:Database reports/Constraint violations/P1346 is flooded with those items, and the use of many other sports-related properties has the same problem. —MisterSynergy (talk) 22:57, 9 December 2017 (UTC)[reply]
I found the previous discussion at User talk:Laddo/Archive/2016/1#Structure of tennis items (thus ping @Laddo here as well). It was indeed initiated by me, and addressed the same problem. I have no idea why I lost track back then so that the problem persisted… —MisterSynergy (talk) 08:32, 10 December 2017 (UTC)[reply]
It seems that the constraint violations program that you mention, does not look 'deep enough'. If one item is 'part of' another item, it obviously inherits all its properties. It seems like the program ignores this inheritance. Ping @Stryn, who originally designed the structure of tennis tournaments. Vinkje83 (talk) 09:25, 10 December 2017 (UTC)[reply]
part-of relationships do not imply any inheritance, thus the constraint violation software intentionally “ignores” this information for the reports. This is not a bug or a software limitation. —MisterSynergy (talk) 11:02, 10 December 2017 (UTC)[reply]
Let's take 2017 Kremlin Cup (Q41165574) as an example (because the 1996 Japan Open Tennis Championships (Q3807212) that you selected, is a badly structured specimen). This combined male/female tournament has, as it should have, six items below it (two directly, four indirectly): "men", "women", "men's singles", "men's doubles", "women's singles" and "women's doubles". Are you seriously implying that properties such as country (P17), located in the administrative territorial entity (P131), location (P276) and several others, do not apply to the six constituents? That would put the door wide open for those who repeat higher-level properties on lower-level items, in flagrant violation of basic database design principles (remember: Third Normal Form, in plain English: place item properties just there where they belong, not higher, not lower), obscuring the properties that are really significant at that level by a plethora of copies of higher-level properties. For example: the fact that 2017 Kremlin Cup (women) (Q41885198) has organizer (P664)=Women's Tennis Association (Q948442) is surely *not to be repeated* on its parts 2017 Kremlin Cup – women's singles (Q42296798) and 2017 Kremlin Cup – women's doubles (Q42034292). Same objection to repeating point in time (P585)=2017 as already defined by 2017 WTA Tour (Q24908533) to which it belongs. Same for many, many other properties that implicitly 'drip down'. Vinkje83 (talk) 11:43, 10 December 2017 (UTC)[reply]
Formally there is nowhere any inheritance of claims implied when property paths along different properties are followed, so all the claims you mention would be valid on the lower-ranking item(s) even if they were redundant to claims of the higher-ranking item. However, it is of course reasonable to work efficiently and use claims from higher-ranking items as you describe in certain areas—as long as this model is documented somewhere. It is worth to note at this point that many users intentionally add such redundant content to the lower-ranking items as well, since it simplifies Wikipedia module coding for several reasons (I don’t like this as well).
Here, however, we talk about P31. This property and P279 organize the content here, thus redundancy is often inevitable to get things done properly. The issue that I reported here has in fact two consequences: according to the P31 claims in the reported items, tennis tournament items instantiate sports discipline (Q2312410) (which they do not describe), and at the same time they do not instantiate tennis tournament (which they do describe). —MisterSynergy (talk) 13:11, 10 December 2017 (UTC)[reply]
Can you explain the difference between competition class (P2094) and competition class (Q22936940)? Vinkje83 (talk) 14:04, 10 December 2017 (UTC)[reply]
The first is a property, the latter an item. Both about the same concept, but properties and items have very different roles in Wikidata and thus there are two entities. They are now also linked to each other, which was still missing for some reason. —MisterSynergy (talk) 14:11, 10 December 2017 (UTC)[reply]
If I read you correctly, you propose to apply the property (competition class (P2094)) to tournament parts. I looked at this list, and I saw only persons and teams, no tournaments. Also I read the descriptions of this property, in English, French and German – they express themselves in terms of a 'subject' (which I read as a person). Examples given are 'weight class' and 'age class', obviously geared towards persons only. I do not see its relevance to tournaments or tournament parts. Vinkje83 (talk) 16:27, 10 December 2017 (UTC)[reply]
Please have a look at the property proposal of P2094: it explicitly mentions “events, teams, participants, equipment”. As often here, “subject” refers to the item on which the property is used (as opposed to “predicates”/properties and “objects”/values of data triples). Unfortunately there were no property constraints in place until today, which I try to implement at the moment. A subject type constraint (Q21503250) will likely follow, and it will of course include sports competitions. —MisterSynergy (talk) 17:14, 10 December 2017 (UTC)[reply]
Notified participants of WikiProject Tennis again: The issue is too far-reaching to be decided by only two people. We need more opinions participating in this discussion. Also @Kacir, @Dooom84, @Mac460, @Openbk, @Mac6v5, @MisterGB, @A.Gust14, @Siebenschläferchen. Vinkje83 (talk) 08:27, 11 December 2017 (UTC)[reply]
Thx for ping, at first sight it seems that I would be agree with Vinkje83 as he showed 2017 Kremlin Cup (Q41165574) example, nevertheless it´s an explicitly technical problem, so my opinion here is not relevant. Side-note: From category tree organization we know, that there should be no duplicity in lower/higher-levels (of course excepting some cases).--Kacir (talk) 16:01, 11 December 2017 (UTC) / @Matěj Suchánek ping. --Kacir (talk) 16:16, 11 December 2017 (UTC)[reply]

I've been pinged 3 times of this discussion, but I don't know what to say. And no, I didn't "originally design the structure of tennis tournaments". I only remember that I instructed some users to not mix up different topics (e.g. 2012 Sony Ericsson Open (women) (Q2302482) vs. 2012 Sony Ericsson Open – women's singles (Q305437), FYI User:Stryn/Tennis) to the same item. I have not worked with the statements on the tennis tournament items. Stryn (talk) 16:39, 11 December 2017 (UTC)[reply]

Responding to the ping. Don't have a lengthy experience with wikidata yet so some of the concepts are a bit tricky to grasp. One of the proposals is to "replace instance of (P31) women's doubles (Q17299348) (and similar) with instance of (P31) tennis tournament (Q13219666)" and I have recently been doing the exact opposite with the reasoning that e.g. a "womens's doubles" is not a tennis tournament but a tennis tournament event or, more specifically, a tennis tournament edition event. At the English wikipedia, where I mainly edit, tennis tournaments have three hierarchical levels:
  1. tournament (e.g. Japan Open Tennis Championships (Q1025949))
  2. tournament edition (e.g. 1996 Japan Open Tennis Championships (Q3807212))
  3. tournament edition event (e.g. 1996 Japan Open Tennis Championships – women's doubles (Q3807213)).
On some other language wikipedias there is an intermediate level between 2. and 3. to indicate if it is a men's (ATP) or women's (WTA) tournament and I believe wikidata is stuctured like that as well. So if we use instance of (P31) at the level of the event (e.g. 1996 Japan Open Tennis Championships – women's doubles (Q3807213) should the item not be "tennis tournament event" instead of "tennis tournament" to be able to distinguish between these levels? --Wolbo (talk) 00:02, 12 December 2017 (UTC)[reply]
Good point. The distinction between tournaments and individual events at a tournament with different instance of (P31) values is also done for a couple of other types of sport. I would not oppose doing so here as well, although this would add extra complexity. Should we elaborate this a little further? —MisterSynergy (talk) 05:58, 12 December 2017 (UTC)[reply]
It might actually make it easier to maintain. Besides, you wont have the problem with attempts to merge level 2 with level 3.
--- Jura 06:19, 12 December 2017 (UTC)[reply]

In order not to end in a stalled situation with the plenty constraint violations for a second time, I’d like to ask how do we proceed here? There is my original proposal, and the modification proposed by User:Wolbo. I can live with both, but I am also open for completely different suggestions (maybe by Vinkje83?), if that would be better for the tennis items and fixes the related constraint violations particularly on Wikidata:Database reports/Constraint violations/P1346. —MisterSynergy (talk) 13:37, 15 December 2017 (UTC)[reply]

In the meantime, I repaired the structure of 1996 Japan Open Tennis Championships (Q3807212) (because MisterSynergy used its grandchild 1996 Japan Open Tennis Championships – women's doubles (Q3807213) in his first communication). All who are concerned with editing wikidata items for tennis tournaments should be aware that combined male/female tournaments (editions, actually) should be structured on wikidata as follows (making Wolbo's structure more complete):
                        +---------------------+
                        | combined tournament |
                        +-+-----------------+-+
                          |                 |
      +-------------------+-+             +-+-------------------+
      |    ATP tournament   |             |    WTA tournament   |
      +-+-----------------+-+             +-+-----------------+-+
        |                 |                 |                 |
+-------+-------+ +-------+-------+ +-------+-------+ +-------+-------+
| men's singles | | men's doubles | |women's singles| |women's doubles|
+---------------+ +---------------+ +---------------+ +---------------+
Note that I did not include the 'general tournament' here – this is a different animal (being of 'class' type). The edition-type items shown above are not 'class' but 'individual' (or something like that).
I know of no language wikipedia having articles on all three levels. Some (such as en, es, it) have articles on top and bottom levels. Some (such as de, nl) have articles on top and middle levels. Some (such as fr, pl) have articles on the middle level only.
The matter being under discussion here (initiated by MisterSynergy) relates to the bottom level only. Its current specification instance of (P31) women's doubles (Q17299348) apparently violates some constraint, because women's doubles (Q17299348) is not an event, but of 'class' type. (My opinion that the item inherits event-nature from its parent 1996 Japan Open Tennis Championships (women) (Q3353988) is apparently not practically useable.) What value should we give instance of (P31) then? MisterSynergy suggested tennis tournament (Q13219666). My objection to this value is, that it is too high-level. The categorization principle applied to all wiki's is: to choose the detailedest applicable value. In the example, that would be Japan Women's Open (Q1032130) as copied from its immediate parent. This value is an (indirect) subclass of tennis tournament (Q13219666), and should (so I hope) satisfy the constraint(?). Even better would be a value which is not (yet) defined, but which I could circumscribe in English as "Japan Women's Open – Doubles".
In case this not-yet-existing item is not chosen for implementation, I vote for copying the instance of (P31) Japan Women's Open (Q1032130) from the immediate parent, through the part of (P361) property. In that case, the item needs to receive an additional property to have the value women's doubles (Q17299348). The competition class (P2094) property, as suggested by MisterSynergy, still feels to me as having person-nature – but I can live with it, provided its description is edited such as to take the impression of person-nature away.
Vinkje83 (talk) 17:29, 16 December 2017 (UTC)[reply]

Thanks, Vinkje83, for the detailed comment. I agree to a large extent to the describing part (there are some inaccuracies which do not really matter here). Comments on your proposal:

  1. If we added “Japan Women's Open – Doubles” or Japan Women's Open (Q1032130) as values for instance of (P31) in the items of the third layer, the constraint violations would be gone since those items subclass tennis tournament (Q13219666). So, technically I am fine with both approaches. However, I’d like to express that this even increases redundancy (the part of (P361) and instance of (P31) would both carry specific “Japan Women’s open” items as values), and the number of required structural items that serve as P31 values would be rather big and thus expensive to manage with no advantages for queries. Efficient repairs would probably not be possible as well, i.e. we would have to repair a large amount of the constraint violations manually. The same applies to Jura1’s “sport:tennis”-only worklist. An approach with tennis tournament (Q13219666) (or likewise Wolbo’s suggestion “tennis event” – in German „Tenniswettbewerb“ – which I really like as well) would be much simpler without disadvantages.
  2. Regarding the description of competition class (P2094): I have made it more precise according to the property proposal, and I also added a type constraint which includes all types provided during property proposal (including sports competitions).

MisterSynergy (talk) 17:09, 17 December 2017 (UTC)[reply]

Thank you for clarifying competition class (P2094).
@Dooom84 @J. N. Squire Veuillez améliorer la description Française de competition class (P2094), s'il vous plaît. Merci.
@Matlab1985 @Beta16 Please adapt the Italian description of competition class (P2094). Thanks.
Having read your most recent contribution, I withdraw my suggestion to use Japan Women's Open (Q1032130) (or lower). For the practical reasons that you mentioned, I agree with a high-level value, such as tennis tournament (Q13219666) or else "tennis tournament event" (or: "tennis tournament edition"). Could you create an item like "tennis tournament event/edition"? Subsequently we can discuss that in more detail (where needed) in a language-independent manner.
If we start working with a concept like that, I am in favour of changing the name of tennis tournament (Q13219666) (in English currently "tennis tournament") to "tennis tournament class", in order to make the distinction with "tennis tournament event/edition" more clear to those editors who are less familiar with the subject matter.
Vinkje83 (talk) 16:38, 19 December 2017 (UTC)[reply]
Okay. To better disambiguate level 1 and level 3 items, I suggest to go with Wolbo’s suggestion and created tennis event (Q46190676). It is already suitable for use as values in P31 of level 3 items, but more labels/descriptions would be great of course. I have not yet used it in any other item. The model would now look like this:
This way it would work for level 1 and 3, but I am not sure about level 2 yet. Should it be put in between 1 and 3 or not in the P361 hierarchy?
tennis tournament (Q13219666) does not need to have another label. Its role depends on the property you use (P31 or P279). —MisterSynergy (talk) 08:52, 20 December 2017 (UTC)[reply]
Higher-up in this discussion, the words 'event' and 'edition' have often been used interchangeably. In the context of tennis tournaments: are the words 'event' and 'edition' synonymous? If not, we need to find out what the difference is. Vinkje83 (talk) 23:20, 20 December 2017 (UTC)[reply]
No, they are distinct, at least in the way they are used in the English wikipedia. Basically a tournament (Japan (Women's) Open) has specific editions (e.g. 1996 Japan Open Tennis Championships) which each consist of one or more events (Women's Singles, Women's Doubles). If we follow this distinction the dutch text for tennis event (Q46190676) is currently not correct as it refers to an edition. Should IMO be something like 'onderdeel' or 'discipline'.--Wolbo (talk) 23:58, 20 December 2017 (UTC)[reply]
Yes, correct.
  • It might be worth to mention that we typically use the term “instance” instead of “edition” at Wikidata, likely due to the English label “instance of” for instance of (P31). So “1996 Japan Open Tennis Championships is an instance of tennis tournament”, as it is (indirectly) claimed by one of the statements outlined above.
  • Independently of that it is indeed difficult to deal with the various meanings of the English word “event”, since in its generic meaning it applies to all levels here. However, the level 1 (and 2) items of the model above (“tournaments”) follow pretty much event (Q1656682), thus they have properties such as "point in time", "location", "organizer", "website", "social media channel", "brand", "inception date", etc. In that sense they are technically not so different from items about non-sport-related entites, such as instances of music festivals, conferences, trade fairs, etc. On the other hand, level 3 items are very sport-specific (key item sporting event (Q16510064)) with properties such as "participant", " overall winner", "full results", "progression system", etc. We have such a scheme in many types of sport here at Wikidata, although it is not yet properly implemented in all fields.
  • There are typically one or a few tennis events in each tennis tournament. The special case of a tournament with only one event is somewhat difficult: level 1 to 3 basically collapse into only one item due to the way how Wikipedia articles are written in such cases, and our model does not really work. This case happens anyway, regardless of our considerations here, so it is not a blocker.
  • Maybe it is also worth to mention here that there is in fact also a "level 4" for individual tennis matches, but we do not have a significant amount of items here (one example would be Isner–Mahut match at the 2010 Wimbledon Championships (Q30801)). It would not be difficult to systematically model those later as well without being in trouble due the outcome of the current discussion.
MisterSynergy (talk) 07:44, 21 December 2017 (UTC)[reply]
Please read the intro paragraph of en:2013 U.S. National Indoor Tennis Championships (and hundreds of others) and you will find that tennis tournament articles on en-wiki use the word 'event' only for the complete tournament, not for the 'discipline' parts. I cannot recall any article using 'event' for a discipline part. All the time you were writing about 'event', I have understood that to mean a yearly tournament occurrence. The distinction between 'tournament class' en 'tournament occurrence/event/edition' is a vital one. For that reason, I am still in favour of renaming tennis tournament (Q13219666) to 'tennis tournament class'. The recently created item tennis event (Q46190676) stands for the other mode: event/occurrence/edition/instance. Vinkje83 (talk) 22:25, 21 December 2017 (UTC)[reply]
Yes, Wikipedia articles are written in natural language, of course, and thus they are often surprisingly imprecise. Humans understand what’s being communicated, particularly if the are familiar with the topic (here: tennis), but nevertheless it is imprecise. In this case we find formally wrong statements, and try to set up a formal structure for items of similar character; similarly this structure is found for items of may other types of sport. We thus use somewhat more generic terms than usual in natural language, but we can of course add aliases. The actual definition of the items however is not in the labels anyway, it comes from the statements.
That said, I have to express that “class” or “edition” postfixes on labels of the discussed items would be absolutely at odds with practically all other similar items. I am sure that this should not be done, in order to avoid confusion for other editors. —MisterSynergy (talk) 13:35, 22 December 2017 (UTC)[reply]
  •  Options reading the proposals (and the current practice), I'm not entirely sure how to visualize the optional intermediary steps. (Level 1) seems fine, (Level 3) works out.
    Ideally the links between (Level 1) and (Level 3) would be there whatever number of (Level 2) are present or not. Maybe part of the series (P179) could be used for that.
    In field with a simpler structure (limited to Level 1), the first part is considered the instance of a reoccurring event and the second part an instance of a reoccurring event edition. This worked out fairly well to replace 3 or 4 different structures.
    Just a minor thing, not that it matters necessarily to understand the scheme, but the enwiki article linked to < Japan Women's Open (Q1032130)> is about the time from 2009.
    --- Jura 16:59, 21 December 2017 (UTC)[reply]
    • I share your worries about the second level, thus I originally proposed to link level 3 directly to level 1 and left it pretty much open what to do with level 2. However, these items are there and they deserve to be modeled as well, of course. The problem with an optional second layer is that data lookup can be quite complicated due to many if-then-else decisions to be made. One thing I did not understand in your comment is the proposal to have a look at part of the series (P179). How’d you use that, can you please give an example? The other things are correct, yet not so important for the model itself. —MisterSynergy (talk) 13:39, 22 December 2017 (UTC)[reply]

@MisterSynergy: maybe something like the above. Given that we can't know beforehand if level 2 is present (and/or if/when it will be created), I think it would be good to have a structure that works independently of its present. This makes it compatible with the incremental way of Wikidata's growth.
--- Jura 14:32, 3 January 2018 (UTC)[reply]

Thanks, @Jura1. Two days ago I looked a little deeper into the existing items, particularly tournament class items and found two things:
  • There has to be a “level 0” of tennis tours (aka tennis circuit); see tennis tour (Q7700500). Tours are seasoned sets of tournaments; I have only seen “tour class items” such as ATP World Tour 250 series (Q300017) until now (i.e. no instance items), and tournament class items are related to tour class items with another P361 layer (although part of the series (P179) would fit much better in this case); start/end date qualifiers are typically missing, in spite of seemingly significant changes over the time. For whatever reason, tours were classified as tournaments like level 1/2 items until very recently, but this should already be fixed now.
  • On level 1/2: on tournament class level, there are only ~30 to 35 level 1-items, and twice as many level-2 items. All other tournament classes are much more like combined level 1 + level 2 items. No idea about tournament instances, there will probably be more items.
I will continue to look into this, and I would like to hear more input from other editors. —MisterSynergy (talk) 15:04, 3 January 2018 (UTC)[reply]
I tried to do a quantitative summary at Wikidata:WikiProject Tennis/numbers/pyramid. Level 2 is still missing. Queries might need adjusting later on. --- Jura 12:20, 14 January 2018 (UTC)[reply]
Maybe we don't need level 2 anyways. The sample above could easily be handled as level 1
--- Jura 18:28, 14 January 2018 (UTC)[reply]
How to model these cases without level 2? Btw. Wikidata:WikiProject Tennis/numbers/pyramid does appear pretty broken for me, it is not really useful in this shape. —MisterSynergy (talk) 20:39, 14 January 2018 (UTC)[reply]
The current format isn't ideal for general consumption. One needs to scroll past the blue lines.
Level 2 A would be included in Level 1 A.
BTW, where would items about qualifiers and finals go? (Level 4?) --- Jura 21:15, 14 January 2018 (UTC)[reply]
Given the relative number of level 2 A items, we can re-visit the question later.
In the items with only 1 statements, there were some 4440 items about events. Wikidata:WikiProject_Tennis/reports/P641_only is now much shorter. Among them, there are many Fed Cup/Davis Cup items and a few items with broken labels ([3]).
--- Jura 07:50, 15 January 2018 (UTC)[reply]
After doing some more edits, I don't think part of the series (P179) would work out as the parts aren't uniform. The current part of (P361)-approach works better, at least for parts of a specific tournament edition.
--- Jura 16:51, 8 February 2018 (UTC)[reply]
  • In the meantime, there are some 6440 additional ones. I identified most of them among items that didn't have any statements but P641. I also converted a few items that used "sport event" or "event". There are also a few "tennis event" in the items that use P31="tennis tournament". These would need to be included as well.
    I updated the pyramid with layers by gender, but didn't identify many that might fit there.
    There were also many tennis tournament editions among the items without any statements and I added p31=Q47345468 to these. It should be fairly easy to add more statements to these and integrate them into whatever model is chosen.
    --- Jura 18:39, 26 January 2018 (UTC)[reply]
    • I can't find the 6440 additional ones, they apparently do not violate constraints, right? Can you point to an example item so that I can see how they currently look like? I would also like to hear whether there is still objection to the proposed repair, particularly by @Vinkje83 … —MisterSynergy (talk) 20:10, 26 January 2018 (UTC)[reply]
I stopped participating in this discussion, because it became too technical for me to comprehend. This was even worsened since many contributions are now being formulated in English only, in stead of using the multi-language constructs {{P|...}} and {{Q|...}} – even plain P-numbers are being used, which I don't store in my brain.
Please summarize the proposal (as it stands right now) for people who are not extremely technical. After all, when the change is implemented, I will have to do my future edits in conformity with it, so I need to understand. Please clarify how you see the levels now; it seems to me that your use of level 1 and level 2 is not the same as I originally introduced them in the drawing. Vinkje83 (talk) 20:42, 26 January 2018 (UTC)[reply]
Okay thanks, I’ll write a (hopefully comprehensive) update this weekend. —MisterSynergy (talk) 21:27, 26 January 2018 (UTC)[reply]
@MisterSynergy: Here: [4].
--- Jura 21:16, 26 January 2018 (UTC)[reply]
Thanks, good work! —MisterSynergy (talk) 21:27, 26 January 2018 (UTC)[reply]

Summary: we have been discussing at least two more or less separate issues in this thread:

  1. I originially opened it due to the many constraints which are violated by tennis items (on “level 3” of the scheme above only). The reason is that in those items about occurrences the values of the instance of (P31) claims are subclass of (P279): sports discipline (Q2312410). I proposed a different model, which (including Wolbo's suggestion) would remove the violations and adapt more the approach from other types of sports: “level 3” items should have instance of (P31): tennis event (Q46190676), and a competition class (P2094) claim with the value that was formerly the instance of (P31) value (example: competition class (P2094): women's doubles (Q17299348)). The connection to tournaments via part of (P361) would not be affected. Most of these changes could be applied automatically.
  2. While discussing the issue, we broadened our view on all tennis items in general, and we found that the situation is quite complex and not yet well-organized (as in many other types of sport in Wikidata). For occurence-type items, a hierarchy of four or five layers seems applicable (0—tennis tour (Q7700500), 1/2—tennis tournament (Q13219666), 3—tennis event (Q46190676), 4—tennis match (Q47459169)). We do have proper class items for all meanwhile, and additionally tennis tournament edition (Q47345468), tennis tour edition (Q47358534), recurring tennis tournament (Q47443726), tennis tournament edition by gender (Q47403752), and maybe even more (Jura1 may help out here). Most of the layers are pretty straight forward to set up, only 1/2 is more difficult due to the sitelink situation.

Since both issues are independent, I recommend and offer to fix #1 already as outlined above, and continue with #2 then. —MisterSynergy (talk) 09:14, 30 January 2018 (UTC)[reply]

Update: I spent some effort into this task this morning. Not yet the actual repair, but this was about adding competition class (P2094) to tennis event items which did not have this property yet – at least when I was able to identify the competition class. That was in fact the case for ~1500 items. Summary:

  • The tennis competition class items have been improved. They now contain information about who’s eligible to participate.
  • There are six new competition class items for wheelchair tennis. For an overview, see this query.
  • There are 2943 doubles tennis events where I can’t add competition class (P2094), because there is no information about the participants' gender (men's, women's, or mixed doubles, query). Any idea how to identify that?
  • There are 749 singles tennis events where I can’t add competition class (P2094), because there is no information about the participants' gender (men's or women's singles, query). Any idea how to identify that?
  • There are less than 100 tennis events with rather untypical competition classes like "legends" or so (query). We’d need competition class items for them, but I don’t know enough about that right now to do it by myself.

I am now waiting some days to have a look at the constraint reports. Then I will continue with the actual repair as proposed before. —MisterSynergy (talk) 09:23, 27 February 2018 (UTC)[reply]

✓ Done; Finally the batch has finished now, all items have been repaired:

Any questions, or comments? —MisterSynergy (talk) 11:05, 28 February 2018 (UTC)[reply]

  • Good news, thanks.
    Maybe once "part of" is added, the P2094 can be determined for remaining ones. For an overview, I added Wikidata:WikiProject Tennis/reports/P2094 missing.
    I'm a bit hesitant about the qualifier approach. P2094 doesn't seem to be laid out for that and qualifiers are much more complicated if not impossible to handle.
    We could obviously move the display of the P2094-statements further up on pages (by changing the default sort order of properties).
    Also, I don't think P641 should be removed when one of these is present. It makes it much easier to identify all items related to tennis and thanks to its presence the above additions were made possible. Now that the above have been implemented, I can try to add tennis tournament edition (Q47345468) where needed.
    --- Jura 11:30, 28 February 2018 (UTC)[reply]
    • I don’t have a strong opinion on the qualifier question. However, right now I cannot see that the information held in competition class (P2094) is any closer related to the instance of (P31): tennis event (Q46190676) claim than any (or at least most of the) other information in the tennis event items. I mean, we could also add point in time (P585) or winner (P1346) as a qualifier, but we don’t do this. Since qualifiers are (unfortunately) very much misunderstood here, I have a weak preference for the independent statement approach; in my opinion it is at least equally correct as a qualifier is, and it pretty much follows typical Wikidata style. Yet, I am open for arguments that I have missed until now. —MisterSynergy (talk) 17:18, 28 February 2018 (UTC)[reply]

tennis tournament --> tennis event[edit]

tennis singles/doubles --> tennis event (done)[edit]

tennis qualification event[edit]

Talk about “edition” classes again …[edit]

Models for tennis tournaments in Wikidata

I somehow don’t really get along with the (relatively) new “edition” classes, particularly the items tennis tour edition (Q47358534), and tennis tournament edition (Q47345468)/tennis tournament edition by gender (Q47403752). Until now I have a complete overview about tennis tournament edition (Q47345468) only, but the matter about the other two is basically the same. Do we really need those items? This approach is not often used in Wikidata.

For visualization purposes, I have created an image displaying the two options. The “editions” items follow Model B from the sketch, but we typically use Model A, which does not require the edition items at all. Is there anything I have missed here? I would otherwise suggest to use Model A only, and delete the edition items again after we have moved all current use. If we continue to have these items, there would be a larger amount of items to be fixed in order to be fully related according to Model B.

Another issue is about recurring tennis tournament (Q47443726), which also appears superfluous to me. We should use tennis tournament (Q13219666) directly instead. —MisterSynergy (talk) 20:42, 9 March 2018 (UTC)[reply]

The badminton model can illustrate this: any of (e.g.) US Open, US Open 2013, US Open 2013 Men's Singles, or some part of that could be seen as a subclass of "tournament" Q13219666 (or "competition"). The badminton model accepts that and adds the equivalent with P279 to any of them. I think it would have been an improvement over some (not all) of the fairly recent situation for tennis, but it wouldn't have the clarity of tennis tournament edition (Q47345468) and requires an additional step to identify the layer when querying. We could avoid that ambiguity by not using Q13219666 directly.
I think we need to find an approach that works with a continuous inflow of items, allows to gradually expand these items and doesn't need redoing statements.
It's fairly common that similar items have the same P31-value, e.g. editions in a given year of different Opens. It also means that these items are likely to get statements with the same properties or (when used at Wikipedia) the same infobox). Using tennis tournament edition (Q47345468) can work for that, just as tennis event (Q46190676) works for specific competition classes, or Q5 for tennis players and other sportspeople. How we link them to gether is just a detail. P364 and P31 or P3450 can do.
BTW, it seems we reached 22000+ tennis event (Q46190676) most with competition class (P2094). Congrats! Obviously, one could argue that we don't need tennis event (Q46190676) once competition-class and part-of are there, but let's not get into that.
--- Jura 14:00, 10 March 2018 (UTC)[reply]

Fed Cup / Davis Cup[edit]

On Wikidata:WikiProject_Tennis/reports/P641_only, there were many items about some aspects of these. I used facet of (P1269) to link these to the item for the cup's year. They can now be found on 2 maintenance reports:

There are a few items that use instance of (P31), subclass of (P279) or part of (P361) to do the same. If you feel like improving them further, don't hesitate.
--- Jura 07:46, 16 January 2018 (UTC)[reply]

Missing P31/P279/part of/etc.[edit]

Looks like there are also plenty of items that have P641, some other statements, but not these properties:

They may have "has part".
--- Jura 16:29, 18 January 2018 (UTC)[reply]


New tab: Numbers[edit]

Wikidata:WikiProject Tennis/numbers provide a partial overview of available items.
--- Jura 07:31, 22 January 2018 (UTC)[reply]

@Jura1 Interesting. FYI the Items section is blank (only shows a small broken image icon) when I click on the tab. Also the Events part of the When? section does show a table but it has a lot of double blue lines above it and a "Wikidata:WikiProject Tennis/numbers/pyramid/level 0.1" redlink.--Wolbo (talk) 19:19, 21 July 2019 (UTC)[reply]
Problem is that charts & graphs do not work any longer for some time now. Watch phab:T226250. —MisterSynergy (talk) 20:06, 21 July 2019 (UTC)[reply]
It works in preview. There is also some talk on Wikidata:Contact_the_development_team#Graphes_no_longer_working. --- Jura 23:18, 21 July 2019 (UTC)[reply]

Name of sport[edit]

At Wikidata:WikiProject Tennis/Lists/name of sport, I added a short list. I tried to fix some of the caps ("Tennis" > "tennis" for some languages that don't use caps). More terms could be added.
--- Jura 12:37, 28 January 2018 (UTC)[reply]

How?[edit]

Kompakt
Stryn
AmaryllisGardener
Edo de Roo
Wolbo
Matlab1985
Soundwaweserb
Pommée
Mad melone
Kacir
A.Gust14
wallerstein-WD
Sakhalinio
Somnifuguist
See the bright light (talk)
Juanman
DonPedro71

Notified participants of WikiProject Tennis

Wikipedias have information about how the game is played included in the main article and a few related ones.

From the discussion at WikiProject_Sports#A_"How?"-gap_?, it appears that no systematic way to include this has been attempted. In contrast, we have lots of items about who? where? when?.

At Wikidata:WikiProject Tennis/Lists/tennis, I started a list. The idea is to display the label and the Wikidata description in English and the users interface language. It should enable people to complete them fairly easily. Descriptions here are generally shorter then in the Wikipedia page mentioned. Let's try to avoid importing that.

The best way to link these together is still to be determined, but I will update the query if needed. You can also hard-code items directly in query. Eventually I will try to include them in the query otherwise. I tried to cross-reference entries in the Wikipedia list at Q47514942#P361 with Wikimedia import URL (P4656).

Eventually, I might learn more about tennis by doing this ..

What are your thoughts on this?
--- Jura 12:13, 2 February 2018 (UTC)[reply]

  • As a first step, I added most equipment, roles and playing technic related concepts from the glossary. Still to do is much about scoring and match/tournament organization.
    --- Jura 11:08, 10 February 2018 (UTC)[reply]


Wikipedia categories, tennis clubs/tennis venues and players[edit]

The above list items for categories used at Wikipedia.

From a Wikidata perspective, these items aren't that important, but they can help find more items, e.g. it helped identified some 500 additional tennis players. ( Oddly (all) Wikipedias have some 650 categories for tennis players, but Wikidata has just 8200 players). I tried to add "sports=tennis" to items in these categories that still lacked it.

The categories also show some differences in approaches, e.g.:

  • tennis clubs in enwiki are mostly in tennis venue categories while some other wikis have them in club specific categories (e.g. svwiki, dewiki). It might be worth sorting them out in a more detailed way. Currently, most should have ended up on Wikidata:WikiProject Tennis/reports/organizations. BTW, there are a few clubs called "tennis clubs" that don't offer any tennis. I added "sports=tennis" with deprecated rank to one or the other.
  • some languages use "tennis in <place>" while others use "tennis tournament in <place>". However, I don't think that checking these helped find tournaments otherwise not categorized (by year/circuit/etc).

If you need help to make use of some of these categories, don't hesitate to ask me.
--- Jura 08:44, 15 February 2018 (UTC)[reply]


Wikipedia badges[edit]

Two new lists:

Spanish labels[edit]

When creating pages, some tool strip content in parentheses ("()"). This leads to many tennis items with incomplete Spanish labels. I left a note about it on Spanish project chat and I'm now completing these. For items that aren't considered lists, I also remove "Anexo:" from the labels.

Sample change: [8].
--- Jura 07:18, 5 March 2018 (UTC)[reply]

  • @MisterSynergy: looks like you found another 1000 tennis items in blank items with eswiki sitelinks. I fixed labels for these too.
    --- Jura 11:45, 9 March 2018 (UTC)[reply]
    • Yes, yesterday I crawled all items that have a sitelink categorized in any of (Wikipedia) category trees in Category:Tennis (Q5325416). ~1700 items were without any claims (thus I added sport (P641):tennis (Q847)), and a reasonable number of them was about tennis events which I immediately processed further. Right now I try to connect the event and the tournament edition layers with part of (P361) relations, in order to get more structure into the vast amount of items. The situation is pretty fluid right now, but I am going to report here as soon as this is possible again. —MisterSynergy (talk) 11:52, 9 March 2018 (UTC)[reply]
      • Sounds good. I tried to add P361 based on itwiki categories, at least for event-items that were in categories linked to tournament editions. It did generate a few links. Probably the "facet of" statements for Davis/Fed Cup mentioned further up should be converted as well.
        --- Jura 11:59, 9 March 2018 (UTC)[reply]


Date property on tennis tournament edition[edit]

To follow up on a discussion with Vinkje83 on a talk page of an item that seems to have been deleted since and their subsequent request here, there is the question if items for tournament editions should include point in time (P585) or not. For editions that have their tour specified, the year can be determined from the year of the tour as these don't systemically overlap two calendar years (as seasons in some other sports).

List of items: https://query.wikidata.org/#SELECT%3Fitem%7B%3Fitem%20wdt%3AP31%20wd%3AQ47345468%7D

Personally, I think the advantage of being able to query the year directly outweighs other downsides (e.g. readability of the item on the current GUI). This also allows to query the items independently of their degree of completeness.
--- Jura 10:10, 27 March 2018 (UTC)[reply]

I have mixed feelings and a very incoherent “opinion” in this matter.
  • “Preliminary claims” such as sport (P641): tennis (Q847) or “redundant/transitive” point in time (P585) data are extremely useful when a set of similar items is being organized, which is still the case for tennis events. I recently used this information to merge more than one thousand tennis event item pairs. However, once the items are organized, those claims, particularly sport (P641), seem indeed redundant and data users are often attempted to access data in poor ways.
  • It is worth to mention that we don’t really work with “preliminary claims” at Wikidata. What’s in an item is typically there forever, at least as long as it is somehow correct.
  • It is also worth to mention that most properties do not imply transitivity. Naturally it appears legit to assume that the value of point in time (P585) in an item about a tennis tournament edition also applies to all events which are connected via has part(s) (P527)/part of (P361) relations, but technically our data model does not include such a deduction.
So, no recommendation by me.
Maybe it is also worth to mention that I still have plans to improve tennis competition items. Next thing to do is to “clear” Wikidata:WikiProject Tennis/reports/P641 only by adding instance of (P31) values (and whatever else is necessary; you can help of course). Subsequently, the above mentioned distinction between tournaments and tournaments by gender need attention (still very incoherently solved by now), and after that is done the tournaments get has part(s) (P527) claims inverse to already existing part of (P361) claims in tennis event items. Finally, I also plan to perform a large-scale import of winner (P1346) data from the sitelinks, and to correct data related to tennis tours. —MisterSynergy (talk) 10:49, 27 March 2018 (UTC)[reply]
Quite impressive all the imports and additions you did. Thanks a lot. Compared to earlier, I think "P641 only" is greatly reduced [https://www.wikidata.org/w/index.php?&diff=654136502&oldid=620197716 (some numbers), even if many more tennis items have been identified since.
Eventually I was hoping to complete #How? above, but it appears that the new features that would simplify this exclude imports.
--- Jura 07:16, 29 March 2018 (UTC)[reply]


Pie chart[edit]

At Wikidata:WikiProject Tennis/numbers, there is now summary chart of some of the main types.
--- Jura 12:40, 16 May 2018 (UTC)[reply]

Tennis tournament edition and tennis events[edit]

Some questions on tennis tournament editions and tennis events:

Tennis tournament edition.

Example: 2017 ABN AMRO World Tennis Tournament (Q28606333)

1. Should 'point in time' not be replaced by 'start date' and 'end date'?
2. Is there a field where the total prize money (with currency indicator) of the tournament edition can be listed?
3. The 2017 edition lists the city where the tournament edition was held in the field 'located in the administrative territorial entity (P131)' whereas the 2018 edition (Q47227238) lists it in the field 'location (P276)'. Which is correct?
4. Is there a field to indicate the surface (e.g. Hard court, Clay court, Grass court) on which the tournament edition was played?
5. Is there a field to indicate if the tournament edition was held indoors or outdoors?

Tennis event.

Example: 2017 ABN AMRO World Tennis Tournament – singles (Q28743399))

1. Should tennis events also list 'follows (P155)' and 'followed by (P156)'?
2. Is there a field to indicate the runner-up / finalist?
3. Is there a field to indicate the draw size (number of competitors)?
4. The 'winner (P1346)' field has an inverse constraint. Is it indeed necessary that the winner should have the name of the tennis event listed in the 'victory (P2522)' field? If so, can this be automated?

@MisterSynergy @Jura1

--Wolbo (talk) 23:47, 11 March 2019 (UTC)[reply]

On tournament editions:
  1. start time (P580) and end time (P582) can (and should) be additionally added to the items. point in time (P585) was added last year relatively systematically, but it was inferred from the labels and added with year precision. Quite valuable for maintenance purposes, but also having exact start and end dates would be much better of course.
  2. prize money (P2121); can also be added to tennis event items; use currency as unit
  3. Regarding the location, the optimal solution would be to use location (P276) with a tennis venue item as value (i.e. not just a city or so). Thus, 2018 ABN AMRO World Tennis Tournament (Q47227238) --> location (P276) --> Rotterdam Ahoy (Q179426). Therein, Rotterdam Ahoy (Q179426) does know by itself where it is located (in Rotterdam (Q2680952)). located in the administrative territorial entity (P131) should not be used in tennis tournament editions at all.
    @MisterSynergy Agree on the use of location (P276) to indicate the venue where a tournament edition is held. Frequently see items where the city is entered as location so those need to be updated. However, the venue of a particular edition of a tournament may not always be known. The further back in time the more difficult it becomes to establish a venue for certain tournaments. In contrast, the city where a tournament is held is always known so would that not be an argument for the use of located in the administrative territorial entity (P131) on the tournament edition level? --Wolbo (talk) 10:34, 22 July 2019 (UTC)[reply]
    I would always and only use location (P276), regardless of whether the value is a tennis venue or a city (or something else if the venue is not know more precisely). Many data users verify the value item type (e.g. "tennis venue" or "city") anyways before they display it anywhere. With all values in P276 they don't have to look for different properties. --MisterSynergy (talk) 10:51, 22 July 2019 (UTC)[reply]
  4. surface played on (P765)
  5. I do not think that anyone defined a way to indicate whether a tournament was played "indoors" and "outdoors". As far as I know, there are meanwhile some modern venues with a roof that can be closed in bad weather conditions, thus it might be a somewhat hybrid situation anyways.
On tennis events:
  1. One can use follows (P155) and followed by (P156) on tennis events as well, linking to the previous/next edition of the same event. Could be useful to automatically retrieve defending champions and so on.
  2. You can add all participants of the event with participant (P710); this statement should have qualifiers stage reached (P2443) for all players, and for finalists additionally ranking (P1352) with values "1" (winner) and "2" (runner-up).
  3. number of participants (P1132)
  4. Yes, that could be automated, but I would not worry about that now. winner (P1346) and victory (P2522) are basically short versions of participant (P710) and participant in (P1344), so per #2 of this section one would probably make sure that the latter pair is symmetrically used, not just winner/victory. Still, such automation is not something to worry about at this point.
MisterSynergy (talk) 07:03, 12 March 2019 (UTC)[reply]
  • Agree with most of it. Please add start time (P580) and end time (P582) when known, but avoid removing P585. I didn't recall victory (P2522). In any case, I wouldn't worry about it.
    Obviously, if you work on a series of items, not all properties mentioned have equal importance. Thanks to MisterSynergy last year, we could get some basic statements on a lot of items (some previously without statements), but earlier other participants developed some items in a thorough way. --- Jura 09:54, 16 March 2019 (UTC)[reply]
If we keep point in time (P585) in addition to start time (P580) and end time (P582) how should these be added to tournament edition items? Should all three be separate statements (see tennis at the 1988 Summer Olympics (Q562043)) or should start time (P580) and end time (P582) be added as qualifiers to the point in time (P585) statement (see 2017 ABN AMRO World Tennis Tournament (Q28606333))? What are the pros and cons of either approach? --Wolbo (talk) 11:30, 21 July 2019 (UTC)[reply]
@MisterSynergy, Jura1, Stryn

Followed by / follows[edit]

Should follows (P155) and followed by (P156) be added as separate statements to a tennis tournament edition (see 1983 Dutch Open (Q3716626)) or as qualifiers to sports season of league or competition (P3450) (see 1983 Torneo Godó (Q2366035))? I have always used the first method but see the second popping up more frequently. Seems that MatSuBot is converting the first method into the second.--Wolbo (talk) 10:45, 10 August 2019 (UTC)[reply]

Someone has placed conflicts-with constraint (Q21502838) constraints for follows (P155) and followed by (P156) on sports season of league or competition (P3450), thus the qualifier solution fits better to the constraints situation of that property. However, I know that there are some editors unhappy about these constraints. If the bot moves main statements to qualifiers, it does not really matter where to put them; in case we switch the situation again in the future, it would be rather simple to move the qualifiers back to main statements. —MisterSynergy (talk) 10:58, 10 August 2019 (UTC)[reply]
Thx. My main concern was that by using the wrong method all the update efforts would be in vain so it is good to know that a bot can convert the methods in either direction. Also understand the constraint situation if sports season of league or competition (P3450) is used but I am not sure why sports season of league or competition (P3450) needs to be added to a tournament edition item in the first place. It makes perfect sense to say that the 2019 ATP Tour (Q52702738) is a season of the ATP Tour (Q300008) but in plain English it is doubtful anyone would say that the 1983 Torneo Godó (Q2366035) is a season of the Barcelona Open (Q299163). A season implies a number of events / games / tournaments, not a single event / game / tournament. Using sports season of league or competition (P3450) also seems superfluous as we already use instance of (P31) to store this information.--Wolbo (talk) 12:47, 10 August 2019 (UTC)[reply]
Yes, I share your concern. sports season of league or competition (P3450) is supposed to be used for league seasons and things like tennis tours, but not for individual tournaments etc. Many editors get that wrong, unfortunately. Shall we try to fix it somehow automatically? I would need finish another task first, and query a bit to see whether this looks doable… —MisterSynergy (talk) 20:09, 10 August 2019 (UTC)[reply]
We should probably start to have a look at this query result. After investigating some cases, it seems to me that earlier this year User:MatSuBot by User:Matěj Suchánek has added quite some sports season of league or competition (P3450) statements which shouldn't be there. —MisterSynergy (talk) 21:48, 10 August 2019 (UTC)[reply]
Agree with that conclusion. Can we check if the bot has been updated to prevent the addition of sports season of league or competition (P3450) in these situations? The query looks good but I would exclude the Billie Jean King Cup (Q206984) and Davis Cup (Q132377) editions, e.g. 1966 Federation Cup (Q1400176), as these can arguably be seen as a season. If we remove sports season of league or competition (P3450) from the other items will the follows (P155) and followed by (P156) qualifiers be converted to main statements so no info is lost? --Wolbo (talk) 12:03, 11 August 2019 (UTC)[reply]
MisterSynergy, can we pick this up again.?--Wolbo (talk) 13:39, 30 October 2021 (UTC)[reply]
Sure, which topic exactly?
Anyways, Wikidata:Requests for comment/P155/P156 as qualifiers only, rather than as main statements seems related so I drop it here in case you are not aware of it. —MisterSynergy (talk) 21:56, 30 October 2021 (UTC)[reply]
Kompakt
Stryn
AmaryllisGardener
Edo de Roo
Wolbo
Matlab1985
Soundwaweserb
Pommée
Mad melone
Kacir
A.Gust14
wallerstein-WD
Sakhalinio
Somnifuguist
See the bright light (talk)
Juanman
DonPedro71

Notified participants of WikiProject Tennis: You cannot assign new ID numbers to the old www address used by this property. Can it be fixed somehow? 89.73.160.53 00:35, 17 April 2020 (UTC)[reply]

Property proposal ITF player ID 2020[edit]

Notified: @MisterSynergy:, @Jura1:.
Please see proposal for an identifier for a tennis player at the International Tennis Federation (ITF) website as of 2020.--Wolbo (talk) 21:13, 26 August 2020 (UTC)[reply]

List of tournament wins[edit]

Dear all,

We had a look into this before, but I wanted to address one of the most unsatisfactory tasks at local Wikipedias and one of the main quality issues at the same time: keeping up with the list of tournament wins, especially on the ITF level.

I want to propose a way to provide this information via Wikidata and subsequently include the required information by way of templates. A nice example that I am aware of is the results from cycling races, see for example de:Tour_de_France_2020#Etappenliste, en:2020_Tour_de_France#Route_and_stages, es:Tour_de_Francia_2020#Etapas and additional language versions using the same source data from Wikipedia.

Having analyzed some major language versions (at least in the Western languages), I would propose the following structure:

Would you be fine with this approach? And can you help with the missing pieces, especially the result?

Thanks, --Mad melone (talk) 14:52, 26 September 2020 (UTC)[reply]

Keep it simple. Some examples, pretty much in line with your proposal above (use participant in (P1344) with qualifiers ranking (P1352) for finalists or stage reached (P2443) for non-finalists):
  1. Finalists
  2. Non-finalists (means that there is no final "ranking", they just left the tournament after a certain stage):
It is not very trivial to add information about the opponent in the last stage (or final); or to add the result of the final match. However, for infoboxes, this is usually not very important.
The value item of participant in (P1344), in this example 2020 US Open – women's singles (Q66399871), should have:
The corresponding Lua template on Wikipedia would be added in tennis player articles and it would need to have a look at the Wikidata item (easy), loop over all participant in (P1344) claims (easy), and if the value item has competition class (P2094) set with a whitelisted tennis-specific value (easy), it can use the qualifier ranking (P1352) or stage reached (P2443) to indicate the result of the player in that event (also easy). It can of course also use other claims from the value item. Relatively simple logic, and it is also relatively simple to make maintenance lists to track missing claims or oversee the used values. —MisterSynergy (talk) 16:59, 26 September 2020 (UTC)[reply]

Thanks for your feedback, but I guess that's a little bit to easy.

  • First, I think that the opponent and the result are very important. Check out for example de:Chanel Simmonds and the list of her tournament wins for an example - more examples are found via the interwikis
  • Also, providing only the venue would also not be feasible as different Wikipedias use different information here (see list above). Even though I think that we should use some standard, at the end of the day each Wikipedia is independent.
  • I will include the information as stated above in two or three players that are not that much in the public eye and test around them, if that is ok with you. However, results are still important.

More feedback always welcome!--Mad melone (talk) 17:10, 26 September 2020 (UTC)[reply]

  • The opponent in the latest stage (or final only) of an event where the player participated would be relatively simple to add with a suitable property qualifier for the participant in (P1344) claim; either we already have something which I don't know of, or we can propose a new one which should not be too complicated to get approved as we can outline a direct need for it. Results are much more complicated as Wikidata as a graph database is not really the best technical solution for that problem; if I remember correctly, there has already been some discussion regarding tennis results in the past, but I don't know where to find it.
  • Re. venues: you cannot cover everything that any Wikipedia is currently doing. Some include the venue, others the city or country. Since usually the venue should know in which city it is located, this sort of information can theoretically be derived from the venue item. Start simple here.
  • Sure it is okay, you don't need to ask me or anyone else :-)
MisterSynergy (talk) 18:09, 26 September 2020 (UTC)[reply]

Tournament IDs[edit]

I just noticed that URLs using the old format for ITF tournament result pages, after having been broken for about a year [9], now redirect to the right pages. I recently went through and changed these on all Grand Slam draw pages on enwiki, so it's made me wonder—why don't we have properties for these as well as for ATP and WTA tournament IDs, like we have for player IDs? If we did, and the links were to change again, we'd only have to worry about fixing them here. The relevant parts of the URLs seem to be (e.g. for the 2020 Italian Open):

  • ATP: /rome/416/2020/ [10]
  • WTA: /709/rome/2020/ [11]
  • ITF (men): /rome/ita/2020/m-1000-ita-01a-2020/ [12]
  • ITF (women): /rome/ita/2020/w-p5-ita-01a-2020/ [13]

Thoughts? --Somnifuguist (talk) 22:12, 3 October 2020 (UTC)[reply]

  • Makes sense, just go ahead and propose the new Property. --Mad melone (talk) 06:17, 7 October 2020 (UTC)[reply]
  • On second thought, I think just having IDs for the ITF's site would suffice—they have results for tennis at all levels since 1968 (and further back as well, but those were hidden several years ago for some reason). The WTA's site is currently pretty useless for draws (they don't even have Grand Slam results from 8 years ago [14]), and the ATP's site is missing most qualifying draws. I'm new to Wikidata, so if somebody more familiar with its processes wants to propose the new Property please be my guest. One thing to keep in mind is that the ITF splits tennis into 6 tours: Men's, Women's, Juniors, Senior, Wheelchair and Beach. Where 2+ tours intersect at the same tournament, e.g. as happens at the Grand Slams, could that tournament's ID have multiple values, or would we need separate ID Properties for each tour, e.g. ITF Men's tournament ID, ITF Women's tournament ID, etc.? Somnifuguist (talk) 21:53, 2 November 2020 (UTC)[reply]

Turns out I didn't do my due diligence, and properties already exist for these IDs: WTA tennis tournament ID (P3469) ATP tennis tournament edition ID (P6880), ATP tennis tournament ID (P3456) and ITF tournament ID (P6841). There are several issues with them (outdated ITF format, confusion between the two ATP properties, etc.) that I will discuss on the relevant talk pages so that these properties can become useful on Wikipedia. --Somnifuguist (talk) 17:01, 26 April 2021 (UTC)[reply]