Topic on User talk:Edgars2007

Jump to navigation Jump to search
Wesalius (talkcontribs)

Hi Edgars,

can I ask for your help?

I am going to set elo ratings of Chess players. I obtained the data on players from https://ratings.fide.com/download.phtml and managed to extract FIDE_ID - RATING pairs. I am still in the process of the data sorting, but I already have first results here.

Can you help me with geting the PLAYER_ITEM - FIDE_ID pairs? When I have FIDE_ID - PLAYER_ITEM - RATING triplets, I can pass them into quickstatements and onto wd.

Thank you in advance.

Edgars2007 (talkcontribs)

Yeah, that's easy to get those triplets. Will do that tomorrow, if everything goes well.

Edgars2007 (talkcontribs)

Here are the triplets.

Was using 'sept2016' file from 'standart' folder. If that was the wrong one, then sorry, I can regenerate list if needed.

For these I (ok, my script, not me) didn't found FIDE ID in that Excel file you gave.

As I saw a Python dict in your sandbox, I assume you have some Python knowledge. So here is the script, that I was using those lists. Script is pretty dirty and not very Pythonic, but it does the job. That "chess-tsv.tsv" file is your Excel file, that is converted into TSV file.

And as we have problems with QS (per your post on Project chat) wannted to let you know, that it should be pretty easy to create a Python script for this. Actually, I had some thoughts time to time to update thos FIDE ratings myself, not waiting that bot approval, that was several months ago :)

Wesalius (talkcontribs)

Thank you Edgars. I will use your script and get the triplets for all of the elo data I got from fide.com. Do you think that I should import only standard elo ratings, or rapid and blitz elos as well? My knowledge of python is very limited, I dont know how to write a python script that would write down the data to wd directly. That is why I used QS so far, because I just tell python to print out the right QS output and paste it into QS.

Wesalius (talkcontribs)

Ha, I just found this tutorial page, it should be easy to write to wd using PWB. I will try to get it working and report back when I have something.

Edgars2007 (talkcontribs)

You can also have a look at Wikidata:Pywikibot - Python 3 Tutorial. And Matěj Suchánek has such script, which may look a little bit too complex for a Python newbie, but...

Oh, and about standart/blitz/rapid ratings - don't have an opinion. Although it may seem from past discussions, that I understand chess, I do not :) Of course, I know some basic stuff (name some World champions, for example), but that's all.

Matěj Suchánek (talkcontribs)

By the way, that script was written just for the task to import population, it doesn't really serve like Python QS rewrite... but if there was demand, I could create it.

Edgars2007 (talkcontribs)

I'm not saying, that it's QS for Python, but that is relatively similar script. I think I saw such attempt somewhere, actually. Don't remember where, but it should be on Github and maybe it's linked from our 'Requests for permissions' pages somewhere. But there is probablity, that it wasn't Python.

Wesalius (talkcontribs)
Edgars2007 (talkcontribs)
Wesalius (talkcontribs)

What do you think about edits like this and this? Shall I proceed with the rest?

Wesalius (talkcontribs)

I did not put FIDE ID in the reference, do you think it is needed even when the ID is already as a solitary statement on the page?

Edgars2007 (talkcontribs)
Edgars2007 (talkcontribs)

Oh, got some.

  1. We can't assume, that everybody is an expert in chess. So if some bypasser comes to Magnus item and sees Elo rating and wants to see the current situation, he hasn't got easy way of doing that. How can he know, that he has to look at FIDE ID? He has to go to ratings.fide.com and then search for Magnus profile. In some websites, searching is [censored] :)
  2. With FIDE ID in references one can use this information as reference in Wikipedia, if he is using Elo in infobox, for example.
Wesalius (talkcontribs)

You made your point :-) I will include it in sources and check back when edits to be reviewed will be maede.

Wesalius (talkcontribs)

How do you like this and this? Shall I proceed with the rest?

Edgars2007 (talkcontribs)

Looks good :)

Wesalius (talkcontribs)

You can check the script I used at this gitub repo: https://github.com/Wesalius/Elobot

I encountered error after importing around 850 values - when setting P1087 claim to Q11814034, value of 2070:

pywikibot.exceptions.NoPage: Page wikidata:Q11814034 doesn't exist.

After that the script crashed.

When checking in the external file with triplets (chess-elo-item-rating-output.txt), this item is paired with FIDE ID "1126385" which is this guy: https://ratings.fide.com/card.phtml?event=1126385

I have not yet find an wikidata item for this guy and so far I have no idea how it is possible that the query gave this non-existent item.

Any idea?

Edgars2007 (talkcontribs)

Yeah, the item Q11814034 was deleted, but it is in SPARQL results. It happens time to time (nobody knows the exact reason, why :D ).

The best idea (to skip such items) is to have an exception in code. Currently I'm out of time and don't have such exception handling in any of my scripts, so can't help you now, but will try to do that later today or tomorrow.

To answer some things from your to-do-list:

a) (counter of written items): before "for" add "counter = 0" and then somewhere in loop add something like

if counter % 50 == 0:
			print('counter: '+str(counter))

b) (file closing): the right way of doing this is using "with open" (google for "with open file python"). The way I was doing file opening/writing is not the best way, but I don't like the "with open" constructions :D AFAIK, files should get closed correctly also in my way of doing things

Wesalius (talkcontribs)

Ok, I didnt know that could happen :-O

The counter and opening with with open file are things that I know how to do right away, just did not have the time to do it yet, thank you for the advice though :-) In contrary the exception for skipping unexistent items and pausing/resuming script are things that I do not know yet how to do, but if you are willing to help, I will be happy to list to your advice ;-)

Thank you so far.

Wesalius (talkcontribs)

Ok, I added a counter and an exception for pywikibot.NoPage. Actual revision of the code is here. Right now, I do not know how to skip claiming if a the exact claim already exists to prevent duplicities. I have been reverting them manually so far, but that is definitely not the way to go. I was thinking something like

if not player_item.Claims(elo_claim).Qualifier(date_property_qualifier).target_equals(date_property_qualifier):

but that does not work, since the qualifier method does not exist, I made that snippet up, but I guess you can figure out what it is trying to achieve. Maybe there is an easier way to prevent duplicate claims using pywikibot... Can you or User:Matěj Suchánek kindly help me out?

Matěj Suchánek (talkcontribs)

Try something like this:

datum = # datum
# ...
for xxx in lines:
    hasUpToDateClaim = False
    for claim in item.claims.get(prop, []):
        for qualifier in claim.qualifiers.get(date_prop, []):
             if qualifier.target_equals(datum):
                 hasUpToDateClaim = True
                 break
    if hasUpToDateClaim:
        continue
Wesalius (talkcontribs)
Wesalius (talkcontribs)

I am taking it back, I had made an mistake in indentation. It is almost solved now, I am just tinkering with the output so it is clear what the script is doing at the time.

Wesalius (talkcontribs)

It is almost done, now just there is one small detail, the script proceeds when the duplicate statement IS found and does not proceed when the duplicate IS NOT found, so the other way around :D I am exhausted...

Matěj Suchánek (talkcontribs)

Seems that you got caught into nested loops.

for player in item_list:
    # do some preparations
    hasUpToDateClaim = False
    player_item.get()
    for claim in player_item.claims.get(elo_property, []):
        for qualifier in claim.qualifiers.get(date_property, []):
            if qualifier.target_equals(date_value):
                hasUpToDateClaim = True
                break
    if hasUpToDateClaim:
        # now this is time to skip
        continue
    # and this is where you can start updating the item
    try:
        # ...

This is what I meant and should work.

Wesalius (talkcontribs)

What do you mean by:

# now this is time to skip

Does any command replace this comment?

Matěj Suchánek (talkcontribs)

Whatever you want to do there, like printing to the command line that you are skipping this player

Edgars2007 (talkcontribs)

I like your version better than mine, looks cleaner. although you're both Czech, you're still answering in English - nice :)

Matěj Suchánek (talkcontribs)

Modified it, there was an error (not it looks even simpler).

Wesalius (talkcontribs)

Thank you, I will go through it soon. This might be the final step towards a functional script.

Wesalius (talkcontribs)

And last time I ran the script it crashed with pywikibot.data.api.APIError: editconflict: Edit conflict.

But that is lower priority than the previous task, since if the preventing duplicities is at work I can just rerun the script and it will skip all of the previous claims and wont have the editconflict when it reaches the item...

Edgars2007 (talkcontribs)

That's why when I know, that I could need to do reruns, I ussually save in a file list of those items which will be edited and then simply remove already edited items from list after some error. Of course, not perfect, but...

About qualifiers, wrote one little dirty script. It's for checking, if item already has the last FIFA World Ranking (P2656), but after little adjustment it should work also for you.

Wesalius (talkcontribs)

Thats what I actually tried to do, I took all Q\d+ from elobot contributions page and found FIDE IDs for them and then deleted those lines from the input csv. But somehow about 200+ did not get removed this way. I guess I did something wrong along the way and dont know what, so thats the reason for the question... Thank you for the script, I will try to adjust it.

129.13.72.198 (talkcontribs)

Hi guys. I was waiting already a long time for adding the Elo ratings to wikidata (see [[:de:Wikipedia:Bots/Anfragen#Elo-Zahlen_in_Wikidata_einf.C3.BCgen|] here]. Just a few things that you maybe not know: The elo ratings are published at the moment monthly, but some years ago, it was only every two months, before that it was every four months and maybe 20 years ago it was only once a year. Secondly, FIDE has the ratings only back to 2000. For older ratings back to 1990, you should refer to benoni.de ([http://www.benoni.de/schach/elo/his.html?id=4100026 Example]), and for the years 1971 to 1990, olimpbase.org covers the ratings ([https://www.olimpbase.org/Elo/Elo198101e.html Example]).

Wesalius (talkcontribs)

Thank you for the information.

I harvested only the data that was available on fide web in xml. These were available only until 2012, earlier were stored as plain text and I did not process those, yet. After this phase I can move to other sources.

Reply to "Help with ELO ratings pairing"