User:Daniel Mietchen/365 climate edits

From Wikidata
Jump to navigation Jump to search

This page assists in documenting my contributions to the 365 climate edits initiative.

Scope[edit]

  • Start time: January 1, 2023 (Q69306665)  View with Reasonator View with SQIDView profile on Scholia
  • End time: December 31, 2023 (Q69307031)  View with Reasonator View with SQIDView profile on Scholia
  • Tasks:
    • Make at least one climate-related edit per day, anywhere in the Wikimedia ecosystem
    • Document the edits on an ongoing basis
  • Rules
    • For the purpose of this activity, I understand a "day" as the time frame from the earliest to the latest point at which the date in question is valid anywhere on Earth.
    • An "edit" is a change to a Wikimedia wiki that is visible to the public version history of the wiki page in question.
    • Edits that are part of an edit batch are eligible, but if an edit from a given batch has already been selected as the contribution for any given day, then further edits from the same batch are not eligible for future days.
  • Recent changes

Gallery of media files worked on[edit]

Daily examples of my #365climateedits contributions[edit]

Below, I am linking a sample edit (including newly created pages) for each day of the year 2023 (Q49622). The list of days is generated via this query, which will probably see some modifications over the year. In many cases, I am using the template {{Q'''}}, which provides links (via the icons) to WD:Reasonator, SQID and Scholia to facilitate further exploration, including of related content.

Most recent[edit]

Keeping this to three days for now.

Upcoming[edit]

Nothing for now.

Past[edit]

Queries[edit]

German nouns containing the string "klima" but having no sense statements[edit]

The following query uses these:

  • Properties: item for this sense (P5137)  View with Reasonator View with SQID
    SELECT DISTINCT
    (CONCAT("https://ordia.toolforge.org/search?", "language=",  ENCODE_FOR_URI(LANG(?lemma)), "&q=", ENCODE_FOR_URI(STR(?lemma))) AS ?Url2)
    WHERE {
      ?lexeme dct:language wd:Q188 .
      FILTER NOT EXISTS { ?lexeme ontolex:sense / wdt:P5137 ?item }
          ?lexeme wikibase:lemma ?lemma ;              
           wikibase:lexicalCategory wd:Q1084 .
           FILTER REGEX(LCASE(?lemma), "klima")
    }
    LIMIT 500
    

German nouns containing the string "klima" but having no pronunciation audio file[edit]

The missing audio files can be recorded via LinguaLibre's dedicated list.

The following query uses these:

  • Properties: pronunciation audio (P443)  View with Reasonator View with SQID
    SELECT DISTINCT
    (CONCAT("https://ordia.toolforge.org/search?", "language=",  ENCODE_FOR_URI(LANG(?lemma)), "&q=", ENCODE_FOR_URI(STR(?lemma))) AS ?Url2)
    WHERE {
      ?lexeme dct:language wd:Q188 .
      FILTER NOT EXISTS { ?lexeme wdt:P443 ?prununciation_audio. }
          ?lexeme wikibase:lemma ?lemma ;              
           wikibase:lexicalCategory wd:Q1084 .
           FILTER REGEX(LCASE(?lemma), "klima")
    }
    LIMIT 500
    

Ukrainian nouns containing the string "кліма" but having no sense statements[edit]

The following query uses these:

  • Properties: item for this sense (P5137)  View with Reasonator View with SQID
    SELECT DISTINCT
    (CONCAT("https://ordia.toolforge.org/search?", "language=",  ENCODE_FOR_URI(LANG(?lemma)), "&q=", ENCODE_FOR_URI(STR(?lemma))) AS ?Url2)
    WHERE {
      ?lexeme dct:language wd:Q8798 .
      FILTER NOT EXISTS { ?lexeme ontolex:sense / wdt:P5137 ?item }
          ?lexeme wikibase:lemma ?lemma ;              
           wikibase:lexicalCategory wd:Q1084 .
           FILTER REGEX(LCASE(?lemma), "кліма")
    }
    LIMIT 500
    

Common n-grams in titles of works about palaeoclimate reconstructions[edit]

The following query uses these:

  • Properties: main subject (P921)  View with Reasonator View with SQID, title (P1476)  View with Reasonator View with SQID, KIT Linked Open Numbers ID (P5176)  View with Reasonator View with SQID, numeric value (P1181)  View with Reasonator View with SQID
    # Most frequent n-grams from a random set of 1000 publications on a given topic
    SELECT DISTINCT ?Ngram ?N ?Count ?Length ?Dashes ?Score ?ExamplePub ?ExamplePubTitle
    
    WITH
    { # Generating a list of entities to be analyzed
      SELECT ?Publication
       { 
          SERVICE bd:sample { ?Publication wdt:P921 wd:Q116146313 . bd:serviceParam bd:sample.limit 1000 }   
       }
    } AS %items 
    WITH
    { # Preprocessing the titles
      SELECT ?Title ?Publication ?Seeds ?ClearTitleLength
       { 
          INCLUDE %items
          ?Publication wdt:P1476 ?Title.
          BIND (REPLACE(STR(?Title),"[\\.:,;\\[\\]\\?()$]","") AS ?ClearTitle) # remove some frequent special characters, including colons and semicolons
          BIND(STRLEN(?ClearTitle) AS ?ClearTitleLength) 
          FILTER(LANG(?Title)="en") 
          # Basic processing of the titles
          BIND ("::: ::: ::: ::: ::: ::: ::: ::: " AS ?StartCodon)
          BIND (" ;;; ;;; ;;; ;;; ;;; ;;; ;;; ;;;" AS ?StopCodon)
          BIND (LCASE(CONCAT(?StartCodon , # add start codon of colons to assist with processing of n-grams at beginning of title
                                ?ClearTitle, 
                                ?StopCodon)) # add stop codon of semicolons to assist with processing of n-grams at end of title
                         AS ?Seeds )
       }
    } AS %titles 
    WITH
    { # Generating a list of regexes to look for the NumericValue-th word in a string     
      # Based on https://w.wiki/KG$ by Jura1
      SELECT ?Regex1 ?Regex2 ?Regex3 ?Regex4 ?NumericValue 
        { 
          ?NumberItem wdt:P5176 []; wdt:P1181 ?NumericValue . 
          FILTER( ?NumericValue > 0 ) 
          FILTER( ?NumericValue < 151)
          BIND("^([^ ]+ ){" AS ?RegexStart)
          BIND("}([^ ]+) .*" AS ?RegexEnd)
          BIND( CONCAT( ?RegexStart , STR( ?NumericValue - 1 ), ?RegexEnd ) AS ?Regex1)
          BIND( CONCAT( ?RegexStart , STR( ?NumericValue + 1 ), ?RegexEnd ) AS ?Regex2) 
          BIND( CONCAT( ?RegexStart , STR( ?NumericValue + 3 ), ?RegexEnd ) AS ?Regex3) 
          BIND( CONCAT( ?RegexStart , STR( ?NumericValue + 5 ), ?RegexEnd ) AS ?Regex4) 
        }
    } AS %regexes 
    WITH
    { # Applying the regexes to the titles to extract ngrams (for n <= 8), and counting occurrences of the ngrams across titles
      SELECT 
        DISTINCT ?Ngram 
        ?N
        (COUNT(DISTINCT ?Title) AS ?Count)
        ?Length
        ?Dashes
        (( ?Count * ?Length * ( (?Dashes +1) / ?N) 
         ) AS ?Score)
        (SAMPLE(DISTINCT ?Publication) AS ?ExamplePub)
          { 
            INCLUDE %regexes
            INCLUDE %titles
            BIND( 
              (CONCAT(
                REPLACE(?Seeds, ?Regex1, "$1"), " ", 
                REPLACE(?Seeds, ?Regex1, "$2"), " ", 
                REPLACE(?Seeds, ?Regex2, "$1"), " ", 
                REPLACE(?Seeds, ?Regex2, "$2"), " ", 
                REPLACE(?Seeds, ?Regex3, "$1"), " ", 
                REPLACE(?Seeds, ?Regex3, "$2"), " ", 
                REPLACE(?Seeds, ?Regex4, "$1"), " ", 
                REPLACE(?Seeds, ?Regex4, "$2")
              )
            ) AS ?NgramCandidate) 
                                
            BIND( 
              (REPLACE
               (REPLACE
                (REPLACE
                 (REPLACE
                  (STR(?NgramCandidate),"([;:])",""),
                  "(^\\s+)",""),
                 "(\\s+$)",""),
                "([ ]{2,})"," ")
              ) AS ?Ngram) 
    
            BIND(STRLEN(?Ngram) AS ?Length) 
            FILTER (?Length > 3 )  
            FILTER (?Length <= ?ClearTitleLength )  
    
            BIND(STRLEN(REPLACE(?Ngram, "\\S", "")) + 1 as ?N)
            BIND((STRLEN(?Ngram) - STRLEN(REPLACE(?Ngram, "-", "")))  as ?Dashes)
          }
      GROUP BY ?Ngram ?N ?Count ?Length ?Dashes ?Score ?ExamplePub
      HAVING(?Count > 1)
    } AS %ngrams 
    WHERE {
      INCLUDE %ngrams 
      # Exclude Ngrams starting or ending with any of a set of blacklisted words
      BIND("(a|and|between|during|for|from|in|of|on|or|the|to|with)" AS ?blacklist)
      BIND( CONCAT( "(^", ?blacklist ,")+( )+") AS ?RegexBlackStart)
      BIND( CONCAT( "( )+(", ?blacklist ,")+$") AS ?RegexBlackEnd)
      FILTER (!REGEX(?Ngram, ?RegexBlackStart))
      FILTER (!REGEX(?Ngram, ?RegexBlackEnd))
    
    #   # Exclude Ngrams too similar to the target
    #   FILTER (!CONTAINS(?Ngram, "climate"))
    #   FILTER (!CONTAINS(?Ngram, "change"))
              
      ?ExamplePub wdt:P1476 ?ExamplePubTitle.
      FILTER(LANG(?ExamplePubTitle)="en") 
    }
    GROUP BY ?Ngram ?N ?Count ?Length ?Dashes ?Score ?ExamplePub ?ExamplePubTitle
    ORDER BY DESC(?Score) DESC(?Count) DESC(?Length)
    LIMIT 200
    

Potential things to work on[edit]

See also[edit]