User:ProteinBoxBot/SPARQL Examples

From Wikidata
Jump to navigation Jump to search

Query Wikidata with SPARQL[edit]

SPARQL queries can be submitted via a web form at http://query.wikidata.org/, or, for programmatic access: https://query.wikidata.org/sparql. The latter URL allows integrating Wikidata items with external SPARQL endpoints through federated queries or to integrate in data analysis packages such as R or any other platform with a SPARQL plugin.

This page collects biomedical example queries to the SPARQL endpoint of Wikidata. For more general example queries, see here.

Drugs & Diseases[edit]

Find drugs that treat a disease and show a link for each supporting reference[edit]

The following query uses these:

  • Properties: NCI Thesaurus ID (P1748)  View with Reasonator View with SQID, formatter URL (P1630)  View with Reasonator View with SQID, drug or therapy used for treatment (P2176)  View with Reasonator View with SQID, ChEMBL ID (P592)  View with Reasonator View with SQID, NDF-RT ID (P2115)  View with Reasonator View with SQID
    #Find drugs that treat a disease and show a link for each supporting reference
    SELECT ?disease ?diseaseLabel ?diseaseDescription ?drug ?drugLabel ?drugDescription ?link
    WHERE {
     ?disease wdt:P1748 'C3243' . #multiple sclerosis 
     ?disease p:P2176 ?disease_drug .  #statement about drug used for treatment
     ?disease_drug ps:P2176 ?drug . #which drug was it in that statement...  
     ?disease_drug prov:wasDerivedFrom ?reference . #chemblid pr:P592 , #NDF-RT P2115        
     optional { 
       ?reference pr:P592 ?chemblid . 
       wd:P592 wdt:P1630 ?url .
       BIND (replace(?url, "\\$1", ?chemblid) AS ?link)           
     }
     optional {
      ?reference pr:P2115 ?NDF_RT_ID .  
      wd:P2115 wdt:P1630 ?url .
      BIND (replace(?url, "\\$1", ?NDF_RT_ID ) AS ?link) 
      }
      SERVICE wikibase:label { bd:serviceParam wikibase:language "en" . }
    }
    

Get all the drug-drug interactions for Methadone[edit]

The following query uses these:

  • Properties: ChEMBL ID (P592)  View with Reasonator View with SQID, significant drug interaction (P769)  View with Reasonator View with SQID
    #Get all the drug-drug interactions for Methadone
      SELECT ?compound ?chembl ?label WHERE {
      ?item wdt:P592 'CHEMBL651' .
      ?item wdt:P769 ?compound .
      ?compound wdt:P592 ?chembl .
      OPTIONAL  {?compound rdfs:label ?label filter (lang(?label) = "en")}
    }
    

Get all drug indications and their source database[edit]

The following query uses these:

  • Properties: medical condition treated (P2175)  View with Reasonator View with SQID, stated in (P248)  View with Reasonator View with SQID
    SELECT distinct ?drug ?drugLabel ?value ?valueLabel ?reference_stated_inLabel  WHERE {
      ?drug p:P2175 ?statement .
      ?statement ps:P2175 ?value .  # get the value associated with the statement
      ?statement prov:wasDerivedFrom/pr:P248 ?reference_stated_in . #where stated
      SERVICE wikibase:label { bd:serviceParam wikibase:language "en" . }
    }
    

Drug Repurposing[edit]

Drug interacts with protein encoded by gene with association to disease. Showing Metformin[edit]

The following query uses these:

  • Properties: physically interacts with (P129)  View with Reasonator View with SQID, encoded by (P702)  View with Reasonator View with SQID, genetic association (P2293)  View with Reasonator View with SQID
    #Drug interacts with protein encoded by gene with association to disease. Showing Metformin
    SELECT ?gene ?geneLabel ?disease ?diseaseLabel WHERE {
      wd:Q19484 wdt:P129 ?gene_product .   # drug (metformin) interacts with a gene_product 
      ?gene_product wdt:P702 ?gene .  # gene_product is encoded by a gene
      ?gene	wdt:P2293 ?disease .    # gene is genetically associated with a disease 
      SERVICE wikibase:label { bd:serviceParam wikibase:language "en" .	}
    }
    

Find drugs for cancers that target genes related to cell proliferation, where a drug physically interacts with the product of gene known to be genetically associated to a disease[edit]

These cases may show opportunities to repurpose a drug for a new disease See [1] and [2]. An example that was recently validated involved a new link between metformin (Q19484) and cancer survival [3]. Query is currently set up to find drugs for cancers that target genes related to cell proliferation. Adapt by changing constraints (e.g. to 'heart disease' Q190805) or removing them

The following query uses these:

  • Properties: physically interacts with (P129)  View with Reasonator View with SQID, encodes (P688)  View with Reasonator View with SQID, genetic association (P2293)  View with Reasonator View with SQID, subclass of (P279)  View with Reasonator View with SQID, biological process (P682)  View with Reasonator View with SQID, part of (P361)  View with Reasonator View with SQID, drug or therapy used for treatment (P2176)  View with Reasonator View with SQID
    #Find drugs for cancers that target genes related to cell proliferation, where a drug physically interacts with the product of gene known to be genetically associated to a disease
    SELECT ?drugLabel ?geneLabel ?biological_processLabel ?diseaseLabel WHERE {
      ?drug wdt:P129 ?gene_product .   # drug interacts with a gene_product 
      ?gene wdt:P688 ?gene_product .  # gene_product (usually a protein) is a product of a gene (a region of DNA)
      ?disease	wdt:P2293 ?gene .    # genetic association between disease and gene 
      ?disease wdt:P279* wd:Q12078 .  # limit to cancers wd:Q12078 (the * operator runs up a transitive relation..)
      ?gene_product wdt:P682 ?biological_process . #add information about the GO biological processes that the gene is related to  
      #limit to genes related to certain biological processes (and their sub-processes):
      		#apoptosis wd:Q14599311 
      		#cell proliferation wd:Q14818032
      {?biological_process wdt:P279* wd:Q14818032 } # chain down subclass
       UNION 
      {?biological_process wdt:P361* wd:Q14818032 } # chain down part of
      #uncomment the next line to find a subset of the known true positives (there are not a lot of them in here yet)
      #?disease wdt:P2176 ?drug . 	# disease is treated by a drug 
      SERVICE wikibase:label { bd:serviceParam wikibase:language "en" .	}
    }
    

Variants[edit]

Which variant of which gene predicts a positive prognosis in colorectal cancer[edit]

The following query uses these:

  • Properties: positive prognostic predictor for (P3358)  View with Reasonator View with SQID, biological variant of (P3433)  View with Reasonator View with SQID
    #Which variant of which gene predicts a positive prognosis in colorectal cancer
    SELECT ?geneLabel ?variantLabel WHERE {  
      values ?disease {wd:Q188874}
      ?variant wdt:P3358 ?disease ; # P3358 Positive prognostic predictor
               wdt:P3433 ?gene . # P3433 biological variant of
      SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
    }
    

This list is periodically updated by a bot. Manual changes to the list will be removed on the next update!

WDQS | PetScan | TABernacle | Find images | Recent changes | Query: SELECT ?item WHERE { ?item wdt:P3358 wd:Q188874 ; wdt:P3433 ?gene . }
label description biological variant of
KRAS G12D genetic variant KRAS
CDX2 EXPRESSION genetic variant CDX2
MIR218-1 EXPRESSION genetic variant MIR218-1
GNAS c.393T>C genetic variant GNAS
DCC EXPRESSION genetic variant DCC
POLE P286R genetic variant POLE
POLE V411L genetic variant POLE
POLE S459F genetic variant POLE
BRAF Non-V600 genetic variant BRAF
End of automatically generated list.
Given a variant ID, find drugs that may target the corresponding protein[edit]

The following query uses these:

  • Properties: HGVS nomenclature (P3331)  View with Reasonator View with SQID, biological variant of (P3433)  View with Reasonator View with SQID, encodes (P688)  View with Reasonator View with SQID, physically interacts with (P129)  View with Reasonator View with SQID
    # Given a variant ID, find drugs that may target the corresponding protein
    SELECT ?variant ?variantLabel ?gene ?geneLabel ?drug ?drugLabel WHERE {
      values ?hgvsid {"NC_000007.13:g.140453136A>T"}  # currently basing on HGVS ID because we don't have dbsnp IDs yet
      ?variant wdt:P3331 ?hgvsid .
      ?variant wdt:P3433 ?gene .
      ?gene wdt:P688 ?protein .
      ?protein wdt:P129 ?drug .
      SERVICE wikibase:label { bd:serviceParam wikibase:language "en" .	}
    }
    
Given a variant ID, find drugs that may target a protein in the same pathway[edit]

The following query uses these:

  • Properties: HGVS nomenclature (P3331)  View with Reasonator View with SQID, biological variant of (P3433)  View with Reasonator View with SQID, encodes (P688)  View with Reasonator View with SQID, has part(s) (P527)  View with Reasonator View with SQID, instance of (P31)  View with Reasonator View with SQID, physically interacts with (P129)  View with Reasonator View with SQID
    # Given a variant ID, find drugs that may target a protein in the same pathway
    SELECT ?gene ?geneLabel ?pathway ?pathwayLabel ?pathwayProtein ?pathwayProteinLabel ?drug ?drugLabel WHERE {
      values ?hgvsid {"NC_000007.13:g.140453136A>T"}  # currently basing on HGVS ID because we don't have dbsnp IDs yet
      ?variant wdt:P3331 ?hgvsid .
      ?variant wdt:P3433 ?gene .
      ?gene wdt:P688 ?protein .
    
      ?pathway wdt:P527* ?protein .
      ?pathway wdt:P31 wd:Q4915012 .
      ?pathway wdt:P527* ?pathwayProtein .
      ?pathwayProtein wdt:P31 wd:Q8054 .
      ?pathwayProtein wdt:P129 ?drug .
      SERVICE wikibase:label { bd:serviceParam wikibase:language "en" .	}
    }
    

Microbial Queries[edit]

Request all taxa that have at least one gene and are child of the genus Chlamydia[edit]

The following query uses these:

  • Properties: instance of (P31)  View with Reasonator View with SQID, found in taxon (P703)  View with Reasonator View with SQID, parent taxon (P171)  View with Reasonator View with SQID
    #Request all taxa that have at least one gene and are child of the genus Chlamydia
    SELECT DISTINCT ?taxa ?taxaLabel WHERE {
      ?gene wdt:P31 wd:Q7187 .
      ?gene wdt:P703 ?taxa .
      ?taxa wdt:P171* wd:Q846309 .
      SERVICE wikibase:label { bd:serviceParam wikibase:language "en" . }
    }
    

Request all operons and their genes for Listeria monocytogenes EGD-e[edit]

This query is meant to show the structured version of data published in The Listeria transcriptional landscape from saprophytism to virulence. (Q28131414), and specifically the table starting on page 17 of the supplementary file.

The following query uses these:

  • Properties: NCBI taxonomy ID (P685)  View with Reasonator View with SQID, found in taxon (P703)  View with Reasonator View with SQID, strand orientation (P2548)  View with Reasonator View with SQID, instance of (P31)  View with Reasonator View with SQID, has part(s) (P527)  View with Reasonator View with SQID, NCBI locus tag (P2393)  View with Reasonator View with SQID, Entrez Gene ID (P351)  View with Reasonator View with SQID
    #Request all operons and their genes for Listeria monocytogenes EGD-e
    SELECT ?operon ?operonLabel (if(?label = "Forward Strand"@en, '+', '-') as ?strand)
     (group_concat(distinct ?locustag; separator=" ") as ?locustagG) WHERE {
      ?strain wdt:P685 "169963". # NCBI Taxonomy ID for Listeria monocytogenes EGD-e
      ?operon wdt:P703 ?strain; # get operon that is found in that genome
              wdt:P2548 ?strand; # get strand orientation
              wdt:P31 wd:Q139677; #instance of operon
              wdt:P527 ?gene. # has part gene (gets all genes in operon)
      ?gene wdt:P2393 ?locustag; # get ncbi locus tag for genes in operon
            wdt:P351 ?entrez.  # get ncbi entrez id for genes in operon
          SERVICE wikibase:label { bd:serviceParam wikibase:language "en" . }
       ?strand rdfs:label ?label . filter(lang(?label) = 'en')
    } GROUP BY ?operonLabel ?strandLabel ?label ?operon
    ORDER BY ?operonLabel
    

Request all operons, their regulators, and their products[edit]

The following query uses these:

  • Properties: subclass of (P279)  View with Reasonator View with SQID, has part(s) (P527)  View with Reasonator View with SQID, product or material produced or service provided (P1056)  View with Reasonator View with SQID, regulates (molecular biology) (P128)  View with Reasonator View with SQID, Gene Ontology ID (P686)  View with Reasonator View with SQID, found in taxon (P703)  View with Reasonator View with SQID
    #Request all operons, their regulators, and their products 
    SELECT ?taxa_name ?regulator_name ?operon_name ?go_name ?product_name 
    WHERE {   
    ?operon wdt:P279 wd:Q139677 ;
    rdfs:label ?operon_name ;
       	wdt:P527 ?gene ;
       	wdt:P1056 ?protein .  
    ?regulator wdt:P128 ?operon  ;
       	rdfs:label ?regulator_name .
    ?protein ?function_type ?go_term ;
     	wdt:P1056 ?product .
    ?go_term wdt:P686 ?go_id ;
       	rdfs:label ?go_name .   
    ?product rdfs:label ?product_name . 
    ?gene wdt:P703 ?taxa .
    ?taxa rdfs:label ?taxa_name . 
    FILTER (LANG(?taxa_name) = "en") .
       	FILTER (LANG(?regulator_name) = "en") .
       	FILTER (LANG(?go_name) = "en") 
      	FILTER (LANG(?product_name) = "en") .  
    }
    

Request all organisms that are located in the female urogenital tract and that have a gene with product indole[edit]

The following query uses these:

Return gene counts for each bacterial genome in Wikidata[edit]

The following query uses these:

  • Properties: instance of (P31)  View with Reasonator View with SQID, found in taxon (P703)  View with Reasonator View with SQID, parent taxon (P171)  View with Reasonator View with SQID
    #Return gene counts for each bacterial genome in Wikidata
    SELECT ?species ?label (count (DISTINCT ?gene) as ?gene_counts) WHERE {
      ?gene wdt:P31 wd:Q7187 .
      ?gene wdt:P703 ?species .
      ?species wdt:P171* wd:Q10876 .
      ?species rdfs:label ?label FILTER (lang(?label) = "en") .
    } GROUP BY ?species ?label ORDER BY DESC(?gene_counts)
    

Return protein counts for each bacterial genome in Wikidata[edit]

The following query uses these:

  • Properties: instance of (P31)  View with Reasonator View with SQID, found in taxon (P703)  View with Reasonator View with SQID, parent taxon (P171)  View with Reasonator View with SQID
    #Return protein counts for each bacterial genome in Wikidata
    SELECT ?species ?label (count (DISTINCT ?protein) as ?protein_counts) WHERE {
      ?protein wdt:P31 wd:Q8054 ;
        wdt:P703 ?species .
      ?species wdt:P171* wd:Q10876 .
      ?species rdfs:label ?label FILTER (lang(?label) = "en") .
    } GROUP BY ?species ?label ORDER BY DESC(?protein_counts)
    

Queries for microRNAs and extracellular RNAs[edit]

Get all mature miRNAs in Wikidata[edit]

#Get all mature miRNAs in Wikidata 
SELECT ?mirna ?mirnaLabel WHERE {
  ?mirna wdt:P31 wd:Q23838648.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}
Try it!

Retrieve all mature miRNAs and their targets (names and NCBI entrez gene ID)[edit]

The following query uses these:

  • Properties: instance of (P31)  View with Reasonator View with SQID, regulates (molecular biology) (P128)  View with Reasonator View with SQID, Entrez Gene ID (P351)  View with Reasonator View with SQID
    #Retrieve all mature miRNAs and their targets (names and NCBI entrez gene ID)
    SELECT DISTINCT ?mirna ?mirnaLabel ?gene ?geneLabel ?entrez WHERE {
      ?mirna wdt:P31 wd:Q23838648 ;
             wdt:P128 ?gene .
      ?gene wdt:P351 ?entrez .
      
      SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
    }
    

Retrieve all genes and mature miRNAs which are involved in the 'regulation of immune response' (GO:0050776)[edit]

The following query uses these:

  • Properties: biological process (P682)  View with Reasonator View with SQID, Gene Ontology ID (P686)  View with Reasonator View with SQID, encoded by (P702)  View with Reasonator View with SQID, regulates (molecular biology) (P128)  View with Reasonator View with SQID
    #Retrieve all genes and mature miRNAs which are involved in the 'regulation of immune response' (GO:0050776)
    SELECT DISTINCT ?gene ?geneLabel ?mir ?mirLabel WHERE {
      ?protein wdt:P682 [wdt:P686 'GO:0050776'] . # regulation of immune response
      ?protein wdt:P702 ?gene .
      
      ?mir wdt:P128 ?gene .
      SERVICE wikibase:label { bd:serviceParam wikibase:language "en" .}
    }
    

Retrieve all genes that are involved in "regulation of immune response" AND that are regulated by an miRNA expressed in blood plasma or saliva[edit]

The following query uses these:

  • Properties: biological process (P682)  View with Reasonator View with SQID, Gene Ontology ID (P686)  View with Reasonator View with SQID, encoded by (P702)  View with Reasonator View with SQID, regulates (molecular biology) (P128)  View with Reasonator View with SQID, part of (P361)  View with Reasonator View with SQID
    #Retrieve all genes that are involved in "regulation of immune response" AND that are regulated by an miRNA expressed in blood plasma or saliva
    SELECT DISTINCT ?gene ?geneLabel ?mir ?mirLabel ?fluidLabel WHERE {
      ?protein wdt:P682 [wdt:P686 'GO:0050776'] . # regulation of immune response
      ?protein wdt:P702 ?gene .
      
      ?mir wdt:P128 ?gene .
      ?mir wdt:P361 ?fluid .
      values ?fluid {wd:Q79749 wd:Q155925} .
      SERVICE wikibase:label { bd:serviceParam wikibase:language "en" .}
    }
    

For these genes involved in regulation of immune reponse, are there drugs which modulate the immune response, by targeting one of these gene?[edit]

The following query uses these:

Biological processes impacted by hsa-miR-211-5p[edit]

The following query uses these:

  • Properties: regulates (molecular biology) (P128)  View with Reasonator View with SQID, encodes (P688)  View with Reasonator View with SQID, biological process (P682)  View with Reasonator View with SQID
    #Biological processes impacted by hsa-miR-211-5p
    SELECT DISTINCT ?bioProcess ?bioProcessLabel WHERE {
    	?mirna rdfs:label 'hsa-miR-211-5p'@en .
        ?mirna wdt:P128 ?gene .
        ?gene wdt:P688 ?protein .
        ?protein wdt:P682 ?bioProcess .
      
      	SERVICE wikibase:label { bd:serviceParam wikibase:language "en" .}
    }
    

Retrieve all human genes which are involved any immune response (as annotated by GO) AND are regulated by hsa-miR-211-5p.[edit]

The following query uses these:

  • Properties: subclass of (P279)  View with Reasonator View with SQID, Gene Ontology ID (P686)  View with Reasonator View with SQID, encoded by (P702)  View with Reasonator View with SQID, regulates (molecular biology) (P128)  View with Reasonator View with SQID
    #Retrieve all human genes which are involved any immune response (as annotated by GO) AND are regulated by hsa-miR-211-5p.
    SELECT DISTINCT ?geneLabel ?goLabel ?mirLabel WHERE {
      ?go wdt:P279* [wdt:P686 'GO:0006955'] . # immune response
      ?p ?pr ?go .
      ?p wdt:P702 ?gene .
      
      ?mir wdt:P128 ?gene .
      SERVICE wikibase:label { bd:serviceParam wikibase:language "en" .}  
    }
    GROUP BY ?goLabel ?mirLabel ?geneLabel
    HAVING(?mirLabel = 'hsa-miR-211-5p'@en)
    

Same query as the previous, but also retrieving small molecules interacting with these gene products.[edit]

The following query uses these:

  • Properties: subclass of (P279)  View with Reasonator View with SQID, Gene Ontology ID (P686)  View with Reasonator View with SQID, encoded by (P702)  View with Reasonator View with SQID, physically interacts with (P129)  View with Reasonator View with SQID, regulates (molecular biology) (P128)  View with Reasonator View with SQID
    #Same query as the previous, but also retrieving small molecules interacting with these gene products.
    SELECT DISTINCT ?gene ?geneLabel ?mir ?mirLabel ?drug ?drugLabel WHERE {
      ?x wdt:P279* [wdt:P686 'GO:0006955'] . # regulation of immune response
      ?p ?pr ?x .
      ?p wdt:P702 ?gene .
      OPTIONAL { ?p wdt:P129 ?drug . }
      
      ?mir wdt:P128 ?gene . 
      
      SERVICE wikibase:label { bd:serviceParam wikibase:language "en" .}
    }
    GROUP BY ?gene ?mir ?mirLabel ?geneLabel ?drug ?drugLabel
    HAVING(?mirLabel = 'hsa-miR-211-5p'@en)
    

BOSC 2017 examples[edit]

Get genes with GWAS association with asthma[edit]

Demonstrates basic retrieval of GWAS catalog data Result (as of 2017-07-22): 39 genes

The following query uses these:

  • Properties: genetic association (P2293)  View with Reasonator View with SQID, instance of (P31)  View with Reasonator View with SQID
    #Get genes with GWAS association with asthma
    SELECT DISTINCT ?gene ?geneLabel where {
      ?gene wdt:P2293 wd:Q35869 .  # gene has genetic association to "asthma"
      ?gene wdt:P31 wd:Q7187 .     # gene is subclass of "gene"
      SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
    }
    

... and whose gene product is localized to membrane[edit]

Demonstrates basic integration w/ Gene Ontology Result (as of 2017-07-22): 22 genes

The following query uses these:

  • Properties: genetic association (P2293)  View with Reasonator View with SQID, instance of (P31)  View with Reasonator View with SQID, encodes (P688)  View with Reasonator View with SQID, cell component (P681)  View with Reasonator View with SQID, subclass of (P279)  View with Reasonator View with SQID, part of (P361)  View with Reasonator View with SQID
    #... and whose gene product is localized to membrane
    SELECT DISTINCT ?gene ?geneLabel where {
      ?gene wdt:P2293 wd:Q35869 .  # gene has genetic association to "asthma"
      
      ?gene wdt:P31 wd:Q7187 .     # gene is subclass of "gene"
    
      ?gene wdt:P688 ?protein .                # gene encodes a protein
      ?protein wdt:P681 ?cc .                  # protein has a cellular component
      ?cc wdt:P279*|wdt:P361* wd:Q14349455 .   # cell component is 'part of' or 'subclass of' membrane
    
      SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
    }
    

... where the GO localization is based on a non-IEA evidence code[edit]

Demonstrates computing on provenance Result (as of 2017-07-22): 15 genes

The following query uses these:

  • Properties: genetic association (P2293)  View with Reasonator View with SQID, instance of (P31)  View with Reasonator View with SQID, encodes (P688)  View with Reasonator View with SQID, subclass of (P279)  View with Reasonator View with SQID, part of (P361)  View with Reasonator View with SQID, cell component (P681)  View with Reasonator View with SQID, determination method (P459)  View with Reasonator View with SQID
    #... and whose gene product is localized to membrane
    SELECT DISTINCT ?gene ?geneLabel where {
      ?gene wdt:P2293 wd:Q35869 .  # gene has genetic association to "asthma"
      
      ?gene wdt:P31 wd:Q7187 .     # gene is subclass of "gene"
    
      ?gene wdt:P688 ?protein .                        # gene encodes a protein
      ?protein p:P681 ?s .                             # protein's cell component statement
        ?s ps:P681 ?cp .                               # get statement value
        FILTER NOT EXISTS {?s pq:P459 wd:Q23190881 .}  # determination method is not IEA
        ?cp wdt:P279*|wdt:P361* wd:Q14349455 .         # statement value is 'part of' or 'subclass of' membrane
    
      SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
    }
    

... with GWAS association with any respiratory disease[edit]

Demonstrates leveraging ontology structure Result (as of 2017-07-22): 31 genes over 8 diseases

The following query uses these:

  • Properties: genetic association (P2293)  View with Reasonator View with SQID, subclass of (P279)  View with Reasonator View with SQID, instance of (P31)  View with Reasonator View with SQID, encodes (P688)  View with Reasonator View with SQID, part of (P361)  View with Reasonator View with SQID, cell component (P681)  View with Reasonator View with SQID, determination method (P459)  View with Reasonator View with SQID
    #... with GWAS association with any respiratory disease
    SELECT ?diseaseGALabel (count (DISTINCT ?gene) as ?gene_counts) 
    (group_concat(DISTINCT ?geneLabel; separator=", ") as ?geneList) WHERE {
      ?gene wdt:P2293 ?diseaseGA .        # gene has genetic association
      ?diseaseGA wdt:P279* wd:Q3286546 .  # to a type of respiratory system disease
      
      ?gene wdt:P31 wd:Q7187 ; wdt:P688 ?protein ;    # gene is subclass of "gene" and encodes protein
            rdfs:label ?geneLabel . 
      FILTER (lang(?geneLabel) = "en")
      ?protein p:P681 ?s .                             # protein's cell component statement
        ?s ps:P681 ?cp .                               # get statement value
        FILTER NOT EXISTS {?s pq:P459 wd:Q23190881 .}  # determination method is not IEA
        ?cp wdt:P279*|wdt:P361* wd:Q14349455 .         # statement value is 'part of' or 'subclass of' membrane
    
      SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
    } 
    GROUP BY ?diseaseGALabel ?geneList ORDER BY DESC(?gene_counts)
    

... show associated chemical exposures[edit]

Demonstrates opportunistic integration with independent community contribution Result (as of 2017-07-22): 4 diseases / 6 chemical hazards

The following query uses these:

  • Properties: genetic association (P2293)  View with Reasonator View with SQID, subclass of (P279)  View with Reasonator View with SQID, instance of (P31)  View with Reasonator View with SQID, encodes (P688)  View with Reasonator View with SQID, part of (P361)  View with Reasonator View with SQID, has effect (P1542)  View with Reasonator View with SQID, cell component (P681)  View with Reasonator View with SQID, determination method (P459)  View with Reasonator View with SQID
    #... show associated chemical exposures
    SELECT DISTINCT ?diseaseGA ?diseaseGALabel ?exposure ?exposureLabel where {
      ?gene wdt:P2293 ?diseaseGA .        # gene has genetic association
      ?diseaseGA wdt:P279* wd:Q3286546 .  # to a type of respiratory system disease
      
      ?gene wdt:P31 wd:Q7187 .     # gene is subclass of "gene"
    
      ?gene wdt:P688 ?protein .                        # gene encodes a protein
      ?protein p:P681 ?s .                             # protein's cell component statement
        ?s ps:P681 ?cp .                               # get statement value
        FILTER NOT EXISTS {?s pq:P459 wd:Q23190881 .}  # determination method is not IEA
        ?cp wdt:P279*|wdt:P361* wd:Q14349455 .         # statement value is 'part of' or 'subclass of' membrane
    
      ?exposure wdt:P1542 ?diseaseGA .  # something causes disease
      ?exposure wdt:P279 wd:Q21167512 . # and that something is a chemical hazard
      
      SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
    }
    

... and show associated pathways[edit]

Demonstrates opportunistic integration with other community contributions Result (as of 2017-07-22): 59 gene-pathway combos

The following query uses these:

  • Properties: genetic association (P2293)  View with Reasonator View with SQID, subclass of (P279)  View with Reasonator View with SQID, instance of (P31)  View with Reasonator View with SQID, encodes (P688)  View with Reasonator View with SQID, part of (P361)  View with Reasonator View with SQID, has part(s) (P527)  View with Reasonator View with SQID, cell component (P681)  View with Reasonator View with SQID, determination method (P459)  View with Reasonator View with SQID
    #... and show associated pathways
    SELECT DISTINCT ?gene ?geneLabel ?pathwayLabel where {
      ?gene wdt:P2293 ?diseaseGA .        # gene has genetic association
      ?diseaseGA wdt:P279* wd:Q3286546 .  # to a type of respiratory system disease
      
      ?gene wdt:P31 wd:Q7187 .     # gene is subclass of "gene"
    
      ?gene wdt:P688 ?protein .                        # gene encodes a protein
      ?protein p:P681 ?s .                             # protein's cell component statement
        ?s ps:P681 ?cp .                               # get statement value
        FILTER NOT EXISTS {?s pq:P459 wd:Q23190881 .}  # determination method is not IEA
        ?cp wdt:P279*|wdt:P361* wd:Q14349455 .         # statement value is 'part of' or 'subclass of' membrane
    
      ?pathway wdt:P31 wd:Q4915012 ;                   # instance of a biological pathway
               wdt:P527 ?gene .
    
      SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
    }
    

NCATS Translator Queries[edit]

What compensatory mutations in FA core genes confer resistance to chemotherapeutic drugs (e.g. cisplatin)?[edit]

See here for more information.

The following query uses these:

  • Properties: HGNC gene symbol (P353)  View with Reasonator View with SQID, biological variant of (P3433)  View with Reasonator View with SQID, PubChem CID (P662)  View with Reasonator View with SQID, subclass of (P279)  View with Reasonator View with SQID, negative therapeutic predictor for (P3355)  View with Reasonator View with SQID, reference URL (P854)  View with Reasonator View with SQID, medical condition treated (P2175)  View with Reasonator View with SQID
    #What compensatory mutations in FA core genes confer resistance to chemotherapeutic drugs (e.g. cisplatin)?
    select ?geneLabel ?variantLabel ?variant ?drugLabel ?cid ?diseaseLabel ?ref where {
      values ?hgnc {"FANCA" "FANCB" "FANCC" "FANCE" "FANCF" "FANCG" "FANCL" "FANCM" "FANCD2" "FANCI" "UBE2T" "BRCA2" "BRIP1" "PALB2" "RAD51C" "SLX4" "ERCC4" "RAD51" "BRCA1" "MAD2L2" "XRCC2" "RFWD3"}
      ?gene wdt:P353 ?hgnc .
      ?variant wdt:P3433 ?gene .
      ?variant p:P3355 ?s .
      ?s ps:P3355 ?drug .
      ?s prov:wasDerivedFrom/pr:P854 ?ref .
      ?drug wdt:P662 ?cid .
      ?s pq:P2175 ?disease .
      ?disease wdt:P279* wd:Q12078 .
      SERVICE wikibase:label {  bd:serviceParam wikibase:language "en" }
    }
    

Get GO Annotations for FA genes[edit]

The following query uses these:

  • Properties: HGNC gene symbol (P353)  View with Reasonator View with SQID, encodes (P688)  View with Reasonator View with SQID, molecular function (P680)  View with Reasonator View with SQID, cell component (P681)  View with Reasonator View with SQID, biological process (P682)  View with Reasonator View with SQID, Gene Ontology ID (P686)  View with Reasonator View with SQID
    #Get GO Annotations for FA genes
    SELECT ?hgnc ?protein ?go ?goLabel ?goId WHERE {
      values ?hgnc {"FANCA" "FANCB" "FANCC" "FANCE" "FANCF" "FANCG" "FANCL" "FANCM" "FANCD2" "FANCI" "UBE2T" "BRCA2" "BRIP1" "PALB2" "RAD51C" "SLX4" "ERCC4" "RAD51" "BRCA1" "MAD2L2" "XRCC2" "RFWD3"}
      ?gene wdt:P353 ?hgnc .  # get gene items with these HGNC symbols
      ?gene wdt:P688 ?protein . # get the protein
      ?protein wdt:P680|wdt:P681|wdt:P682 ?go . # get GO terms
      ?go wdt:P686 ?goId
      SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
    }
    

Retrieve orthologs of FA-core genes[edit]

Used for part of the competency question here.

The following query uses these:

  • Properties: HGNC gene symbol (P353)  View with Reasonator View with SQID, ortholog (P684)  View with Reasonator View with SQID, found in taxon (P703)  View with Reasonator View with SQID
    #Retrieve orthologs of FA-core genes
    SELECT ?hgnc ?gene ?geneLabel ?ortho ?orthoLabel ?taxonLabel WHERE
    {
      values ?hgnc {"FANCA" "FANCB" "FANCC" "FANCE" "FANCF" "FANCG" "FANCL" "FANCM" "FANCD2" "FANCI" "UBE2T" "BRCA2" "BRIP1" "PALB2" "RAD51C" "SLX4" "ERCC4" "RAD51" "BRCA1" "MAD2L2" "XRCC2" "RFWD3"}
      ?gene wdt:P353 ?hgnc . 
      ?gene wdt:P684 ?ortho .
      ?ortho wdt:P703 ?taxon
      SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
    }
    

Wikidata FAIR paper[edit]

Consider a pulmonologist who is interested in identifying candidate chemical compounds for testing in disease models. She may start by identifying genes with a genetic association to any respiratory disease, with a particular interest in genes that encode membrane-bound proteins (for ease in cell sorting). She may then look for chemical compounds that either directly inhibit those proteins, or finding none, compounds that inhibit another protein in the same pathway. Because she has collaborators with relevant expertise, she may specifically filter for proteins containing a serine-threonine kinase domain.

The following query uses these:

  • Properties: instance of (P31)  View with Reasonator View with SQID, genetic association (P2293)  View with Reasonator View with SQID, subclass of (P279)  View with Reasonator View with SQID, encodes (P688)  View with Reasonator View with SQID, cell component (P681)  View with Reasonator View with SQID, part of (P361)  View with Reasonator View with SQID, has part(s) (P527)  View with Reasonator View with SQID, physically interacts with (P129)  View with Reasonator View with SQID, has use (P366)  View with Reasonator View with SQID
    SELECT DISTINCT ?compound ?compoundLabel where {
    
      # gene has genetic association with a respiratory disease  
      ?gene       wdt:P31    wd:Q7187 .
      ?gene       wdt:P2293  ?diseaseGA .
      ?diseaseGA  wdt:P279*  wd:Q3286546 .
    
      # gene product is localized to the membrane
      ?gene     wdt:P688             ?protein .
      ?protein  wdt:P681             ?cc .
      ?cc       wdt:P279*|wdt:P361*  wd:Q14349455 .
    
      # gene is involved in a pathway with another gene (gene2)
      ?pathway  wdt:P31   wd:Q4915012 ;
                wdt:P527  ?gene ;
                wdt:P527  ?gene2 .
      ?gene2    wdt:P31   wd:Q7187 . 
    
      # gene2 product has a Ser/Thr protein kinase domain AND known enzyme inhibitor  
      ?gene2     wdt:P688  ?protein2 .
      ?protein2  wdt:P129  ?compound ;
                 wdt:P527  wd:Q24787419 ;
                 p:P129    ?s2 .
      ?s2        ps:P129   ?cp2 .
      ?compound  wdt:P31   wd:Q11173 .
      FILTER EXISTS {?s2 pq:P366 wd:Q427492 .}
    
      SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
    }
    

Other Example Queries[edit]

Retrieve an item by its external ID[edit]

Get the item that has the Disease Ontology ID "DOID:8577"

The following query uses these:

  • Properties: Disease Ontology ID (P699)  View with Reasonator View with SQID
    #Retrieve an item by its external ID
    SELECT ?item ?itemLabel WHERE {
       ?item wdt:P699 "DOID:8577" .
       SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
    }
    

Retrieve all items with an external ID[edit]

Get all items with a Disease Ontology ID

The following query uses these:

  • Properties: Disease Ontology ID (P699)  View with Reasonator View with SQID
    #Get all items with a Disease Ontology ID
    SELECT ?item ?doid WHERE {
       ?item wdt:P699 ?doid .
    }
    

Example demonstrating GROUP BY, COUNT and ORDER BY[edit]

Count the number of genes in each taxon by NCBI Tax ID, sort by the count

The following query uses these:

  • Properties: NCBI taxonomy ID (P685)  View with Reasonator View with SQID, found in taxon (P703)  View with Reasonator View with SQID, Entrez Gene ID (P351)  View with Reasonator View with SQID
    #Count the number of genes in each taxon by NCBI Tax ID
    
    SELECT (COUNT(?gene) as ?count) ?taxon ?taxonLabel ?taxids WHERE {
      values ?taxids {"559292" "6239" "7227" "7955" "10090" "10116" "9606"}
      ?taxon wdt:P685 ?taxids .
      ?gene wdt:P703 ?taxon .
      ?gene wdt:P351 ?en
      SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
    } GROUP BY ?taxon ?taxonLabel ?taxids
    ORDER BY DESC(?count)
    

Example demonstrating how to retrieve qualifiers and references for statements[edit]

Get Gene Ontology subcellular localization information, with evidence codes, and references for Reelin

The following query uses these:

  • Properties: UniProt protein ID (P352)  View with Reasonator View with SQID, cell component (P681)  View with Reasonator View with SQID, stated in (P248)  View with Reasonator View with SQID, retrieved (P813)  View with Reasonator View with SQID, reference URL (P854)  View with Reasonator View with SQID, determination method (P459)  View with Reasonator View with SQID
    #Get Gene Ontology subcellular localization information, with evidence codes, and references for Reelin
    SELECT distinct ?proteinLabel ?value ?valueLabel ?determination ?determinationLabel ?reference_stated_inLabel ?reference_retrieved ?reference_URL WHERE {
      ?protein wdt:P352 "P78509" . # get a protein by uniprot id 
      ?protein p:P681 ?statement . # get the cell component statements
      ?statement ps:P681 ?value .  # get the value associated with the statement
      ?statement pq:P459 ?determination . # get 'determination method' qualifiers associated with the statements
      # change ?determination to wd:Q23175558 for ISS (Inferred from Sequence or structural Similarity)
      # or e.g. wd:Q23190881 for IEA (Inferred from Electronic Annotation)
      #add reference links 
      ?statement prov:wasDerivedFrom/pr:P248 ?reference_stated_in . #where stated
      ?statement prov:wasDerivedFrom/pr:P813 ?reference_retrieved . #when retrieved
      ?statement prov:wasDerivedFrom/pr:P854 ?reference_URL 
      SERVICE wikibase:label { bd:serviceParam wikibase:language "en" . }
    } ORDER BY ?value
    

Example demonstrating how to retrieve determination method qualifiers[edit]

Get all genetic association claims linking gene to disease, along with the determination method (GWAS, TAS, etc).

The following query uses these:

  • Properties: genetic association (P2293)  View with Reasonator View with SQID, determination method (P459)  View with Reasonator View with SQID
    #Get all genetic association claims linking gene to disease, along with the determination method (GWAS, TAS, etc).
    SELECT distinct ?geneLabel ?gene ?diseaseLabel ?disease ?determinationLabel WHERE {
      ?gene p:P2293 ?statement . # all gene disease genetic associations
      ?statement ps:P2293 ?disease .  # get the value associated with the statement
      ?statement pq:P459 ?determination . # get 'determination method' qualifiers associated with the statements
      SERVICE wikibase:label { bd:serviceParam wikibase:language "en" . }
    }
    

Example demonstrating how to retrieve references for statements that cite journal articles[edit]

Get Gene Ontology subcellular localization information and references for proteins

The following query uses these:

  • Properties: UniProt protein ID (P352)  View with Reasonator View with SQID, instance of (P31)  View with Reasonator View with SQID, PubMed ID (P698)  View with Reasonator View with SQID, PMCID (P932)  View with Reasonator View with SQID, cell component (P681)  View with Reasonator View with SQID, stated in (P248)  View with Reasonator View with SQID
    #Get Gene Ontology subcellular localization information and references for proteins
    
    SELECT ?proteinLabel ?uniprot ?valueLabel ?paperLabel ?PMID ?PMCID WHERE {
      ?protein wdt:P352 ?uniprot .
      ?protein p:P681 ?statement .
      ?statement ps:P681 ?value . 
      ?statement prov:wasDerivedFrom/pr:P248 ?paper .
      ?paper wdt:P31 wd:Q13442814 .
      OPTIONAL { ?paper wdt:P698 ?PMID . }
      OPTIONAL { ?paper wdt:P932 ?PMCID . }
      SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
    } limit 10
    

Example demonstrating how to retrieve information about references for statements[edit]

Get all Gene Ontology information and references for proteins, but only show statements where the refernece is a journal article published in Nature, Cell, or Science

The following query uses these:

  • Properties: UniProt protein ID (P352)  View with Reasonator View with SQID, instance of (P31)  View with Reasonator View with SQID, PubMed ID (P698)  View with Reasonator View with SQID, published in (P1433)  View with Reasonator View with SQID, molecular function (P680)  View with Reasonator View with SQID, cell component (P681)  View with Reasonator View with SQID, biological process (P682)  View with Reasonator View with SQID, stated in (P248)  View with Reasonator View with SQID
    #Get all Gene Ontology information and references for proteins, but only show statements where the refernece is a journal article published in Nature, Cell, or Science
    SELECT ?proteinLabel ?uniprot ?valueLabel ?goTypeLabel ?paperLabel ?PMID ?PMCID WHERE {
      ?protein wdt:P352 ?uniprot .
      ?protein p:P680|p:P681|p:P682 ?statement .
      ?statement ps:P680|ps:P681|ps:P682 ?value . 
      ?value wdt:P31 ?goType .
      ?statement prov:wasDerivedFrom/pr:P248 ?paper .
      ?paper wdt:P698 ?PMID .
      ?paper wdt:P1433 ?journal .
      values ?journal {wd:Q192864 wd:Q180445 wd:Q655814}
      SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
    } limit 1000
    

Example demonstrating how to filter out unreferenced claims[edit]

Get all "drug used for treatment of disease" claims with a column that indicates whether a reference exists for the claim.

The following query uses these:

  • Properties: instance of (P31)  View with Reasonator View with SQID, drug or therapy used for treatment (P2176)  View with Reasonator View with SQID, stated in (P248)  View with Reasonator View with SQID
    SELECT DISTINCT ?disease ?diseaseLabel ?drug ?drugLabel ?hasRef ?stated_inLabel WHERE {
      ?disease wdt:P31 wd:Q12136 ;  # find items that are in instance of disease
            p:P2176 ?id .  # get "drug used for treatment" statements
      ?id ?b ?drug .  # get the object used in these statements
      FILTER(regex(str(?b), "http://www.wikidata.org/prop/statement" ))
      # FILTER NOT EXISTS { ?id prov:wasDerivedFrom ?provenance }  # filter out statements with no references
      # ?id prov:wasDerivedFrom ?provenance  # only keep statements with a references
      BIND(EXISTS {?id prov:wasDerivedFrom ?provenance } as ?hasRef) # tag statements with whether or not a ref exists
      OPTIONAL {?id prov:wasDerivedFrom ?prov .
                ?prov pr:P248 ?stated_in }
      SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
    }
    

Example demonstrating how to filter out contradictory claims[edit]

Get "positive therapeutic predictor" claims, filtering out claims that are "disputed by" anything

The following query uses these:

  • Properties: CIViC variant ID (P3329)  View with Reasonator View with SQID, positive therapeutic predictor for (P3354)  View with Reasonator View with SQID, statement disputed by (P1310)  View with Reasonator View with SQID
    SELECT DISTINCT ?item ?itemLabel ?civic ?value ?valueLabel ?disputed_by WHERE {
      ?item wdt:P3329 ?civic ;  # find items that have a civic id
            p:P3354 ?id .  # get "positive therapeutic predictor" statements
      ?id ?b ?value .  # get the object used in these statements
      FILTER(regex(str(?b), "http://www.wikidata.org/prop/statement" ))
      # FILTER NOT EXISTS {?id pq:P1310 ?disputed_by } # filter out statements that have a disputing qualifier
      BIND(EXISTS {?id pq:P1310 [] } as ?disputed_by)
      SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
    }
    

Miscellaneous Queries[edit]

Get mapping of Wikipedia articles to WikiData items to Entrez Gene IDs[edit]

The following query uses these:

  • Properties: Entrez Gene ID (P351)  View with Reasonator View with SQID, found in taxon (P703)  View with Reasonator View with SQID
    SELECT ?entrez_id ?cid ?article ?label WHERE {
        ?cid wdt:P351 ?entrez_id .
      	?cid wdt:P703 wd:Q15978631 . 
        OPTIONAL { ?cid rdfs:label ?label filter (lang(?label) = "en") . }
        ?article schema:about ?cid .
        ?article schema:inLanguage "en" .
        FILTER (SUBSTR(str(?article), 1, 25) = "https://en.wikipedia.org/") . 
        FILTER (SUBSTR(str(?article), 1, 38) != "https://en.wikipedia.org/wiki/Template")
    } limit 10
    

Count of number of GO annotations on yeast grouped by curator[edit]

The following query uses these:

Retrieve reverse/"what links here" statements (statements where the item is the object)[edit]

The following query uses these:

  • Items: night blindness (Q7758678)  View with Reasonator View with SQID, color blindness (Q133696)  View with Reasonator View with SQID
    SELECT ?item ?itemLabel ?property ?propertyLabel ?value ?valueLabel ?id
    WHERE {
      values ?value {wd:Q7758678 wd:Q133696}
      ?item ?propertyclaim ?id .
      ?property wikibase:propertyType wikibase:WikibaseItem .
      ?property wikibase:claim ?propertyclaim .
      ?id ?b ?value .
      FILTER(regex(str(?b), "http://www.wikidata.org/prop/statement" ))
      SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
    }
    

Get all proteins with GO annotations that are a subclass of "signaling receptor activity", where the determination method is a type of manual assertion, and the evidence is a scientific article that was published since 2014[edit]

The following query uses these:

  • Properties: Gene Ontology ID (P686)  View with Reasonator View with SQID, subclass of (P279)  View with Reasonator View with SQID, found in taxon (P703)  View with Reasonator View with SQID, UniProt protein ID (P352)  View with Reasonator View with SQID, molecular function (P680)  View with Reasonator View with SQID, instance of (P31)  View with Reasonator View with SQID, publication date (P577)  View with Reasonator View with SQID, PubMed ID (P698)  View with Reasonator View with SQID, stated in (P248)  View with Reasonator View with SQID, curator (P1640)  View with Reasonator View with SQID, determination method (P459)  View with Reasonator View with SQID
    SELECT distinct ?uniprot ?determinationLabel ?curatorLabel ?reference_stated_inLabel ?pmid ?publication_date ?go_id ?sig_rec_goLabel WHERE  {
      
      ?sig_rec_go wdt:P686 ?go_id . # get GO IDs
      ?sig_rec_go wdt:P279* wd:Q21109843 . # that are subclasses of "signaling receptor activity"
      
      ?protein wdt:P703 wd:Q15978631 . # get items that are "found in taxon" human
      ?protein wdt:P352 ?uniprot . # and have a uniprot ID
      
      ?protein wdt:P680 ?sig_rec_go . # proteins where the MF a signaling receptor activity subclass
      ?protein p:P680 ?statement . # get statements
      
      ?statement pq:P459 ?determination . # get 'determination method' qualifiers associated with the statements
      ?determination wdt:P31 wd:Q28955254 . # filter where the determination method is a "manual assertion"
      
      ?statement prov:wasDerivedFrom/pr:P248 ?reference_stated_in . # get the "stated in" reference
      ?reference_stated_in wdt:P31 wd:Q13442814 . # stated in a "scientific article"
      ?reference_stated_in wdt:P577 ?publication_date . # get the publication date
      ?reference_stated_in wdt:P698 ?pmid . # get the pubmed id
      FILTER (?publication_date >= "2014-01-01T00:00:00Z"^^xsd:dateTime) . # filter where publication date is after 2014
      
      ?statement prov:wasDerivedFrom/pr:P1640 ?curator . # get the curator
    
      SERVICE wikibase:label { bd:serviceParam wikibase:language "en" . }
      
    } limit 50
    

Get labels and description for Wikidata items[edit]

Get label, synonyms, description for for a selected list of diseases as DOID

The following query uses these:

  • Properties: Disease Ontology ID (P699)  View with Reasonator View with SQID
    SELECT DISTINCT ?item ?doid ?itemLabel (group_concat(distinct ?itemaltLabel; separator="|") as ?altLabel) ?itemDesc WHERE {
      ?item wdt:P699 ?doid .
      values ?doid {"DOID:0050602" "DOID:0060308" "DOID:0060728" "DOID:10595" "DOID:11589" "DOID:2476" "DOID:5212"}
      OPTIONAL{
      ?item skos:altLabel ?itemaltLabel .
        FILTER(LANG(?itemaltLabel) = "en")
      ?item schema:description ?itemDesc .
        FILTER(LANG(?itemDesc) = "en")
      }
      SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
    }
    group by ?item ?doid ?itemLabel ?itemDesc
    

Count language labels for diseases[edit]

The following query uses these:

  • Properties: Disease Ontology ID (P699)  View with Reasonator View with SQID
    SELECT ?disease ?doid ?enLabel (count(?language) as ?languages) WHERE {
    	?disease wdt:P699 ?doid ;
                 rdfs:label ?label ;
                 rdfs:label ?enLabel .
        FILTER (lang(?enLabel) = "en")
        BIND (lang(?label) AS ?language)
    } group by ?disease ?doid ?enLabel
    order by desc(?languages)
    

Example counting statements by their determination method[edit]

Get all "genetic association" (P2293) claims linking gene to disease, how many are from GWAS versus other methods?

The following query uses these:

  • Properties: genetic association (P2293)  View with Reasonator View with SQID, determination method (P459)  View with Reasonator View with SQID
    SELECT distinct (COUNT(*) as ?c) ?determinationLabel WHERE {
      ?gene p:P2293 ?statement . # all gene disease genetic associations
      ?statement ps:P2293 ?disease .  # get the value associated with the statement
      ?statement pq:P459 ?determination . # get 'determination method' qualifiers associated with the statements
      SERVICE wikibase:label { bd:serviceParam wikibase:language "en" . }
    } GROUP BY ?determinationLabel
    

Get all properties and their usage counts on items that are diseases[edit]

The following query uses these:

  • Properties: instance of (P31)  View with Reasonator View with SQID
    SELECT ?propertyLabel ?propertyDescription ?pt ?count WHERE {
      {
        SELECT ?propertyclaim (COUNT(*) AS ?count) WHERE {
          ?item wdt:P31 wd:Q12136 .
    	  ?item ?propertyclaim [] .
    	} GROUP BY ?propertyclaim
      }
      ?property wikibase:propertyType ?pt .
      ?property wikibase:claim ?propertyclaim .
      SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
    } ORDER BY DESC (?count)
    

Get all properties and their usage counts for statements where the values are diseases[edit]

i.e. these properties have values that are diseases

The following query uses these:

  • Properties: instance of (P31)  View with Reasonator View with SQID
    SELECT ?propertyLabel ?propertyDescription ?pt ?count WHERE {
      {
        SELECT ?propertyclaim (COUNT(*) AS ?count) WHERE {
          ?id ?b ?item .
          ?item wdt:P31 wd:Q12136 .
    	  [] ?propertyclaim ?id .
    	} GROUP BY ?propertyclaim
      }
      ?property wikibase:propertyType ?pt .
      ?property wikibase:claim ?propertyclaim .
      SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
    } ORDER BY DESC (?count)
    

Quality Control (QC) Queries[edit]

See here for more info.

Query Wikidata using SPARQL Through R[edit]

Example Query[edit]

library(SPARQL)
sparql <- "https://query.wikidata.org/bigdata/namespace/wdq/sparql"
query <- "PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
SELECT ?item ?itemLabel WHERE {
  ?item wdt:P279 wd:Q1049021 .
  SERVICE wikibase:label { bd:serviceParam wikibase:language 'en' . }
}
"
results <- SPARQL(sparql, query)
View(as.matrix(results$results))

Query Wikidata using SPARQL Through Python[edit]

Example Query[edit]

from wikidataintegrator.wdi_core import WDItemEngine
import pandas as pd
r = WDItemEngine.execute_sparql_query("""SELECT ?item ?itemLabel WHERE {
  ?item wdt:P279 wd:Q1049021 .
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en" . }
}""")['results']['bindings']
df = pd.DataFrame([{k:v['value'] for k,v in item.items()} for item in r])

Federated Queries[edit]

Wikidata -> Wikipathways[edit]

Wikipathways: Get drugs that act as channel blockers from Wikidata, get the pathways that these drugs are part of from Wikipathways[edit]

The following query uses these:

  • Properties: subclass of (P279)  View with Reasonator View with SQID, part of (P361)  View with Reasonator View with SQID, instance of (P31)  View with Reasonator View with SQID, physically interacts with (P129)  View with Reasonator View with SQID, subject has role (P2868)  View with Reasonator View with SQID
    PREFIX bd: <http://www.bigdata.com/rdf#>
    PREFIX wp:      <http://vocabularies.wikipathways.org/wp#> 
    PREFIX dcterms:  <http://purl.org/dc/terms/>
    PREFIX dc:      <http://purl.org/dc/elements/1.1/> 
    
    SELECT DISTINCT ?metabolite ?wikidatadrug ?wikidatadrugLabel ?title ?wpIdentifier WHERE {
        {?protein wdt:P279* wd:Q422500 .}
         UNION
        {?protein wdt:P361* wd:Q422500 .}
      ?protein wdt:P31 wd:Q8054 .
      ?wikidatadrug wdt:P129 ?protein .
      ?wikidatadrug p:P129/pq:P2868 wd:Q389934 .
      SERVICE <http://sparql.wikipathways.org/> {
      ?metabolite a wp:Metabolite ;
        wp:bdbWikidata ?wikidatadrug ;
        dcterms:isPartOf ?pathway .
       ?pathway a wp:Pathway .
        ?pathway dc:title ?title .
        ?pathway dc:identifier ?wpIdentifier .
      }
    SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
    }
    

Get all interaction between metabolites in Pathways from Wikipathways and their individual pKa values[edit]

The following query uses these:

  • Properties: pKa (P1117)  View with Reasonator View with SQID
    #defaultView:Dimensions
    PREFIX wp:      <http://vocabularies.wikipathways.org/wp#>
    PREFIX dcterms:  <http://purl.org/dc/terms/>
    PREFIX dc:      <http://purl.org/dc/elements/1.1/> 
    SELECT DISTINCT ?pwTitle ?metabolite1Label ?pKa1 ?pKa2 ?metabolite2Label WHERE {
      ?metabolite2 wdt:P1117 ?pKa2 .
      SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
      {SELECT * WHERE {
        ?metabolite1 wdt:P1117 ?pKa1 .
        {SELECT * WHERE {
           SERVICE <http://sparql.wikipathways.org/> {
             ?pathway dc:identifier ?pw ;
                      dc:title ?pwTitle ;
                       wp:organismName "Homo sapiens"^^xsd:string .
             ?interaction rdf:type wp:Interaction ;
                    wp:participants ?wpmb1, ?wpmb2 ;
                    dcterms:isPartOf ?pathway .
              ?wpmb1 wp:bdbWikidata ?metabolite1 .
              ?wpmb2 wp:bdbWikidata ?metabolite2 .
             FILTER (?wpmb1 != ?wpmb2)}
         }
        }
       }
      }
    }
    

Find genes regulated by an miRNA of interest from Wikidata and retrieve pathways this gene is active in from WikiPathways[edit]

Submit through Wikipathways endpoint

PREFIX wdt: <http://www.wikidata.org/prop/direct/>

SELECT ?mirna ?gene ?pathway ?pLabel WHERE {
  ?target dc:identifier ?exact .
    ?target dcterms:isPartOf ?pathway .
    ?pathway a wp:Pathway .
    ?pathway <http://purl.org/dc/elements/1.1/title> ?pLabel .
  SERVICE <https://query.wikidata.org/sparql> {
    ?mirna rdfs:label 'hsa-miR-211-5p'@en .
    ?mirna wdt:P128 ?gene .
    ?gene wdt:P2888 ?exact filter (?exact = <http://identifiers.org/ncbigene/1234>)
  }
} LIMIT 10

Try it!

Uniprot -> Wikidata[edit]

Retrieve all human membrane proteins annotated for a role in colorectal cancer[edit]

Submit through uniprot endpoint

PREFIX up: <http://purl.uniprot.org/core/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?gene ?geneLabel ?wdncbi ?disease_text ?disease_annotation ?sa ?mesh_iri WHERE {    
  SERVICE <https://query.wikidata.org/sparql> {    
    ?gene wdt:P351 ?wdncbi ;
          wdt:P703 wd:Q15978631;
          rdfs:label ?geneLabel ;
          wdt:P688 ?wd_protein .
    ?wd_protein wdt:P352 ?uniprot_id ;
                wdt:P681 ?go_term .
    ?go_term wdt:P686 "GO:0016020" .
    FILTER (LANG(?geneLabel) = "en") .
    ?disease wdt:P31 wd:Q12136 .
    ?disease wdt:P486 ?mesh .
    ?disease wdt:P279* wd:Q188874 .
  }
  BIND(IRI(CONCAT("http://purl.uniprot.org/uniprot/", ?uniprot_id)) as ?protein)
  BIND(IRI(CONCAT("https://id.nlm.nih.gov/mesh/", ?mesh)) as ?mesh_iri)
  ?protein up:annotation ?annotation .
  ?annotation a up:Disease_Annotation .
  ?annotation up:disease ?disease_annotation .
  ?disease_annotation <http://www.w3.org/2004/02/skos/core#prefLabel> ?disease_text .
  ?disease_annotation rdfs:seeAlso ?mesh_iri
}

Try it!

Select all human UniProt entries with a sequence variant that leads to a 'loss of function' and also physically interact with (P129) a drug with a qualifier of "use" (P366) of "enzyme inhibitor" (Q427492)[edit]

Submit through uniprot endpoint

PREFIX keywords:<http://purl.uniprot.org/keywords/> 
PREFIX uniprotkb:<http://purl.uniprot.org/uniprot/> 
PREFIX ec:<http://purl.uniprot.org/enzyme/> 
PREFIX rdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#> 
PREFIX skos:<http://www.w3.org/2004/02/skos/core#> 
PREFIX owl:<http://www.w3.org/2002/07/owl#> 
PREFIX bibo:<http://purl.org/ontology/bibo/> 
PREFIX dc:<http://purl.org/dc/terms/> 
PREFIX xsd:<http://www.w3.org/2001/XMLSchema#> 
PREFIX faldo:<http://biohackathon.org/resource/faldo#> 
PREFIX up:<http://purl.uniprot.org/core/> 
PREFIX taxon:<http://purl.uniprot.org/taxonomy/> 
PREFIX rdfs:<http://www.w3.org/2000/01/rdf-schema#>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX pq: <http://www.wikidata.org/prop/qualifier/>
PREFIX ps: <http://www.wikidata.org/prop/statement/>
PREFIX p: <http://www.wikidata.org/prop/>

SELECT DISTINCT ?wd_item ?physically_interacts_with ?interactswithLabel ?typeLabel ?iri ?uniprot ?text  WHERE  {
   {SELECT * WHERE { ?iri a up:Protein ;
		    up:organism taxon:9606 ; 
		    up:annotation ?annotation .
		?annotation a up:Natural_Variant_Annotation ; 
		            rdfs:comment ?text .
		FILTER (CONTAINS(?text, 'loss of function')) }
   }
   SERVICE <https://query.wikidata.org/bigdata/namespace/wdq/sparql> {
      	VALUES ?use {wd:Q427492}
		?wd_item	wdt:P352 ?uniprot ;
             		p:P129 ?physically_interacts_with_node ;     
        			wdt:P2888 ?iri ;
        			wdt:P703 wd:Q15978631 .
    	?phys_interacts_with_node 	ps:P129 ?physically_interacts_with ;
                              		pq:P366 ?use .    
    	?physically_interacts_with 	wdt:P31 ?type ;
                               		rdfs:label ?interactswithLabel .
        ?type rdfs:label ?typeLabel .
    	FILTER (lang(?interactswithLabel) = "en")
        FILTER (lang(?typeLabel) = "en")
    }
}

Try it

Maintenance Queries[edit]

See here: Maintenance Queries