Property talk:P4839

From Wikidata
Jump to navigation Jump to search

Documentation

Wolfram Language entity code
input form for an entity in Wolfram Language
RepresentsWolfram Alpha (Q207006)
Applicable "stated in" valueWolfram Language (Q15241057)
Data typeExternal identifier
Allowed valuesEntity\["[A-Z][a-zA-Z]+", \{?"[^"]+"(?:, (?:\d+|"[a-zA-Z]+"))*\}?\]Entity\[.+\]
Usage notesTo retrieve the entity code, type the item into Wolfram Alpha, mouse over the input interpretation, then click on Plain Text (bottom right corner) and select the value under "Wolfram Language code:" (if available).
ExampleEarth (Q2)["Planet",%20"Earth" Entity["Planet", "Earth"]]
Riemann hypothesis (Q205966)["FamousMathProblem",%20"RiemannHypothesis" Entity["FamousMathProblem", "RiemannHypothesis"]]
Basshunter (Q383541)["Person",%20"Basshunter::v4vqn" Entity["Person", "Basshunter::v4vqn"]]
PostScript (Q218170)["FileFormat",%20"PS-1" Entity["FileFormat", "PS-1"]]
Washington County (Q507256)["AdministrativeDivision",%20{"WashingtonCounty",%20"Maine",%20"UnitedStates"} Entity["AdministrativeDivision", {"WashingtonCounty", "Maine", "UnitedStates"}]]
Clifton Bridge (Q5133183)["Bridge",%20"CliftonBridge(Nottingham)::j23y6" Entity["Bridge", "CliftonBridge(Nottingham)::j23y6"]]
Do Androids Dream of Electric Sheep? (Q605249)["Book",%20"DoAndroidsDreamOfElectricSheep?1968" Entity["Book", "DoAndroidsDreamOfElectricSheep?1968"]]
bit shift (Q10265617)["Word",%20"bit%20shift" Entity["Word", "bit shift"]]
Hawaii Five-0 (Q728443)["TelevisionProgram",%20"HawaiiFive02010" Entity["TelevisionProgram", "HawaiiFive02010"]]
Wolfram Language (Q15241057)["ProgrammingLanguage",%20"WolframLanguage" Entity["ProgrammingLanguage", "WolframLanguage"]]
Stephen Wolfram (Q310798)["Person",%20"StephenWolfram::j276d" Entity["Person", "StephenWolfram::j276d"]]
Formatter URLhttps://www.wolframalpha.com/input/?i=$1
Tracking: usageCategory:Pages using Wikidata property P4839 (Q63179353)
See alsoMathWorld ID (P2812), Wolfram Language unit code (P7007), Wolfram Language quantity ID (P7431), Wolfram Language entity type (P7497)
Lists
Proposal discussionProposal discussion
Current uses
Total242,348
Main statement242,320 out of 16,881,532 (1% complete)>99.9% of uses
Qualifier11<0.1% of uses
Reference17<0.1% of uses
Search for values
[create Create a translatable help page (preferably in English) for this property to be included here]
Format “Entity\["[A-Z][a-zA-Z]+", \{?"[^"]+"(?:, (?:\d+|"[a-zA-Z]+"))*\}?\]: value must be formatted using this pattern (PCRE syntax). (Help)
Exceptions are possible as rare values may exist. Exceptions can be specified using exception to constraint (P2303).
List of violations of this constraint: Database reports/Constraint violations/P4839#Format, SPARQL
Scope is as main value (Q54828448): the property must be used by specified way only (Help)
List of violations of this constraint: Database reports/Constraint violations/P4839#Scope, hourly updated report, SPARQL
Distinct values: this property likely contains a value that is different from all other items. (Help)
Exceptions are possible as rare values may exist. Exceptions can be specified using exception to constraint (P2303).
List of violations of this constraint: Database reports/Constraint violations/P4839#Unique value, SPARQL (every item), SPARQL (by value)
Format “Entity\[.+\]: value must be formatted using this pattern (PCRE syntax). (Help)
List of violations of this constraint: Database reports/Constraint violations/P4839#Format, hourly updated report, SPARQL
Allowed entity types are Wikibase item (Q29934200): the property may only be used on a certain entity type (Help)
Exceptions are possible as rare values may exist. Exceptions can be specified using exception to constraint (P2303).
List of violations of this constraint: Database reports/Constraint violations/P4839#Entity types
Single best value: this property generally contains a single value. If there are several, one would have preferred rank (Help)
Exceptions are possible as rare values may exist. Exceptions can be specified using exception to constraint (P2303).
List of violations of this constraint: Database reports/Constraint violations/P4839#single best value, SPARQL
First part should be a value of Wolfram Language entity type (P7497)
type is currently not a value of Wolfram Language entity type (P7497) (Help)
Violations query: SELECT * { { SELECT ?type (COUNT(*) as ?count) (SAMPLE(?value) as ?samplevalue) (SAMPLE(?item) as ?item) { ?item wdt:P4839 ?value . BIND(strafter(strbefore(?value, "\","),"Entity[\"") as ?type) } GROUP BY ?type } MINUS { [] wdt:P7497 ?type } } ORDER BY DESC(?count)
List of this constraint violations: Database reports/Complex constraint violations/P4839#First part should be a value of Wolfram Language entity type (P7497)
Check Entity name
Check Entity listed in nhttps://reference.wolfram.com/language/guide/EntityTypes.html (Help)
Violations query: SELECT ?item ?value { ?item p:P4839 [ ps:P4839 ?value ]. BIND( REGEX( STR( ?value ), "^(.+(Language|Word|GrammaticalUnit|WritingScript|Alphabet|Character|Concept|WritingDirection|WritingScriptBaseline|WritingScriptType|HistoricalCountry|HistoricalEvent|HistoricalPeriod|HistoricalSite|Shipwreck|MilitaryConflict|Person|PersonTitle|GivenName|Surname|Gender|Emotion|Food|FoodType|BasicFoodGroup|USDAFoodGroup|FoodTypeGroup|FoodAlcoholLabel|FoodCaffeineLabel|FoodCalorieLabel|FoodFiberLabel|FoodFatLabel|FoodIronLabel|FoodSodiumLabel|FoodSugarLabel|FoodBoneContent|FoodSkinContent|FoodSeedContent|FoodCrustType|FoodFatType|FoodGeometryType|FoodPeelingType|FoodProcessingType|FoodServingType|FoodStorageType|FoodSugarType|FoodBeefGrade|FoodMeatCut|FoodMeatQuality|FoodPattyCount|FoodBrandName|FoodSubBrandName|FoodManufacturer|FoodAge|FoodComposition|FoodConcentration|FoodCulture|FoodDataSource|FoodFlavor|FoodIntendedUse|FoodLocation|FoodMoistureLevel|FoodNutritionalSupplement|FoodNutritionalSupplementNotAdded|FoodPackaging|FoodPart|FoodPreparation|FoodSeafoodVariety|FoodSize|FoodState|FoodSugarType|FoodTexture|FoodTrimmingLevel|FoodVariety|FoodVegetablePart|Financial|Company|CurrencyDenomination|SportObject|SportMatch|MusicalInstrument|BoardGame|PopularCurve|YogaPose|YogaPosition|YogaSequence|YogaProp|PilatesExercisePokemon|Digimon|Language|Religion|Mythology|Movie|MusicAct|MusicAlbum|MusicAlbumRelease|MusicWork|MusicWorkRecording|BroadcastStation|BroadcastStationClassification|Book|Artwork|Periodical|FictionalCharacter|Museum|LibraryBranch|LibrarySystem|FrequencyAllocation|BroadcastStation|MeasurementDevice|Building|Bridge|Tunnel|Dam|Mine|Aircraft|Airline|Airport|Ship|WeatherStation|TropicalStorm|Cloud|AtmosphericLayer|Earthquake|GeologicalLayer|GeologicalPeriod|Mineral|FamousGem|TidalConstituent|TideStation|Satellite|Rocket|DeepSpaceProbe|MannedSpaceMission|Planet|PlanetaryMoon|MinorPlanet|Comet|SolarSystemFeature|MeteorShower|Exoplanet|Star|Galaxy|StarCluster|Nebula|Supernova|Pulsar|AstronomicalRadioSource|Constellation|Icon|Color|ColorSet|LightColor|FileFormat|DisplayFormat|NotableComputer|InternetDomain|IPAddress|NetworkService|TopLevelDomain|ProgrammingLanguage|WolframLanguageSymbol|Polyhedron|Solid|Lamina|Surface|SpaceCurve|PlaneCurve|Lattice|LatticeSystem|PeriodicTiling|NonperiodicTiling|Graph|Knot|FiniteGroup|MathematicalFunction|IntegerSequence|ContinuedFraction|FunctionSpace|TopologicalSpaceType|FamousMathProblem|FamousMathGame|ComputationalComplexityClass|MathWorld|ContinuedFractionResult|ContinuedFractionSource|FunctionalAnalysisSource|Plant|Species|Dinosaur|DogBreed|CatBreed|AnatomicalStructure|AnimalAnatomicalStructure|Neuron|Disease|MedicalTest|Protein|AnatomicalFunctionalConcept|AnatomicalTemporalConcept|CognitiveTask|ICDNine|ICDTen|Gene|SNP|Protein|Chemical|Element|Isotope|Particle|Mineral|Laser|CrystalFamily|CrystalSystem|CrystallographicSpaceGroup|PhysicalSystem|PhysicalConstant|FamousPhysicsProblem|FamousChemistryProblem|Color|ColorSet|LightColor|MeasurementDevice|Country|AdministrativeDivision|City|Neighborhood|MetropolitanArea|ZIPCode|USCongressionalDistrict|DistrictCourt|Ocean|Island|UnderseaFeature|Reef|Beach|Lake|Mountain|Volcano|River|Glacier|Waterfall|EarthImpact|Desert|Forest|GeographicRegion|Airport|Park|AmusementPark|AmusementParkRide|Stadium|Bridge|Canal|Tunnel|Dam|Mine|Cave|OilField|Building|Castle|Cemetery|HistoricalSite|PreservationStatus|ReserveLand|Shipwreck|University|SchoolDistrict|PublicSchool|PrivateSchool|Museum|LibrarySystem|LibraryBranch|WeatherStation|AstronomicalObservatory|ParticleAccelerator|NuclearReactor|NuclearTestSite|NuclearExplosion|TimeZone).+)$" ) AS ?regexresult ) . FILTER( ?regexresult = false ) . FILTER( ?item NOT IN ( wd:Q4115189, wd:Q13406268, wd:Q15397819 ) ) . }
List of this constraint violations: Database reports/Complex constraint violations/P4839#Check Entity name

Constraint regex too tight[edit]

Just a note - the regular expression doesn't allow for colons within the strings, which is valid in WL. For instance Entity["Species", "Species:HomoSapiens"] is the entity code for Q5, but this shows constraint violation. --Hebejebelus (talk) 13:36, 26 January 2019 (UTC)[reply]

@Hebejebelus: I made the minimal change to regex, I don't know where the "third string" is used so I didn't change that to accept a single colon. I also added your string to Homo Sapiens. I see you suggested to add it to human (Q5) instead, that separation seems to be debated anyway.
It may be noted that there are properties that aren't supported for P4839, so Entity["Species", "Species:HomoSapiens"][EntityProperty["Species", "ScientificName"]] isn't accepted by regex, and I think should not. Jagulin (talk) 05:08, 22 August 2019 (UTC)[reply]

Data type[edit]

Why isn't this an external identifier? --99of9 (talk) 01:21, 1 May 2019 (UTC)[reply]

I came with the same question: Wikidata property related to software (Q21126229) seems incorrect for this item. Proposal clearly talks about it as an ID. Conceptually from Wolfram point of view it's a more complex search string, but with the restrictions added for WD Wolfram Language entity code (P4839) should be reclassified and renamed from "code" to "ID". (Note that the Wolfram Language unit code (P7007) to me is slightly different, the proposal talks about code rather than an ID. The discussion on that item doesn't exactly change the outcome here.)
@Jura1:Can you take care of it, raising the consensus discussion elsewhere if needed? Jagulin (talk) 04:17, 22 August 2019 (UTC)[reply]
I left a note on Project chat [1]. --- Jura 08:52, 22 August 2019 (UTC)[reply]
I created a ticket to change the datatype. Lea Lacroix (WMDE) (talk) 14:22, 30 September 2019 (UTC)[reply]
 Weak oppose  Support Hi. Just saw this, so I hope I'm not too late to the discussion. ("weak" oppose because there might be some higher principle that is stronger than the argument I want to make.) The string stored in this property is actual Wolfram Language (Q15241057) code, as can be seen in the following:
In[1]:= Needs["GraphStore`"]

In[2]:= result = SPARQLExecute[
  "https://query.wikidata.org/sparql",
  "select * where { wd:Q937 wdt:P4839 ?entity }"
]

Out[2]= {<|"entity" -> "Entity[\"Person\", \"AlbertEinstein::6tb7g\"]"|>}

In[3]:= entityCode = result[[1, "entity"]]

Out[3]= "Entity[\"Person\", \"AlbertEinstein::6tb7g\"]"

In[4]:= SyntaxQ[entityCode]

Out[4]= True

In[5]:= entity = ToExpression[entityCode]

Out[5]= Entity["Person", "AlbertEinstein::6tb7g"]

The second input queries the Wolfram Language entity code (P4839) for Albert Einstein (Q937). The third input extracts the single result and stores it in the variable entityCode. SyntaxQ is a built-in function that confirms that this is valid code. Note that this "code" is not usable yet in the language to identify anything: It first has to be parsed, which is what ToExpression does. So I think this demonstrates that this is "code", just as with Wolfram Language unit code (P7007). Toni 001 (talk) 11:17, 2 October 2019 (UTC)[reply]


Now that I showed some code, I wanted to hint that in production-quality code one would not use the single-argument ToExpression[...], but rather the safer ToExpression[..., HoldComplete]. This is important because this property could be filled with any malicious expression, say Quit. The HoldComplete wrapper prevents evaluation and gives an opportunity to inspect the expression. Luckily, a whitelist of certain patterns will be sufficient. Here is a demonstration of how the value of this property can be safely converted to an expression:

(* wrapper that applies a predicate to an unevaluated argument *)
unev[pred_] := Function[Null, pred[Unevaluated[#]], HoldAllComplete];

(* strong patterns for strings and integers *)
$str = _String?(unev[StringQ]);
$int = _Integer?(unev[IntegerQ]);

(* checks whether all leaves of a held expression match a pattern *)
safeExprQ[expr_HoldComplete, patt_] := MatchQ[
	Level[expr, {-1}, HoldComplete, Heads -> True],
	HoldComplete[HoldComplete, patt ..]
];
safeExprQ[_, _] := False;

(* safely convert a string to an expression *)
FromEntityCode[entityCode_String] := Module[
	{res},
	res = Quiet[ToExpression[entityCode, InputForm, HoldComplete]];
	If[
		Or[
			FailureQ[res],
			! safeExprQ[res, $str | $int | Entity | List],
			! MatchQ[res, HoldComplete[Entity[_String, _]]]
		],
		Return[
			Failure["InvalidEntityCode", <|
				"MessageTemplate" -> "The value `EntityCode` for the entity code (P4839) is invalid.",
				"MessageParameters" -> <|"EntityCode" -> entityCode|>
			|>],
			Module
		];
	];
	res = ReleaseHold[res];
	res
];

Example:

In[7]:= FromEntityCode["Entity[\"Person\", \"AlbertEinstein::6tb7g\"]"]

Out[7]= Entity["Person", "AlbertEinstein::6tb7g"]

In[8]:= FromEntityCode["weird"]

Out[8]= Failure["InvalidEntityCode", ...]

Toni 001 (talk) 12:10, 2 October 2019 (UTC)[reply]


Thanks, Toni. I don't quite understand what the issue is though. Are you saying we should be storing something else as the identifier than we currently are? If so what? (Sorry I'm not familiar enough with WA to tell right away.) --Lydia Pintscher (WMDE) (talk) 13:42, 2 October 2019 (UTC)[reply]
Some comments (nothing that would prevent making this an ID, but things worth considering):
  • This property is very useful as it is right now, so I'm not proposing any change. (In fact, at Wolfram Research (Q1367937) we are looking at it very carefully and might be using it for upcoming functionality.) My point is: "yes, this is code". My weaker point is that "code and ID might be conflicting concepts", but this might depend on the context. This can be seen when looking at a few examples like "ISBN-10", "CAS Registry Number", ...; putting this property in the list makes it stand out because it is the only one that can't be used as it is, but needs to be "parsed".
  • Another way to look at it is that an entity actually consists of two IDs: A "type" (first argument of the function Entity) and a "canonical name" (second argument); the latter is defined only in the context of the former. But creating individual ID properties for those two parts would not help at the moment because the "canonical name" is not always a string.
  • Finally, Wolfram Alpha (Q207006) accepts natural language input, but as an additional feature it accepts (a subset of) WL code; that's why there can be a formatter URL (P1630) for this property.

Toni 001 (talk) 14:34, 2 October 2019 (UTC)[reply]

No real objection from me, but see also my comment on this proposal, where I make this ID vs. code distinction. Toni 001 (talk) 14:11, 8 October 2019 (UTC)[reply]
By the way, I don't want to be holding up this change, especially if I'm the only one being a little skeptical. Maybe I'm missing a point in this discussion: Is is just for the values being listed in the "identifiers" section? Is there a technical / philosophical / practical / ... reason? My reasons would fall into the "(very) technical" category. Toni 001 (talk) 15:46, 8 October 2019 (UTC)[reply]
Maybe there could be a "computer code" datatype for all properties whole values are valid syntax in some programming language. It could be just like "monolingual text", but instead of the (human) language an item representing the programming language is specified. Toni 001 (talk) 13:56, 9 October 2019 (UTC)[reply]

Ok so to make sure I understand this correctly: WA doesn't actually have unique IDs for entities? There is only a type plus an ID that uniquely identifies an entity? Is that the root of the issue? --Lydia Pintscher (WMDE) (talk) 10:55, 10 October 2019 (UTC)[reply]

@Lydia Pintscher (WMDE): That summary sounds correct to me. To reformulate that in my programmers mind: There is no ID, say 123abc which could be fed anywhere to produce the expression/code/function call Entity[type, canonicalName]. The formatter URL works not because the string "Entity[\"type\", ...]" is an ID (well, depending on the precise definition of ID, of course), but because the natural language parser recognizes that fragment as code, finds the head to be Entity, then resolves the two arguments. The first argument, the type, does have official URLs, for instance Planet. Toni 001 (talk) 11:44, 10 October 2019 (UTC)[reply]
There are other identifiers that have a various parts that form an identifier in combination. Given the above, can we also convert Wolfram Language unit code (P7007)? BTW, a property for Wolfram Language entity types would probably be useful as well. --- Jura 09:40, 13 October 2019 (UTC)[reply]
I changed my mind. Being code or a composite identifier does not make this less an identifier. So I support changing this to an external ID. I would like to keep the current name though, to emphasize the code-character. Toni 001 (talk) 10:05, 15 October 2019 (UTC)[reply]
I went ahead and made a proposal for types: Wikidata:Property proposal/Wolfram Language entity type. --- Jura 06:47, 19 October 2019 (UTC)[reply]

✓ Done thanks to all involved, notably @Lea Lacroix (WMDE), Lydia Pintscher (WMDE), Ladsgroup: at WMDE. --- Jura 11:40, 15 November 2019 (UTC)[reply]

Mixnmatch P2264[edit]

This item is getting clogged up with numerous Mix'n'match catalog ID (P2264) added. What are the purpose of those? In the original example P2264 is used to identify AAT. For this item, P2264 seems to list any query that uses Wolfram. Should they be removed? Jagulin (talk) 05:20, 22 August 2019 (UTC)[reply]

How do I find the entity code?[edit]

I'd like to add the entity code for Pomona College (Q7227384), but I can't figure out how to find it. Could we please add some instructions here? Courtesy ping IvanP. {{u|Sdkb}}talk 23:50, 14 July 2020 (UTC)[reply]

@Sdkb: Type Pomona College into Wolfram|Alpha, mouse over the input interpretation, then click on Plain Text (bottom right corner): Entity["University", "PomonaCollege::jycd5"]. -- IvanP (talk) 20:07, 16 July 2020 (UTC)[reply]
@IvanP: Thanks! I added usage instructions to this property. {{u|Sdkb}}talk 20:28, 16 July 2020 (UTC)[reply]

Missing reference for number of entries in the database[edit]

https://www.wikidata.org/wiki/Property:P4839#P4876 could someone fix that? --So9q (talk) 09:15, 27 January 2021 (UTC)[reply]

Shouldn't wiki entries with instance: Human only accept the Entity["Person"] wolfram language code?[edit]

[Gaye] for example got 2 entries, one the music act and one the person as entity.LuukH87 (talk) 14:15, 20 January 2022 (UTC)[reply]

Canonical and alternate names[edit]

Some entities accept alternate names in the second argument. Send it through EntityValue[...] to determine what the canonical name is:

In[3]:= Internal`ClearEntityValueCache[]

In[4]:= EntityValue[Entity["Language", "French"]]

Out[4]= Entity["Language", "French::367gk"]

Toni 001 (talk) 05:56, 6 April 2022 (UTC)[reply]

This requires having a local wolfram install to do conversion. I have a script that does conversion between Wolfram code and wikidata entities that breaks with only the canonical form being there. I suggest we keep all form that the wolfram language accept, but put the canonical form as the prefered statement Vincent cloutier (talk) 18:14, 6 April 2022 (UTC)[reply]