Wikidata:Property proposal/EIK
Jump to navigation
Jump to search
BG EIK
[edit]Originally proposed at Wikidata:Property proposal/Authority control
Description | BG Unified Identification Code for companies, BULSTAT code for non-commercial orgs. 9-digit for organisations, 13-digit for branches |
---|---|
Represents | Unified Identification code (Q24574607) |
Data type | External identifier |
Domain | organization (Q43229) |
Allowed values | \d{9}|\d{13} |
Example 1 | Fantastico (Q61140834) → 127624992 |
Example 2 | EPay.bg (Q12270635) → 131409398 |
Example 3 | Teximbank (Q7708347) → 040534040 |
Example 4 | Central Cooperative Bank (Q2944755) → 831447150 |
Example 5 | BULATSA (Q25476246) → 000697179 : a government trader org |
Example 6 | Ministry of Economy (Q6866930) → 000695406 : no online page for BULSTAT of government non-trade orgs :-( |
Source | XML dumps: of BG Trade Register at http://data.egov.bg: one XML per day of messages received into the Trade Register |
Planned use | Import BG Media companies with their EIK number |
Number of IDs in source | 631k to 1.1M, see https://papagal.bg/ |
Expected completeness | always incomplete (Q21873886) |
Formatter URL | https://portal.registryagency.bg/CR/Reports/ActiveConditionTabResult?uic=$1 |
See also | EU VAT number (P3608), Companies House company ID (P2622), OpenCorporates ID (P1320) |
Motivation
[edit]Notified participants of WikiProject Companies
All Bulgarian commercial companies have EIK, and non-commercial organizations have BULSTAT; in the same identifier space.
- In a semantic technologies course at New Bulgarian University (Q370968), we are working on importing data about Media companies and their contracts, and we need the EIK
--Georgi Godev (talk) 16:54, 6 November 2020 (UTC)
Discussion:
- Support --Vladimir Alexiev (talk)
- Support This will be very useful for maintaining up to date information about organizations, such as owners of media brands and news outlets. --Nikola Tulechki (talk) 08:52, 17 November 2020 (UTC)
- Support This will be useful property. --Mariana.peteva.angelova (talk) 19:19, 17 November 2020 (UTC)
- Support This will be helpful for identification of branches of foreign companies in the BG Trade register. --Dana Tomova (talk) 21:09, 17 November 2020 (UTC)
- Support Valuable information. --Elle.Kirilova (talk) 17:31, 20 November 2020 (UTC)
- Support Reasonalbe. It will be a bit tricky to decide about formatter URL (P1630), we had a similar problem with Czech Registration ID (IČO) (P4156) with some governmental organizations missing and decided to have a trade register link.--Jklamo (talk) 16:33, 26 November 2020 (UTC)
- Done, see BG EIK (P8894). I created this in part using a new script to simplify property creation, please let me know if something went wrong --DannyS712 (talk) 00:20, 30 November 2020 (UTC)
Extra Info
[edit]- Done Аliases:
- EN: EIK, UIC, BULSTAT
- BG: ЕИК, БУЛСТАТ
- Done Added info to Unified Identification code (Q24574607). Please copy that info to this prop once created
- Done Add "instance of Wikidata property to identify organizations in company registers (Q26235166)" to the prop
- official websites:
- Trade register: https://www.registryagency.bg/bg/registri/targovski-registar/ (in BG and EN, also add qualifier "language of work", and qualifier Title)
- BULSTAT register: https://www.registryagency.bg/bg/registri/registar-bulstat/ (in BG and EN, also add qualifier "language of work", and qualifier Title)
- described at URL: http://trudipravo.bg/mesechni-spisania/mesechno-spisanie-targovsko-i-obligacionno-pravo/menuconttkp1/235-targovski-registar-edinen-identifikatzionen-kod
- third party formatter: https://bird.bg/tr/?v=view&guid=$1 , with qualifier "use: investigative journalism"
- third party formatter: https://papagal.bg/eik/$1/$2 where $2 is some kind of hash of $1, we're trying to find out what it is. Examples:
- I contacted Papagal with this message -- Vladimir Alexiev, 16 November
Здравейте! Благодаря за чудесния сайт за фирми от търговския регистър!Бихте ли ми казали как се формира последната част на УРЛа, напр "958f" в https://papagal.bg/eik/127624992/958f ? Искаме да включим линкове към вашия като "third party formatter URL" в https://www.wikidata.org/wiki/Wikidata:Property_proposal/EIK. Там сме включили линк към Търговския регистър, но вашите страници са по-добри и дават допълнителна информация. Викидата е енциклопедична база данни в която вече има много данни за България.
Поздрави! Владимир Алексиев, главен архитект за данни, Ontotext.com
- I also contacted papagal through the DNS register https://www.register.bg/tld_user_reg/app.pl. It has an unspecified limit on the number of chars, so I could only send this paragraph --Vladimir Alexiev (talk) 13:19, 26 November 2020 (UTC)
Благодаря за чудесния сайт за фирми от търговския регистър! Искаме да включим линкове към вашия сайт като "third party formatter URL" във Викиданни, вижте https://www.wikidata.org/wiki/Wikidata:Property_proposal/EIK. Там сме включили линк към Търговския регистър, но вашите страници са по-добри и дават допълнителна информация.
- Another third party formatter: https://e-sme.government.bg/user/preview?bulstat=$1 (only for Small and Medium Enterprises, 71k) --Vladimir Alexiev (talk) 13:07, 26 November 2020 (UTC)
- @Jklamo, ArthurPSmith: If you have prop creation rights, could you guys create this one? I and the SemWeb students at NBU need it --Vladimir Alexiev (talk) 13:21, 26 November 2020 (UTC)
- Once accepted, we should add it as instance of Wikidata property for an identifier having a checksum (Q48710051), and add a Template:Complex_constraint with SPARQL query to find items that fail the checksum.
- The checksum algorithm is at https://gist.github.com/vstoykov/4344506 and includes two parts: for EIK-9 and for EIK-13
- An example can be seen at https://www.wikidata.org/wiki/Property_talk:P212 "Invalid ISBN-13", and results at https://www.wikidata.org/wiki/Wikidata:Database_reports/Complex_constraint_violations/P212
- There are 489 props that use "complex constraint" and I wonder how many of them do it for a checksum
- This category https://www.wikidata.org/w/index.php?title=Category:Properties_with_complex_constraints returns pretty much the same list
- Only 17 props have Wikidata property for an identifier having a checksum (Q48710051), see this query https://w.wiki/o6D --Vladimir Alexiev (talk) 13:34, 26 November 2020 (UTC)
- We should use the following query to populate EIK from EU VAT and OpenCorporates ID starting with "BG". But there are only 38 values --Vladimir Alexiev (talk) 15:58, 27 November 2020 (UTC)
- github task https://github.com/NBU-DSCM-2020/dscm006-semtech-group-project/issues/16
- Properties used: EU VAT number (P3608) , OpenCorporates ID (P1320) , BG EIK (P8894)
select ?x ?xLabel ?eik {
?x wdt:P3608|wdt:P1320 ?id
filter not exists {?x wdt:P8894 []}
filter(regex(?id,"^bg","i"))
bind(replace(?id,"^bg/?","","i") as ?eik)
SERVICE wikibase:label { bd:serviceParam wikibase:language "bg,en". }
}
- For VAT-registered companies, EU VAT number (P3608) is the EIK prepended with the country code "BG". And the OpenCorporates ID is pretended with "bg/". Can we add that as some sort of constraint that compares two external-id fields?
- @DannyS712, Jklamo: guess this can be done with a Complex Constraint? --Vladimir Alexiev (talk) 03:45, 11 December 2020 (UTC)