Wikidata:Lexicographical data/Documentation/Languages/ta
Subclass of | Tamil languages |
---|---|
Native label | தமிழ் |
Located in the administrative territorial entity | Tamil Nadu, Singapore, Sri Lanka |
Has tense | present tense, past tense, future tense |
Has grammatical gender | masculine, feminine, neuter |
Writing system | Tamil script, Vatteluttu, Koleluttu, Arabic script |
Ethnologue language status | 2 Provincial |
Studied in | Tamilology |
Related category | Category:Tamil pronunciation |
Time of earliest written record | 4. century BCE |
Stack Exchange tag | https://stackoverflow.com/tags/tamil |
Wikimedia language code | ta |
This page is a documentation page for Tamil (Q5885) lexemes in WikiProject Wikidata:Lexicographical data, intended for assisting contributions to Tamil lexeme content. Tamil language is a Dravidian language spoken by over 75 million people in South Asia, mainly in Tamil Nadu, Sri Lanka, Puducherry and Singapore. It is an agglutinative language.
Wikidata:Lexemes aims to provide a CC0 licensed structured lexicographical data for everyone to use for different purposes, including for Wiktionary, the upcoming Abstract Wikipedia and external projects.
Layout[edit]
Every lexeme entry has the following layout:
- Lemma (dictionary form) of the lexeme as title or headword. It is to be written in Tamil script. Every lexeme entry will have a lexeme ID.
- Language of the lexeme should be Tamil (Q5885) and lexical category is also specified. The lexical category should be as broad as possible, and based on the Tamil linguistic ontology
- Senses - different meanings of the same word
- Statements for senses. This usually includes image, item for this sense, translation, synonym, antonym, usage example, and more (see list). Note that for translation, antonym, & synonym properties, the lexeme "sense ID" (
LXXXXX-S1
) of the target lexeme has to be copy pasted, not the lexeme ID.
- Statements for senses. This usually includes image, item for this sense, translation, synonym, antonym, usage example, and more (see list). Note that for translation, antonym, & synonym properties, the lexeme "sense ID" (
- Forms - different forms and cases of the lexeme
Structure and properties[edit]
Common properties to be added for lexeme entries are given below:
Lemma[edit]
Lemma is the dictionary form (base form) of the word/lexeme.
Statements[edit]
- grammatical gender (P5185): (masculine (Q499327) / feminine (Q1775415))
- derived from lexeme (P5191)
- usage example (P5831)
- homograph lexeme (P5402)
- combines lexemes (P5238)
Senses[edit]
- item for this sense (P5137)
- image (P18)
- translation (P5972)
- synonym (P5973)
- antonym (P5974)
- hyperonym (P6593)
- gloss quote (P8394)
Forms[edit]
- Grammatical features
- Grammatical gender: masculine (Q499327) / feminine (Q1775415)
- Grammatical number: singular (Q110786) / plural (Q146786)
- pronunciation audio (P443)
- IPA transcription (P898)
- See Tamil grammar (Q3535154) Cases, tenses and other inflections.
- Tamil nouns are inflected based on number and grammatical case. There are 9 grammatical cases described for Tamil:
case | suffix | transliteration of suffix |
---|---|---|
nominative case (Q131105) | -∅ | |
accusative case (Q146078) | -ஐ | -ai |
instrumental case (Q192997) | -ஆல், -கொண்டு | -āl, -(aik) koṇṭu |
sociative case (Q3773161) | -ஓடு, -உடன் | -ōṭu, -uṭaṉ |
dative case (Q145599) | -(க்)கு, -இன்பொருட்டு, -இந்நிமித்தம் | -(k)ku, -iṉ poruṭṭu, -iṉ nimittam |
ablative case (Q156986) | -இலிருந்து, -இடமிருந்து, -இனின்று | -il(ē) iruntu [irrational], -iṭam iruntu [rational], -iṉiṉṟu |
genitive case (Q146233) | -அது, -ஆது, -உடைய | -atu, -uṭaiya |
locative case (Q202142) | -இல், -இடம் | -il(ē) [irrational], -iṭam [rational] |
vocative case (Q185077) | -ஏ | -ē |
See also this Tamil Wikipedia article: w:ta:வேற்றுமை (தமிழ் இலக்கணம்)
Maintenance[edit]
- Recent Changes to Tamil Lexemes
- Search lexemes:
To do[edit]
- Wikidata:Wikidata Lexeme Forms - Please fill up a request for Tamil language. It is an important tool for adding lexeme forms.
- Add the most frequent missing forms of Tamil language in Wikidata LD.
Lexicographical Coverage[edit]
- See also: WD:Lexicographical data/Statistics
- The lexeme forms coverage chart for Tamil language is given below. These statistics use corpus data from the Leipzig Corpora Collection.
|
|
|
- Current Tamil lexemes count by lexical category: https://w.wiki/3$cS
Queries[edit]
- Main page: WD:Lexicographical data/Ideas of queries
- Tamil Q-id:
Q5885
1) Get all existing lexemes in Tamil: results
The following query uses these:
- Items: Tamil (Q5885)
SELECT ?lexeme ?lemma WHERE { ?lexeme dct:language wd:Q5885; wikibase:lemma ?lemma. }
2) Get the count of lexemes in Tamil belonging to different lexical categories: https://w.wiki/3$cS
3) Query for all Tamil nouns missing a locative case: query
The following query uses these:
- Items: Tamil (Q5885) , noun (Q1084) , accusative case (Q146078)
SELECT DISTINCT ?l ?lemma WHERE { ?l a ontolex:LexicalEntry ; dct:language wd:Q5885; wikibase:lexicalCategory wd:Q1084; wikibase:lemma ?lemma ; ontolex:lexicalForm ?form . ?form ontolex:representation ?word ; minus { {?l a ontolex:LexicalEntry ; ontolex:lexicalForm/wikibase:grammaticalFeature wd:Q146078.} }. }
Resources[edit]
- Commons:Category:Tamil dictionaries - hundreds of scanned Tamil dictionaries
- Category:Tamil lemmas (Q31160964) (Note that Wiktionaries are cc-by-sa licensed while WD:Lexemes is cc0 licensed.)
- Tamil Wiktionary
- Wikidata:Lexicographical data - Main project page
Wikisource[edit]
- ta:s:பகுப்பு:அகரமுதலிகள் - Tamil Wikisource copy-pastable text version of Public Domain Dictionaries
Citable external resources[edit]
- Tamil Lexicon (Q120646844) Searchable Version
- A comprehensive Tamil and English dictionary of high and low Tamil (Q117344581) Searchable Version
- N. Kathiraiver Pillai's Tamil Moli Akarathi: Tamil-Tamil dictionary (Q123027241) Searchable Version
Tools[edit]