Wikidata:Lists/lexemes/en

From Wikidata
Jump to navigation Jump to search

This is about lexemes with language set to English (Q1860).

As of 2018-10-11, up to L30000. Based on lemmas. If there are multiple values, generally the first one.

General statistics[edit]

lexemes single-word
total 10298 10199
total char 61075 60305
avg char 5.93 5.91
max char 45 45
avg vowels 2.29
max vowels 18
  • Homograph lexemes: 4129 with 2028 different lemmas

Lexical categories[edit]

Letter frequency[edit]

letter total %
a 5372 8.9
b 1188 2
c 2725 4.5
d 1803 3
e 6628 11
f 1070 1.8
g 1344 2.2
h 1511 2.5
i 4504 7.5
j 142 0.2
k 732 1.2
l 3759 6.2
m 1814 3
n 3526 5.8
o 3932 6.5
p 2035 3.4
q 128 0.2
r 4400 7.3
s 3315 5.5
t 4494 7.5
u 2282 3.8
v 754 1.3
w 726 1.2
x 233 0.4
y 1701 2.8
z 181 0.3


Word length[edit]

only single-word lexemes (i.e. lemmas without " " or "-")

Word count[edit]

lengthcountsample
1 8 F
2 34 CD
3 655 rum
4 2140 bare
5 3335 augur
6 1004 bazaar
7 863 bayside
8 661 bachelor
9 533 bariatric
10 403 basketball
11 249 audiovisual
12 152 availability
13 89 axiomatically
14 45 authentication
15 14 atherosclerosis
16 6 arteriosclerosis
17 1 agrotechnological
18 1 characteristically
20+ 5 pneumonoultramicroscopicsilicovolcanoconiosis

distinct letter[edit]

wordcount of letters
supercalifragilisticexpialidocious 15
misconjugatedly 15
pneumonoultramicroscopicsilicovolcanoconiosis 14
comprehensively 13
discombobulate 12
electromagnetism 12
acknowledgment 12
adventurously 12
providentially 12
accomplishment 12
controversially 12
administratively 12
advantageously 12
agrochemistry 12
antidisestablishmentarianism 12
antidisestablishmentarian 12
filmography 11
relationship 11
postmeridian 11
responsibility 11
consequential 11

vowel[edit]

(approx.)

lengthcountsample
1 2880 ball
2 3983 batten
3 1688 bazaar
4 1001 bariatrics
5 470 axiomatics
6 132 availability
7 32 axiomatically
8 1 aerodynamically
10 1 antidisestablishmentarian
11 1 antidisestablishmentarianism
15 1 supercalifragilisticexpialidocious
18 1 pneumonoultramicroscopicsilicovolcanoconiosis


Pattern within[edit]

Introduction[edit]

Aspectcountsamples
palindrome 52 tenet, radar, madam, kayak, toot, toot, peep, peep, deed, boob, tot, tit, pup, pip, pep
with AEIOUY 9 educationally, revolutionary, equatorially, praseodymium, aeronautically
with A to Z 1 The quick brown fox jumps over the lazy dog /
with ABCD 11 discombobulate, abdicate, backward, background, considerable, broadcast, disturbance, adiabatic, adiabatically
without etaoin shrdlu 21 I, my, by, YDI, gym, gyp, CD, VAT, pygmy, pygmy
with ALL CAPS 15 CPU, UK, CD, VAT
with çgjpqy 1 gyp
with Q, but not QU 0
lemmas with a space (' ') 36
lemmas with a dash ('-') 37
lemmas with a quote (') 0
lemmas with a quote (’) 0
lexemes with multiple lemmas 90

Repeated letters[edit]

sequencecountsamples
aa 1 bazaar,
bb 21 abbey, crabby, robbery, ebb, shabbily, shabby, rabbit, flabbergast, wibble, abbreviate,
cc 50 successful, access, success, occur, accept, succor, accommodate, accord, vaccination, account,
dd 36 wedding, additional, address, addition, middle, odd, suddenly, sudden, add, bikeshedding,
ee 213 queen, proceed, sleep, agree, seek, three, feel, see, seed, thirteen,
ff 94 effect, affect, difficulty, difficult, different, difference, differ, stiff, jiffy, differentiate,
gg 31 ragged, trigger, aggressively, aggressive, aggression, struggle, egg, suggestion, aggregate, suggest,
kk 1 brekkie,
ll 411 follow, tall, allow, poll, recall, parallel, collate, stroll, smell, allocate,
mm 47 immediate, comment, hammer, common, summer, accommodate, commercial, communicate, community, summarize,
nn 45 perennial, annual, announcement, connection, dinner, millennium, tennis, announce, innovate, connect,
oo 210 door, cool, good, moon, tool, look, choose, book, google, coordinate,
pp 72 happen, pepper, apple, appear, approximate, support, disappear, happy, supply, appraise,
rr 91 tomorrow, arrive, correspond, carry, corroborate, array, error, arrange, rearrange, interrogate,
ss 212 pass, assign, glass, assimilate, dissect, chess, reassess, assess, discuss, express,
tt 90 attempt, letter, little, attack, matter, attend, Brittany, etiquette, chatter, attribute,
vv 2 savvy, divvy,
zz 21 dizzy, abuzz, jazz, fuzzily, fuzzy, fuzz, fizzy, fizz, buzz, puzzle,

Repeated vowel letters[edit]

Repeated vowel letterscountsamples
aa 305 alphabetize, metadata, abandon, adapt, database, translate, appraise, arrange, rearrange, amalgamate,
ee 797 interpret, detect, converse, where, erect, implement, strengthen, develop, express, elementary,
ii 359 subdivide, simplify, initiate, finish, pontificate, wiki, prioritize, originate, elicit, supercalifragilisticexpialidocious,
oo 377 follow, control, corroborate, colour, discombobulate, October, compose, etymology, consolidate, propose,
uu 55 cultural, usually, culture, future, autumn, nurture, structure, Zulu, unlucky, August,
yy 10 pygmy, gypsy, dryly, synonymy, spryly, slyly, shyly, coyly, Aberystwyth,

First and second letter of word[edit]

1st / 2nd a b c d e f g h i j k l m n o p q r s t u v w x y z total
a 55 85 93 17 23 46 7 19 2 1 100 44 120 2 55 5 84 70 43 59 22 17 10 2 2 983
b 139 84 50 61 88 90 77 4 593
c 134 1 22 134 20 94 328 1 108 60 14 1 917
d 53 1 161 138 63 63 42 3 8 532
e 21 3 7 17 2 7 4 1 5 2 1 34 31 58 1 8 15 15 15 12 1 27 2 102 4 395
f 83 53 1 77 101 85 65 56 521
g 58 45 3 20 44 7 51 105 44 5 382
h 82 67 40 77 44 15 325
i 1 7 13 2 4 8 45 168 5 12 7 7 3 282
j 28 13 9 30 32 112
k 8 17 2 27 1 18 1 1 2 77
l 90 63 78 1 82 39 8 361
m 128 67 71 108 50 5 429
n 36 46 21 45 13 2 163
o 7 13 11 5 14 2 3 3 2 3 1 9 2 26 30 2 4 15 14 7 5 1 179
p 122 97 26 65 69 1 117 176 4 65 5 747
q 63 63
r 100 309 9 57 85 45 1 606
s 82 102 93 108 77 34 89 37 48 77 124 8 204 102 60 15 1260
t 80 81 76 62 90 115 1 34 28 6 573
u 1 1 1 4 4 80 11 7 11 2 1 123
v 39 40 56 23 1 159
w 70 53 66 73 39 22 323
x 1 1 2
y 17 1 15 3 15 2 5 58
z 2 9 7 6 1 25
total 1379 72 212 132 1355 47 57 435 978 4 39 609 162 509 1420 225 28 893 110 274 850 67 117 117 95 4 10190

Note: Maximum value is 328 in the cell for words starting with 'co'. Sample word: 'colloidal'


Comparision with others[edit]

Introduction[edit]

Aspectcountsamples
anacyclique 103 tuber=rebut, timer=remit, tide=edit, .. /anacyclic


edit distance between words of same length[edit]

↓length \ distance → 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 total
1 21 21
2 52 444 496
3 2432 24162 87887 114481
4 4994 49631 287906 775729 1118260
5 2800 24873 179374 892610 2054159 3153816
6 146 1118 6671 34084 122401 206671 371091
7 49 361 1642 7582 33624 104007 138125 285390
8 33 114 467 1931 7549 27783 69617 74009 181503
9 13 84 253 766 2469 7803 23701 49366 40795 125250
10 12 59 158 474 1377 3094 6176 15698 27129 19743 73920
11 4 23 54 171 505 1022 1443 3074 6965 10275 6354 29890
12 0 7 23 46 111 277 434 823 1594 2753 3336 1921 11325
13 0 3 11 27 29 101 145 226 378 692 894 873 362 3741
14 0 5 1 1 4 11 21 45 78 91 136 212 235 63 903
15 0 0 0 0 1 0 2 0 6 10 14 13 21 16 8 91
16 0 0 0 0 0 0 0 0 0 0 1 4 1 4 4 1 15
total 10556 100884 564447 1713421 2222229 350769 239664 143241 76945 33564 10735 3023 619 83 12 1

anagram[edit]

Countwords
4 lead, deal, dale, lade,
4 name, mean, amen, mane,
4 trace, react, cater, crate,
4 leap, pale, peal, plea,
4 time, item, emit, mite,
4 large, glare, lager, regal,
4 part, trap, prat, rapt,
4 live, evil, veil, vile,
4 meat, team, mate, tame,
4 rape, pare, pear, reap,
4 ester, reset, steer, terse,
4 merit, mitre, remit, timer,
3 late, tale, teal,
3 edit, diet, tide,
3 list, silt, slit,
3 spot, stop, post,
3 three, there, ether,
3 read, dear, dare,
3 set up, upset, setup,
3 parse, spare, spear,
3 earth, heart, hater,
3 top, pot, opt,
3 serve, verse, sever,
3 art, rat, tar,
3 stele, steel, sleet,
3 meet, mete, teem,
3 seat, east, sate,
3 arm, mar, ram,
3 begin, being, binge,
3 below, elbow, bowel,
3 throw, worth, wroth,
3 tire, rite, tier,
3 salt, last, slat,
3 shut, thus, tush,
3 rate, tear, tare,
3 lake, kale, leak,
3 plane, panel, penal,
3 plate, petal, pleat,
3 beat, abet, bate,
3 brake, break, baker,
3 care, race, acre,
3 early, relay, layer,
3 great, grate, retag,
3 steak, stake, skate,
3 meal, male, lame,
3 angle, angel, glean,
3 fowl, flow, wolf,
3 rail, lair, liar,
3 cruel, lucre, ulcer,
3 sure, user, ruse,

Completeness[edit]

Lexical categories[edit]

  • adv → adj (467 of 570 found)

Multi-string lemmas[edit]

  • multi-string → part (14 of 71 with possibly missing parts)


Lists[edit]