Guidance Rule Statistics

Generated: 2026-05-24 21:12:39 UTC

This page measures how the translation-guidance rules are behaving as reusable editorial knowledge. The denominators are explicit: scan coverage is based on headwords with guidance scan evidence, and rule coverage is based on active, non-retired rules with matched evidence.

Rules: 163
Scanned headwords: 3,570
Headwords with active rule matches: 605
Active rule incidences: 1,849

Introduction Date Basis

Rules imported from Gabe's spreadsheet can predate their database row. Estimated dates use the earliest Gabe-linked kappa human translation whose headword is matched by that rule; rules without that evidence are kept separate from actual timestamps.

Date basisRules
actual11
estimated62
unknown90

Estimated Rules by Introduction Date

DateNew estimated rulesCumulative estimated rules
2026-02-15 3 3
2026-02-19 10 13
2026-02-26 5 18
2026-02-27 2 20
2026-03-02 22 42
2026-03-12 2 44
2026-03-26 4 48
2026-04-02 2 50
2026-04-08 1 51
2026-04-23 10 61
2026-05-12 1 62

Butterfly Collecting Estimate

Here each guidance rule is treated like a species and each scanned headword is a collecting site. Rules that appear only once or twice drive the estimate of how many additional rules are likely still unseen.

Observed active firing rules: 95
Chao2 unseen rules: 10.56
Chao2 estimated total: 105.56

Chao2 currently estimates 10.56 unseen active firing rules beyond the 95 active rules already observed in guidance-match evidence.

Kappa Rule Accumulation

This uses the dated kappa human translations as the collecting sequence. Each step is one Gabe-linked kappa headword in timestamp order. The upper curve shows cumulative distinct active rules first observed in that sample; the lower panel shows new-rule bursts and the number of new rules seen in the trailing 10 sampled entries. The current sample observes 89 distinct active rules across 86 dated kappa entries.

MeasureValue
Observed active rules that fire at least once95
Rule incidences in scanned headwords1849
Rules seen in one headword13
Rules seen in two headwords8
Estimated total rules, Chao2105.56
Estimated unseen rules, Chao210.56
Estimated total rules, first-order jackknife108.00
Estimated total rules, second-order jackknife113.00
Sample coverage99.3%
Chance next rule incidence is a not-yet-seen rule0.7%

Estimate by Rule Kind

Kind Active rules Observed firing rules Singletons Doubletons Chao2 total Chao2 unseen Sample coverage
gloss 67 63 10 7 70.14 7.14 98.9%
formula 31 30 1 1 30.50 0.50 99.9%
proper_noun 60 2 2 0 3.00 1.00 0.0%

Zipf-Like Rank Frequency

Rules are ranked by the number of distinct headwords where they fire. A Zipf-like distribution would appear close to a straight line on the log-log plot, with slope near -1.

Fitted slope: -1.384
Alpha: 1.384
Log-log R²: 0.92
Slope p-value: 5.963e-52

Top-N Headword Coverage

This asks how much of the scanned corpus is covered if we keep only the most frequently firing rules.

Top N rules Covered headwords Coverage of scanned headwords Coverage of matched headwords Coverage of rule incidences
1 323 9.0% 53.4% 17.5%
3 488 13.7% 80.7% 32.6%
5 539 15.1% 89.1% 42.8%
10 573 16.1% 94.7% 60.0%
20 589 16.5% 97.4% 75.8%
50 604 16.9% 99.8% 93.2%

Top Firing Rules

Rank Kind Rule Preferred Matched headwords Occurrences Date basis Introduced
1 gloss χώρα ἡ region; territory (when belonging to a specific people, tribe, or nation); surrounding territory (when juxtaposed against a πόλις); land (remote or poetical places) 323 429 estimated 2026-02-26
2 gloss ἐθνικόν τό ethnonym 178 190 estimated 2026-02-27
3 formula τὸ ἐθνικὸν + X (nominative ETHNONYM) "The ethnonym is 'X'" 102 112 unknown 2026-04-24
4 formula ὡς + X (AUTHOR NAME) "as per X," 96 114 estimated 2026-02-26
5 formula X (SETTLEMENT) + Y (genitive REGION) "A X in Y" 92 98 unknown 2026-04-24
6 formula X (AUTHOR NAME) + Y (NUMERAL) "X, book Y" 87 89 estimated 2026-03-02
7 gloss ἔθνος τό people 64 65 estimated 2026-02-27
8 formula X (SETTLEMENT) + Y (genitive PEOPLE) "A X of the Y" 62 65 unknown 2026-04-24
9 formula X (nominative) + ὡς + Y (nominative HOMOMORPH) "'X' as in 'Y'" 60 72 unknown 2026-04-24
10 formula X (genitive ETYMON) + Y (nominative DERIVED NOUN) "Y is from 'X'" 46 50 unknown 2026-04-24
11 formula X (nominative DERIVED NOUN) + Y (nominative ETYMON) "'X' is from Y" 44 50 unknown 2026-04-24
12 formula ὡς + X (nominative ETYMON) + Y (nominative DERIVED NOUN) "(just) as 'Y' is from X" 38 46 unknown 2026-04-24
13 gloss πολίτης ὁ citizen 38 38 estimated 2026-02-19
14 formula X (nominative PROPER NOUN) ... + ἀπό + Y (genitive ETYMON) "X... after Y" 36 39 unknown 2026-04-24
15 formula X (AUTHOR NAME) + Y (dative NUMBER) "X, book Y" 26 26 unknown 2026-04-24
16 formula ὁ πολίτης + X (nominative GENTILIC) "A citizen is an 'X'" 25 25 unknown 2026-04-24
17 gloss καλεῖται/ἐκαλεῖτο/κέκληται/ἐκλήθη (ἀπο...) is/used to be/is/was called/named after (+ ἀπο) 22 30 unknown 2026-04-24
18 formula X (AUTHOR NAME) + ἐν + Y (dative BOOK NAME) "Χ, in his *Y*" 22 25 unknown 2026-04-24
19 formula X (nominative PROPER NOUN) + X (genitive PROPER NOUN) "X, X" 22 22 unknown 2026-04-24
20 gloss ὄρος ὁ mountain, mount (X ὄρος = 'Mount X') 19 23 estimated 2026-02-19
21 gloss χωρίον τό locality; point (only in μέσα χωρία: ‘halfway point’) 17 25 estimated 2026-04-08
22 formula ὡς + X (definite ARTICLE + ETYMON) + Y (nominative DERIVED NOUN) "as 'Y' is from the name X" 16 17 unknown 2026-04-24
23 formula X... πλησίον + Y. (genitive) "[X...] near + [Y]" 16 16 unknown 2026-04-24
24 gloss γένος τό descent 16 16 estimated 2026-03-12
25 formula Χ (AUTHOR NAME) + ἐν + Y (dative NUMBER) + Z (genitive BOOK NAME) "X, in book Y of his *Z*" 15 20 unknown 2026-04-24
26 gloss οἰκήτωρ ὁ inhabitant, resident, patron (of a brothel...? - κ123) 14 14 estimated 2026-03-12
27 gloss γῆ ἡ land, dirt, Earth 13 19 estimated 2026-03-26
28 formula X (nominative PROPER NOUN)... + ἀπό + Y (genitive ARTICLE + genitive ETYMON) "'X... from the form Y" 13 15 unknown 2026-04-24
29 gloss κτητικόν τό possessive 13 13 estimated 2026-02-19
30 formula [NO ANTECEDENT] + ἀφ' οὗ + Y (DERIVED NOUN) "as per 'Y'" 11 13 estimated 2026-02-19
31 formula X... + πρός + Y (dative) "[X...] near + [Y]" 11 12 unknown 2026-04-24
32 gloss δῆμος ὁ deme 11 11 estimated 2026-03-02
33 gloss τοπικά τά locative term 11 11 estimated 2026-03-02
34 gloss φυλή ἡ tribe 11 11 estimated 2026-03-02
35 formula X (MASCULINE ETYMON) + ἀφ' οὗ + Y (DERIVED NOUN) "X... from/after whom is Y" 10 11 estimated 2026-02-19
36 gloss κώμη ἡ village 10 11 estimated 2026-05-12
37 formula Y (EPITHET) + X (DEITY) "X Y" 10 10 estimated 2026-02-19
38 gloss ἄκρα ἡ promontory 9 11 estimated 2026-03-02
39 gloss ὁμωνύμως/ὁμώνυμος -ον homonymous, having the same name 9 11 estimated 2026-03-02
40 gloss δημότης ὁ deme-member 9 9 estimated 2026-03-02
41 gloss πολίχνιον τό town 9 9 estimated 2026-02-19
42 gloss πόλισμα τό * town 8 11 estimated 2026-02-26
43 formula X (NEUTER ETYMON) + ἀφ' οὗ + Y (DERIVED NOUN) "X... from which is Y", "X... as per 'Y'" 8 9 estimated 2026-02-19
44 gloss λιμή ὁ lake, harbor 8 9 estimated 2026-04-23
45 gloss τόνος ὁ stress (oxytone if + ὀξύς; barytone if + βαρύς) 8 8 estimated 2026-03-02
46 gloss μοῖρα ἡ region or part (in geographic contexts); district (in urban contexts only) 7 11 estimated 2026-04-02
47 formula X (nominative DERIVED NOUN) + παρά + τό + Y (NOUN or INFINITIVE which is an ETYMON of the DERIVED NOUN) "X is deriving/derives from the form 'Y'" 7 9 actual 2026-05-13
48 formula καί + X (nominative PROPER NOUN) + Y (nominative PROPER NOUN) "Y is also 'X'" 7 9 unknown 2026-04-24
49 gloss ἀκτή ἡ headland 7 8 estimated 2026-03-02
50 formula διὰ τοῦ + «X» (GREEK LETTER) + [FORM OF γράφειν] "Writes/write/wrote/is written with X" 7 7 estimated 2026-02-15