Translation Quality Predictor
This page tests whether current source vocabulary, translation-guidance recogniser matches, and mean v3 translation length can predict which approved-human passages are translated badly by ordinary legacy_scholarly v3. It excludes reasoning profiles and the special-temperature v3 experiment lanes.
Best current model: Greek vocabulary + translation length predicting 2-gram F1 badness, CV R^2 0.394, Spearman r 0.640.
Sentence-level alignment and metric rows exist; passage-level modeling remains the current daily output.
Metric Engines
| Metric | Status |
| BLEU-4 | SacreBLEU sentence BLEU-4 |
| METEOR | NLTK METEOR with WordNet synonyms |
| ROUGE-L | rouge-score ROUGE-L with stemming |
| chrF++ | SacreBLEU chrF++ with word_order=2 |
Approved human passages101
Scored v3 passages101
Completed v3 runs118
Mean runs per passage1.17
Predictability By Metric
Targets are badness measures: for score metrics, larger means lower translation score; for length, larger means more absolute word-count error. Cross-validation uses fixed five-fold splits where possible. The sample is small, so negative R^2 values should be read as evidence that the feature family is not currently useful for that metric.
| Feature family |
Target |
Status |
Passages |
Features |
CV R^2 |
Spearman r |
CV MAE |
MAE lift |
Worst-quartile precision |
Ridge alpha |
| Greek vocabulary + translation length |
2-gram F1 badness |
ok |
101 |
207 |
0.394 |
0.640 |
0.1123 |
0.0331 |
61.5% |
0.3257 |
| Greek vocabulary + translation length |
chrF++ badness |
ok |
101 |
207 |
0.372 |
0.679 |
0.0696 |
0.0227 |
61.5% |
0.4924 |
| Greek vocabulary |
2-gram F1 badness |
ok |
101 |
206 |
0.370 |
0.596 |
0.1149 |
0.0304 |
50.0% |
0.3257 |
| Greek vocabulary + translation length |
3-gram F1 badness |
ok |
101 |
207 |
0.369 |
0.598 |
0.1479 |
0.0374 |
61.5% |
0.3257 |
| Greek vocabulary + translation length |
ROUGE-L badness |
ok |
101 |
207 |
0.360 |
0.663 |
0.0714 |
0.0207 |
61.5% |
0.3257 |
| Greek vocabulary + translation length |
3-gram Jaccard badness |
ok |
101 |
207 |
0.356 |
0.610 |
0.1576 |
0.0359 |
61.5% |
0.4924 |
| Greek vocabulary + translation length |
BLEU-4 badness |
ok |
101 |
207 |
0.355 |
0.647 |
0.1269 |
0.0390 |
65.4% |
0.4924 |
| Greek vocabulary + translation length |
Sentence BLEU badness |
ok |
101 |
207 |
0.355 |
0.647 |
0.1269 |
0.0390 |
65.4% |
0.4924 |
| Greek vocabulary + translation length |
METEOR badness |
ok |
101 |
207 |
0.347 |
0.650 |
0.0782 |
0.0199 |
61.5% |
0.4924 |
| Greek vocabulary |
3-gram F1 badness |
ok |
101 |
206 |
0.342 |
0.560 |
0.1506 |
0.0347 |
57.7% |
0.4924 |
| Greek vocabulary |
chrF++ badness |
ok |
101 |
206 |
0.338 |
0.617 |
0.0723 |
0.0199 |
57.7% |
0.4924 |
| Greek vocabulary |
3-gram Jaccard badness |
ok |
101 |
206 |
0.334 |
0.552 |
0.1592 |
0.0343 |
50.0% |
0.4924 |
| Greek vocabulary |
METEOR badness |
ok |
101 |
206 |
0.330 |
0.594 |
0.0781 |
0.0201 |
50.0% |
0.4924 |
| Greek vocabulary |
ROUGE-L badness |
ok |
101 |
206 |
0.330 |
0.605 |
0.0733 |
0.0188 |
50.0% |
0.3257 |
| Greek vocabulary |
BLEU-4 badness |
ok |
101 |
206 |
0.326 |
0.596 |
0.1287 |
0.0373 |
65.4% |
0.4924 |
| Greek vocabulary |
Sentence BLEU badness |
ok |
101 |
206 |
0.326 |
0.596 |
0.1287 |
0.0373 |
65.4% |
0.4924 |
| Translation length |
chrF++ badness |
ok |
101 |
1 |
0.164 |
0.442 |
0.0835 |
0.0087 |
44.4% |
13.4340 |
| Translation length |
ROUGE-L badness |
ok |
101 |
1 |
0.129 |
0.378 |
0.0847 |
0.0075 |
46.2% |
20.3092 |
| Translation length |
METEOR badness |
ok |
101 |
1 |
0.116 |
0.334 |
0.0914 |
0.0068 |
42.3% |
20.3092 |
| Translation length |
BLEU-4 badness |
ok |
101 |
1 |
0.102 |
0.358 |
0.1535 |
0.0124 |
33.3% |
13.4340 |
| Translation length |
Sentence BLEU badness |
ok |
101 |
1 |
0.102 |
0.358 |
0.1535 |
0.0124 |
33.3% |
13.4340 |
| Translation length |
3-gram Jaccard badness |
ok |
101 |
1 |
0.075 |
0.249 |
0.1811 |
0.0125 |
37.0% |
8.8862 |
| Translation length |
2-gram F1 badness |
ok |
101 |
1 |
0.072 |
0.281 |
0.1364 |
0.0090 |
38.5% |
13.4340 |
| Greek vocabulary |
Absolute length percent error |
ok |
101 |
206 |
0.068 |
0.129 |
0.0408 |
0.0008 |
30.8% |
1.7013 |
| Greek vocabulary + translation length |
Absolute length percent error |
ok |
101 |
207 |
0.062 |
0.107 |
0.0409 |
0.0007 |
30.8% |
1.7013 |
| Vocabulary + recognisers + translation length |
METEOR badness |
ok |
101 |
288 |
0.060 |
0.327 |
0.0950 |
0.0032 |
46.2% |
4375.4794 |
| Recogniser rules + translation length |
METEOR badness |
ok |
101 |
82 |
0.059 |
0.325 |
0.0950 |
0.0032 |
46.2% |
4375.4794 |
| Vocabulary + recognisers + translation length |
ROUGE-L badness |
ok |
101 |
288 |
0.059 |
0.523 |
0.0831 |
0.0090 |
61.5% |
1.7013 |
| Vocabulary + recognisers |
METEOR badness |
ok |
101 |
287 |
0.058 |
0.323 |
0.0951 |
0.0031 |
42.3% |
4375.4794 |
| Recogniser rules |
METEOR badness |
ok |
101 |
81 |
0.058 |
0.322 |
0.0951 |
0.0031 |
42.3% |
4375.4794 |
| Vocabulary + recognisers + translation length |
chrF++ badness |
ok |
101 |
288 |
0.058 |
0.339 |
0.0892 |
0.0030 |
50.0% |
2894.2661 |
| Recogniser rules + translation length |
chrF++ badness |
ok |
101 |
82 |
0.057 |
0.339 |
0.0892 |
0.0030 |
50.0% |
2894.2661 |
| Translation length |
3-gram F1 badness |
ok |
101 |
1 |
0.055 |
0.240 |
0.1749 |
0.0104 |
33.3% |
13.4340 |
| Vocabulary + recognisers |
chrF++ badness |
ok |
101 |
287 |
0.054 |
0.327 |
0.0894 |
0.0028 |
50.0% |
2894.2661 |
| Recogniser rules |
chrF++ badness |
ok |
101 |
81 |
0.053 |
0.323 |
0.0895 |
0.0028 |
46.2% |
2894.2661 |
| Recogniser rules + translation length |
ROUGE-L badness |
ok |
101 |
82 |
0.052 |
0.323 |
0.0889 |
0.0032 |
42.3% |
4375.4794 |
| Vocabulary + recognisers |
ROUGE-L badness |
ok |
101 |
287 |
0.050 |
0.307 |
0.0890 |
0.0031 |
42.3% |
4375.4794 |
| Recogniser rules |
ROUGE-L badness |
ok |
101 |
81 |
0.050 |
0.305 |
0.0891 |
0.0031 |
42.3% |
4375.4794 |
| Vocabulary + recognisers + translation length |
3-gram Jaccard badness |
ok |
101 |
288 |
0.040 |
0.436 |
0.1828 |
0.0108 |
50.0% |
5.8780 |
| Vocabulary + recognisers + translation length |
BLEU-4 badness |
ok |
101 |
288 |
0.038 |
0.218 |
0.1625 |
0.0035 |
42.3% |
4375.4794 |
| Vocabulary + recognisers + translation length |
Sentence BLEU badness |
ok |
101 |
288 |
0.038 |
0.218 |
0.1625 |
0.0035 |
42.3% |
4375.4794 |
| Recogniser rules + translation length |
BLEU-4 badness |
ok |
101 |
82 |
0.038 |
0.216 |
0.1625 |
0.0035 |
42.3% |
4375.4794 |
| Recogniser rules + translation length |
Sentence BLEU badness |
ok |
101 |
82 |
0.038 |
0.216 |
0.1625 |
0.0035 |
42.3% |
4375.4794 |
| Vocabulary + recognisers |
BLEU-4 badness |
ok |
101 |
287 |
0.036 |
0.212 |
0.1627 |
0.0033 |
42.3% |
4375.4794 |
| Vocabulary + recognisers |
Sentence BLEU badness |
ok |
101 |
287 |
0.036 |
0.212 |
0.1627 |
0.0033 |
42.3% |
4375.4794 |
| Recogniser rules |
BLEU-4 badness |
ok |
101 |
81 |
0.036 |
0.210 |
0.1627 |
0.0032 |
42.3% |
4375.4794 |
| Recogniser rules |
Sentence BLEU badness |
ok |
101 |
81 |
0.036 |
0.210 |
0.1627 |
0.0032 |
42.3% |
4375.4794 |
| Vocabulary + recognisers + translation length |
2-gram F1 badness |
ok |
101 |
288 |
0.033 |
0.242 |
0.1422 |
0.0032 |
42.3% |
4375.4794 |
| Recogniser rules + translation length |
2-gram F1 badness |
ok |
101 |
82 |
0.033 |
0.240 |
0.1422 |
0.0032 |
42.3% |
4375.4794 |
| Vocabulary + recognisers |
2-gram F1 badness |
ok |
101 |
287 |
0.031 |
0.231 |
0.1424 |
0.0030 |
42.3% |
4375.4794 |
| Recogniser rules |
2-gram F1 badness |
ok |
101 |
81 |
0.031 |
0.228 |
0.1424 |
0.0030 |
42.3% |
4375.4794 |
| Vocabulary + recognisers |
Absolute length percent error |
ok |
101 |
287 |
0.028 |
0.167 |
0.0420 |
-0.0004 |
46.2% |
837.6776 |
| Recogniser rules |
Absolute length percent error |
ok |
101 |
81 |
0.028 |
0.165 |
0.0420 |
-0.0004 |
46.2% |
837.6776 |
| Vocabulary + recognisers + translation length |
Absolute length percent error |
ok |
101 |
288 |
0.028 |
0.164 |
0.0420 |
-0.0004 |
46.2% |
837.6776 |
| Recogniser rules + translation length |
Absolute length percent error |
ok |
101 |
82 |
0.027 |
0.167 |
0.0420 |
-0.0004 |
46.2% |
837.6776 |
| Vocabulary + recognisers + translation length |
3-gram F1 badness |
ok |
101 |
288 |
0.026 |
0.230 |
0.1813 |
0.0040 |
34.6% |
2894.2661 |
| Recogniser rules + translation length |
3-gram F1 badness |
ok |
101 |
82 |
0.026 |
0.229 |
0.1813 |
0.0040 |
34.6% |
2894.2661 |
| Vocabulary + recognisers |
3-gram F1 badness |
ok |
101 |
287 |
0.024 |
0.221 |
0.1815 |
0.0038 |
34.6% |
2894.2661 |
| Recogniser rules |
3-gram F1 badness |
ok |
101 |
81 |
0.024 |
0.193 |
0.1819 |
0.0034 |
34.6% |
4375.4794 |
| Recogniser rules + translation length |
3-gram Jaccard badness |
ok |
101 |
82 |
0.023 |
0.227 |
0.1903 |
0.0033 |
38.5% |
2894.2661 |
| Vocabulary + recognisers |
3-gram Jaccard badness |
ok |
101 |
287 |
0.021 |
0.224 |
0.1905 |
0.0030 |
38.5% |
2894.2661 |
| Recogniser rules |
3-gram Jaccard badness |
ok |
101 |
81 |
0.021 |
0.222 |
0.1905 |
0.0030 |
38.5% |
2894.2661 |
| Translation length |
Absolute length percent error |
ok |
101 |
1 |
-0.002 |
-0.077 |
0.0416 |
-0.0001 |
19.2% |
10000.0000 |
Highest Predicted Risk
This list uses the best cross-validated model in this run and sorts passages by predicted badness for 2-gram F1 badness.
| Lemma |
ID |
v3 runs |
Source words |
Observed badness |
Predicted badness |
BLEU-4 |
chrF++ |
3-gram F1 |
Length error |
| Καρία |
2484 |
1 |
181.0 |
0.4484 |
0.6604 |
42.0% |
65.8% |
42.2% |
12.6% |
| Κασώριον |
2623 |
1 |
14.0 |
0.5556 |
0.6126 |
32.1% |
66.0% |
35.3% |
10.0% |
| Κάρυστος |
2603 |
1 |
132.0 |
0.4327 |
0.5945 |
46.5% |
69.5% |
41.5% |
0.6% |
| Καλάσιρις |
2085 |
1 |
10.0 |
0.6923 |
0.5179 |
32.3% |
69.5% |
8.3% |
15.4% |
| Κάλυτις |
2335 |
2 |
16.0 |
0.6765 |
0.5155 |
31.9% |
60.3% |
14.7% |
5.8% |
| Κριώα |
3530 |
1 |
16.0 |
0.4894 |
0.5146 |
38.0% |
71.1% |
31.1% |
11.5% |
| Καδμεία |
2059 |
1 |
17.0 |
0.5000 |
0.4971 |
41.9% |
68.2% |
36.8% |
10.0% |
| Καρχηδών |
2604 |
1 |
88.0 |
0.4897 |
0.4941 |
35.0% |
63.9% |
35.7% |
9.4% |
| Καππαδοκία |
2470 |
2 |
57.0 |
0.4280 |
0.4923 |
20.0% |
56.8% |
41.7% |
3.7% |
| Κοτιάειον |
3496 |
1 |
46.0 |
0.5373 |
0.4854 |
41.0% |
62.3% |
34.8% |
8.5% |
| Καταονία |
2628 |
1 |
17.0 |
0.5319 |
0.4780 |
36.2% |
67.4% |
31.1% |
4.2% |
| Καλαβρία |
2080 |
1 |
12.0 |
0.6250 |
0.4718 |
23.6% |
66.7% |
13.3% |
11.1% |
| Κύρνος |
7247 |
1 |
34.0 |
0.3895 |
0.4620 |
47.1% |
69.3% |
51.6% |
6.0% |
| Κύτα |
7254 |
1 |
58.0 |
0.4000 |
0.4495 |
51.9% |
73.6% |
47.6% |
4.8% |
| Καπετώλιον |
2468 |
1 |
86.0 |
0.5827 |
0.4493 |
30.1% |
50.3% |
31.0% |
14.5% |
| Κάναστρον |
2455 |
1 |
43.0 |
0.5826 |
0.4492 |
32.0% |
62.9% |
26.5% |
5.0% |
| Καρπασία |
2597 |
1 |
80.0 |
0.3767 |
0.4414 |
54.3% |
75.2% |
51.6% |
6.2% |
| Κώμη |
7266 |
1 |
53.0 |
0.7432 |
0.4163 |
13.8% |
47.6% |
13.7% |
10.1% |
| Κατάνη |
2626 |
1 |
64.0 |
0.4945 |
0.4159 |
43.8% |
64.8% |
36.7% |
6.3% |
| Κυτέριον |
7255 |
1 |
18.0 |
0.4167 |
0.4147 |
53.0% |
76.1% |
43.5% |
27.3% |
| Κάλπη |
2329 |
1 |
36.0 |
0.2000 |
0.4054 |
81.0% |
89.1% |
72.2% |
3.5% |
| Καβασσός |
2055 |
1 |
69.0 |
0.5048 |
0.4053 |
41.6% |
67.5% |
32.7% |
5.5% |
| Καικῖνον |
2074 |
1 |
6.0 |
0.6842 |
0.4048 |
21.4% |
66.4% |
0.0% |
9.1% |
| Κωνώπη |
7267 |
1 |
47.0 |
0.3667 |
0.4033 |
57.6% |
75.7% |
50.8% |
9.4% |
| Κάλλατις |
2119 |
1 |
45.0 |
0.4887 |
0.4018 |
31.8% |
63.9% |
30.5% |
7.7% |
| Κάσος |
2607 |
1 |
44.0 |
0.4464 |
0.3991 |
33.7% |
61.5% |
43.6% |
0.0% |
| Καβελλιών |
2057 |
1 |
23.0 |
0.2836 |
0.3934 |
45.8% |
75.0% |
55.4% |
2.9% |
| Κάληρος |
2116 |
1 |
22.0 |
0.5890 |
0.3854 |
29.3% |
61.2% |
28.2% |
12.5% |
| Κωλιάς |
7264 |
1 |
42.0 |
0.3445 |
0.3853 |
56.9% |
76.7% |
53.0% |
1.6% |
| Κάσιον |
2605 |
1 |
43.0 |
0.3158 |
0.3812 |
53.0% |
75.6% |
55.4% |
7.1% |
Predictive Features
Translation Length: chrF++ badness
Positive coefficients predict worse translation scores for the selected target. Negative coefficients predict better scores. Translation length is z-scored; vocabulary features exclude detected proper-noun tokens. These are exploratory ridge coefficients, not causal claims.
Features associated with worse scores
| Type |
Feature |
Detail |
Coefficient |
Passages |
Mean badness present |
Mean badness absent |
| translation_length |
Mean v3 translation word count |
z-scored; mean 43.1, SD 36.7 words; present/high = top quartile, absent/low = bottom quartile |
0.04422 |
101 |
0.3137 |
0.1565 |
Features associated with better scores
| Type |
Feature |
Detail |
Coefficient |
Passages |
Mean badness present |
Mean badness absent |
Vocabulary Terms: 2-gram F1 badness
Positive coefficients predict worse translation scores for the selected target. Negative coefficients predict better scores. Translation length is z-scored; vocabulary features exclude detected proper-noun tokens. These are exploratory ridge coefficients, not causal claims.
Features associated with worse scores
| Type |
Feature |
Detail |
Coefficient |
Passages |
Mean badness present |
Mean badness absent |
| vocabulary |
χωριον |
|
0.25588 |
2 |
0.5835 |
0.3334 |
| vocabulary |
οικητωρ |
|
0.20685 |
6 |
0.5544 |
0.3247 |
| vocabulary |
επι |
|
0.17838 |
2 |
0.6113 |
0.3328 |
| vocabulary |
τον |
|
0.17386 |
10 |
0.4599 |
0.3250 |
| vocabulary |
τοις |
|
0.17006 |
3 |
0.4885 |
0.3338 |
| vocabulary |
ωστε |
|
0.16307 |
3 |
0.5518 |
0.3318 |
| vocabulary |
και φασι |
|
0.15998 |
2 |
0.5012 |
0.3351 |
| vocabulary |
οικητωρ και |
|
0.15359 |
3 |
0.6190 |
0.3298 |
| vocabulary |
καλειται |
|
0.15359 |
2 |
0.4723 |
0.3356 |
| vocabulary |
και |
|
0.14659 |
66 |
0.3828 |
0.2546 |
| vocabulary |
ει |
|
0.14567 |
4 |
0.5657 |
0.3290 |
| vocabulary |
εκαλειτο |
|
0.13830 |
7 |
0.4612 |
0.3292 |
Features associated with better scores
| Type |
Feature |
Detail |
Coefficient |
Passages |
Mean badness present |
Mean badness absent |
| vocabulary |
τεταρτω |
|
-0.28307 |
3 |
0.0929 |
0.3459 |
| vocabulary |
εθνικον |
|
-0.21955 |
57 |
0.3037 |
0.3833 |
| vocabulary |
εβδομη |
|
-0.20173 |
2 |
0.2444 |
0.3403 |
| vocabulary |
ως εθνικον |
|
-0.17735 |
5 |
0.1882 |
0.3462 |
| vocabulary |
μεταξυ και |
|
-0.15902 |
5 |
0.2941 |
0.3407 |
| vocabulary |
μεταξυ |
|
-0.15902 |
5 |
0.2941 |
0.3407 |
| vocabulary |
πορρω |
|
-0.15293 |
3 |
0.0962 |
0.3458 |
| vocabulary |
ου πορρω |
|
-0.15293 |
3 |
0.0962 |
0.3458 |
| vocabulary |
πολιτης |
|
-0.13977 |
18 |
0.3404 |
0.3379 |
| vocabulary |
παιδος |
|
-0.13969 |
4 |
0.2105 |
0.3436 |
| vocabulary |
απο παιδος |
|
-0.13969 |
4 |
0.2105 |
0.3436 |
| vocabulary |
εν |
|
-0.13932 |
37 |
0.3444 |
0.3349 |
Vocabulary Terms + Translation Length: 2-gram F1 badness
Positive coefficients predict worse translation scores for the selected target. Negative coefficients predict better scores. Translation length is z-scored; vocabulary features exclude detected proper-noun tokens. These are exploratory ridge coefficients, not causal claims.
Features associated with worse scores
| Type |
Feature |
Detail |
Coefficient |
Passages |
Mean badness present |
Mean badness absent |
| vocabulary |
χωριον |
|
0.26148 |
2 |
0.5835 |
0.3334 |
| vocabulary |
οικητωρ |
|
0.20978 |
6 |
0.5544 |
0.3247 |
| vocabulary |
επι |
|
0.18030 |
2 |
0.6113 |
0.3328 |
| vocabulary |
και φασι |
|
0.17593 |
2 |
0.5012 |
0.3351 |
| vocabulary |
οικητωρ και |
|
0.16831 |
3 |
0.6190 |
0.3298 |
| vocabulary |
τοις |
|
0.15941 |
3 |
0.4885 |
0.3338 |
| vocabulary |
τον |
|
0.15662 |
10 |
0.4599 |
0.3250 |
| vocabulary |
μοιρα |
|
0.15277 |
2 |
0.6121 |
0.3328 |
| vocabulary |
καλειται |
|
0.15045 |
2 |
0.4723 |
0.3356 |
| vocabulary |
ωστε |
|
0.14658 |
3 |
0.5518 |
0.3318 |
| vocabulary |
και |
|
0.14300 |
66 |
0.3828 |
0.2546 |
| vocabulary |
δευτερω |
|
0.13128 |
3 |
0.5157 |
0.3329 |
Features associated with better scores
| Type |
Feature |
Detail |
Coefficient |
Passages |
Mean badness present |
Mean badness absent |
| vocabulary |
τεταρτω |
|
-0.26840 |
3 |
0.0929 |
0.3459 |
| vocabulary |
εθνικον |
|
-0.20535 |
57 |
0.3037 |
0.3833 |
| vocabulary |
εβδομη |
|
-0.19872 |
2 |
0.2444 |
0.3403 |
| vocabulary |
ως εθνικον |
|
-0.17192 |
5 |
0.1882 |
0.3462 |
| vocabulary |
εν |
|
-0.16332 |
37 |
0.3444 |
0.3349 |
| vocabulary |
μεταξυ και |
|
-0.15855 |
5 |
0.2941 |
0.3407 |
| vocabulary |
μεταξυ |
|
-0.15855 |
5 |
0.2941 |
0.3407 |
| vocabulary |
ως εν |
|
-0.14623 |
7 |
0.3104 |
0.3404 |
| vocabulary |
πορρω |
|
-0.14466 |
3 |
0.0962 |
0.3458 |
| vocabulary |
ου πορρω |
|
-0.14466 |
3 |
0.0962 |
0.3458 |
| vocabulary |
παιδος |
|
-0.13855 |
4 |
0.2105 |
0.3436 |
| vocabulary |
απο παιδος |
|
-0.13855 |
4 |
0.2105 |
0.3436 |
Recogniser Rules: METEOR badness
Positive coefficients predict worse translation scores for the selected target. Negative coefficients predict better scores. Translation length is z-scored; vocabulary features exclude detected proper-noun tokens. These are exploratory ridge coefficients, not causal claims.
Features associated with worse scores
| Type |
Feature |
Detail |
Coefficient |
Passages |
Mean badness present |
Mean badness absent |
| recogniser_summary |
gloss occurrence count |
|
0.00194 |
96 |
0.1998 |
0.1167 |
| recogniser_summary |
gloss rule count |
|
0.00153 |
96 |
0.1998 |
0.1167 |
| recogniser_summary |
matched occurrence count |
|
0.00105 |
101 |
0.1957 |
N/A |
| recogniser_summary |
matched rule count |
|
0.00038 |
101 |
0.1957 |
N/A |
| recogniser_rule |
formula: X (SETTLEMENT) + Y (genitive REGION) |
Translate as "a X in Y" |
0.00028 |
63 |
0.2110 |
0.1704 |
| recogniser_rule |
formula: X (nominative DERIVED NOUN) + Y (nominative ETYMON) |
Translate as "'X' is from Y" |
0.00026 |
36 |
0.2116 |
0.1869 |
| recogniser_rule |
formula: X (nominative PROPER NOUN) + X (genitive PROPER NOUN) |
Translate as "X, X" |
0.00020 |
20 |
0.2275 |
0.1879 |
| recogniser_rule |
gloss: οἰκήτωρ ὁ |
inhabitant, resident, patron (of a brothel...? - κ123) |
0.00018 |
6 |
0.3660 |
0.1849 |
| recogniser_rule |
gloss: ἄκρον τό |
cape (when on the coast, sgl.), headlands (when on the coast, plu.); peak (when inland) |
0.00018 |
8 |
0.2890 |
0.1877 |
| recogniser_rule |
gloss: καλεῖται/ἐκαλεῖτο/κέκληται/ἐκλήθη (ἀπο...) |
is/used to be/is/was called/named after (+ ἀπο) |
0.00018 |
14 |
0.2488 |
0.1872 |
| recogniser_rule |
gloss: πόλισμα τό * |
town |
0.00016 |
5 |
0.2852 |
0.1910 |
| recogniser_rule |
formula: καί + X (nominative PROPER NOUN) + Y (nominative PROPER NOUN) |
Translate as "Y is also 'X'" |
0.00012 |
13 |
0.2129 |
0.1932 |
Features associated with better scores
| Type |
Feature |
Detail |
Coefficient |
Passages |
Mean badness present |
Mean badness absent |
| recogniser_summary |
formula rule count |
|
-0.00120 |
100 |
0.1949 |
0.2718 |
| recogniser_summary |
formula occurrence count |
|
-0.00094 |
100 |
0.1949 |
0.2718 |
| recogniser_rule |
formula: τὸ ἐθνικὸν + X (nominative ETHNONYM) |
Translate as "the ethnonym is 'X'" |
-0.00029 |
62 |
0.1798 |
0.2210 |
| recogniser_rule |
formula: X (nominative) + ὡς + Y (nominative HOMOMORPH) |
Translate as "'X' as in 'Y'" |
-0.00028 |
43 |
0.1737 |
0.2120 |
| recogniser_rule |
formula: ὡς + X (nominative ETYMON) + Y (nominative DERIVED NOUN) |
Translate as "(just) as 'Y' is from X" |
-0.00026 |
35 |
0.1677 |
0.2105 |
| recogniser_rule |
formula: X (AUTHOR NAME) + Y (NUMERAL) |
Translate as "X, book Y" |
-0.00019 |
31 |
0.1637 |
0.2099 |
| recogniser_rule |
gloss: ἔθνος τό |
people |
-0.00019 |
16 |
0.1559 |
0.2032 |
| recogniser_rule |
formula: X (AUTHOR NAME) + ἐν + Y (dative BOOK NAME) |
Translate as "Χ, in his *Y*" |
-0.00017 |
17 |
0.1616 |
0.2026 |
| recogniser_rule |
formula: ὡς + X (AUTHOR NAME) |
Translate as "as per X" |
-0.00016 |
41 |
0.1781 |
0.2077 |
| recogniser_rule |
formula: Χ (AUTHOR NAME) + ἐν + Y (dative NUMBER) + Z (genitive BOOK NAME) |
Translate as "X, in book Y of his *Z*" |
-0.00012 |
11 |
0.1692 |
0.1989 |
| recogniser_rule |
formula: ὡς + X (definite ARTICLE + ETYMON) + Y (nominative DERIVED NOUN) |
Translate as "just as 'Y' is from the name X" |
-0.00011 |
19 |
0.1781 |
0.1998 |
| recogniser_rule |
gloss: ἐθνικόν τό |
ethnonym |
-0.00009 |
61 |
0.1838 |
0.2138 |
Recogniser Rules + Translation Length: METEOR badness
Positive coefficients predict worse translation scores for the selected target. Negative coefficients predict better scores. Translation length is z-scored; vocabulary features exclude detected proper-noun tokens. These are exploratory ridge coefficients, not causal claims.
Features associated with worse scores
| Type |
Feature |
Detail |
Coefficient |
Passages |
Mean badness present |
Mean badness absent |
| recogniser_summary |
gloss occurrence count |
|
0.00192 |
96 |
0.1998 |
0.1167 |
| recogniser_summary |
gloss rule count |
|
0.00152 |
96 |
0.1998 |
0.1167 |
| recogniser_summary |
matched occurrence count |
|
0.00103 |
101 |
0.1957 |
N/A |
| translation_length |
Mean v3 translation word count |
z-scored; mean 43.1, SD 36.7 words; present/high = top quartile, absent/low = bottom quartile |
0.00063 |
101 |
0.2753 |
0.1464 |
| recogniser_summary |
matched rule count |
|
0.00037 |
101 |
0.1957 |
N/A |
| recogniser_rule |
formula: X (SETTLEMENT) + Y (genitive REGION) |
Translate as "a X in Y" |
0.00028 |
63 |
0.2110 |
0.1704 |
| recogniser_rule |
formula: X (nominative DERIVED NOUN) + Y (nominative ETYMON) |
Translate as "'X' is from Y" |
0.00026 |
36 |
0.2116 |
0.1869 |
| recogniser_rule |
formula: X (nominative PROPER NOUN) + X (genitive PROPER NOUN) |
Translate as "X, X" |
0.00020 |
20 |
0.2275 |
0.1879 |
| recogniser_rule |
gloss: οἰκήτωρ ὁ |
inhabitant, resident, patron (of a brothel...? - κ123) |
0.00018 |
6 |
0.3660 |
0.1849 |
| recogniser_rule |
gloss: ἄκρον τό |
cape (when on the coast, sgl.), headlands (when on the coast, plu.); peak (when inland) |
0.00018 |
8 |
0.2890 |
0.1877 |
| recogniser_rule |
gloss: καλεῖται/ἐκαλεῖτο/κέκληται/ἐκλήθη (ἀπο...) |
is/used to be/is/was called/named after (+ ἀπο) |
0.00018 |
14 |
0.2488 |
0.1872 |
| recogniser_rule |
gloss: πόλισμα τό * |
town |
0.00015 |
5 |
0.2852 |
0.1910 |
Features associated with better scores
| Type |
Feature |
Detail |
Coefficient |
Passages |
Mean badness present |
Mean badness absent |
| recogniser_summary |
formula rule count |
|
-0.00120 |
100 |
0.1949 |
0.2718 |
| recogniser_summary |
formula occurrence count |
|
-0.00094 |
100 |
0.1949 |
0.2718 |
| recogniser_rule |
formula: τὸ ἐθνικὸν + X (nominative ETHNONYM) |
Translate as "the ethnonym is 'X'" |
-0.00029 |
62 |
0.1798 |
0.2210 |
| recogniser_rule |
formula: X (nominative) + ὡς + Y (nominative HOMOMORPH) |
Translate as "'X' as in 'Y'" |
-0.00028 |
43 |
0.1737 |
0.2120 |
| recogniser_rule |
formula: ὡς + X (nominative ETYMON) + Y (nominative DERIVED NOUN) |
Translate as "(just) as 'Y' is from X" |
-0.00026 |
35 |
0.1677 |
0.2105 |
| recogniser_rule |
formula: X (AUTHOR NAME) + Y (NUMERAL) |
Translate as "X, book Y" |
-0.00019 |
31 |
0.1637 |
0.2099 |
| recogniser_rule |
gloss: ἔθνος τό |
people |
-0.00019 |
16 |
0.1559 |
0.2032 |
| recogniser_rule |
formula: X (AUTHOR NAME) + ἐν + Y (dative BOOK NAME) |
Translate as "Χ, in his *Y*" |
-0.00017 |
17 |
0.1616 |
0.2026 |
| recogniser_rule |
formula: ὡς + X (AUTHOR NAME) |
Translate as "as per X" |
-0.00016 |
41 |
0.1781 |
0.2077 |
| recogniser_rule |
formula: Χ (AUTHOR NAME) + ἐν + Y (dative NUMBER) + Z (genitive BOOK NAME) |
Translate as "X, in book Y of his *Z*" |
-0.00011 |
11 |
0.1692 |
0.1989 |
| recogniser_rule |
formula: ὡς + X (definite ARTICLE + ETYMON) + Y (nominative DERIVED NOUN) |
Translate as "just as 'Y' is from the name X" |
-0.00010 |
19 |
0.1781 |
0.1998 |
| recogniser_rule |
gloss: ἐθνικόν τό |
ethnonym |
-0.00010 |
61 |
0.1838 |
0.2138 |
Combined Model Features: METEOR badness
Positive coefficients predict worse translation scores for the selected target. Negative coefficients predict better scores. Translation length is z-scored; vocabulary features exclude detected proper-noun tokens. These are exploratory ridge coefficients, not causal claims.
Features associated with worse scores
| Type |
Feature |
Detail |
Coefficient |
Passages |
Mean badness present |
Mean badness absent |
| recogniser_summary |
gloss occurrence count |
|
0.00194 |
96 |
0.1998 |
0.1167 |
| recogniser_summary |
gloss rule count |
|
0.00153 |
96 |
0.1998 |
0.1167 |
| recogniser_summary |
matched occurrence count |
|
0.00105 |
101 |
0.1957 |
N/A |
| recogniser_summary |
matched rule count |
|
0.00038 |
101 |
0.1957 |
N/A |
| recogniser_rule |
formula: X (SETTLEMENT) + Y (genitive REGION) |
Translate as "a X in Y" |
0.00028 |
63 |
0.2110 |
0.1704 |
| recogniser_rule |
formula: X (nominative DERIVED NOUN) + Y (nominative ETYMON) |
Translate as "'X' is from Y" |
0.00026 |
36 |
0.2116 |
0.1869 |
| recogniser_rule |
formula: X (nominative PROPER NOUN) + X (genitive PROPER NOUN) |
Translate as "X, X" |
0.00020 |
20 |
0.2275 |
0.1879 |
| recogniser_rule |
gloss: οἰκήτωρ ὁ |
inhabitant, resident, patron (of a brothel...? - κ123) |
0.00018 |
6 |
0.3660 |
0.1849 |
| recogniser_rule |
gloss: ἄκρον τό |
cape (when on the coast, sgl.), headlands (when on the coast, plu.); peak (when inland) |
0.00018 |
8 |
0.2890 |
0.1877 |
| recogniser_rule |
gloss: καλεῖται/ἐκαλεῖτο/κέκληται/ἐκλήθη (ἀπο...) |
is/used to be/is/was called/named after (+ ἀπο) |
0.00018 |
14 |
0.2488 |
0.1872 |
| recogniser_rule |
gloss: πόλισμα τό * |
town |
0.00016 |
5 |
0.2852 |
0.1910 |
| recogniser_rule |
formula: καί + X (nominative PROPER NOUN) + Y (nominative PROPER NOUN) |
Translate as "Y is also 'X'" |
0.00012 |
13 |
0.2129 |
0.1932 |
Features associated with better scores
| Type |
Feature |
Detail |
Coefficient |
Passages |
Mean badness present |
Mean badness absent |
| recogniser_summary |
formula rule count |
|
-0.00120 |
100 |
0.1949 |
0.2718 |
| recogniser_summary |
formula occurrence count |
|
-0.00094 |
100 |
0.1949 |
0.2718 |
| recogniser_rule |
formula: τὸ ἐθνικὸν + X (nominative ETHNONYM) |
Translate as "the ethnonym is 'X'" |
-0.00029 |
62 |
0.1798 |
0.2210 |
| recogniser_rule |
formula: X (nominative) + ὡς + Y (nominative HOMOMORPH) |
Translate as "'X' as in 'Y'" |
-0.00028 |
43 |
0.1737 |
0.2120 |
| recogniser_rule |
formula: ὡς + X (nominative ETYMON) + Y (nominative DERIVED NOUN) |
Translate as "(just) as 'Y' is from X" |
-0.00026 |
35 |
0.1677 |
0.2105 |
| recogniser_rule |
formula: X (AUTHOR NAME) + Y (NUMERAL) |
Translate as "X, book Y" |
-0.00019 |
31 |
0.1637 |
0.2099 |
| recogniser_rule |
gloss: ἔθνος τό |
people |
-0.00019 |
16 |
0.1559 |
0.2032 |
| recogniser_rule |
formula: X (AUTHOR NAME) + ἐν + Y (dative BOOK NAME) |
Translate as "Χ, in his *Y*" |
-0.00017 |
17 |
0.1616 |
0.2026 |
| recogniser_rule |
formula: ὡς + X (AUTHOR NAME) |
Translate as "as per X" |
-0.00016 |
41 |
0.1781 |
0.2077 |
| recogniser_rule |
formula: Χ (AUTHOR NAME) + ἐν + Y (dative NUMBER) + Z (genitive BOOK NAME) |
Translate as "X, in book Y of his *Z*" |
-0.00012 |
11 |
0.1692 |
0.1989 |
| vocabulary |
εθνικον |
|
-0.00011 |
57 |
0.1742 |
0.2236 |
| recogniser_rule |
formula: ὡς + X (definite ARTICLE + ETYMON) + Y (nominative DERIVED NOUN) |
Translate as "just as 'Y' is from the name X" |
-0.00011 |
19 |
0.1781 |
0.1998 |
Combined Model Features + Translation Length: METEOR badness
Positive coefficients predict worse translation scores for the selected target. Negative coefficients predict better scores. Translation length is z-scored; vocabulary features exclude detected proper-noun tokens. These are exploratory ridge coefficients, not causal claims.
Features associated with worse scores
| Type |
Feature |
Detail |
Coefficient |
Passages |
Mean badness present |
Mean badness absent |
| recogniser_summary |
gloss occurrence count |
|
0.00192 |
96 |
0.1998 |
0.1167 |
| recogniser_summary |
gloss rule count |
|
0.00152 |
96 |
0.1998 |
0.1167 |
| recogniser_summary |
matched occurrence count |
|
0.00103 |
101 |
0.1957 |
N/A |
| translation_length |
Mean v3 translation word count |
z-scored; mean 43.1, SD 36.7 words; present/high = top quartile, absent/low = bottom quartile |
0.00063 |
101 |
0.2753 |
0.1464 |
| recogniser_summary |
matched rule count |
|
0.00037 |
101 |
0.1957 |
N/A |
| recogniser_rule |
formula: X (SETTLEMENT) + Y (genitive REGION) |
Translate as "a X in Y" |
0.00028 |
63 |
0.2110 |
0.1704 |
| recogniser_rule |
formula: X (nominative DERIVED NOUN) + Y (nominative ETYMON) |
Translate as "'X' is from Y" |
0.00026 |
36 |
0.2116 |
0.1869 |
| recogniser_rule |
formula: X (nominative PROPER NOUN) + X (genitive PROPER NOUN) |
Translate as "X, X" |
0.00020 |
20 |
0.2275 |
0.1879 |
| recogniser_rule |
gloss: οἰκήτωρ ὁ |
inhabitant, resident, patron (of a brothel...? - κ123) |
0.00018 |
6 |
0.3660 |
0.1849 |
| recogniser_rule |
gloss: ἄκρον τό |
cape (when on the coast, sgl.), headlands (when on the coast, plu.); peak (when inland) |
0.00018 |
8 |
0.2890 |
0.1877 |
| recogniser_rule |
gloss: καλεῖται/ἐκαλεῖτο/κέκληται/ἐκλήθη (ἀπο...) |
is/used to be/is/was called/named after (+ ἀπο) |
0.00017 |
14 |
0.2488 |
0.1872 |
| recogniser_rule |
gloss: πόλισμα τό * |
town |
0.00015 |
5 |
0.2852 |
0.1910 |
Features associated with better scores
| Type |
Feature |
Detail |
Coefficient |
Passages |
Mean badness present |
Mean badness absent |
| recogniser_summary |
formula rule count |
|
-0.00120 |
100 |
0.1949 |
0.2718 |
| recogniser_summary |
formula occurrence count |
|
-0.00094 |
100 |
0.1949 |
0.2718 |
| recogniser_rule |
formula: τὸ ἐθνικὸν + X (nominative ETHNONYM) |
Translate as "the ethnonym is 'X'" |
-0.00029 |
62 |
0.1798 |
0.2210 |
| recogniser_rule |
formula: X (nominative) + ὡς + Y (nominative HOMOMORPH) |
Translate as "'X' as in 'Y'" |
-0.00028 |
43 |
0.1737 |
0.2120 |
| recogniser_rule |
formula: ὡς + X (nominative ETYMON) + Y (nominative DERIVED NOUN) |
Translate as "(just) as 'Y' is from X" |
-0.00026 |
35 |
0.1677 |
0.2105 |
| recogniser_rule |
formula: X (AUTHOR NAME) + Y (NUMERAL) |
Translate as "X, book Y" |
-0.00019 |
31 |
0.1637 |
0.2099 |
| recogniser_rule |
gloss: ἔθνος τό |
people |
-0.00019 |
16 |
0.1559 |
0.2032 |
| recogniser_rule |
formula: X (AUTHOR NAME) + ἐν + Y (dative BOOK NAME) |
Translate as "Χ, in his *Y*" |
-0.00017 |
17 |
0.1616 |
0.2026 |
| recogniser_rule |
formula: ὡς + X (AUTHOR NAME) |
Translate as "as per X" |
-0.00016 |
41 |
0.1781 |
0.2077 |
| recogniser_rule |
formula: Χ (AUTHOR NAME) + ἐν + Y (dative NUMBER) + Z (genitive BOOK NAME) |
Translate as "X, in book Y of his *Z*" |
-0.00011 |
11 |
0.1692 |
0.1989 |
| vocabulary |
εθνικον |
|
-0.00011 |
57 |
0.1742 |
0.2236 |
| recogniser_rule |
formula: ὡς + X (definite ARTICLE + ETYMON) + Y (nominative DERIVED NOUN) |
Translate as "just as 'Y' is from the name X" |
-0.00010 |
19 |
0.1781 |
0.1998 |
Downloadable Tables
Generated: 2026-06-25 12:20:32 UTC. Recogniser detector version: translation_guidance_scan_v4.