Translation Quality Predictor

This page tests whether current source vocabulary, translation-guidance recogniser matches, and mean v3 translation length can predict which approved-human passages are translated badly by ordinary legacy_scholarly v3. It excludes reasoning profiles and the special-temperature v3 experiment lanes.

Best current model: Greek vocabulary + translation length predicting 2-gram F1 badness, CV R^2 0.394, Spearman r 0.640.

Sentence-level alignment and metric rows exist; passage-level modeling remains the current daily output.

Metric Engines

MetricStatus
BLEU-4SacreBLEU sentence BLEU-4
METEORNLTK METEOR with WordNet synonyms
ROUGE-Lrouge-score ROUGE-L with stemming
chrF++SacreBLEU chrF++ with word_order=2
Approved human passages101
Scored v3 passages101
Completed v3 runs118
Mean runs per passage1.17

Predictability By Metric

Targets are badness measures: for score metrics, larger means lower translation score; for length, larger means more absolute word-count error. Cross-validation uses fixed five-fold splits where possible. The sample is small, so negative R^2 values should be read as evidence that the feature family is not currently useful for that metric.

Feature family Target Status Passages Features CV R^2 Spearman r CV MAE MAE lift Worst-quartile precision Ridge alpha
Greek vocabulary + translation length 2-gram F1 badness ok 101 207 0.394 0.640 0.1123 0.0331 61.5% 0.3257
Greek vocabulary + translation length chrF++ badness ok 101 207 0.372 0.679 0.0696 0.0227 61.5% 0.4924
Greek vocabulary 2-gram F1 badness ok 101 206 0.370 0.596 0.1149 0.0304 50.0% 0.3257
Greek vocabulary + translation length 3-gram F1 badness ok 101 207 0.369 0.598 0.1479 0.0374 61.5% 0.3257
Greek vocabulary + translation length ROUGE-L badness ok 101 207 0.360 0.663 0.0714 0.0207 61.5% 0.3257
Greek vocabulary + translation length 3-gram Jaccard badness ok 101 207 0.356 0.610 0.1576 0.0359 61.5% 0.4924
Greek vocabulary + translation length BLEU-4 badness ok 101 207 0.355 0.647 0.1269 0.0390 65.4% 0.4924
Greek vocabulary + translation length Sentence BLEU badness ok 101 207 0.355 0.647 0.1269 0.0390 65.4% 0.4924
Greek vocabulary + translation length METEOR badness ok 101 207 0.347 0.650 0.0782 0.0199 61.5% 0.4924
Greek vocabulary 3-gram F1 badness ok 101 206 0.342 0.560 0.1506 0.0347 57.7% 0.4924
Greek vocabulary chrF++ badness ok 101 206 0.338 0.617 0.0723 0.0199 57.7% 0.4924
Greek vocabulary 3-gram Jaccard badness ok 101 206 0.334 0.552 0.1592 0.0343 50.0% 0.4924
Greek vocabulary METEOR badness ok 101 206 0.330 0.594 0.0781 0.0201 50.0% 0.4924
Greek vocabulary ROUGE-L badness ok 101 206 0.330 0.605 0.0733 0.0188 50.0% 0.3257
Greek vocabulary BLEU-4 badness ok 101 206 0.326 0.596 0.1287 0.0373 65.4% 0.4924
Greek vocabulary Sentence BLEU badness ok 101 206 0.326 0.596 0.1287 0.0373 65.4% 0.4924
Translation length chrF++ badness ok 101 1 0.164 0.442 0.0835 0.0087 44.4% 13.4340
Translation length ROUGE-L badness ok 101 1 0.129 0.378 0.0847 0.0075 46.2% 20.3092
Translation length METEOR badness ok 101 1 0.116 0.334 0.0914 0.0068 42.3% 20.3092
Translation length BLEU-4 badness ok 101 1 0.102 0.358 0.1535 0.0124 33.3% 13.4340
Translation length Sentence BLEU badness ok 101 1 0.102 0.358 0.1535 0.0124 33.3% 13.4340
Translation length 3-gram Jaccard badness ok 101 1 0.075 0.249 0.1811 0.0125 37.0% 8.8862
Translation length 2-gram F1 badness ok 101 1 0.072 0.281 0.1364 0.0090 38.5% 13.4340
Greek vocabulary Absolute length percent error ok 101 206 0.068 0.129 0.0408 0.0008 30.8% 1.7013
Greek vocabulary + translation length Absolute length percent error ok 101 207 0.062 0.107 0.0409 0.0007 30.8% 1.7013
Vocabulary + recognisers + translation length METEOR badness ok 101 288 0.060 0.327 0.0950 0.0032 46.2% 4375.4794
Recogniser rules + translation length METEOR badness ok 101 82 0.059 0.325 0.0950 0.0032 46.2% 4375.4794
Vocabulary + recognisers + translation length ROUGE-L badness ok 101 288 0.059 0.523 0.0831 0.0090 61.5% 1.7013
Vocabulary + recognisers METEOR badness ok 101 287 0.058 0.323 0.0951 0.0031 42.3% 4375.4794
Recogniser rules METEOR badness ok 101 81 0.058 0.322 0.0951 0.0031 42.3% 4375.4794
Vocabulary + recognisers + translation length chrF++ badness ok 101 288 0.058 0.339 0.0892 0.0030 50.0% 2894.2661
Recogniser rules + translation length chrF++ badness ok 101 82 0.057 0.339 0.0892 0.0030 50.0% 2894.2661
Translation length 3-gram F1 badness ok 101 1 0.055 0.240 0.1749 0.0104 33.3% 13.4340
Vocabulary + recognisers chrF++ badness ok 101 287 0.054 0.327 0.0894 0.0028 50.0% 2894.2661
Recogniser rules chrF++ badness ok 101 81 0.053 0.323 0.0895 0.0028 46.2% 2894.2661
Recogniser rules + translation length ROUGE-L badness ok 101 82 0.052 0.323 0.0889 0.0032 42.3% 4375.4794
Vocabulary + recognisers ROUGE-L badness ok 101 287 0.050 0.307 0.0890 0.0031 42.3% 4375.4794
Recogniser rules ROUGE-L badness ok 101 81 0.050 0.305 0.0891 0.0031 42.3% 4375.4794
Vocabulary + recognisers + translation length 3-gram Jaccard badness ok 101 288 0.040 0.436 0.1828 0.0108 50.0% 5.8780
Vocabulary + recognisers + translation length BLEU-4 badness ok 101 288 0.038 0.218 0.1625 0.0035 42.3% 4375.4794
Vocabulary + recognisers + translation length Sentence BLEU badness ok 101 288 0.038 0.218 0.1625 0.0035 42.3% 4375.4794
Recogniser rules + translation length BLEU-4 badness ok 101 82 0.038 0.216 0.1625 0.0035 42.3% 4375.4794
Recogniser rules + translation length Sentence BLEU badness ok 101 82 0.038 0.216 0.1625 0.0035 42.3% 4375.4794
Vocabulary + recognisers BLEU-4 badness ok 101 287 0.036 0.212 0.1627 0.0033 42.3% 4375.4794
Vocabulary + recognisers Sentence BLEU badness ok 101 287 0.036 0.212 0.1627 0.0033 42.3% 4375.4794
Recogniser rules BLEU-4 badness ok 101 81 0.036 0.210 0.1627 0.0032 42.3% 4375.4794
Recogniser rules Sentence BLEU badness ok 101 81 0.036 0.210 0.1627 0.0032 42.3% 4375.4794
Vocabulary + recognisers + translation length 2-gram F1 badness ok 101 288 0.033 0.242 0.1422 0.0032 42.3% 4375.4794
Recogniser rules + translation length 2-gram F1 badness ok 101 82 0.033 0.240 0.1422 0.0032 42.3% 4375.4794
Vocabulary + recognisers 2-gram F1 badness ok 101 287 0.031 0.231 0.1424 0.0030 42.3% 4375.4794
Recogniser rules 2-gram F1 badness ok 101 81 0.031 0.228 0.1424 0.0030 42.3% 4375.4794
Vocabulary + recognisers Absolute length percent error ok 101 287 0.028 0.167 0.0420 -0.0004 46.2% 837.6776
Recogniser rules Absolute length percent error ok 101 81 0.028 0.165 0.0420 -0.0004 46.2% 837.6776
Vocabulary + recognisers + translation length Absolute length percent error ok 101 288 0.028 0.164 0.0420 -0.0004 46.2% 837.6776
Recogniser rules + translation length Absolute length percent error ok 101 82 0.027 0.167 0.0420 -0.0004 46.2% 837.6776
Vocabulary + recognisers + translation length 3-gram F1 badness ok 101 288 0.026 0.230 0.1813 0.0040 34.6% 2894.2661
Recogniser rules + translation length 3-gram F1 badness ok 101 82 0.026 0.229 0.1813 0.0040 34.6% 2894.2661
Vocabulary + recognisers 3-gram F1 badness ok 101 287 0.024 0.221 0.1815 0.0038 34.6% 2894.2661
Recogniser rules 3-gram F1 badness ok 101 81 0.024 0.193 0.1819 0.0034 34.6% 4375.4794
Recogniser rules + translation length 3-gram Jaccard badness ok 101 82 0.023 0.227 0.1903 0.0033 38.5% 2894.2661
Vocabulary + recognisers 3-gram Jaccard badness ok 101 287 0.021 0.224 0.1905 0.0030 38.5% 2894.2661
Recogniser rules 3-gram Jaccard badness ok 101 81 0.021 0.222 0.1905 0.0030 38.5% 2894.2661
Translation length Absolute length percent error ok 101 1 -0.002 -0.077 0.0416 -0.0001 19.2% 10000.0000

Highest Predicted Risk

This list uses the best cross-validated model in this run and sorts passages by predicted badness for 2-gram F1 badness.

Lemma ID v3 runs Source words Observed badness Predicted badness BLEU-4 chrF++ 3-gram F1 Length error
Καρία 2484 1 181.0 0.4484 0.6604 42.0% 65.8% 42.2% 12.6%
Κασώριον 2623 1 14.0 0.5556 0.6126 32.1% 66.0% 35.3% 10.0%
Κάρυστος 2603 1 132.0 0.4327 0.5945 46.5% 69.5% 41.5% 0.6%
Καλάσιρις 2085 1 10.0 0.6923 0.5179 32.3% 69.5% 8.3% 15.4%
Κάλυτις 2335 2 16.0 0.6765 0.5155 31.9% 60.3% 14.7% 5.8%
Κριώα 3530 1 16.0 0.4894 0.5146 38.0% 71.1% 31.1% 11.5%
Καδμεία 2059 1 17.0 0.5000 0.4971 41.9% 68.2% 36.8% 10.0%
Καρχηδών 2604 1 88.0 0.4897 0.4941 35.0% 63.9% 35.7% 9.4%
Καππαδοκία 2470 2 57.0 0.4280 0.4923 20.0% 56.8% 41.7% 3.7%
Κοτιάειον 3496 1 46.0 0.5373 0.4854 41.0% 62.3% 34.8% 8.5%
Καταονία 2628 1 17.0 0.5319 0.4780 36.2% 67.4% 31.1% 4.2%
Καλαβρία 2080 1 12.0 0.6250 0.4718 23.6% 66.7% 13.3% 11.1%
Κύρνος 7247 1 34.0 0.3895 0.4620 47.1% 69.3% 51.6% 6.0%
Κύτα 7254 1 58.0 0.4000 0.4495 51.9% 73.6% 47.6% 4.8%
Καπετώλιον 2468 1 86.0 0.5827 0.4493 30.1% 50.3% 31.0% 14.5%
Κάναστρον 2455 1 43.0 0.5826 0.4492 32.0% 62.9% 26.5% 5.0%
Καρπασία 2597 1 80.0 0.3767 0.4414 54.3% 75.2% 51.6% 6.2%
Κώμη 7266 1 53.0 0.7432 0.4163 13.8% 47.6% 13.7% 10.1%
Κατάνη 2626 1 64.0 0.4945 0.4159 43.8% 64.8% 36.7% 6.3%
Κυτέριον 7255 1 18.0 0.4167 0.4147 53.0% 76.1% 43.5% 27.3%
Κάλπη 2329 1 36.0 0.2000 0.4054 81.0% 89.1% 72.2% 3.5%
Καβασσός 2055 1 69.0 0.5048 0.4053 41.6% 67.5% 32.7% 5.5%
Καικῖνον 2074 1 6.0 0.6842 0.4048 21.4% 66.4% 0.0% 9.1%
Κωνώπη 7267 1 47.0 0.3667 0.4033 57.6% 75.7% 50.8% 9.4%
Κάλλατις 2119 1 45.0 0.4887 0.4018 31.8% 63.9% 30.5% 7.7%
Κάσος 2607 1 44.0 0.4464 0.3991 33.7% 61.5% 43.6% 0.0%
Καβελλιών 2057 1 23.0 0.2836 0.3934 45.8% 75.0% 55.4% 2.9%
Κάληρος 2116 1 22.0 0.5890 0.3854 29.3% 61.2% 28.2% 12.5%
Κωλιάς 7264 1 42.0 0.3445 0.3853 56.9% 76.7% 53.0% 1.6%
Κάσιον 2605 1 43.0 0.3158 0.3812 53.0% 75.6% 55.4% 7.1%

Predictive Features

Translation Length: chrF++ badness

Positive coefficients predict worse translation scores for the selected target. Negative coefficients predict better scores. Translation length is z-scored; vocabulary features exclude detected proper-noun tokens. These are exploratory ridge coefficients, not causal claims.

Features associated with worse scores

Type Feature Detail Coefficient Passages Mean badness present Mean badness absent
translation_length Mean v3 translation word count z-scored; mean 43.1, SD 36.7 words; present/high = top quartile, absent/low = bottom quartile 0.04422 101 0.3137 0.1565

Features associated with better scores

Type Feature Detail Coefficient Passages Mean badness present Mean badness absent

Vocabulary Terms: 2-gram F1 badness

Positive coefficients predict worse translation scores for the selected target. Negative coefficients predict better scores. Translation length is z-scored; vocabulary features exclude detected proper-noun tokens. These are exploratory ridge coefficients, not causal claims.

Features associated with worse scores

Type Feature Detail Coefficient Passages Mean badness present Mean badness absent
vocabulary χωριον 0.25588 2 0.5835 0.3334
vocabulary οικητωρ 0.20685 6 0.5544 0.3247
vocabulary επι 0.17838 2 0.6113 0.3328
vocabulary τον 0.17386 10 0.4599 0.3250
vocabulary τοις 0.17006 3 0.4885 0.3338
vocabulary ωστε 0.16307 3 0.5518 0.3318
vocabulary και φασι 0.15998 2 0.5012 0.3351
vocabulary οικητωρ και 0.15359 3 0.6190 0.3298
vocabulary καλειται 0.15359 2 0.4723 0.3356
vocabulary και 0.14659 66 0.3828 0.2546
vocabulary ει 0.14567 4 0.5657 0.3290
vocabulary εκαλειτο 0.13830 7 0.4612 0.3292

Features associated with better scores

Type Feature Detail Coefficient Passages Mean badness present Mean badness absent
vocabulary τεταρτω -0.28307 3 0.0929 0.3459
vocabulary εθνικον -0.21955 57 0.3037 0.3833
vocabulary εβδομη -0.20173 2 0.2444 0.3403
vocabulary ως εθνικον -0.17735 5 0.1882 0.3462
vocabulary μεταξυ και -0.15902 5 0.2941 0.3407
vocabulary μεταξυ -0.15902 5 0.2941 0.3407
vocabulary πορρω -0.15293 3 0.0962 0.3458
vocabulary ου πορρω -0.15293 3 0.0962 0.3458
vocabulary πολιτης -0.13977 18 0.3404 0.3379
vocabulary παιδος -0.13969 4 0.2105 0.3436
vocabulary απο παιδος -0.13969 4 0.2105 0.3436
vocabulary εν -0.13932 37 0.3444 0.3349

Vocabulary Terms + Translation Length: 2-gram F1 badness

Positive coefficients predict worse translation scores for the selected target. Negative coefficients predict better scores. Translation length is z-scored; vocabulary features exclude detected proper-noun tokens. These are exploratory ridge coefficients, not causal claims.

Features associated with worse scores

Type Feature Detail Coefficient Passages Mean badness present Mean badness absent
vocabulary χωριον 0.26148 2 0.5835 0.3334
vocabulary οικητωρ 0.20978 6 0.5544 0.3247
vocabulary επι 0.18030 2 0.6113 0.3328
vocabulary και φασι 0.17593 2 0.5012 0.3351
vocabulary οικητωρ και 0.16831 3 0.6190 0.3298
vocabulary τοις 0.15941 3 0.4885 0.3338
vocabulary τον 0.15662 10 0.4599 0.3250
vocabulary μοιρα 0.15277 2 0.6121 0.3328
vocabulary καλειται 0.15045 2 0.4723 0.3356
vocabulary ωστε 0.14658 3 0.5518 0.3318
vocabulary και 0.14300 66 0.3828 0.2546
vocabulary δευτερω 0.13128 3 0.5157 0.3329

Features associated with better scores

Type Feature Detail Coefficient Passages Mean badness present Mean badness absent
vocabulary τεταρτω -0.26840 3 0.0929 0.3459
vocabulary εθνικον -0.20535 57 0.3037 0.3833
vocabulary εβδομη -0.19872 2 0.2444 0.3403
vocabulary ως εθνικον -0.17192 5 0.1882 0.3462
vocabulary εν -0.16332 37 0.3444 0.3349
vocabulary μεταξυ και -0.15855 5 0.2941 0.3407
vocabulary μεταξυ -0.15855 5 0.2941 0.3407
vocabulary ως εν -0.14623 7 0.3104 0.3404
vocabulary πορρω -0.14466 3 0.0962 0.3458
vocabulary ου πορρω -0.14466 3 0.0962 0.3458
vocabulary παιδος -0.13855 4 0.2105 0.3436
vocabulary απο παιδος -0.13855 4 0.2105 0.3436

Recogniser Rules: METEOR badness

Positive coefficients predict worse translation scores for the selected target. Negative coefficients predict better scores. Translation length is z-scored; vocabulary features exclude detected proper-noun tokens. These are exploratory ridge coefficients, not causal claims.

Features associated with worse scores

Type Feature Detail Coefficient Passages Mean badness present Mean badness absent
recogniser_summary gloss occurrence count 0.00194 96 0.1998 0.1167
recogniser_summary gloss rule count 0.00153 96 0.1998 0.1167
recogniser_summary matched occurrence count 0.00105 101 0.1957 N/A
recogniser_summary matched rule count 0.00038 101 0.1957 N/A
recogniser_rule formula: X (SETTLEMENT) + Y (genitive REGION) Translate as "a X in Y" 0.00028 63 0.2110 0.1704
recogniser_rule formula: X (nominative DERIVED NOUN) + Y (nominative ETYMON) Translate as "'X' is from Y" 0.00026 36 0.2116 0.1869
recogniser_rule formula: X (nominative PROPER NOUN) + X (genitive PROPER NOUN) Translate as "X, X" 0.00020 20 0.2275 0.1879
recogniser_rule gloss: οἰκήτωρ ὁ inhabitant, resident, patron (of a brothel...? - κ123) 0.00018 6 0.3660 0.1849
recogniser_rule gloss: ἄκρον τό cape (when on the coast, sgl.), headlands (when on the coast, plu.); peak (when inland) 0.00018 8 0.2890 0.1877
recogniser_rule gloss: καλεῖται/ἐκαλεῖτο/κέκληται/ἐκλήθη (ἀπο...) is/used to be/is/was called/named after (+ ἀπο) 0.00018 14 0.2488 0.1872
recogniser_rule gloss: πόλισμα τό * town 0.00016 5 0.2852 0.1910
recogniser_rule formula: καί + X (nominative PROPER NOUN) + Y (nominative PROPER NOUN) Translate as "Y is also 'X'" 0.00012 13 0.2129 0.1932

Features associated with better scores

Type Feature Detail Coefficient Passages Mean badness present Mean badness absent
recogniser_summary formula rule count -0.00120 100 0.1949 0.2718
recogniser_summary formula occurrence count -0.00094 100 0.1949 0.2718
recogniser_rule formula: τὸ ἐθνικὸν + X (nominative ETHNONYM) Translate as "the ethnonym is 'X'" -0.00029 62 0.1798 0.2210
recogniser_rule formula: X (nominative) + ὡς + Y (nominative HOMOMORPH) Translate as "'X' as in 'Y'" -0.00028 43 0.1737 0.2120
recogniser_rule formula: ὡς + X (nominative ETYMON) + Y (nominative DERIVED NOUN) Translate as "(just) as 'Y' is from X" -0.00026 35 0.1677 0.2105
recogniser_rule formula: X (AUTHOR NAME) + Y (NUMERAL) Translate as "X, book Y" -0.00019 31 0.1637 0.2099
recogniser_rule gloss: ἔθνος τό people -0.00019 16 0.1559 0.2032
recogniser_rule formula: X (AUTHOR NAME) + ἐν + Y (dative BOOK NAME) Translate as "Χ, in his *Y*" -0.00017 17 0.1616 0.2026
recogniser_rule formula: ὡς + X (AUTHOR NAME) Translate as "as per X" -0.00016 41 0.1781 0.2077
recogniser_rule formula: Χ (AUTHOR NAME) + ἐν + Y (dative NUMBER) + Z (genitive BOOK NAME) Translate as "X, in book Y of his *Z*" -0.00012 11 0.1692 0.1989
recogniser_rule formula: ὡς + X (definite ARTICLE + ETYMON) + Y (nominative DERIVED NOUN) Translate as "just as 'Y' is from the name X" -0.00011 19 0.1781 0.1998
recogniser_rule gloss: ἐθνικόν τό ethnonym -0.00009 61 0.1838 0.2138

Recogniser Rules + Translation Length: METEOR badness

Positive coefficients predict worse translation scores for the selected target. Negative coefficients predict better scores. Translation length is z-scored; vocabulary features exclude detected proper-noun tokens. These are exploratory ridge coefficients, not causal claims.

Features associated with worse scores

Type Feature Detail Coefficient Passages Mean badness present Mean badness absent
recogniser_summary gloss occurrence count 0.00192 96 0.1998 0.1167
recogniser_summary gloss rule count 0.00152 96 0.1998 0.1167
recogniser_summary matched occurrence count 0.00103 101 0.1957 N/A
translation_length Mean v3 translation word count z-scored; mean 43.1, SD 36.7 words; present/high = top quartile, absent/low = bottom quartile 0.00063 101 0.2753 0.1464
recogniser_summary matched rule count 0.00037 101 0.1957 N/A
recogniser_rule formula: X (SETTLEMENT) + Y (genitive REGION) Translate as "a X in Y" 0.00028 63 0.2110 0.1704
recogniser_rule formula: X (nominative DERIVED NOUN) + Y (nominative ETYMON) Translate as "'X' is from Y" 0.00026 36 0.2116 0.1869
recogniser_rule formula: X (nominative PROPER NOUN) + X (genitive PROPER NOUN) Translate as "X, X" 0.00020 20 0.2275 0.1879
recogniser_rule gloss: οἰκήτωρ ὁ inhabitant, resident, patron (of a brothel...? - κ123) 0.00018 6 0.3660 0.1849
recogniser_rule gloss: ἄκρον τό cape (when on the coast, sgl.), headlands (when on the coast, plu.); peak (when inland) 0.00018 8 0.2890 0.1877
recogniser_rule gloss: καλεῖται/ἐκαλεῖτο/κέκληται/ἐκλήθη (ἀπο...) is/used to be/is/was called/named after (+ ἀπο) 0.00018 14 0.2488 0.1872
recogniser_rule gloss: πόλισμα τό * town 0.00015 5 0.2852 0.1910

Features associated with better scores

Type Feature Detail Coefficient Passages Mean badness present Mean badness absent
recogniser_summary formula rule count -0.00120 100 0.1949 0.2718
recogniser_summary formula occurrence count -0.00094 100 0.1949 0.2718
recogniser_rule formula: τὸ ἐθνικὸν + X (nominative ETHNONYM) Translate as "the ethnonym is 'X'" -0.00029 62 0.1798 0.2210
recogniser_rule formula: X (nominative) + ὡς + Y (nominative HOMOMORPH) Translate as "'X' as in 'Y'" -0.00028 43 0.1737 0.2120
recogniser_rule formula: ὡς + X (nominative ETYMON) + Y (nominative DERIVED NOUN) Translate as "(just) as 'Y' is from X" -0.00026 35 0.1677 0.2105
recogniser_rule formula: X (AUTHOR NAME) + Y (NUMERAL) Translate as "X, book Y" -0.00019 31 0.1637 0.2099
recogniser_rule gloss: ἔθνος τό people -0.00019 16 0.1559 0.2032
recogniser_rule formula: X (AUTHOR NAME) + ἐν + Y (dative BOOK NAME) Translate as "Χ, in his *Y*" -0.00017 17 0.1616 0.2026
recogniser_rule formula: ὡς + X (AUTHOR NAME) Translate as "as per X" -0.00016 41 0.1781 0.2077
recogniser_rule formula: Χ (AUTHOR NAME) + ἐν + Y (dative NUMBER) + Z (genitive BOOK NAME) Translate as "X, in book Y of his *Z*" -0.00011 11 0.1692 0.1989
recogniser_rule formula: ὡς + X (definite ARTICLE + ETYMON) + Y (nominative DERIVED NOUN) Translate as "just as 'Y' is from the name X" -0.00010 19 0.1781 0.1998
recogniser_rule gloss: ἐθνικόν τό ethnonym -0.00010 61 0.1838 0.2138

Combined Model Features: METEOR badness

Positive coefficients predict worse translation scores for the selected target. Negative coefficients predict better scores. Translation length is z-scored; vocabulary features exclude detected proper-noun tokens. These are exploratory ridge coefficients, not causal claims.

Features associated with worse scores

Type Feature Detail Coefficient Passages Mean badness present Mean badness absent
recogniser_summary gloss occurrence count 0.00194 96 0.1998 0.1167
recogniser_summary gloss rule count 0.00153 96 0.1998 0.1167
recogniser_summary matched occurrence count 0.00105 101 0.1957 N/A
recogniser_summary matched rule count 0.00038 101 0.1957 N/A
recogniser_rule formula: X (SETTLEMENT) + Y (genitive REGION) Translate as "a X in Y" 0.00028 63 0.2110 0.1704
recogniser_rule formula: X (nominative DERIVED NOUN) + Y (nominative ETYMON) Translate as "'X' is from Y" 0.00026 36 0.2116 0.1869
recogniser_rule formula: X (nominative PROPER NOUN) + X (genitive PROPER NOUN) Translate as "X, X" 0.00020 20 0.2275 0.1879
recogniser_rule gloss: οἰκήτωρ ὁ inhabitant, resident, patron (of a brothel...? - κ123) 0.00018 6 0.3660 0.1849
recogniser_rule gloss: ἄκρον τό cape (when on the coast, sgl.), headlands (when on the coast, plu.); peak (when inland) 0.00018 8 0.2890 0.1877
recogniser_rule gloss: καλεῖται/ἐκαλεῖτο/κέκληται/ἐκλήθη (ἀπο...) is/used to be/is/was called/named after (+ ἀπο) 0.00018 14 0.2488 0.1872
recogniser_rule gloss: πόλισμα τό * town 0.00016 5 0.2852 0.1910
recogniser_rule formula: καί + X (nominative PROPER NOUN) + Y (nominative PROPER NOUN) Translate as "Y is also 'X'" 0.00012 13 0.2129 0.1932

Features associated with better scores

Type Feature Detail Coefficient Passages Mean badness present Mean badness absent
recogniser_summary formula rule count -0.00120 100 0.1949 0.2718
recogniser_summary formula occurrence count -0.00094 100 0.1949 0.2718
recogniser_rule formula: τὸ ἐθνικὸν + X (nominative ETHNONYM) Translate as "the ethnonym is 'X'" -0.00029 62 0.1798 0.2210
recogniser_rule formula: X (nominative) + ὡς + Y (nominative HOMOMORPH) Translate as "'X' as in 'Y'" -0.00028 43 0.1737 0.2120
recogniser_rule formula: ὡς + X (nominative ETYMON) + Y (nominative DERIVED NOUN) Translate as "(just) as 'Y' is from X" -0.00026 35 0.1677 0.2105
recogniser_rule formula: X (AUTHOR NAME) + Y (NUMERAL) Translate as "X, book Y" -0.00019 31 0.1637 0.2099
recogniser_rule gloss: ἔθνος τό people -0.00019 16 0.1559 0.2032
recogniser_rule formula: X (AUTHOR NAME) + ἐν + Y (dative BOOK NAME) Translate as "Χ, in his *Y*" -0.00017 17 0.1616 0.2026
recogniser_rule formula: ὡς + X (AUTHOR NAME) Translate as "as per X" -0.00016 41 0.1781 0.2077
recogniser_rule formula: Χ (AUTHOR NAME) + ἐν + Y (dative NUMBER) + Z (genitive BOOK NAME) Translate as "X, in book Y of his *Z*" -0.00012 11 0.1692 0.1989
vocabulary εθνικον -0.00011 57 0.1742 0.2236
recogniser_rule formula: ὡς + X (definite ARTICLE + ETYMON) + Y (nominative DERIVED NOUN) Translate as "just as 'Y' is from the name X" -0.00011 19 0.1781 0.1998

Combined Model Features + Translation Length: METEOR badness

Positive coefficients predict worse translation scores for the selected target. Negative coefficients predict better scores. Translation length is z-scored; vocabulary features exclude detected proper-noun tokens. These are exploratory ridge coefficients, not causal claims.

Features associated with worse scores

Type Feature Detail Coefficient Passages Mean badness present Mean badness absent
recogniser_summary gloss occurrence count 0.00192 96 0.1998 0.1167
recogniser_summary gloss rule count 0.00152 96 0.1998 0.1167
recogniser_summary matched occurrence count 0.00103 101 0.1957 N/A
translation_length Mean v3 translation word count z-scored; mean 43.1, SD 36.7 words; present/high = top quartile, absent/low = bottom quartile 0.00063 101 0.2753 0.1464
recogniser_summary matched rule count 0.00037 101 0.1957 N/A
recogniser_rule formula: X (SETTLEMENT) + Y (genitive REGION) Translate as "a X in Y" 0.00028 63 0.2110 0.1704
recogniser_rule formula: X (nominative DERIVED NOUN) + Y (nominative ETYMON) Translate as "'X' is from Y" 0.00026 36 0.2116 0.1869
recogniser_rule formula: X (nominative PROPER NOUN) + X (genitive PROPER NOUN) Translate as "X, X" 0.00020 20 0.2275 0.1879
recogniser_rule gloss: οἰκήτωρ ὁ inhabitant, resident, patron (of a brothel...? - κ123) 0.00018 6 0.3660 0.1849
recogniser_rule gloss: ἄκρον τό cape (when on the coast, sgl.), headlands (when on the coast, plu.); peak (when inland) 0.00018 8 0.2890 0.1877
recogniser_rule gloss: καλεῖται/ἐκαλεῖτο/κέκληται/ἐκλήθη (ἀπο...) is/used to be/is/was called/named after (+ ἀπο) 0.00017 14 0.2488 0.1872
recogniser_rule gloss: πόλισμα τό * town 0.00015 5 0.2852 0.1910

Features associated with better scores

Type Feature Detail Coefficient Passages Mean badness present Mean badness absent
recogniser_summary formula rule count -0.00120 100 0.1949 0.2718
recogniser_summary formula occurrence count -0.00094 100 0.1949 0.2718
recogniser_rule formula: τὸ ἐθνικὸν + X (nominative ETHNONYM) Translate as "the ethnonym is 'X'" -0.00029 62 0.1798 0.2210
recogniser_rule formula: X (nominative) + ὡς + Y (nominative HOMOMORPH) Translate as "'X' as in 'Y'" -0.00028 43 0.1737 0.2120
recogniser_rule formula: ὡς + X (nominative ETYMON) + Y (nominative DERIVED NOUN) Translate as "(just) as 'Y' is from X" -0.00026 35 0.1677 0.2105
recogniser_rule formula: X (AUTHOR NAME) + Y (NUMERAL) Translate as "X, book Y" -0.00019 31 0.1637 0.2099
recogniser_rule gloss: ἔθνος τό people -0.00019 16 0.1559 0.2032
recogniser_rule formula: X (AUTHOR NAME) + ἐν + Y (dative BOOK NAME) Translate as "Χ, in his *Y*" -0.00017 17 0.1616 0.2026
recogniser_rule formula: ὡς + X (AUTHOR NAME) Translate as "as per X" -0.00016 41 0.1781 0.2077
recogniser_rule formula: Χ (AUTHOR NAME) + ἐν + Y (dative NUMBER) + Z (genitive BOOK NAME) Translate as "X, in book Y of his *Z*" -0.00011 11 0.1692 0.1989
vocabulary εθνικον -0.00011 57 0.1742 0.2236
recogniser_rule formula: ὡς + X (definite ARTICLE + ETYMON) + Y (nominative DERIVED NOUN) Translate as "just as 'Y' is from the name X" -0.00010 19 0.1781 0.1998

Downloadable Tables

Generated: 2026-06-25 12:20:32 UTC. Recogniser detector version: translation_guidance_scan_v4.