Translation Quality Predictor

This page tests whether current source vocabulary, translation-guidance recogniser matches, and mean v3 translation length can predict which approved-human passages are translated badly by ordinary legacy_scholarly v3. It excludes reasoning profiles and the special-temperature v3 experiment lanes.

Best current model: Greek vocabulary + translation length predicting 2-gram F1 badness, CV R^2 0.394, Spearman r 0.640.

Sentence-level alignment and metric rows exist; passage-level modeling remains the current daily output.

Metric Engines

Metric	Status
BLEU-4	SacreBLEU sentence BLEU-4
METEOR	NLTK METEOR with WordNet synonyms
ROUGE-L	rouge-score ROUGE-L with stemming
chrF++	SacreBLEU chrF++ with word_order=2

Approved human passages101

Scored v3 passages101

Completed v3 runs118

Mean runs per passage1.17

Predictability By Metric

Targets are badness measures: for score metrics, larger means lower translation score; for length, larger means more absolute word-count error. Cross-validation uses fixed five-fold splits where possible. The sample is small, so negative R^2 values should be read as evidence that the feature family is not currently useful for that metric.

Feature family	Target	Status	Passages	Features	CV R^2	Spearman r	CV MAE	MAE lift	Worst-quartile precision	Ridge alpha
Greek vocabulary + translation length	2-gram F1 badness	ok	101	207	0.394	0.640	0.1123	0.0331	61.5%	0.3257
Greek vocabulary + translation length	chrF++ badness	ok	101	207	0.372	0.679	0.0696	0.0227	61.5%	0.4924
Greek vocabulary	2-gram F1 badness	ok	101	206	0.370	0.596	0.1149	0.0304	50.0%	0.3257
Greek vocabulary + translation length	3-gram F1 badness	ok	101	207	0.369	0.598	0.1479	0.0374	61.5%	0.3257
Greek vocabulary + translation length	ROUGE-L badness	ok	101	207	0.360	0.663	0.0714	0.0207	61.5%	0.3257
Greek vocabulary + translation length	3-gram Jaccard badness	ok	101	207	0.356	0.610	0.1576	0.0359	61.5%	0.4924
Greek vocabulary + translation length	BLEU-4 badness	ok	101	207	0.355	0.647	0.1269	0.0390	65.4%	0.4924
Greek vocabulary + translation length	Sentence BLEU badness	ok	101	207	0.355	0.647	0.1269	0.0390	65.4%	0.4924
Greek vocabulary + translation length	METEOR badness	ok	101	207	0.347	0.650	0.0782	0.0199	61.5%	0.4924
Greek vocabulary	3-gram F1 badness	ok	101	206	0.342	0.560	0.1506	0.0347	57.7%	0.4924
Greek vocabulary	chrF++ badness	ok	101	206	0.338	0.617	0.0723	0.0199	57.7%	0.4924
Greek vocabulary	3-gram Jaccard badness	ok	101	206	0.334	0.552	0.1592	0.0343	50.0%	0.4924
Greek vocabulary	METEOR badness	ok	101	206	0.330	0.594	0.0781	0.0201	50.0%	0.4924
Greek vocabulary	ROUGE-L badness	ok	101	206	0.330	0.605	0.0733	0.0188	50.0%	0.3257
Greek vocabulary	BLEU-4 badness	ok	101	206	0.326	0.596	0.1287	0.0373	65.4%	0.4924
Greek vocabulary	Sentence BLEU badness	ok	101	206	0.326	0.596	0.1287	0.0373	65.4%	0.4924
Translation length	chrF++ badness	ok	101	1	0.164	0.442	0.0835	0.0087	44.4%	13.4340
Translation length	ROUGE-L badness	ok	101	1	0.129	0.378	0.0847	0.0075	46.2%	20.3092
Translation length	METEOR badness	ok	101	1	0.116	0.334	0.0914	0.0068	42.3%	20.3092
Translation length	BLEU-4 badness	ok	101	1	0.102	0.358	0.1535	0.0124	33.3%	13.4340
Translation length	Sentence BLEU badness	ok	101	1	0.102	0.358	0.1535	0.0124	33.3%	13.4340
Translation length	3-gram Jaccard badness	ok	101	1	0.075	0.249	0.1811	0.0125	37.0%	8.8862
Translation length	2-gram F1 badness	ok	101	1	0.072	0.281	0.1364	0.0090	38.5%	13.4340
Greek vocabulary	Absolute length percent error	ok	101	206	0.068	0.129	0.0408	0.0008	30.8%	1.7013
Greek vocabulary + translation length	Absolute length percent error	ok	101	207	0.062	0.107	0.0409	0.0007	30.8%	1.7013
Vocabulary + recognisers + translation length	METEOR badness	ok	101	288	0.060	0.327	0.0950	0.0032	46.2%	4375.4794
Recogniser rules + translation length	METEOR badness	ok	101	82	0.059	0.325	0.0950	0.0032	46.2%	4375.4794
Vocabulary + recognisers + translation length	ROUGE-L badness	ok	101	288	0.059	0.523	0.0831	0.0090	61.5%	1.7013
Vocabulary + recognisers	METEOR badness	ok	101	287	0.058	0.323	0.0951	0.0031	42.3%	4375.4794
Recogniser rules	METEOR badness	ok	101	81	0.058	0.322	0.0951	0.0031	42.3%	4375.4794
Vocabulary + recognisers + translation length	chrF++ badness	ok	101	288	0.058	0.339	0.0892	0.0030	50.0%	2894.2661
Recogniser rules + translation length	chrF++ badness	ok	101	82	0.057	0.339	0.0892	0.0030	50.0%	2894.2661
Translation length	3-gram F1 badness	ok	101	1	0.055	0.240	0.1749	0.0104	33.3%	13.4340
Vocabulary + recognisers	chrF++ badness	ok	101	287	0.054	0.327	0.0894	0.0028	50.0%	2894.2661
Recogniser rules	chrF++ badness	ok	101	81	0.053	0.323	0.0895	0.0028	46.2%	2894.2661
Recogniser rules + translation length	ROUGE-L badness	ok	101	82	0.052	0.323	0.0889	0.0032	42.3%	4375.4794
Vocabulary + recognisers	ROUGE-L badness	ok	101	287	0.050	0.307	0.0890	0.0031	42.3%	4375.4794
Recogniser rules	ROUGE-L badness	ok	101	81	0.050	0.305	0.0891	0.0031	42.3%	4375.4794
Vocabulary + recognisers + translation length	3-gram Jaccard badness	ok	101	288	0.040	0.436	0.1828	0.0108	50.0%	5.8780
Vocabulary + recognisers + translation length	BLEU-4 badness	ok	101	288	0.038	0.218	0.1625	0.0035	42.3%	4375.4794
Vocabulary + recognisers + translation length	Sentence BLEU badness	ok	101	288	0.038	0.218	0.1625	0.0035	42.3%	4375.4794
Recogniser rules + translation length	BLEU-4 badness	ok	101	82	0.038	0.216	0.1625	0.0035	42.3%	4375.4794
Recogniser rules + translation length	Sentence BLEU badness	ok	101	82	0.038	0.216	0.1625	0.0035	42.3%	4375.4794
Vocabulary + recognisers	BLEU-4 badness	ok	101	287	0.036	0.212	0.1627	0.0033	42.3%	4375.4794
Vocabulary + recognisers	Sentence BLEU badness	ok	101	287	0.036	0.212	0.1627	0.0033	42.3%	4375.4794
Recogniser rules	BLEU-4 badness	ok	101	81	0.036	0.210	0.1627	0.0032	42.3%	4375.4794
Recogniser rules	Sentence BLEU badness	ok	101	81	0.036	0.210	0.1627	0.0032	42.3%	4375.4794
Vocabulary + recognisers + translation length	2-gram F1 badness	ok	101	288	0.033	0.242	0.1422	0.0032	42.3%	4375.4794
Recogniser rules + translation length	2-gram F1 badness	ok	101	82	0.033	0.240	0.1422	0.0032	42.3%	4375.4794
Vocabulary + recognisers	2-gram F1 badness	ok	101	287	0.031	0.231	0.1424	0.0030	42.3%	4375.4794
Recogniser rules	2-gram F1 badness	ok	101	81	0.031	0.228	0.1424	0.0030	42.3%	4375.4794
Vocabulary + recognisers	Absolute length percent error	ok	101	287	0.028	0.167	0.0420	-0.0004	46.2%	837.6776
Recogniser rules	Absolute length percent error	ok	101	81	0.028	0.165	0.0420	-0.0004	46.2%	837.6776
Vocabulary + recognisers + translation length	Absolute length percent error	ok	101	288	0.028	0.164	0.0420	-0.0004	46.2%	837.6776
Recogniser rules + translation length	Absolute length percent error	ok	101	82	0.027	0.167	0.0420	-0.0004	46.2%	837.6776
Vocabulary + recognisers + translation length	3-gram F1 badness	ok	101	288	0.026	0.230	0.1813	0.0040	34.6%	2894.2661
Recogniser rules + translation length	3-gram F1 badness	ok	101	82	0.026	0.229	0.1813	0.0040	34.6%	2894.2661
Vocabulary + recognisers	3-gram F1 badness	ok	101	287	0.024	0.221	0.1815	0.0038	34.6%	2894.2661
Recogniser rules	3-gram F1 badness	ok	101	81	0.024	0.193	0.1819	0.0034	34.6%	4375.4794
Recogniser rules + translation length	3-gram Jaccard badness	ok	101	82	0.023	0.227	0.1903	0.0033	38.5%	2894.2661
Vocabulary + recognisers	3-gram Jaccard badness	ok	101	287	0.021	0.224	0.1905	0.0030	38.5%	2894.2661
Recogniser rules	3-gram Jaccard badness	ok	101	81	0.021	0.222	0.1905	0.0030	38.5%	2894.2661
Translation length	Absolute length percent error	ok	101	1	-0.002	-0.077	0.0416	-0.0001	19.2%	10000.0000

Highest Predicted Risk

This list uses the best cross-validated model in this run and sorts passages by predicted badness for 2-gram F1 badness.

Lemma	ID	v3 runs	Source words	Observed badness	Predicted badness	BLEU-4	chrF++	3-gram F1	Length error
Καρία	2484	1	181.0	0.4484	0.6604	42.0%	65.8%	42.2%	12.6%
Κασώριον	2623	1	14.0	0.5556	0.6126	32.1%	66.0%	35.3%	10.0%
Κάρυστος	2603	1	132.0	0.4327	0.5945	46.5%	69.5%	41.5%	0.6%
Καλάσιρις	2085	1	10.0	0.6923	0.5179	32.3%	69.5%	8.3%	15.4%
Κάλυτις	2335	2	16.0	0.6765	0.5155	31.9%	60.3%	14.7%	5.8%
Κριώα	3530	1	16.0	0.4894	0.5146	38.0%	71.1%	31.1%	11.5%
Καδμεία	2059	1	17.0	0.5000	0.4971	41.9%	68.2%	36.8%	10.0%
Καρχηδών	2604	1	88.0	0.4897	0.4941	35.0%	63.9%	35.7%	9.4%
Καππαδοκία	2470	2	57.0	0.4280	0.4923	20.0%	56.8%	41.7%	3.7%
Κοτιάειον	3496	1	46.0	0.5373	0.4854	41.0%	62.3%	34.8%	8.5%
Καταονία	2628	1	17.0	0.5319	0.4780	36.2%	67.4%	31.1%	4.2%
Καλαβρία	2080	1	12.0	0.6250	0.4718	23.6%	66.7%	13.3%	11.1%
Κύρνος	7247	1	34.0	0.3895	0.4620	47.1%	69.3%	51.6%	6.0%
Κύτα	7254	1	58.0	0.4000	0.4495	51.9%	73.6%	47.6%	4.8%
Καπετώλιον	2468	1	86.0	0.5827	0.4493	30.1%	50.3%	31.0%	14.5%
Κάναστρον	2455	1	43.0	0.5826	0.4492	32.0%	62.9%	26.5%	5.0%
Καρπασία	2597	1	80.0	0.3767	0.4414	54.3%	75.2%	51.6%	6.2%
Κώμη	7266	1	53.0	0.7432	0.4163	13.8%	47.6%	13.7%	10.1%
Κατάνη	2626	1	64.0	0.4945	0.4159	43.8%	64.8%	36.7%	6.3%
Κυτέριον	7255	1	18.0	0.4167	0.4147	53.0%	76.1%	43.5%	27.3%
Κάλπη	2329	1	36.0	0.2000	0.4054	81.0%	89.1%	72.2%	3.5%
Καβασσός	2055	1	69.0	0.5048	0.4053	41.6%	67.5%	32.7%	5.5%
Καικῖνον	2074	1	6.0	0.6842	0.4048	21.4%	66.4%	0.0%	9.1%
Κωνώπη	7267	1	47.0	0.3667	0.4033	57.6%	75.7%	50.8%	9.4%
Κάλλατις	2119	1	45.0	0.4887	0.4018	31.8%	63.9%	30.5%	7.7%
Κάσος	2607	1	44.0	0.4464	0.3991	33.7%	61.5%	43.6%	0.0%
Καβελλιών	2057	1	23.0	0.2836	0.3934	45.8%	75.0%	55.4%	2.9%
Κάληρος	2116	1	22.0	0.5890	0.3854	29.3%	61.2%	28.2%	12.5%
Κωλιάς	7264	1	42.0	0.3445	0.3853	56.9%	76.7%	53.0%	1.6%
Κάσιον	2605	1	43.0	0.3158	0.3812	53.0%	75.6%	55.4%	7.1%

Predictive Features

Translation Length: chrF++ badness

Positive coefficients predict worse translation scores for the selected target. Negative coefficients predict better scores. Translation length is z-scored; vocabulary features exclude detected proper-noun tokens. These are exploratory ridge coefficients, not causal claims.

Features associated with worse scores

Type	Feature	Detail	Coefficient	Passages	Mean badness present	Mean badness absent
translation_length	Mean v3 translation word count	z-scored; mean 43.1, SD 36.7 words; present/high = top quartile, absent/low = bottom quartile	0.04422	101	0.3137	0.1565

Features associated with better scores

Type	Feature	Detail	Coefficient	Passages	Mean badness present	Mean badness absent

Vocabulary Terms: 2-gram F1 badness

Features associated with worse scores

Type	Feature	Coefficient	Passages	Mean badness present	Mean badness absent
vocabulary	χωριον	0.25588	2	0.5835	0.3334
vocabulary	οικητωρ	0.20685	6	0.5544	0.3247
vocabulary	επι	0.17838	2	0.6113	0.3328
vocabulary	τον	0.17386	10	0.4599	0.3250
vocabulary	τοις	0.17006	3	0.4885	0.3338
vocabulary	ωστε	0.16307	3	0.5518	0.3318
vocabulary	και φασι	0.15998	2	0.5012	0.3351
vocabulary	οικητωρ και	0.15359	3	0.6190	0.3298
vocabulary	καλειται	0.15359	2	0.4723	0.3356
vocabulary	και	0.14659	66	0.3828	0.2546
vocabulary	ει	0.14567	4	0.5657	0.3290
vocabulary	εκαλειτο	0.13830	7	0.4612	0.3292

Features associated with better scores

Type	Feature	Coefficient	Passages	Mean badness present	Mean badness absent
vocabulary	τεταρτω	-0.28307	3	0.0929	0.3459
vocabulary	εθνικον	-0.21955	57	0.3037	0.3833
vocabulary	εβδομη	-0.20173	2	0.2444	0.3403
vocabulary	ως εθνικον	-0.17735	5	0.1882	0.3462
vocabulary	μεταξυ και	-0.15902	5	0.2941	0.3407
vocabulary	μεταξυ	-0.15902	5	0.2941	0.3407
vocabulary	πορρω	-0.15293	3	0.0962	0.3458
vocabulary	ου πορρω	-0.15293	3	0.0962	0.3458
vocabulary	πολιτης	-0.13977	18	0.3404	0.3379
vocabulary	παιδος	-0.13969	4	0.2105	0.3436
vocabulary	απο παιδος	-0.13969	4	0.2105	0.3436
vocabulary	εν	-0.13932	37	0.3444	0.3349

Vocabulary Terms + Translation Length: 2-gram F1 badness

Features associated with worse scores

Type	Feature	Coefficient	Passages	Mean badness present	Mean badness absent
vocabulary	χωριον	0.26148	2	0.5835	0.3334
vocabulary	οικητωρ	0.20978	6	0.5544	0.3247
vocabulary	επι	0.18030	2	0.6113	0.3328
vocabulary	και φασι	0.17593	2	0.5012	0.3351
vocabulary	οικητωρ και	0.16831	3	0.6190	0.3298
vocabulary	τοις	0.15941	3	0.4885	0.3338
vocabulary	τον	0.15662	10	0.4599	0.3250
vocabulary	μοιρα	0.15277	2	0.6121	0.3328
vocabulary	καλειται	0.15045	2	0.4723	0.3356
vocabulary	ωστε	0.14658	3	0.5518	0.3318
vocabulary	και	0.14300	66	0.3828	0.2546
vocabulary	δευτερω	0.13128	3	0.5157	0.3329

Features associated with better scores

Type	Feature	Coefficient	Passages	Mean badness present	Mean badness absent
vocabulary	τεταρτω	-0.26840	3	0.0929	0.3459
vocabulary	εθνικον	-0.20535	57	0.3037	0.3833
vocabulary	εβδομη	-0.19872	2	0.2444	0.3403
vocabulary	ως εθνικον	-0.17192	5	0.1882	0.3462
vocabulary	εν	-0.16332	37	0.3444	0.3349
vocabulary	μεταξυ και	-0.15855	5	0.2941	0.3407
vocabulary	μεταξυ	-0.15855	5	0.2941	0.3407
vocabulary	ως εν	-0.14623	7	0.3104	0.3404
vocabulary	πορρω	-0.14466	3	0.0962	0.3458
vocabulary	ου πορρω	-0.14466	3	0.0962	0.3458
vocabulary	παιδος	-0.13855	4	0.2105	0.3436
vocabulary	απο παιδος	-0.13855	4	0.2105	0.3436

Recogniser Rules: METEOR badness

Features associated with worse scores

Type	Feature	Detail	Coefficient	Passages	Mean badness present	Mean badness absent
recogniser_summary	gloss occurrence count		0.00194	96	0.1998	0.1167
recogniser_summary	gloss rule count		0.00153	96	0.1998	0.1167
recogniser_summary	matched occurrence count		0.00105	101	0.1957	N/A
recogniser_summary	matched rule count		0.00038	101	0.1957	N/A
recogniser_rule	formula: X (SETTLEMENT) + Y (genitive REGION)	Translate as "a X in Y"	0.00028	63	0.2110	0.1704
recogniser_rule	formula: X (nominative DERIVED NOUN) + Y (nominative ETYMON)	Translate as "'X' is from Y"	0.00026	36	0.2116	0.1869
recogniser_rule	formula: X (nominative PROPER NOUN) + X (genitive PROPER NOUN)	Translate as "X, X"	0.00020	20	0.2275	0.1879
recogniser_rule	gloss: οἰκήτωρ ὁ	inhabitant, resident, patron (of a brothel...? - κ123)	0.00018	6	0.3660	0.1849
recogniser_rule	gloss: ἄκρον τό	cape (when on the coast, sgl.), headlands (when on the coast, plu.); peak (when inland)	0.00018	8	0.2890	0.1877
recogniser_rule	gloss: καλεῖται/ἐκαλεῖτο/κέκληται/ἐκλήθη (ἀπο...)	is/used to be/is/was called/named after (+ ἀπο)	0.00018	14	0.2488	0.1872
recogniser_rule	gloss: πόλισμα τό *	town	0.00016	5	0.2852	0.1910
recogniser_rule	formula: καί + X (nominative PROPER NOUN) + Y (nominative PROPER NOUN)	Translate as "Y is also 'X'"	0.00012	13	0.2129	0.1932

Features associated with better scores

Type	Feature	Detail	Coefficient	Passages	Mean badness present	Mean badness absent
recogniser_summary	formula rule count		-0.00120	100	0.1949	0.2718
recogniser_summary	formula occurrence count		-0.00094	100	0.1949	0.2718
recogniser_rule	formula: τὸ ἐθνικὸν + X (nominative ETHNONYM)	Translate as "the ethnonym is 'X'"	-0.00029	62	0.1798	0.2210
recogniser_rule	formula: X (nominative) + ὡς + Y (nominative HOMOMORPH)	Translate as "'X' as in 'Y'"	-0.00028	43	0.1737	0.2120
recogniser_rule	formula: ὡς + X (nominative ETYMON) + Y (nominative DERIVED NOUN)	Translate as "(just) as 'Y' is from X"	-0.00026	35	0.1677	0.2105
recogniser_rule	formula: X (AUTHOR NAME) + Y (NUMERAL)	Translate as "X, book Y"	-0.00019	31	0.1637	0.2099
recogniser_rule	gloss: ἔθνος τό	people	-0.00019	16	0.1559	0.2032
recogniser_rule	formula: X (AUTHOR NAME) + ἐν + Y (dative BOOK NAME)	Translate as "Χ, in his Y"	-0.00017	17	0.1616	0.2026
recogniser_rule	formula: ὡς + X (AUTHOR NAME)	Translate as "as per X"	-0.00016	41	0.1781	0.2077
recogniser_rule	formula: Χ (AUTHOR NAME) + ἐν + Y (dative NUMBER) + Z (genitive BOOK NAME)	Translate as "X, in book Y of his Z"	-0.00012	11	0.1692	0.1989
recogniser_rule	formula: ὡς + X (definite ARTICLE + ETYMON) + Y (nominative DERIVED NOUN)	Translate as "just as 'Y' is from the name X"	-0.00011	19	0.1781	0.1998
recogniser_rule	gloss: ἐθνικόν τό	ethnonym	-0.00009	61	0.1838	0.2138

Recogniser Rules + Translation Length: METEOR badness

Features associated with worse scores

Type	Feature	Detail	Coefficient	Passages	Mean badness present	Mean badness absent
recogniser_summary	gloss occurrence count		0.00192	96	0.1998	0.1167
recogniser_summary	gloss rule count		0.00152	96	0.1998	0.1167
recogniser_summary	matched occurrence count		0.00103	101	0.1957	N/A
translation_length	Mean v3 translation word count	z-scored; mean 43.1, SD 36.7 words; present/high = top quartile, absent/low = bottom quartile	0.00063	101	0.2753	0.1464
recogniser_summary	matched rule count		0.00037	101	0.1957	N/A
recogniser_rule	formula: X (SETTLEMENT) + Y (genitive REGION)	Translate as "a X in Y"	0.00028	63	0.2110	0.1704
recogniser_rule	formula: X (nominative DERIVED NOUN) + Y (nominative ETYMON)	Translate as "'X' is from Y"	0.00026	36	0.2116	0.1869
recogniser_rule	formula: X (nominative PROPER NOUN) + X (genitive PROPER NOUN)	Translate as "X, X"	0.00020	20	0.2275	0.1879
recogniser_rule	gloss: οἰκήτωρ ὁ	inhabitant, resident, patron (of a brothel...? - κ123)	0.00018	6	0.3660	0.1849
recogniser_rule	gloss: ἄκρον τό	cape (when on the coast, sgl.), headlands (when on the coast, plu.); peak (when inland)	0.00018	8	0.2890	0.1877
recogniser_rule	gloss: καλεῖται/ἐκαλεῖτο/κέκληται/ἐκλήθη (ἀπο...)	is/used to be/is/was called/named after (+ ἀπο)	0.00018	14	0.2488	0.1872
recogniser_rule	gloss: πόλισμα τό *	town	0.00015	5	0.2852	0.1910

Features associated with better scores

Type	Feature	Detail	Coefficient	Passages	Mean badness present	Mean badness absent
recogniser_summary	formula rule count		-0.00120	100	0.1949	0.2718
recogniser_summary	formula occurrence count		-0.00094	100	0.1949	0.2718
recogniser_rule	formula: τὸ ἐθνικὸν + X (nominative ETHNONYM)	Translate as "the ethnonym is 'X'"	-0.00029	62	0.1798	0.2210
recogniser_rule	formula: X (nominative) + ὡς + Y (nominative HOMOMORPH)	Translate as "'X' as in 'Y'"	-0.00028	43	0.1737	0.2120
recogniser_rule	formula: ὡς + X (nominative ETYMON) + Y (nominative DERIVED NOUN)	Translate as "(just) as 'Y' is from X"	-0.00026	35	0.1677	0.2105
recogniser_rule	formula: X (AUTHOR NAME) + Y (NUMERAL)	Translate as "X, book Y"	-0.00019	31	0.1637	0.2099
recogniser_rule	gloss: ἔθνος τό	people	-0.00019	16	0.1559	0.2032
recogniser_rule	formula: X (AUTHOR NAME) + ἐν + Y (dative BOOK NAME)	Translate as "Χ, in his Y"	-0.00017	17	0.1616	0.2026
recogniser_rule	formula: ὡς + X (AUTHOR NAME)	Translate as "as per X"	-0.00016	41	0.1781	0.2077
recogniser_rule	formula: Χ (AUTHOR NAME) + ἐν + Y (dative NUMBER) + Z (genitive BOOK NAME)	Translate as "X, in book Y of his Z"	-0.00011	11	0.1692	0.1989
recogniser_rule	formula: ὡς + X (definite ARTICLE + ETYMON) + Y (nominative DERIVED NOUN)	Translate as "just as 'Y' is from the name X"	-0.00010	19	0.1781	0.1998
recogniser_rule	gloss: ἐθνικόν τό	ethnonym	-0.00010	61	0.1838	0.2138

Combined Model Features: METEOR badness

Features associated with worse scores

Type	Feature	Detail	Coefficient	Passages	Mean badness present	Mean badness absent
recogniser_summary	gloss occurrence count		0.00194	96	0.1998	0.1167
recogniser_summary	gloss rule count		0.00153	96	0.1998	0.1167
recogniser_summary	matched occurrence count		0.00105	101	0.1957	N/A
recogniser_summary	matched rule count		0.00038	101	0.1957	N/A
recogniser_rule	formula: X (SETTLEMENT) + Y (genitive REGION)	Translate as "a X in Y"	0.00028	63	0.2110	0.1704
recogniser_rule	formula: X (nominative DERIVED NOUN) + Y (nominative ETYMON)	Translate as "'X' is from Y"	0.00026	36	0.2116	0.1869
recogniser_rule	formula: X (nominative PROPER NOUN) + X (genitive PROPER NOUN)	Translate as "X, X"	0.00020	20	0.2275	0.1879
recogniser_rule	gloss: οἰκήτωρ ὁ	inhabitant, resident, patron (of a brothel...? - κ123)	0.00018	6	0.3660	0.1849
recogniser_rule	gloss: ἄκρον τό	cape (when on the coast, sgl.), headlands (when on the coast, plu.); peak (when inland)	0.00018	8	0.2890	0.1877
recogniser_rule	gloss: καλεῖται/ἐκαλεῖτο/κέκληται/ἐκλήθη (ἀπο...)	is/used to be/is/was called/named after (+ ἀπο)	0.00018	14	0.2488	0.1872
recogniser_rule	gloss: πόλισμα τό *	town	0.00016	5	0.2852	0.1910
recogniser_rule	formula: καί + X (nominative PROPER NOUN) + Y (nominative PROPER NOUN)	Translate as "Y is also 'X'"	0.00012	13	0.2129	0.1932

Features associated with better scores

Type	Feature	Detail	Coefficient	Passages	Mean badness present	Mean badness absent
recogniser_summary	formula rule count		-0.00120	100	0.1949	0.2718
recogniser_summary	formula occurrence count		-0.00094	100	0.1949	0.2718
recogniser_rule	formula: τὸ ἐθνικὸν + X (nominative ETHNONYM)	Translate as "the ethnonym is 'X'"	-0.00029	62	0.1798	0.2210
recogniser_rule	formula: X (nominative) + ὡς + Y (nominative HOMOMORPH)	Translate as "'X' as in 'Y'"	-0.00028	43	0.1737	0.2120
recogniser_rule	formula: ὡς + X (nominative ETYMON) + Y (nominative DERIVED NOUN)	Translate as "(just) as 'Y' is from X"	-0.00026	35	0.1677	0.2105
recogniser_rule	formula: X (AUTHOR NAME) + Y (NUMERAL)	Translate as "X, book Y"	-0.00019	31	0.1637	0.2099
recogniser_rule	gloss: ἔθνος τό	people	-0.00019	16	0.1559	0.2032
recogniser_rule	formula: X (AUTHOR NAME) + ἐν + Y (dative BOOK NAME)	Translate as "Χ, in his Y"	-0.00017	17	0.1616	0.2026
recogniser_rule	formula: ὡς + X (AUTHOR NAME)	Translate as "as per X"	-0.00016	41	0.1781	0.2077
recogniser_rule	formula: Χ (AUTHOR NAME) + ἐν + Y (dative NUMBER) + Z (genitive BOOK NAME)	Translate as "X, in book Y of his Z"	-0.00012	11	0.1692	0.1989
vocabulary	εθνικον		-0.00011	57	0.1742	0.2236
recogniser_rule	formula: ὡς + X (definite ARTICLE + ETYMON) + Y (nominative DERIVED NOUN)	Translate as "just as 'Y' is from the name X"	-0.00011	19	0.1781	0.1998

Combined Model Features + Translation Length: METEOR badness

Features associated with worse scores

Type	Feature	Detail	Coefficient	Passages	Mean badness present	Mean badness absent
recogniser_summary	gloss occurrence count		0.00192	96	0.1998	0.1167
recogniser_summary	gloss rule count		0.00152	96	0.1998	0.1167
recogniser_summary	matched occurrence count		0.00103	101	0.1957	N/A
translation_length	Mean v3 translation word count	z-scored; mean 43.1, SD 36.7 words; present/high = top quartile, absent/low = bottom quartile	0.00063	101	0.2753	0.1464
recogniser_summary	matched rule count		0.00037	101	0.1957	N/A
recogniser_rule	formula: X (SETTLEMENT) + Y (genitive REGION)	Translate as "a X in Y"	0.00028	63	0.2110	0.1704
recogniser_rule	formula: X (nominative DERIVED NOUN) + Y (nominative ETYMON)	Translate as "'X' is from Y"	0.00026	36	0.2116	0.1869
recogniser_rule	formula: X (nominative PROPER NOUN) + X (genitive PROPER NOUN)	Translate as "X, X"	0.00020	20	0.2275	0.1879
recogniser_rule	gloss: οἰκήτωρ ὁ	inhabitant, resident, patron (of a brothel...? - κ123)	0.00018	6	0.3660	0.1849
recogniser_rule	gloss: ἄκρον τό	cape (when on the coast, sgl.), headlands (when on the coast, plu.); peak (when inland)	0.00018	8	0.2890	0.1877
recogniser_rule	gloss: καλεῖται/ἐκαλεῖτο/κέκληται/ἐκλήθη (ἀπο...)	is/used to be/is/was called/named after (+ ἀπο)	0.00017	14	0.2488	0.1872
recogniser_rule	gloss: πόλισμα τό *	town	0.00015	5	0.2852	0.1910

Features associated with better scores

Type	Feature	Detail	Coefficient	Passages	Mean badness present	Mean badness absent
recogniser_summary	formula rule count		-0.00120	100	0.1949	0.2718
recogniser_summary	formula occurrence count		-0.00094	100	0.1949	0.2718
recogniser_rule	formula: τὸ ἐθνικὸν + X (nominative ETHNONYM)	Translate as "the ethnonym is 'X'"	-0.00029	62	0.1798	0.2210
recogniser_rule	formula: X (nominative) + ὡς + Y (nominative HOMOMORPH)	Translate as "'X' as in 'Y'"	-0.00028	43	0.1737	0.2120
recogniser_rule	formula: ὡς + X (nominative ETYMON) + Y (nominative DERIVED NOUN)	Translate as "(just) as 'Y' is from X"	-0.00026	35	0.1677	0.2105
recogniser_rule	formula: X (AUTHOR NAME) + Y (NUMERAL)	Translate as "X, book Y"	-0.00019	31	0.1637	0.2099
recogniser_rule	gloss: ἔθνος τό	people	-0.00019	16	0.1559	0.2032
recogniser_rule	formula: X (AUTHOR NAME) + ἐν + Y (dative BOOK NAME)	Translate as "Χ, in his Y"	-0.00017	17	0.1616	0.2026
recogniser_rule	formula: ὡς + X (AUTHOR NAME)	Translate as "as per X"	-0.00016	41	0.1781	0.2077
recogniser_rule	formula: Χ (AUTHOR NAME) + ἐν + Y (dative NUMBER) + Z (genitive BOOK NAME)	Translate as "X, in book Y of his Z"	-0.00011	11	0.1692	0.1989
vocabulary	εθνικον		-0.00011	57	0.1742	0.2236
recogniser_rule	formula: ὡς + X (definite ARTICLE + ETYMON) + Y (nominative DERIVED NOUN)	Translate as "just as 'Y' is from the name X"	-0.00010	19	0.1781	0.1998

Downloadable Tables

Generated: 2026-06-25 12:20:32 UTC. Recogniser detector version: translation_guidance_scan_v4.