/4 min
Why AI Fabricates Greek and How to Catch It
AI's most confident-sounding errors are often in Greek and Hebrew detail. The discipline is to know why and to check.
AI tools used in sermon prep produce, with some regularity, Greek and Hebrew claims that look plausible and are wrong. The errors are not random. They cluster in particular kinds of claims, and once you know the pattern you can catch most of them with a habit of checking.
Why the errors cluster here
A language model generates text by predicting plausible sequences. For most text, plausibility tracks accuracy reasonably well — the model has seen enough examples that the most plausible continuation is also the right one. Greek and Hebrew claims are different in two ways.
First, the training data is much sparser. The model has seen far fewer first-class lexicographic discussions of any given Greek word than it has seen, say, of any common English topic. Sparse data means more room for plausible-but-wrong generations.
Second, the model has seen, in the training data, both careful scholarship on the Greek and a substantial amount of devotional or homiletic material that uses Greek casually and sometimes incorrectly. Both bodies of material discuss Greek words with confidence. The model does not always distinguish between the careful and the casual when generating new claims.
The result is that a model asked about a Greek word will sometimes produce careful, BDAG-grade material, and sometimes produce material that reads like a sermon-circuit etymology — confident, vivid, and not what the lexicon would say.
The patterns to watch for
Four patterns of fabrication recur often enough to be worth naming.
Etymological readings. The model sometimes produces a meaning derived from the word's components rather than from usage. The classic root fallacy, presented as if it were a lexical claim. Watch for any claim that says "the word literally means" followed by a parts-based reading.
Spurious overtones. The model produces a list of "shades of meaning" or "nuances" the word carries that go beyond what the lexicon supports. Often these read as if the word carried every related sense at once. Watch for the word "carries" or "connotes" attached to a list that goes beyond the lexicon's range.
Confident classifications. The model produces a claim that this passage's word is "almost always" or "consistently" used in a particular sense, when in fact the word's usage is varied. Watch for the certainty of the framing.
Invented citations. The model produces a citation to a source that does not exist, or to a source that exists but does not say what is claimed. The citation will look real — author, work, page — and is sometimes hallucinated. Watch for any specific page or section number you have not verified yourself.
The check
The check is small. For any AI-surfaced Greek or Hebrew claim that crosses the pulpit threshold, open the actual lexicon. Read the entry. Compare the lexicon's range to the claim. If the claim is inside the range and the construction supports it, the claim is defensible. If the claim is outside the range, or asserts a sense the lexicon does not catalog, the claim is the model's fabrication and should be cut.
This is the same check a careful minister would do for any reference work, including a printed commentary that might have been wrong. AI does not change the discipline; it changes the volume of candidate claims that need the discipline applied.
What to do when you cannot do the check
If you cannot open the lexicon — you do not have access, you are short on time, you are working from memory — soften the claim until it does not depend on lexical detail. "The Greek word here can carry the sense of..." is honest without overclaiming. "The Greek word here means..." with no lexicon check behind it is overclaiming.
The minister who builds this habit catches most of the AI's Greek fabrications before they reach the pulpit. The minister who does not eventually preaches one and has to apologize. Both outcomes are common. The first is the one to aim for.