Epsilon gate

A convergence threshold (eps) used as a hard pass/fail gate - and the silent failure it causes when the same code runs on different hardware.

An epsilon gate is a pass/fail check built around a number that’s supposed to be almost zero. Many iterative algorithms stop when some error measure drops below a tiny threshold called epsilon (often written eps, a value like 0.002): close enough, declare success, hand back the answer. The trouble starts when the code treats that threshold as a hard gate, returning a result only if the error makes it under epsilon. Anything that lands a hair short gets thrown away, even when it’s a perfectly good answer.

That’s fine until the arithmetic shifts underneath it. The same computation run on a different backend, CUDA versus MPS versus the CPU, will not produce bit-identical numbers, because floating-point precision and the order of operations differ from one machine to the next. A search that used gradient descent to reach 0.0019 on the GPU it was tuned for might settle at 0.0021 somewhere else: the same quality of answer, missing the gate by a rounding error. If epsilon was hardcoded for one machine, the program can silently produce nothing on another. No crash, no warning, just an empty output folder and a confusing afternoon.

The fix is to stop treating epsilon as a publication gate and treat it as a stopping hint: keep going until the error is under the threshold or you run out of steps, then always return the best result the run actually found. Epsilon decides when to stop looking, not whether the answer is allowed to exist. A hardcoded convergence threshold is a close cousin of any magic number tuned on one setup, the kind of buried assumption that turns into a silent failure the moment the ground moves: a new GPU, a different floating-point precision , a fresh library version.