You must log in or # to comment.
By the end of long workflows
Yes, this has been known for 10 years.
By the end of long workflows
Yes, this has been known for 10 years.
huh? the kind of “long workflows” this paper is discussing didn’t exist two years ago much less 10
it doesn’t matter. the principle is that if x is the length of your context window, then at 0.4x the chance of hallucinations start increasing exponentially. we’re now at token windows of 1M, and all it does is shift that hallucination window further away, so the model ‘feels’ stronger because it takes longer before it hallucinates, but eventually it always does.



