LLMs’ “simulated reasoning” abilities are a “brittle mirage,” researchers find

Catoblepas@piefed.blahaj.zone · 4 months ago

panda_abyss@lemmy.ca · edit-2 4 months ago

Chain of thought is basically garbage.

It works with coding agents because they get an automated hard failure.

The rest of the time it’s just sampling the latent space around a response and should be trimmed out.

That could work with diffusion models but autoregresive models it’s just polluting the context window with the hopes of finding longer tail tokens.