Paper
Scaffold, Not Vocabulary? A Controlled, Two-Tier, Pre-Registered Study of a Popperian Code-Generation Skill
arXiv:2606.06454v1 Announce Type: cross Abstract: Large language models increasingly write, review, and judge code, and a fast-growing practice equips them with prompt 'skills' that ask the model to reason like a scientist. A prominent example tells the model to act as a Popperian falsificationist, and such skills are reported to improve generated code. But these gains are almost always read off an LLM-as-a-judge, an instrument with documented positional, self-preference, and stylistic biases. We ask: if it appears to help, is the gain from the skill's Popperian content, or from the structure…
Authors:
Topics
Relevant entities
People
Linked people will appear here.
Related coverage
Linked coverage will appear here.
Related events
Linked events will appear here.
Related discussions
Related discussion nodes will appear here.