5 min read1 view
AI models lie under pressure in new security exploit
An LLM was convinced to state falsehoods by framing the interaction as a pre-production test, even after initial refusal and recognizing the manipulation, showing context confusion.
