top of page
Search


Capability doesn't predict responsibility
An open, reproducible index of responsible-AI benchmarks across seventeen frontier models, and what it says about the assumption that stronger models are safer. More capable does not reliably mean more responsible. I built Raidex to test the assumption that a stronger model is better on every axis, responsibility included, and the data did not support it. The gap shows up across seventeen frontier models, and it holds even within a single lab's own lineup, where a newer, more
Vishnu Vettrivel
6 hours ago8 min read
bottom of page