How we like our bots
Two media narratives on AI
Two articles appeared side-by-side in my news today. One, from Vox, describes a whole lot of industry groupthink and media hype around existential risk. Turns out, a lot of the "scheming" that has been reported is highly dependent on researcher instructions.
In fact, trying to extrapolate what the AI is really like by watching how it behaves in highly artificial scenarios is kind of like extrapolating that Ralph Fiennes, the actor who plays Voldemort in the Harry Potter movies, is an evil person in real life because he plays an evil character onscreen.
If you really wanted to get at a model’s propensity, Summerfield and his co-authors suggest, you’d have to quantify a few things. How often does the model behave maliciously when in an uninstructed state? How often does it behave maliciously when it’s instructed to? And how often does it refuse to be malicious even when it’s instructed to? You’d also need to establish a baseline estimate of how often malicious behaviors should be expected by chance — not just cherry-pick anecdotes...
(paywall)
And from NPR, a shallow little clickbait on how we supposedly prefer extroverted robots but relate to neurotic ones. The only kind we don't appear to connect with is the one that bores us.
The experiment also included a third version of the robot with a more typical robot personality that was bland and flat. People generally didn't like that one.