Discussion about this post

User's avatar
stephen russell's avatar

This series is exceptionally valuable. Your scenarios are a great complement to the more general warning regarding misalignment. Misalignment is a concept which requires a strong background in AI development to understand. Your scenarios demonstrate the concept of misalignment and the potential consequences. Thank you

Expand full comment
Daniel Kokotajlo's avatar

Thanks for writing this Alex! Like you say, even if alignment is solved, the question "who to align to" looms large, and the default answer seems pretty grim: Not only will potentially one man get to decide, they might even be able to keep this a secret for quite some time. This is just one scenario but it's good to get something on the table. Other scenarios I'd be keen to see explored:

--Let's think of good security practices against insider threats such as the CEO, and see if there are still pathways for the CEO to insert secret loyalties despite those security practices.

--What about a scenario where the loyalties aren't secret, at least to company leadership? Right now companies aren't required to publish their Specs or even have Specs.

--What about a scenario where it's the POTUS instead of a CEO?

--What if it's a small group like the oversight committee, but unlike in the Slowdown ending of AI 2027, there's a power struggle within that group?

--What if there are two or three rival AGI companies of roughly equivalent capability, only one or two of which have power-hungry leaders?

--What about ways of exerting power/control/etc. that are less drastic than inserting full loyalty, but instead amount to e.g. privileged access, or biasing the Spec in some way?

Expand full comment
8 more comments...

No posts