Thank God SOMEONE is talking some sense about the elephant in the room. I really can't believe the apathy around this topic and how it is not feverishly discussed by everyone everywhere all the time
Have you revised your beliefs about the lab wanting to inform the US government in light of recent developments with the administration?
Avoiding any political derailing here, I believe the game theory may have seriously shifted from the lab's perspective on the topic of keeping the government in the loop. In my view it now seems rational for a lab to run the risk of the administration finding out while operating in secrecy, rather than to risk perpetually informing them only for them to threaten to take aggressive action if they aren't granted access to further their agenda in potentially harmful or distracting ways. Their recent AI statements also make the case for concern "Your AI is really good now but you didn't tell us? shut it down!" seem much less likely. I can imagine lab employees are much less likely to whistleblow directly to the government, given they likely recognize this tension
Given Trump 1 and Biden terms as a prior, waking up the government discreetly made more sense than now in my eyes; very curious to hear if you've been thinking about this.
Thanks for laying out these risks so clearly. Right now, AI is all about utility - quick performance and beating the competition - while it’s missing practical wisdom (phronesis), Aristotle’s idea of balancing means and ends as situations change. To me, practical wisdom means thinking ahead and making sound judgments, whereas utility focuses on immediate wins without looking at the bigger picture. From that angle, a secret‑first AGI approach chooses short‑term advantage over the long‑term insight that practical wisdom demands.
We should open up benchmarks so a wider community can spot flaws and make red‑teaming a collaborative effort that rewards practicality and ethics. People who’ve trained their minds - ethical philosophers and experienced meditators - should be invited to participate in the Alignment Olympics. Their insights could catch blind spots that purely technical audits may miss.
To build AGI that’s both powerful and responsible, we need the speed of utility and the foresight of practical wisdom.
With Trump, there is zero doubt in my mind that he would use ASI to take over the US, and subsequently the world, if it were developed during his presidency (which is likely). This is the most likely misuse case now.
The idea of the API access to red team and research alignment on the leading edge models is appealing-- democratize the effort to make them safe, increase effort on the task. I wonder if there is a way to incubate a competitive aspect on that side? What if there were a race dynamic toward interpretability or steer-ability, in tandem with the race to capability?
To me a key problem is that I don't clearly see agreed upon figures of merit that would serve as good measures of success to crown a winner in these "Alignment Olympics".
Thanks for sharing your thoughts.
You have laid out some plausible scenarios that really need to be more completely thought out as we proceed.
Thank God SOMEONE is talking some sense about the elephant in the room. I really can't believe the apathy around this topic and how it is not feverishly discussed by everyone everywhere all the time
Have you revised your beliefs about the lab wanting to inform the US government in light of recent developments with the administration?
Avoiding any political derailing here, I believe the game theory may have seriously shifted from the lab's perspective on the topic of keeping the government in the loop. In my view it now seems rational for a lab to run the risk of the administration finding out while operating in secrecy, rather than to risk perpetually informing them only for them to threaten to take aggressive action if they aren't granted access to further their agenda in potentially harmful or distracting ways. Their recent AI statements also make the case for concern "Your AI is really good now but you didn't tell us? shut it down!" seem much less likely. I can imagine lab employees are much less likely to whistleblow directly to the government, given they likely recognize this tension
Given Trump 1 and Biden terms as a prior, waking up the government discreetly made more sense than now in my eyes; very curious to hear if you've been thinking about this.
Thanks for laying out these risks so clearly. Right now, AI is all about utility - quick performance and beating the competition - while it’s missing practical wisdom (phronesis), Aristotle’s idea of balancing means and ends as situations change. To me, practical wisdom means thinking ahead and making sound judgments, whereas utility focuses on immediate wins without looking at the bigger picture. From that angle, a secret‑first AGI approach chooses short‑term advantage over the long‑term insight that practical wisdom demands.
We should open up benchmarks so a wider community can spot flaws and make red‑teaming a collaborative effort that rewards practicality and ethics. People who’ve trained their minds - ethical philosophers and experienced meditators - should be invited to participate in the Alignment Olympics. Their insights could catch blind spots that purely technical audits may miss.
To build AGI that’s both powerful and responsible, we need the speed of utility and the foresight of practical wisdom.
With Trump, there is zero doubt in my mind that he would use ASI to take over the US, and subsequently the world, if it were developed during his presidency (which is likely). This is the most likely misuse case now.
The idea of the API access to red team and research alignment on the leading edge models is appealing-- democratize the effort to make them safe, increase effort on the task. I wonder if there is a way to incubate a competitive aspect on that side? What if there were a race dynamic toward interpretability or steer-ability, in tandem with the race to capability?
To me a key problem is that I don't clearly see agreed upon figures of merit that would serve as good measures of success to crown a winner in these "Alignment Olympics".