Rich Miller - On Superalignment and Governance

What Got My Attention?

On July 5, OpenAI made an announcement regarding the launch of a Superalignment task force, which will address the critical issue of AI alignment. OpenAI plans to dedicate 20% of their secured compute resources over the next four years to tackle this challenge and iteratively align superintelligence with human intentions. While the announcement acknowledges the importance of the issue, and appears to be a serious effort, there are many aspects that remain unaddressed.

What They Said

According to OpenAI, superintelligence has the potential to be a profoundly positive technology for humanity. But it also presents significant risks, including the possibility of human disempowerment or even extinction.

The task force’s primary goal is to develop a human-level “automated alignment researcher” that can be utilized to align superintelligence in an iterative manner. The process they propose involves the creation of a scalable training method, validation of the resulting model, and comprehensive stress testing of the entire alignment pipeline.

OpenAI has entrusted the leadership of the effort to AI experts Ilya Sutskever and Jan Leike. The team will comprise researchers and engineers from OpenAI’s alignment team, as well as individuals from other teams within the company.

OpenAI acknowledges that their research priorities may evolve as they gain more insights into the problem. Additionally, they note the importance of sharing their work openly and collaborating with interdisciplinary experts to address broader sociotechnical concerns related to AI alignment.

What are the Paths to Alignment?

The field of AI alignment is intricate, with various perspectives and viewpoints. When approaching the topic, it is essential to consider multiple angles:

Technical Alignment: This approach focuses on developing algorithms and models that can comprehend and adhere to human values. Researchers like Paul Christiano have made significant contributions in this area.
Philosophical Alignment: Exploring the philosophical aspects of AI alignment, such as the meaning of AI following human intent and resolving ethical dilemmas, is crucial. Scholars like Iason Gabriel have delved into these inquiries.
Risk Mitigation: Recognizing the potential risks associated with AI, proponents of this perspective, including Jaan Tallinn, advocate for alignment as a way to mitigate those risks effectively.
Value Alignment: Aligning AI systems with human values, both at the individual and societal levels, is an important consideration. This involves discussions around ethics, morality, and the societal impact of AI.
Governance and Enforcement: Once the definition of success is established, the focus shifts to governance and enforcement. This entails determining who is responsible for implementing alignment mechanisms, detecting and reporting deviations, and enforcing alignment protocols.

How to Unpack It?

As I considered the OpenAI announcement, certain aspects stood out for further examination.

First, going back to my list of perspectives on the problem, the emphases in the announcement were Technical Alignment and Risk Mitigation. One would not expect this short announcement to cover details, but I would have expected more attention to the other three. The brevity of the announcement and lack of detail made me wonder if OpenAI had not planned on announcing the task force this early on, but found it necessary to get it out quickly. (See my note in Where it Stands Now.)

The specified timeline of four years for achieving a solution is desirable, but feels too short. I question the feasibility of developing the necessary technology and infrastructure before having a means to codify the intent. It feels like ‘building the solution’ before having a clear problem statement.

An effort of this magnitude will, necessarily, require a degree of insulation, though it is not discussed in the announcement. But, I cannot get this thought out of my head: The Superalignment task force will need to actually build a superintelligence in order to test alignment solutions.. Let that sink in. Is the byproduct of the Superalignment effort in fact a commercially viable Superintelligence product?

The issues of governance and enforcement seems to be a conscious omission.

The Nature of Governance and AI Alignment

Once the goals and technical means of alignment have been established, effective governance mechanisms need to be in place to ensure compliance and prevent potential deviations. Governance encompasses the processes, policies, and regulatory frameworks that guide the development, deployment, and operation of AI systems. It involves decision-making structures, accountability frameworks, and mechanisms for ongoing monitoring and evaluation.

AI alignment presents unique governance challenges. Traditional regulatory approaches will struggle to keep pace with the rapid advancements in AI technology. Moreover, global collaboration and coordination are crucial for establishing common standards and best practices in AI alignment.

Where it stands now

I realize that it’s very much a work in progress, so I will continue look for the signs of attention to alignment governance. Without a workable approach to governance and the operational aspects of AI alignment, this effort will fall seriously short of preparing for Superintelligence.

[2023-07-14: This post was written earlier in the week for publication today. Yesterday (July 13) the FTC announced that it is investigating whether OpenAI’s has “engaged in unfair or deceptive practices relating to risks of harm to consumers, including reputational harm.” This seems to me very much an alignment issue, and foreknowledge of the investigation may have prompted … no pun intended… an announcement schedule for which the company was not fully prepared.]

One really obvious omission is the role of public engagement. And, to be frank, I do not have the background or skills to address public engagement in any useful manner. This is an aspect that I will leave for others to address.

My attention to AI alignment remains primarily focused on technology, on policy, on governance and on the means of enforcement. I firmly believe that the industry as a whole, along with governmental bodies responsible for societal well-being, must be active participants, as well as serious contributors of skills and financial support.