Last November, Concordia AI Founder and CEO Brian Tse was invited to attend the UK Global AI Safety Summit. This May, Concordia AI, represented by Brian, was one of 10 academia and civil society organizations invited to participate in the AI Seoul Summit. Session One of the AI Seoul Summit sought to “build on the outcomes of the AI Safety Summit held at Bletchley to advance global cooperation on AI safety.” It included showcases of work by existing AI safety institutes and discussion of international cooperation on safety testing and research. The session additionally explored the findings of the interim International Scientific Report on the Safety of Advanced AI and next steps for the report ahead of the France AI Action Summit. Concordia AI Senior Program Manager Kwan Yee Ng was a member of the writing group for the report.
During this session, Brian shared his views on the importance of international redlines for AI safety testing, calling for international consensus, testing, and countermeasures for the biggest AI safety risks. Brian’s full remarks are transcribed, lightly edited with emphasis added:
Thank you very much for the opportunity to speak. My name is Brian; I am the CEO of Concordia AI, a Beijing-based social enterprise focused on AI safety and governance.
As the International Scientific Report on the Safety of Advanced AI concluded, “the future trajectory of general-purpose AI is remarkably uncertain. A wide range of possible outcomes appears possible even in the near future, including both very positive and very negative outcomes.”
Given the high uncertainty, significant stakes for the world, and potential exponential progress of AI, I think we should apply the precautionary principle. This implies an urgent need to broker consensus around international AI safety red lines — risks that may be intolerable to the international community.
There should be three pillars to preventing the biggest safety risks.
The first pillar is defining concrete red lines as an international community. We can draw inspiration from a few declarations and efforts. For example:
Reaffirm the Bletchley Declaration, which is particularly concerned about the risks from large-scale disinformation, cybersecurity, and biosecurity.
A recent dialogue in Beijing attended by Yoshua Bengio and other top AI scientists reached a consensus on five red lines. These include the potential for advanced AI to engage in deception and autonomously replicate.
Finally, as a piece of feedback for the International Scientific Report on the Safety of Advanced AI, we should expand the section on dual-use science risks from only including misuse of biology at the moment, to also incorporate chemistry and other safety-critical scientific domains.
The second pillar is continuous AI safety testing for early warning indicators. This has to be an international network covering the key hubs of frontier AI development, to know when we may be approaching red lines anywhere in the world. This could include national AI safety institutes and other organizations working on safety testing from a public good perspective.
Let me illustrate this with the example of China. In China, there are at least one major government-affiliated think tank and two major government-funded AI labs, conducting testing on threats such as cybersecurity and jailbreaking.1 There are also several academic labs evaluating the risks of model autonomy as well as the dual-use potential of AI in scientific domains such as chemistry and biology.2
The third pillar is developing a set of response protocols that can be triggered if certain risk thresholds are crossed. This could include mandating further AI safety research, assurances, and human oversight until proven safe. The international community should develop shared protocols, particularly when it comes to responding to global systematic risks. And there should be special attention and support for Global South countries for building resilience to risks.
To conclude, in the face of great uncertainty and potentially rapid progress, we should develop preparedness and resilience for the major global risks. If humanity can prevent worst-case risks, then we can strive for the best case of using AI to solve a range of sustainable development goals.
A non-exhaustive list of relevant evaluation efforts include: government-backed think tank China Academy of Information Communications Technology’s AI safety benchmark, state-backed Shanghai AI Lab's SALAD-Bench benchmark, and a recent evaluation from state-backed Beijing Academy of AI.
Examples include Control Risk for Potential Misuse of Artificial Intelligence in Science by researchers from Microsoft Research Asia, University of Science and Technology of China, and Nanyang Technological University, as well as Prioritizing Safeguarding Over Autonomy: Risks of LLM Agents for Science by researchers from Yale University, Shanghai Jiao Tong University, National Institutes of Health, Mila-Quebec AI Institute, and ETH Zurich.