AI Safety in China #22
TC260 Safety Governance Framework 2.0, AI+, BRICS, AI and chemical risks, DeepSeek-R1 in Nature, international red lines
It’s been a while since our last regular newsletter. After a busy summer with the World AI Conference, this edition catches up on other key developments from June through September 2025.
Key Takeaways
A leading standard-setting body increased focus on loss of control and catastrophic risks in AI Safety Governance Framework 2.0.
The State Council published the first detailed policy on the “AI+ Initiative,” focusing on application-driven development, but also referencing safety and governance.
BRICS leaders call for addressing both “immediate and long-term risks” and a “prudent approach towards AGI.”
The Organisation for the Prohibition of Chemical Weapons (OPCW) held a workshop on the intersection of AI and chemical safety in Shanghai.
DeepSeek published a peer-reviewed model card in Nature, significantly increasing transparency on safety measures and evaluations.
Leading Chinese experts joined a global call for AI red lines.
Domestic AI Governance
Key standard body increases focus on loss of control in updated framework
Background: On September 15, TC260 released version 2.0 of its AI Safety Governance Framework, first published in September 2024. TC260 is one of China’s leading AI standards bodies, for example authoring the Basic Security Requirements, the national standard that governs mandatory security assessments for Chinese LLM developers.
The AI Safety Governance Framework identifies and classifies AI safety risks and maps technical and governance measures to address them. V1.0 already stood out for including frontier AI safety concerns, such as misuse in CBRN (chemical, biological, radiological, and nuclear) and missile domains and loss of control (LoC) risks. Version 2.0 significantly strengthens attention to LoC and catastrophic risks, while also introducing new clauses on open-source AI governance.
Key changes from V1.0 on frontier safety:
New risk grading system: V2.0 introduces a structured risk grading approach based on three criteria: application scenario, level of intelligence, and application scale (section 5.5 and Appendix 1). Based on these, risks are classified into five levels, from “low” to “extremely serious.” The framework calls for national standards to formalize this grading system. Beyond the earlier categories of “inherent” and “application” risks, V2.0 introduces a third class: “derivative risks,” which emerge from broader the social, ethical, and environmental consequences of AI use (section 3.3).
New references to catastrophic risks: The principles section prominently adds a call for “consensus-based guidelines to address catastrophic risks of AI” (section 1). The risk levels framework also mentions “catastrophic” or systemic threats as meriting the highest, level five, rating. V1.0 had not used the “catastrophic” risk terminology.
Increased focus on LoC: V1.0 had warned of power-seeking AI attempting to compete with humans for control. V2.0 additionally warns of “sudden, unexpected leaps in intelligence” and proposes control measures over autonomous systems including “circuit breakers” and “safety stop switches” (section 4.2.3). The emergency response plan section includes more detailed measures now, such as setting alert thresholds and developing the “ability to switch to manual or conventional systems when necessary” (section 6.3.3).
Slightly strengthened content on CBRN risks: CBRN and missile risks were already included in V1.0, but V2.0 now explicitly highlights it as a key domain for seeking international consensus (section 5.11). In addition to CBRN misuse, V2.0 now also warns of AI enabling high-risk biotechnology research (section 3.2.2).1
Open-source governance: The framework recognizes new challenges posed by the rapid spread of lightweight, high-efficiency open-source models (section 5.4). It recommends closer collaboration between model developers and open-source communities to strengthen rules and responsibilities for risk disclosure, clearly defined prohibited uses, and clarified security obligations to prevent misuse.
International information-sharing: The clause on risk information sharing now explicitly calls for “exploring the creation of relevant international collaboration mechanisms” (section 5.10).

Implications: The AI Safety Governance Framework reflects TC260’s broad view of the AI risk landscape and acts as a precursor to future national standards. Last year, TC260 followed up on the V1.0 with a comprehensive plan mapping forthcoming standards to the risks outlined in Framework. It is likely that the increased focus on frontier risks in V2.0 will influence future national standards. For example, on September 25, TC260 invited organizations to help draft a new AI risk classification standard. It is plausible that V2.0’s proposed grading system would serve as a reference for this standard.
Draft AI ethics management services regulation
Background: On August 22, the Ministry of Industry and Information Technology (MIIT) released a draft AI ethics regulation. It largely repeats science and technology ethics rules from 2023, which listed AI as one focus area (alongside biology and medical sciences). The main novelty in MIIT’s draft would be establishing third-party ethics review service centers.
Key provisions:
Ethics review committees: Universities, research institutes, and companies must set up AI ethics review committees, and register them in a government platform. Committees must review projects and prepare emergency response plans.
Third-parties: Institutions may outsource reviews to “AI ethics service centers.” The draft aims to cultivate a market of assurance providers and foster industry development beyond top-down oversight.
Risk-based approach: Based on the severity and likelihood of risks, the committee chooses a general, simplified, or emergency review. The review must evaluate fairness, controllability, transparency, traceability, staff qualifications, and proportionality of risks and benefits. Three categories of high-risk projects require a second round of review by a government-assigned expert group: some human-machine integrations, AI that can mobilize public opinion, and some highly autonomous decision-making systems.
Implications: These measures could help concretize ethics review as a distinct regulatory track for AI. But given that the existing 2023 science and technology ethics rules already include most of the same provisions, the key uncertainty is enforcement, which appears to have been weak for the 2023 rules. One change is that responsibility for AI-specific ethics work has moved from the Ministry of Science and Technology (MOST), which oversees science and technology ethics more broadly and issued the 2023 rules, to MIIT. MIIT’s proximity to industry may position it better to build practical enforcement mechanisms.
If implemented thoroughly, the rules could become a tool for early oversight of frontier AI risks, as they require reviews even before model pre-training. Yet, the draft does not discuss any specific AI risks in detail, but rather focuses on institutional risk management procedures.
Plan for AI+ initiative links safety and diffusion
Background: On August 26, the State Council released its first detailed document on the AI+ Initiative (unofficial English translation), originally announced in China’s 2024 government work report. Rather than emphasizing frontier breakthroughs, the plan prioritizes rapid diffusion to spur economic growth, with a target of 90% adoption of smart terminals and agents by 2030.
On safety, the document calls for:
Strengthening AI laws, regulations, and ethics guidelines.
Improving security assessments and algorithm registration.
Addressing risks like black-box opacity, hallucinations, and discrimination.
Building forward-looking evaluation, monitoring, and emergency response systems.
Implications: While safety and governance are not a focus, this document demonstrates that such topics are a standard element in China’s AI plans. The inclusion of safety provisions alongside development objectives suggests these are treated as complementary rather than conflicting priorities.
On September 24, China also issued a “AI+” International Cooperation Initiative, which calls on countries to strengthen collaboration and policy exchanges in AI+. This suggests China also sees its application and diffusion-oriented approach as part of its AI diplomacy.
International AI Governance
BRICS High-level Forum on AI addresses AGI safety
Background: At the 7th BRICS Summit on July 6, member states signed the Leaders’ Statement on the Global Governance of AI. While centered on bridging the global AI divide, the document notably mentions safety, urging countries to address both “immediate and long-term risks” and adopt a “prudent approach towards AGI.”
Content: The statement envisions advancing AI governance through multilateralism under the UN, while also respecting countries’ sovereignty and differing circumstances. It notes that “collaborative governance is complex, but possible.”
On safety, it underscores that:
Trust and safety must be built into AI systems;
Countries should address both immediate and long-term risks in line with national policies and security considerations;
Misuse, including for cyberattacks and cybercrime, must be effectively detected and prevented.
The final chapter, titled “the road ahead,” calls for a “prudent approach to AGI” with ethical development and responsible deployment. It also warns that AGI could lead to concentration of power.
Implications: The statement reflects the top AI governance objectives of the BRICS countries, namely inclusive multilateral efforts and AI capacity building. Explicit references to AGI, long-term risks, and AI misuse for cyberattacks show that concerns about frontier risks are increasingly part of discussions in Global South fora, alongside development goals. These priorities are also consistent with China’s AI diplomacy efforts, suggesting alignment between China’s approach and other BRICS countries. China is continuing to show openness in engaging a variety of multilateral and bilateral channels in AI governance.
OPCW holds workshop on AI and chemical safety in Shanghai
Background: On 27 June, the Organisation for the Prohibition of Chemical Weapons (OPCW) and the Chinese government co-organized a workshop on AI and chemical safety in Shanghai. Participants included OPCW Director-General Ambassador Fernando Arias, MIIT Vice Minister ZHANG Yunming (张云明), Central Military Commission Office for International Military Cooperation Office of Treaty Compliance Deputy Director CAO Xilin (曹希林), Chinese Ambassador and Permanent Representative to the OPCW Tan Jian (谈践), and around 50 experts from almost 30 countries.

Content: The event explored both the benefits of AI for advancing the Chemical Weapons Convention and the risks of misuse for illicit chemical activities or terrorism.
MIIT Vice Minister Zhang stressed the need for global consensus, inclusive cooperation attentive to developing countries, and a “trustworthy and controllable governance ecosystem” to ensure responsible science and technology use.
Director-General Arias underscored the need for risk assessments on the threat of AI misuse in the chemical domain by non-state actors.
Implications: This was the OPCW’s first capacity-building programme focused specifically on AI and chemical safety/security. The immediate practical impact of the workshop is unclear, but co-organization with China highlights Beijing’s growing engagement on AI risks in the CBRN domains. China is also represented in the OPCW’s temporary working group on AI through WU Tongning (巫彤宁), deputy director of the China Academy of Information and Communications Technology (CAICT) AI Research Institute .
Industry
Two leading foundation model developers improve transparency on safety
Background: Two Chinese leading foundation model developers, DeepSeek and Moonshot AI, have released technical model cards with substantially higher degree of frontier safety disclosure than most previous Chinese cards.
DeepSeek-R1: On September 17, Nature published the DeepSeek-R1 report—the first peer-reviewed disclosure for a major foundation model. Supplementary material includes a 10-page safety section, the most detailed of its kind from a Chinese developer to date. The paper:
Specifically acknowledges the unique risk of open-sourcing model weights.
Explains the “risk control system” used on DeepSeek’s services: an additional safety layer on top of the model, where one model acts as a judge on whether the other model’s output complies with a set of safety principles. These safety principles include clauses on not allowing answers regarding manufacturing dangerous weapons, including controlled biochemicals and cyberattacks.
Reports six major safety benchmarks (Simple Safety Tests, BBQ, Anthropic Red Team, XSTest, DNA, and HarmBench), showing results on par with frontier models like Claude-3.7-Sonnet and GPT-4o. Some of these test for frontier risks, for instance HarmBench includes chemical and bio weapon related queries.
Details DeepSeek’s internal safety evaluation framework, which focuses on compliance with Chinese regulations and standards, and largely omits frontier risks.
Describes efforts in multilingual safety and robustness against jailbreaking.

Moonshot AI’s Kimi-K2: In late July, Moonshot AI published the technical report for Kimi-K2, including a relatively detailed safety section with safety evaluations for frontier risks like chemical and bio weapons as well as malicious code generation. Moonshot AI reports that they used Promptfoo for adversarial prompts and evaluation, displaying the positive effects of international diffusion of open-source safety tools.

Implications: Our State of AI Safety in China (2025) report (p. 57-59), published right before these two model card releases, found that only three of 13 major Chinese developers included dedicated safety sections in technical reports, and only three disclosed safety evaluation results. The new DeepSeek and Moonshot publications show that Chinese developers may be increasingly testing for frontier AI risks, and releasing information more transparently. Still, gaps remain. For example, Moonshot only reported aggregate figures across a range of safety categories, making it impossible to know scores on specific risk category benchmarks such as chemical and biological weapons.
DeepSeek’s discussions with the peer reviewers show that safety concerns were a major topic of in-depth exchange and debate, which ultimately improved the level of safety disclosure. This highlights academic peer-review publication as a potential mechanism for raising the bar on safety transparency globally, with a Chinese company setting the precedent.
Expert views on AI Risks
Senior Chinese experts sign statement on international AI red lines
Background: Over 200 experts, including ten Nobel laureates and two former heads of state, have endorsed a Global Call for AI Red Lines, launched during the 80th session of the UN General Assembly. There were a number of influential Chinese signatories including: Tsinghua Dean and Former Baidu President ZHANG Ya-Qin (张亚勤), Turing Award Winner Andrew YAO (姚期智), Beijing Institute of AI Safety Dean ZENG Yi (曾毅), Tsinghua Dean XUE Lan (薛澜), Z.AI Founder TANG Jie (唐杰), and Beijing Academy of AI (BAAI) Chairman HUANG Tiejun (黄铁军).
Content: The statement highlights “unprecedented risks” from advanced AI, including engineered pandemics, disinformation, mass manipulation, security threats, large-scale job loss, and systemic human rights abuses. It warns that human control could soon erode, with some systems already showing deceptive behavior. The signatories call for an international agreement on verifiable AI red lines by 2026, anchored in clear thresholds and robust enforcement mechanisms.
Implications: The statement reflects broad concern for AI risks and support for international coordination from key Chinese stakeholders across academia, government-backed labs, and industry startups. Many of these figures have signed similar declarations earlier this year, including the International Dialogues on AI Safety (IDAIS)-Shanghai statement and The Singapore Consensus on Global AI Safety Research Priorities, showing continued high-level Chinese engagement in international AI safety statements. However, the impact of such public calls is unclear; still, the prominence and diversity of the signatories—combined with the timing alongside the UN General Assembly—may give this latest statement added political weight.
What else we’re reading
Jeff Ding, ChinAI #315: Abandoned? Checking in on Three Key AI Safety Benchmarks, June 9, 2025.
Paul Triolo, Where are Chinese AI companies on safety frameworks and approaches compared to Western counterparts?, July 15, 2025.
Open Questions | Geoffrey Hinton on preventing an AI takeover and the ‘very worrying’ China-US tech race, South China Morning Post, September 1, 2025.
Thomas L. Friedman, Opinion | The One Danger That Should Unite the U.S. and China, The New York Times, September 2, 2025.
Dr. Eric Schmidt’s Crucial Insights on China, Special Competitive Studies Project, September 3, 2025.
Seán Ó hÉigeartaigh and Kristy Loke, China isn’t racing to artificial general intelligence — but U.S. companies are, The Wire China, September 14, 2025.
Concordia AI’s Recent Work
See the Concordia AI: 2025 Mid-Year Impact Report for recent updates!
Reminder: we are hiring
We are hiring! You can find all job postings on our website. Apply by October 31! You can also sign up for an English-language online info session on Wednesday October 15, at 8-9pm Beijing time.
We are hiring for:
China AI Governance Researcher/Research Manager (Beijing/Singapore). This role requires native English and professional Mandarin language skills. Expected start date is January 2026.
Frontier AI Governance Researcher (Beijing/Singapore). This role requires native English and basic Mandarin skills. Expected start date is March 2026.
Communications and Media Specialist (Beijing/Singapore). This role requires native English and basic Mandarin skills. Expected start date is March 2026.
Events Specialist (Beijing/Singapore). This role requires professional Mandarin and English skills. Expected start date is January 2026.
We are additionally hiring for several technical, governance, and operations positions in Beijing for native Chinese speakers. See this WeChat post for more details.
Feedback and Suggestions
Please reach out to us at info@concordia-ai.com if you have any feedback, comments, or suggestions for topics for the newsletter to cover.
In this newsletter issue and in Concordia AI’s work more broadly, we typically use “loss of control” to refer to losing human control over AI systems. The Framework uses “loss of human control” in this manner, but also uses “loss of control” to refer to what Western readers might more commonly consider misuse risks: “loss of control over knowledge and capacity of nuclear, biological, chemical, and missile weapons.”

