AI Safety in China: 2024 in Review
Entering into 2025, Concordia AI has analysed key developments in China’s AI safety and governance from 2024 and provided projections for the coming year. This review covers five areas corresponding to our previous comprehensive reports on the State of AI Safety in China in October 2023 and May 2024: domestic governance, international governance, technical research, expert views, and corporate governance.
Domestic governance
Key developments in 2024
China’s domestic AI governance can be understood through multiple layers, including overarching national guidance; national regulations and policies; science and technology ethics; voluntary standards; and local government actions. In 2024, the most notable developments emerged in national guidance and voluntary standards, as reflected in Concordia AI’s updated database of China’s AI Governance Documents.
China’s most important political meeting in 2024 elevated “instituting oversight systems to ensure the safety of AI” as a national priority. The meeting classified AI safety as a national security and public safety concern alongside cybersecurity, biological security, and natural disasters.
The Communist Party of China’s (CPC) 20th Central Committee held its Third Plenum on July 15-18. This key political meeting, which sets the country’s domestic objectives and is held roughly once every five years, called for “instituting oversight systems to ensure the safety of AI,” using AI “safety” in the official English translation unlike translations of some previous documents, which used “security” instead.
An accompanying party study guide emphasized that development cannot come at the price of sacrificing safety and urged strengthening forward-looking prevention and constraint-based guidance.
Uncertainty remains over China's national AI law, including the timing and likelihood of a first draft.
In May 2024, the State Council’s annual legislative work plan stated that the AI law was “under preparation” for submission to the legislature for review. In the same month, China’s legislature mentioned that it was preparing “legislative projects” on AI.
In July, an official in China’s legislature signalled that legislation on AI would be incremental and deliberative, favoring flexible application of existing regulations or enacting minor legislation.
In January 2025, the Central Political and Legal Affairs Work Conference, a yearly meeting to set law enforcement and legislative priorities, noted the need to push forward legislation to fill gaps in domains including AI.
Chinese standards organizations released AI safety/security testing standards that aim to align with domestic priorities while expanding focus on frontier safety risks.
In February, cybersecurity standards committee TC260 published a technical document on generative AI security assessments, which is in the process of being adapted into a national standard. It focuses on domestic concerns such as content that violates core socialist values, discriminatory content, and commercial violations.1
In September, TC260 also published an AI Safety Governance Framework discussing frontier AI risks like misuse in the biological and cyber domains, as well as loss of control. While not a formal standard, the framework advocates for advancing research on AI explainability, AI safety incident tracking and sharing, and international cooperation on AI safety standards.
While China has not set up a national-level AI safety institute (AISI), local governments in Beijing and Shanghai have created similar bodies, and several government-backed research groups conduct AI safety evaluations.
The Shanghai and Beijing local governments established AI safety and governance labs in July and September respectively, though there has been no news since about the activity of the Shanghai entity.
Several Chinese institutions, including government-backed research centers, parallel the work of AISIs in other countries through technical research, standards development, and/or international cooperation. Concordia AI’s database documents over 40 Chinese AI safety evaluations, with at least three government-backed AI research institutions publishing frontier AI safety benchmarks and evaluations.

At the national regulation level, companies have continued compliance with a generative AI filing and registration process that resembles a licensing regime and tests for content, discrimination, and other domestic Chinese concerns.
Per China’s July 2023 Interim Generative AI Services Management Measures, developers of publicly available chatbots or image generators must register with the government. 238 such AI systems registered in 2024, for a total of 302 since the passage of the regulation.
According to an explanation by a Chinese consulting company and other analysts, this process is highly involved, in particular requiring a safety and security assessment that tests the model against a question bank on the risks contained in TC260’s technical document discussed above. Chinese cyberspace regulators must be given predeployment access to the model to verify its performance. Both provincial level cyberspace authorities and the national level body must approve before the model can be registered.
Looking ahead in 2025
The central question for China’s domestic AI governance in 2025 is how government bodies will implement the Third Plenum’s AI safety oversight provision. While this authoritative party pronouncement will certainly be implemented, the specific steps remain unclear. Key uncertainties include whether implementation will advance the national AI law and change institutional structures. There already has been action in the standards domain; in January 2025, standards body TC260 published a draft list of potential standards to be developed for AI safety and security, including standards that would address risks of AI loss of control and AI misuse for cyberattacks.
International Governance
Key developments in 2024
2024 marked significant developments in China's AI governance through key multilateral forums and bilateral relationships.
China’s top leaders view AI governance as an international diplomatic priority given discussion in high-profile international forums.
At the World Economic Forum in Davos in January, Premier LI Qiang (李强) noted that human control over AI is a red line.
At the World AI Conference (WAIC) 2024 in July, Premier Li argued for improving AI governance, and China held a 30+ nation ministerial roundtable that discussed AI safety bottom lines. The conference published the “Shanghai Declaration” advocating preventing AI misuse for hacking and by terrorists.
At the G20 in November, President Xi Jinping advocated for strengthening global digital governance cooperation and extended an invitation to the G20 to participate in WAIC.
At the January 2025 World Economic Forum in Davos, Chinese Vice-Premier DING Xuexiang (丁薛祥) compared the need for AI safety measures to brakes on a car. He voiced concern that disorderly competition over AI is a “gray rhino,” meaning a probable event with high impact. He also called for learning lessons from the UN’s successful governance of nuclear security and biosafety.

China is emphasizing cooperation on AI capacity building with countries from around the globe through numerous multilateral and bilateral channels.
At the UN, the China-led United Nations General Assembly (UNGA) resolution in July proposed measures to enhance AI capacity building.
This initiative gained momentum when Foreign Minister WANG Yi (王毅) announced China’s “AI Capacity-Building Action Plan for Good and for All” at the UN Summit of the Future in September. The initiative advanced further when China created a “Group of Friends” for AI capacity building in December, although public details on the group’s plans are sparse.
China conducted bilateral and multilateral AI outreach with BRICS, African countries, Russia, and Arab states to expand dialogue channels, explore technology transfer, and discuss UN-centered governance approaches.
The China-US intergovernmental dialogue achieved concrete results in one year with an agreement on maintaining human control over nuclear weapons.
The dialogue, announced by President Xi Jinping and then-President Joe Biden in November 2023, progressed from its first meeting in May 2024 to an agreement on AI in nuclear systems in November.
The May meeting may have discussed AI catastrophic misuse and loss of control risks, given participation of the US AI Safety Institute director.
The agreement to maintain human control over nuclear weapons creates a foundation for cooperation on nuclear risk reduction and AI safety and risk, according to then-US National Security Advisor Jake Sullivan.
Passage of respective US-led and China-led UN General Assembly resolutions on AI trustworthiness and development has created a baseline global common understanding.
The UNGA resolution on “safe, secure and trustworthy AI” introduced by the US in March was co-sponsored by China and 120+ countries, addressing digital divides, testing and evaluation measures, and third-party reporting of AI misuse.
The UNGA resolution on “capacity-building of AI” introduced by China in July was co-sponsored by the US and 140+ countries. It pushed for increased technology sharing, multilingual training data, bridging digital divides, and ensuring safeguards against malicious AI use. China’s Ambassador to the UN emphasized the complementary nature of both resolutions and noted the US’s “positive role” in the process.
Looking ahead in 2025
The landscape of multilateral and bilateral AI governance venues presents both opportunities and uncertainties for advancing frontier AI safety in 2025. Several key forums warrant attention:
China’s diplomatic focus on addressing Global South concerns about AI development gaps aligns with the capacity building aspects of the UN Global Digital Compact, suggesting potential for substantive engagement in its implementation.
The France AI Action Summit offers substantial potential for engagement, as evidenced by the inclusion of AI governance in the May joint statement (Ch, Fr) between Presidents Xi and Macron. However, questions remain about the future hosting and direction of the summit series.
The G20 and APEC may gain prominence in China’s AI diplomacy, particularly given President Xi's recent discussion of AI at the G20, China’s upcoming role as APEC host in 2026, and the Xi-Biden agreement on AI in nuclear systems at APEC 2024. However, the effectiveness of these venues will crucially depend on, among other things, the level of US participation.
The November 2024 China-US AI-nuclear agreement, reached in the final months of the Biden administration, could signal China’s openness to maintain AI safety dialogue with the Trump administration. Chinese officials publicly emphasized a desire to “engage in dialogue” … “with the incoming US government” after the same meeting.
Technical Safety Research
Key developments in 2024
The following analysis is based on Concordia AI’s Chinese Technical AI Safety Database, which has been updated to include papers by Chinese institutions on frontier AI safety from April 2023 to mid-December 2024. See the database’s “Guide” tab for more information on the methodology.
In 2024, Chinese institutions significantly increased publication of frontier AI safety papers compared to 2023, from approximately seven papers per month in 2023 to 18 per month in 2024. The steady publication rate throughout the year might indicate that interest has leveled off after rising sharply in the two years following ChatGPT's release.

Research efforts remained concentrated on alignment and robustness, continuing patterns observed in the State of AI Safety in China Spring 2024 report. While interpretability research represents a small portion of publications, both the volume and share of interpretability papers grew in 2024, with significant new contributions on methods similar to mechanistic interpretability.
The number of “Key Chinese AI Safety-relevant Research Groups” – with at least one researcher who was anchor author for at least three frontier safety papers – doubled from 11 groups to 24 groups between May and December (see database guide for methodology). This growth in dedicated research groups, occurring while overall publication numbers held steady, could indicate that more Chinese researchers are making frontier AI safety a primary focus of their work. While university laboratories accounted for most of this expansion, private sector involvement also increased, from two private company labs documented in May to five at the end of 2024.
Several notable research themes emerged in frontier AI safety during 2024:
Several papers examined the dual-use risks of AI in scientific domains. This includes papers on analyzing risks of LLM agents in science, benchmarking LLM safety in chemistry, and evaluating safety alignment of LLMs across a range of scientific tasks. These three papers are also strong examples of international collaboration on frontier safety research.
Model risk mitigation research diversified beyond traditional RLHF approaches to include novel techniques in unlearning and non-fine-tunable learning. Two position papers explored the foundations of unlearning, and other papers applied unlearning to defend against jailbreaks. Additionally, researchers published two studies on preventing malicious fine-tuning.
After limited work in 2023, researchers increasingly leverage interpretability tools for safety monitoring and intervention. This growing body of research investigated inner representations, intermediate hidden states, parameter safety layers, finding safety neurons, and attention heads.
Looking ahead in 2025
Chinese AI safety research will likely maintain its 2024 momentum in 2025, and may accelerate further if breakthroughs in emerging fields such as AI for science, AI agents, and embodied AI draw increased attention from researchers. It is difficult to predict which research directions will be most popular. A large number of papers will likely continue to be written on safety alignment (e.g. RLHF) and robustness (e.g. jailbreaking topics). Simultaneously, more forward looking research efforts could increase. Agents were already a popular safety research direction in 2024, with 17 papers including “agent” in the title, and this will likely only increase in 2025.
Expert Views
Key developments in 2024
This section analyzes Chinese expert thinking on AI safety and governance through four key platforms: major Chinese AI conferences; international dialogues; state media reports; and an important government-backed think tank’s publications. These platforms were selected for their prominence and influence on government policy. Due to data and space constraints, this section focuses on experts who expressed concerns about frontier AI safety risks, while acknowledging the existence of differing perspectives in Chinese expert discourse.
Quantitative analysis of the proportion of AI safety and governance forums at major government-sponsored conferences shows that the World AI Conference (WAIC) and World Internet Conference (WIC) both increased focus on safety and governance in 2024, while the Zhongguancun (ZGC) Forum held steady.2 These three conferences were selected because of their focus on AI or science and technology, and because they are supported by central government ministries.
At WAIC 2024, AI safety and governance forums increased to 15 of 107 total forums, up from ten of 133 in 2023.3 The event's upgrade to a “High-Level Meeting on Global AI Governance” elevated safety discussions to the opening and closing ceremonies, ministerial talks, and specialized forums including Concordia AI’s Frontier AI Safety and Governance Forum. Notable speeches included Shanghai AI Lab Director ZHOU Bowen (周伯文) endorsing simultaneously advancing AI capabilities and safety, and Turing Award Winner Andrew YAO (姚期智) proposing pursuit of “provably safe AGI.”
The 2024 WIC Summit had two forums on safety or governance out of four AI-focused forums, an increase from zero out of one in 2023.4 WIC created a new AI expert committee, which Chinese Academy of Sciences researcher ZENG Yi (曾毅) stated would work on AI safety and governance. During one of the forums on responsible AI development, an executive from Ant Group proposed a traffic light system with “red lights” to defend against AI applications that violate human values.
The 2024 ZGC Forum did not have any safety or governance-focused forums among seven AI-focused forums, similar to zero out of six in 2023.5 However, two forums in 2024 still featured some discussion of AI safety: Tsinghua Professor SUN Maosong (孙茂松) advocated for developing large model safety/security evaluation data sets, while a researcher at Beijing Institute of General AI discussed safety risks of multi-agent systems.
In 2024, Chinese experts participated in international AI dialogues that agreed on AI redlines, developed AI terminology glossaries, and discussed effects of AI on biosecurity.
The International Dialogues on AI Safety (IDAIS), led by Chinese, US, and Canadian computer scientists, produced two key agreements in 2024. The March Beijing statement established five AI red lines: autonomous replication, power seeking, assisting weapons development, cyberattacks, and deception. The September Venice statement called for domestic AI safety oversight coordination, frontier AI safety frameworks, and global safety research funding.
A dialogue between the Brookings Institution and Tsinghua’s Center for International Security and Strategy (CISS) developed parallel glossaries of AI terms, seeking to standardize terminology across countries.
The Nuclear Threat Initiative launched an AI-Bio Forum in April, and INHR incorporated biology in its AI dialogue series, marking the first China-Western discussions on the AI-biological security convergence — addressing a gap noted earlier in Concordia AI's track 2 dialogue landscape analysis.
Several Chinese AI experts signed the “Manhattan Declaration on Inclusive Global Scientific Understanding of Artificial Intelligence” alongside other notable global experts, recommending stronger global scientific cooperation on AI, assessment of risks, and treating AI as a global public good.

Some of China’s most authoritative media outlets have published on the opportunities and risks of AI agents.
The Chinese government’s state media agency, Xinhua News, published an article on risks of AI agents in July. The article discussed risks of AI agents deceiving humans, subverting safety measures, and escaping human control.
In November, the CPC’s official newspaper, the People’s Daily, published a one-page feature with three expert essays on AI oversight. One essay focused on AI agents, arguing that key challenges in agent safety include unpredictable interactions of multi-agent systems (e.g. stock market “flash crashes”), manipulation of user emotions, and interaction with the physical world.
One of China’s premier government think tanks for technology policy discussed frontier AI safety risks in multiple reports.
The China Academy of Information and Communications Technology (CAICT, 中国信息通信研究院), a non-state public institution under the Ministry of Industry and Information Technology, published a number of reports touching on AI safety in 2024. CAICT is one of many influential government-backed think tanks writing on AI policy, with perhaps the largest amount of public AI governance research output among its peers.
CAICT’s September Foundation Model Safety Research Report, followed by three reports in December on AI development, governance, and risk governance, addressed frontier AI safety concerns including chemical, biological, radiological, and nuclear (CBRN) misuse and loss of control.
Looking ahead in 2025
AI safety and governance discussions at China's premier technology conferences are expected to expand, as we have already seen at WAIC and WIC in 2024. While Track 2 dialogues are key for fostering international exchange, advancing more concrete cooperation initiatives may prove challenging. A central theme in Chinese discourse will remain the balance between AI development and safety considerations, though the precise balance of this evolving dialogue remains to be seen.6
Lab Governance
Key developments in 2024
Concordia AI’s State of AI Safety in China 2023 report analyzed lab governance through safety practices in large model development, industry association actions, and lab or industry principles. Due to limited transparency about AI companies’ internal safety practices, this section focuses instead on Chinese corporate commitments, industry association actions, and expressed views of lab leadership on AI safety.
In 2024, Chinese companies demonstrated willingness to make public AI safety commitments through both domestic and international channels. While implementation details remain unclear, the domestic commitments notably align with international priorities in the section: “Vigorously advance frontier safety and security research.”
In May at the AI Seoul Summit, Zhipu AI joined 15 global companies in signing the Frontier AI Safety Commitments.
In December, the AI Industry Alliance (AIIA) of China announced that 17 Chinese companies had agreed to the “Artificial Intelligence Safety Commitments.” The participants include most of the major Chinese AI developers, including big tech companies (Alibaba, Baidu, Huawei, Tencent) and LLM startups (01.AI, Deepseek, Minimax, Zhipu AI). The document advocates adopting “appropriate safety measures for open-source initiatives” and advancing “frontier safety and security research” in areas such as AI agents and embodied intelligence.
AIIA advanced several AI safety benchmarking projects which focus mostly on typical content security concerns but also include some emerging frontier safety topics.
In collaboration with the government-overseen CAICT think tank, AIIA launched an “AI Safety Benchmark” in April focused on evaluating three domains: science and technology ethics, data security, and content security. It included frontier-safety-relevant questions on AI “consciousness” (e.g. anti-humanity inclinations) and violations of laws (e.g. hazardous chemicals). AIIA iterated on the benchmark throughout the year to address additional challenges such as multimodality.
In late 2024, AIIA and CAICT also held workshops on AI agent safety, such as AI agent-driven cyberattacks, and announced plans for developing norms on AI agent safety assessment.
The leadership of several top Chinese LLM companies increasingly acknowledged frontier risks from advanced AI systems in 2024.
In June, CEOs of four leading Chinese LLM startups shared views on AI safety at the Beijing Academy of AI conference. Baichuan Intelligence’s CEO compared AGI to nuclear bombs in the potential for causing human extinction, and Moonshot AI’s CEO mentioned the possibility of a model “possessing its own intentions.” Zhipu AI’s CEO highlighted the company’s participation in the Frontier AI Safety Commitments, while ModelBest’s CEO focused on safety/security risks when models are deployed to robots and end users.
Looking ahead in 2025
A critical question for 2025 is how Chinese AI companies will implement their domestic and/or international AI safety commitments. The Paris AI Action Summit provides a deadline for companies signing the international commitment, while timelines for executing the domestic pledges are undefined. As Chinese AI companies expand their global presence, they may receive independent ratings for safety and governance practices, which provide valuable metrics for evaluating corporate safety practices.
The technical document also referenced “long-term risks” including deceptiveness, self-replication, and misuse in cyber, biological, or chemical domains, though this reference was removed in the most recent draft.
Forums were defined as focused on AI if the forum name used keywords such as “AI,” “Intelligent,” “Large Model,” “Robot,” and they were further coded as AI safety or governance forums if they also included keywords such as “Safety,” “Governance,” “Law,” “Trustworthy,” “Ethics,” etc. Forums did not qualify as AI-focused if they only used keywords such as “Digital,” “Internet,” Science and Technology,” etc. Since WAIC contains AI in the name, all of the WAIC forums were coded as AI-focused forums.
WAIC is co-organized by four central ministries: National Development and Reform Commission, Ministry of Science and Technology, Ministry of Industry and Information Technology, and Cyberspace Administration of China.
WIC 2024 held around 20 total forums covering a range of cyberspace issues, not just AI. WIC is co-organized by one central ministry: Cyberspace Administration of China.
The 2024 ZGC Forum held 64 total forums in 2024 covering cutting-edge science and technology topics, not just AI. The forum is co-organized by the four central ministries: Ministry of Science and Technology, National Development and Reform Commission, Ministry of Industry and Information Technology, and State-owned Assets Supervision and Administration Commission.
See slide 56 of Concordia AI’s State of AI Safety in China Spring 2024 report.