The CCP’s Two-Track Approach to AI Training

October 19, 2024 Topic: Security Region: Asia Blog Brand: Techland Tags: Artificial IntelligenceChinaCCPBig DataSurveillanceDual Use Systems

The CCP’s Two-Track Approach to AI Training

To outcompete Chinese AI model developers and deployers, the United States and its allies should bet on openness, private sector investment, and tailored government action to foster an environment conducive to experimentation and innovation.

The Chinese Communist Party (CCP) has made clear its intention to become a world leader in developing and deploying artificial intelligence (AI) models. For now, the United States and American companies are still leaders in developing cutting-edge hardware and software to deliver ever-more-powerful AI models. Access to data, however, is an increasing concern for American AI developers, as lawsuits allege that analyzing copyrighted data as a part of model training infringes on copyright holder’s rights. In order to have competitive AI models and provide an alternative to China’s authoritarian vision, access to data and the freedom to train and further improve models within the United States will be paramount.

The Chinese government is taking a two-track approach to AI governance: aggressively regulate and control input data and outputs as it relates to public-facing generative models while imposing few, if any, restrictions on the development and deployment of models in enterprise, research, and military contexts. 

The CCP’s approach to regulating AI training illustrates the dangers of letting China overtake the United States in AI development. In 2023, the Cybersecurity Administration of China (CAC) issued guidance outlining certain restrictions and rules for training generative AI models and providing them to the Chinese public, including guidance on the types of data that can be used to train a model, such as copyrighted information. The CCP’s National Information Security Standardization Technical Committee (NISSTC) recently released new draft regulations governing the development and use of generative AI. The updated regulations impose additional requirements on data used by model providers, such as requiring express consent for using copyrighted information, ensuring model outputs do not undermine core socialist values, and removing any data that includes obscenities or violence.  

However, upon closer inspection, model developers working on AI models to support the CCP’s techno-industrial agenda continue to play by different rules. Both the CAC and NISSTC regulations exempt developers who are not offering generative services to the public from the same restrictions on data access, transparency, and safety testing. If models are developed in or used by “enterprises, research and academic institutions, and other public entities,” they are not bound by such guidance.

Because of this loophole, as well as long-standing national security laws requiring Chinese firms to share information to support military and intelligence operations, the CCP laws support techno-industrial uses of AI while enforcing strict rules regulating AI models used by the public. This two-track approach ensures that Chinese firms are able to move quickly and iterate in areas deemed crucial to national competitiveness while keeping a strict hold over the use of information that would allow Chinese citizens greater access to information and expression. 

The CCP is already leveraging AI to serve its global ambitions through digital surveillance and manipulation of dual-use technologies to the benefit of the Chinese military and intelligence agencies. The CCP has worked to export Chinese-made hardware and software throughout the developed and developing world. These technologies have been used to deploy AI-powered surveillance in service of the Chinese state as well as regimes aligned with the CCP. Such technologies are not “public-facing” in that citizens will not use them to generate content. Therefore, firms developing such models would be free to use any data they see fit to train and improve such systems.

Beyond surveillance, AI systems may be used to gain an advantage in armed conflict. Autonomous drones have been an increasing feature of the war in Ukraine and fighting across the Middle East. Chinese companies such as DJI and Autel are world leaders in drone hardware and software. Combining existing enterprise technologies with untold amounts of data and AI capabilities could be a boon to the modern warfighter. Considering the close-knit relationships among Chinese academia, industry, and military, this is an obvious area where the CCP’s approach to AI development and diffusion would benefit it militarily.

While the United States and American firms are developing and deploying cutting-edge AI models, one area where the United States’ AI ecosystem is facing significant headwinds is training data. Many of the leading American AI model developers have been sued by firms and individuals, alleging that the use of some copyrighted works in model training violates copyright law. If successful, these lawsuits could wipe out leading American model developers and significantly curtail access to the data necessary to build future AI models.

To outcompete Chinese AI model developers and deployers, the United States and its allies should bet on openness, private sector investment, and tailored government action to foster an environment conducive to experimentation and innovation. The first step is ensuring that American model developers and developers in open, democratic nations can access the data necessary to train their models. Countries such as Japan, Singapore, and Israel have already clarified their laws as it relates to text and data mining to promote AI development. 

Policymakers and stakeholders within the United States should continue to monitor ongoing litigation and consider frameworks to address legitimate concerns of rights holders without cutting off access to public training data and stifling AI model development. We need to ensure that American model developers are free to train and iterate upon their existing technology. Failing to consider how an overly restrictive interpretation of copyright law would disadvantage American model developers is a strategic error that the United States cannot afford to make. 

Joshua Levine is the Technology Policy Manager at the Foundation for American Innovation.

Image: Dan74 / Shutterstock.com.