5 December 2023 • 11 minute read

How the mandate for federal agencies to test AI systems may impact wider AI model testing requirements

The new Executive Order on Safe, Secure, and Trustworthy Artificial Intelligence requires many federal agencies to develop and deploy guidance, oversight frameworks, and other governance mechanisms aimed at monitoring and ensuring the use and deployment of AI adheres to existing federal law.

The Order, issued on October 30, is an extensive directive aimed at advancing a coordinated federal government-wide approach to the safe and responsible development and use of AI. It takes both a general and a fine-grained approach to achieving its aim by first outlining several overarching principles and goals for the development and deployment of AI and then providing mandates to a number of primary administrative agencies regarding particular actions that must be taken to ensure that AI is developed and deployed responsibly, equitably, and in compliance with federal law.

In this alert, we describe how these new White House-mandated governance mechanisms may impact the use and development of AI, specifically with regard to AI model testing requirements.

The eight principles

The Order begins by describing eight principles which govern the federal government’s development and use of AI while guiding the oversight of entities operating in the private sector. The eight principles are:

Ensuring the Safety and Security of AI technology

Promoting Innovation and Competition
Supporting Workers
Advancing Equity and Civil Rights
Protecting Consumers, Patients, Passengers, and Students
Protecting Privacy
Advancing Federal Government Use of AI
Strengthening American Leadership Abroad

The Order then moves into a broad directive apportionment to a multitude of administrative agencies covering all types of AI. Special attention is given to “dual-use foundation models.” As defined by the Order, a dual-use foundation model is an AI model which “is trained on broad data, generally uses self-supervision, contains at least tens of billions of parameters, is applicable across a wide range of contexts, and that exhibits, or could be easily modified to exhibit, high levels of performance at tasks that pose a serious risk to security, national economic security, national public health or safety, or any combination of those matters.”

A large portion of the Order is dedicated to dispensing mandates aimed at mitigating societal harms posed by the use and development of AI, with particular focus on unintended bias and discrimination potentially produced by the use of AI and other algorithmic systems. The Order notes, “AI reflects the principles of the people who build it, the people who use it, and the data upon which it is built.” This means that even though AI holds extraordinary promise to help solve urgent challenges, “AI systems deployed irresponsibly have reproduced and intensified existing inequities, caused new types of harmful discrimination, and exacerbated online and physical harms.”

The Order outlines several deadlines for agencies to complete various tasks related to advancing the development and governance of AI in the United States.

Within 90 days, the Order requires:

The Secretary of Commerce to require developers of foundation models to report to the federal government on their activities, including on the model’s performance in relevant AI red-team testing
The Secretary of Health and Human Services to establish an HHS AI Task Force
The Secretary of Transportation to direct appropriate federal advisory committees of the DOT to provide advice on the safe and responsible use of AI in transportation
The Assistant Attorney General in charge of the Civil Rights Division to convene a meeting of the heads of federal civil rights offices to discuss, in part, prevention of discrimination in the use of automated systems, including algorithmic discrimination, and to provide guidance on best practices for investigating and prosecuting civil rights violations and discrimination related to automated systems, including AI.

Within 180 days, the Order requires:

The Secretary of Agriculture to issue guidance on use of AI in benefits programs, including analysis of whether algorithmic systems in use by benefit programs achieve equitable outcomes
The Secretary of Defense and Secretary of Homeland Security to complete an operational pilot project to identify, develop, test, evaluate, and deploy AI capabilities, such as large-language models, to aid in the discovery and remediation of vulnerabilities in critical United States government software, systems, and networks
The Secretary of Health and Human Services to publish a plan addressing the use of algorithmic systems for public benefits and services administered by the Secretary, to promote analysis of whether algorithmic systems in use by benefit programs achieve equitable and just outcomes, and to direct HHS components to develop a strategy to determine whether AI-enabled technologies in the health and human services sector maintain appropriate levels of quality, and to advance the prompt understanding of and compliance with federal nondiscrimination laws by health and human service providers that receive federal financial assistance and how those relate to AI
The Secretary of Housing and Urban Development and Consumer Financial Protection Bureau to issue guidance addressing the use of tenant screening systems in ways that may violate federal laws, including how the use of data, such as criminal records, eviction records, and credit information, can lead to discriminatory outcomes in violation of federal law; and how federal laws such as the Fair Housing Act apply to the advertising of real estate-related transactions through digital platforms, including those that use algorithms to facilitate advertising delivery, as well as on best practices to avoid violations of federal law.
The Secretary of Labor to develop and publish principles and best practices for employers that could be used to mitigate AI's potential harms to employees' wellbeing and maximize its potential benefits.

Within 240 days, the Order requires the Secretary of Commerce to provide a report to the Director of OMB and the Assistant to the President for National Security Affairs identifying the existing and potential techniques, for watermarking or otherwise detecting synthetic content and testing methods thereof.

Within 270 days, the Order requires:

The Secretary of Commerce, acting through the Director of NIST, to promote consensus industry standards, for developing and deploying safe, secure, and trustworthy AI systems and establish appropriate guidelines to enable developers of AI, especially of dual-use foundation models, to conduct AI red-teaming tests (except for AI used as a component of a national security system).
The Secretary of Energy, in coordination with the heads of other Sector Risk Management Agencies, to implement a plan for developing the Department of Energy's AI model evaluation tools and AI testbeds.

Within 365 days, the Order requires:

The Secretary of Education to develop resources with relevant stakeholders which address safe, responsible, and nondiscriminatory uses of AI in education, including the impact AI systems have on vulnerable and underserved communities
The Secretary of Labor to publish guidance for federal contractors regarding nondiscrimination in hiring involving AI and other technology-based hiring systems
The Secretary of State and Administrator of the United States Agency for International Development to publish an AI in Global Development Playbook;
The Secretary of Health and Human Services to establish an AI safety program that, among other requirements, establishes a common framework for approaches to identifying and capturing clinical errors resulting from AI deployed in healthcare settings as well as specifications for a central tracking repository for associated incidents that cause harm, including through bias or discrimination, to patients, caregivers, or other parties.

Testing AI models and systems

Notably, the Order uses the testing of AI models and systems as a keystone of its AI policy and calls for several agencies to play a role in developing guidance for the testing of both traditional AI and generative AI systems.

The importance of AI system testing to the White House’s AI policy is clear. The Order states, “Testing and evaluations will help ensure that AI systems function as intended, are resilient against misuse or dangerous modifications, are ethically developed and operated in a secure manner, and are compliant with applicable federal laws and policies.” In particular, there is guidance specifically for the Attorney General (AG), the Department of Health and Human Services (HHS), the Consumer Financial Protection Bureau (CFPB), the Department of Commerce and the National Institute of Standards and Technology (NIST), the Department of Labor (DOL) and the Department of Energy (DOE) on creating guidance on and implementing testing of AI systems.

Beyond specific instructions to particular administrative agencies to test AI systems for harmful bias, the AG is tasked with generally addressing “unlawful discrimination and other harms that may be exacerbated by AI,” and has a mandate to “support agencies in their implementation and enforcement of existing Federal laws to address civil rights violations and discrimination related to AI.” Moreover, there is also a sweeping mandate across all federal civil rights offices to “prevent and address discrimination in the use of automated systems, including algorithmic discrimination.”

As noted above, implicit in these White House mandates is the expectation that algorithmic systems be tested in order to expose and mitigate harmful bias and bring them in line with applicable federal laws and policies.

Healthcare

Within the healthcare sector, the Order instructs HHS to foster the responsible use of AI in healthcare and benefits administration. This includes the establishment of an HHS AI Task Force, whose responsibility will be to “develop a strategic plan that includes policies and frameworks […] on responsible deployment and use of AI and AI-enabled technologies in the health and human services sector [including] long-term safety and real-world performance monitoring of AI-enabled technologies.”

The Order also mandates HHS to create an AI safety program to establish “a common framework for approaches to identifying and capturing clinical errors resulting from AI deployed in healthcare settings as well as specification for a central tracking repository for associated incidents that cause harm, including through bias or discrimination, to patients, caregivers, or other parties.”

Financial services

The Federal Housing Finance Agency and the CFPB are also “encouraged to consider using their authorities […] to require their respective regulated entities […] to use the appropriate methodologies including AI tools to ensure compliance with federal law.” This includes testing and evaluating underwriting models for bias or disparities that affect protected groups and automating collateral-valuation and appraisal processing in ways that minimize bias.

In addition, CFPB and the Department of Housing and Urban Development are further implicated in providing additional guidance regarding tenant screening systems and their potential violation of existing federal law, including the Fair Housing Act (FHA) and the Fair Credit Reporting Act (FCRA). Moreover, they are to provide guidance addressing how the FHA and other federal laws such as the Equal Credit Opportunity Act, “apply to the advertising of housing, credit, and other real estate-related transaction through digital platforms.”

Employment

The mitigation of bias through the testing of AI systems is also implicated by the Order’s instruction regarding the employment sector. Under the Order, the DOL has a responsibility to, “prevent unlawful discrimination from AI used for hiring” and “publish guidance for Federal contractors regarding nondiscrimination in hiring involving AI and other technology-based hiring systems.” It is expected that testing AI systems for unintended bias will be an integral part of the DOL’s guidance on nondiscriminatory hiring involving AI.

Generative AI

The testing of Generative AI and dual-use foundation models also plays a central role in the AI policy outlined in the Order. The DOE is called out specifically to act on testing generative AI, with the creation of AI testbeds, which are “facilit[ies] or mechanism[s] equipped for conducting rigorous, transparent, and replicable testing of tools and technologies, including AI and privacy-enhancing technologies, to help evaluate the functionality, usability, and performance of those tools or technologies.” These testbeds will be used to assess AI systems’ capabilities and to build foundation models.

The DOE is not alone, as the Department of Commerce is required to collect reports from companies developing foundation models on many aspects of the models and their computing, including the results of the models’ “performance in relevant AI red-team testing guidance developed by NIST […], and a description of any associated measures the company has taken to meet safety objectives, such as mitigations to improve performance on these red-team tests and strengthen overall model security.”

DLA Piper’s AI testing services

It is clear from the White House Order that testing both generative and traditional AI systems will become an integral part of any entity’s use and development of AI. DLA Piper provides a number of AI model testing services that can help organizations prepare for federal oversight of their use of AI. To learn more, contact any of the authors.

Next steps

For more information emerging legal and regulatory standards for AI, visit DLA Piper’s Focus on Artificial intelligence.
Gain insights and perspectives that will help shape your AI strategy through our newly released AI Chatroom video series.

How the mandate for federal agencies to test AI systems may impact wider AI model testing requirements

Related capabilities