Problem Statement¶

There is widespread interest across the DoD for capabilities to address the novel challenges posed by the T&E of AI. In this page, we outline some of the existing problems and consequent requirements.

Previous limitations¶

While many AI T&E capabilities have been developed by previous DoD AI Programs, their usage throughout the Department has been limited by several key factors, including:

Lack of functionality for advanced AI T&E functions, such as robustness of bias testing
Difficulty in deployment or use within DoD environments, including deployment into high-side environments
Difficulty in integration with MLOps pipelines different than the particular pipeline the capability was designed for
Difficulty in application to AI use cases different than the particular use case the capability was designed for
Inability to integration or use alongside other tools from open-source or industry
Inability for capabilities to evolve with developments in AI/ML research

While there exist many AI T&E capabilities within the academic and open-source community, their usage throughout the Department has also been limited by several factors, including:

Difficulty in usage by non-AI/ML experts
Lack of stability, maturity, or security
Lack of scalability
Lack of focus on operationally-realistic problems

These limitations, in addition to other factors, such as a lack of best practices for AI T&E, have inhibited the ability for DoD AI programs to perform sufficient T&E of their AI models.

Documented need¶

The formal documented need for our program comes from the National Artificial Intelligence Test & Evaluation Infrastructure Capability*, issued by CDAO in August 2023

The National Artificial Intelligence Test & Evaluation Infrastructure Capability¶

This report was developed in response to the Fiscal Year 23 Program Decision Memorandum, in which the Deputy Secretary of Defense and the Office of Cost Assessment and Program Evaluation requested more information to support data-driven decisions about AI T&E infrastructure investments at the DoD enterprise level.

Quote

Program Decision Memorandum Terms of Reference (TOR)

AI T&E capability is critical for development and fielding of autonomous and artificial intelligence systems. DoD must target investments in this area at key gaps, leverage existing infrastructure in government, industry, and academia where possible, and ensure alignment of T&E investments at all levels—from the programs up to the enterprise—with a coherent vision. This study is essential to inform the decision space for AI T&E investments, and the TOR outlines criteria that will be used to evaluate the roadmap’s utility in this regard.

Quote

There is widespread interest for DoD enterprise-level T&E infrastructure to address the novel and exacerbated challenges posed by the T&E of AI. While programs are currently investing locally in T&E resources … there is still a consistent desire across survey programs for DoD enterprise support.