You On AI Encyclopedia · DARPA Explainable AI Program The You On AI Encyclopedia Home
Txt Low Med High
EVENT

DARPA Explainable AI Program

The 2016–2020 Defense Advanced Research Projects Agency initiative that brought Klein's cognitive psychology framework into the heart of AI research on human-machine trust.
The DARPA Explainable Artificial Intelligence program, launched in 2016, addressed what DARPA leadership identified as one of the most significant barriers to effective deployment of AI systems: users did not understand how the systems worked, did not know when to trust them, and did not know how to detect failures. DARPA assembled eleven teams of AI researchers to build more explainable systems, and — in a decision that reveals something important about the program's philosophical sophistication — established a separate team of cognitive psychologists led by Klein, tasked with understanding what explanation actually means from the perspective of the humans who need it. The distinction between technical explainability and effective human oversight became the program's defining intellectual contribution, producing both the AIQ toolkit for user-centered assessment and a research literature on the cognitive requirements for appropriate AI trust.
DARPA Explainable AI Program
DARPA Explainable AI Program

In The You On AI Encyclopedia

The program's structure reflected an insight that much subsequent AI research has struggled to internalize: explanation and understanding are not the same thing. A system can generate an explanation that is technically accurate and cognitively useless — telling the user which variables contributed to the prediction without helping the user understand why those variables matter, how the system would behave if the situation changed, or what kinds of errors the system is prone to. Klein's team focused on the gap between these two, developing frameworks for what users actually need in order to form accurate mental models of system behavior.

The program's outputs include the AIQ (Artificial Intelligence Quotient) toolkit, a set of non-algorithmic assessment instruments designed to help users identify the boundaries of AI system competence. Klein framed the name deliberately — the goal was not to measure the AI's intelligence but to raise the user's IQ about the AI systems they wrestle with. The toolkit moves beyond local explanations toward global competence mapping, supporting the construction of the mental models that calibrated trust requires.

Trust Calibration (Klein)
Trust Calibration (Klein)

The program had mixed influence on the broader AI field. Its technical work on generating explanations influenced subsequent research in interpretable machine learning, but its deeper philosophical contribution — that effective oversight requires cognitive resources different from technical transparency — was largely absorbed into human-factors research rather than becoming central to mainstream AI development. The gap between what DARPA XAI established and what production AI systems actually provide remains a significant structural feature of the field.

Klein's subsequent writing on AI has drawn extensively on lessons from the program, particularly the recognition that AI explanation designed by AI researchers tends to satisfy AI researchers rather than the domain experts who need to oversee AI outputs. The insight connects to his broader research program on expertise: effective oversight depends on experiential foundations that AI explanations alone cannot provide.

Origin

DARPA announced the XAI program in 2016 as a response to the growing recognition that the 'black box' character of deep learning systems was impeding military adoption. Program manager David Gunning led the effort, which ran until approximately 2021 and produced dozens of technical papers, evaluation frameworks, and demonstration systems.

Klein's involvement began at the program's inception and extended through its conclusion, making him one of the few non-AI-researcher principal investigators with sustained influence on program outputs. The institutional setup — a cognitive psychology team alongside AI research teams — was itself an unusual recognition that the problem the program was addressing was not purely technical.

Key Ideas

Interpretability Problem
Interpretability Problem

Explanation versus understanding. The program's central insight was that technical transparency and effective oversight are related but distinct.

User-centered assessment. The AIQ toolkit provided non-algorithmic instruments for users to map AI competence boundaries.

Mental model construction. Effective oversight requires users to build global models of system behavior, not only local explanations of specific outputs.

Failure-mode exposure. Users need experience with system failures to develop calibrated trust.

The program had mixed influence on the broader AI field

Cognitive over technical framing. The program demonstrated that AI oversight problems are substantially human-factors problems, not only engineering problems.

Further Reading

  1. Gunning, D., et al. (2019). XAI—Explainable artificial intelligence. Science Robotics, 4(37).
  2. Hoffman, R. R., Mueller, S. T., Klein, G., & Litman, J. (2018). Metrics for explainable AI: Challenges and prospects. arXiv:1812.04608.
  3. Klein, G., Hoffman, R. R., & Mueller, S. T. (2019). Scorecard for self-explaining capabilities of AI systems. DARPA XAI Technical Report.
  4. Mueller, S. T., Hoffman, R. R., Clancey, W. J., Emrey, A., & Klein, G. (2019). Explanation in human-AI systems. arXiv:1902.01876.
Explore more
Browse the full You On AI Encyclopedia — over 8,500 entries
← Home 0%
EVENT Book →