Addressing Uncertainty in LLM Outputs for Trust Calibration Through Visualization and User Interface Design

Helen Armstrong; Ashley Lynne Anderson; Rebecca Planchart; Kweku Baidoo; Matthew Peterson

doi:10.34314/jjw1mf43

The same textual output from a large language model is presented eight times. Each presentation highlights a few portions of the text and utilizes a form of highlighting to indicate uncertain assertions. These highlights vary in color and form.

PDF

Published: 2025-08-15

DOI: https://doi.org/10.34314/jjw1mf43

Keywords:

explainable AI, human-machine teaming, intelligence analysis, large language models, trust calibration, uncertainty, user interface design, visual representation

Helen Armstrong

North Carolina State University

Ashley Lynne Anderson

Virginia Tech

Rebecca Planchart

North Carolina State University

Kweku Baidoo

North Carolina State University

Matthew Peterson

North Carolina State University

Abstract

Large language models (LLMs) are becoming ubiquitous in knowledge work. However, the uncertainty inherent to LLM summary generation limits the efficacy of human-machine teaming, especially when users are unable to properly calibrate their trust in automation. Visual conventions for signifying uncertainty and interface design strategies for engaging users are needed to realize the full potential of LLMs. We report on an exploratory interdisciplinary project that resulted in four main contributions to explainable artificial intelligence in and beyond an intelligence analysis context. First, we provide and evaluate eight potential visual conventions for representing uncertainty in LLM summaries. Second, we describe a framework for uncertainty specific to LLM technology. Third, we specify 10 features for a proposed LLM validation system — the Multiple Agent Validation System (MAVS) — that utilizes the visual conventions, the framework, and three virtual agents to aid in language analysis. Fourth, we provide and describe four MAVS prototypes, one as an interactive simulation interface and the others as narrative interface videos. All four utilize a language analysis scenario to educate users on the potential of LLM technology in human-machine teams. To demonstrate applicability of the contributions beyond intelligence analysis, we also consider LLM-derived uncertainty in clinical decision-making in medicine and in climate forecasting. Ultimately, this investigation makes a case for the importance of visual and interface design in shaping the development of LLM technology.

Issue

Vol. 59 No. 2 (2025): August 2025

Section

Research Article

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Author Biographies

Helen Armstrong, North Carolina State University

Helen Armstrong is a professor of graphic and experience design and the director of the MGXD program at NC State University. Her research focuses on digital rights, human-machine teaming, and accessible design. Armstrong authored Graphic Design Theory; Digital Design Theory; and co-authored Participate: Designing with User-Generated Content. Her recent book, Big Data, Big Design: Why Designers Should Care About Artificial Intelligence, demystifies AI — specifically machine learning — while inspiring designers to harness this technology and establish leadership via thoughtful human-centered design. Armstrong is a past member of the AIGA National Board of Directors, the editorial board of Design and Culture, and a former chair of the AIGA Design Educators Community.

Ashley Lynne Anderson, Virginia Tech

Ashley L. Anderson is an assistant professor of graphic design at Virginia Tech and a PhD in Design candidate at NC State University. Her research focuses on human-centered design and visual representation, particularly in the context of mental health and psychological intervention. She examines how design can shape and enhance the theories, processes, and methods used in psychological intervention.

Rebecca Planchart, North Carolina State University

Rebecca Planchart is a product designer at Pendo.io, a software experience management solution, where she supports enterprise platform and conversational AI initiatives. Her past research explored explainability and trust calibration in AI systems through UX and UI strategies. She is particularly interested in leveraging explainable AI to support users in high-stakes decision-making contexts.

Kweku Baidoo, North Carolina State University

Kweku Baidoo is a lecturer in graphic and experience design at NC State University. His work explores trust-centered design and visual strategies that support human understanding of complex AI systems. He is particularly interested in how AI-assisted decision-making can be designed to enhance appropriate user trust and performance in high-stakes domains such as healthcare.

Matthew Peterson, North Carolina State University

Matthew Peterson is an associate professor of graphic and experience design at NC State University. His research focuses on visual representation in user interface design, recently including the facilitation of AI in intelligence analysis workflows through human-machine teaming, text-image integration in immersive user information systems, and the facilitation of scale cognition and numeracy in virtual environments.

Article Sidebar

Main Article Content

Abstract

Article Details

Issue

Section

Author Biographies

Helen Armstrong, North Carolina State University

Ashley Lynne Anderson, Virginia Tech

Rebecca Planchart, North Carolina State University

Kweku Baidoo, North Carolina State University

Matthew Peterson, North Carolina State University

References