publications
2024
- Promises and Pitfalls: Using Large Language Models to Generate Visualization ItemsIEEE VIS, 2024
Visualization items—factual questions about visualizations that ask viewers to accomplish visualization tasks—are regularly used in the field of information visualization as educational and evaluative materials. For example, researchers of visualization literacy require large, diverse banks of items to conduct studies where the same skill is measured repeatedly on the same participants. Yet, generating a large number of high-quality, diverse items requires significant time and expertise. To address the critical need for a large number of diverse visualization items in education and research, this paper investigates the potential for large language models (LLMs) to automate the generation of multiple-choice visualization items. Through an iterative design process, we develop the VILA (Visualization Items Generated by Large LAnguage Models) pipeline, for efficiently generating visualization items that measure people’s ability to accomplish visualization tasks. We use the VILA pipeline to generate 1,404 candidate items across 12 chart types and 13 visualization tasks. In collaboration with 11 visualization experts, we develop an evaluation rulebook which we then use to rate the quality of all candidate items. The result is the VILA bank of ∼1,100 items. From this evaluation, we also identify and classify current limitations of the VILA pipeline, and discuss the role of human oversight in ensuring quality. In addition, we demonstrate an application of our work by creating a visualization literacy test, VILA-VLAT, which measures people’s ability to complete a diverse set of tasks on various types of visualizations; comparing it to the existing VLAT, VILA-VLAT shows moderate to high convergent validity (R = 0.70). Lastly, we discuss the application areas of the VILA pipeline and the VILA bank and provide practical recommendations for their use.
- Odds and Insights: Decision Quality in Visual Analytics Under UncertaintyACM CHI, 2024
Recent studies have shown that users of visual analytics tools can have difficulty distinguishing robust findings in the data from statistical noise, but the true extent of this problem is likely dependent on both the incentive structure motivating their decisions, and the ways that uncertainty and variability are (or are not) represented in visualisations. In this work, we perform a crowd-sourced study measuring decision-making quality in visual analytics, testing both an explicit structure of incentives designed to reward cautious decision-making as well as a variety of designs for communicating uncertainty. We find that, while participants are unable to perfectly control for false discoveries as well as idealised statistical models such as the Benjamini-Hochberg, certain forms of uncertainty visualisations can improve the quality of participants’ decisions and lead to fewer false discoveries than not correcting for multiple comparisons. We conclude with a call for researchers to further explore visual analytics decision quality under different decision-making contexts, and for designers to directly present uncertainty and reliability information to users of visual analytics tools. This paper and the associated analysis materials are available at: https://osf.io/xtsfz/
2023
- Adaptive Assessment of Visualization LiteracyIEEE VIS, 2023
Visualization literacy is an essential skill for accurately interpreting data to inform critical decisions. Consequently, it is vital to understand the evolution of this ability and devise targeted interventions to enhance it, requiring concise and repeatable assessments of visualization literacy for individuals. However, current assessments, such as the Visualization Literacy Assessment Test (VLAT), are time-consuming due to their fixed, lengthy format. To address this limitation, we develop two streamlined computerized adaptive tests (CATs) for visualization literacy, A-VLAT and A-CALVI, which measure the same set of skills as their original versions in half the number of questions. Specifically, we (1) employ item response theory (IRT) and non-psychometric constraints to construct adaptive versions of the assessments, (2) finalize the configurations of adaptation through simulation, (3) refine the composition of test items of A-CALVI via a qualitative study, and (4) demonstrate the test-retest reliability (ICC: 0.98 and 0.98) and convergent validity (correlation: 0.81 and 0.66) of both CATs via four online studies. We discuss practical recommendations for using our CATs and opportunities for further customization to leverage the full potential of adaptive assessments.
- CALVI: Critical Thinking Assessment for Literacy in VisualizationsLily W. Ge, Yuan Cui, and Matthew KayACM CHI, 2023
Visualization misinformation is a prevalent problem, and combating it requires understanding people’s ability to read, interpret, and reason about erroneous or potentially misleading visualizations, which lacks a reliable measurement: existing visualization literacy tests focus on well-formed visualizations. We systematically develop an assessment for this ability by: (1) developing a precise definition of misleaders (decisions made in the construction of visualizations that can lead to conclusions not supported by the data), (2) constructing initial test items using a design space of misleaders and chart types, (3) trying out the provisional test on 497 participants, and (4) analyzing the test tryout results and refining the items using Item Response Theory, qualitative analysis, a wrong-due-to-misleader score, and the content validity index. Our final bank of 45 items shows high reliability, and we provide item bank usage recommendations for future tests and different use cases.
2021
- Can an Algorithm Be My Healthcare Proxy?Duncan McElfresh, Samuel Dooley, Yuan Cui, Kendra Griesman, Weiqin Wang, Tyler Will, Neil Sehgal, and John DickersonExplainable AI in Healthcare and Medicine, 2021
Planning for death is not a process in which everyone participates. Yet a lack of planning can severely impact a patient’s well-being, the well-being of her family, and the medical community as a whole. Advance Care Planning (ACP) has been a field in the United States for a half-century, and often using short surveys or questionnaires to help patients consider future end of life (EOL) care decisions. Recent web-based tools promise to increase ACP participation rates; modern techniques from artificial intelligence (AI) could further improve and personalize these tools. We discuss two hypothetical AI-based apps and their potential implications. We hope that this paper will encourage thought about appropriate applications of AI in ACP as well as implementation of AI to ensure patient intentions are honored.