LabelVizier: Interactive Validation and Relabeling for Technical Text Annotations

Xiaoyu Zhang; Xiwei Xuan; Rachael Sexton; Alden A. Dima

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

PUBLICATIONS

LabelVizier: Interactive Validation and Relabeling for Technical Text Annotations

Published

March 30, 2023

Author(s)

Xiaoyu Zhang, Xiwei Xuan, Rachael Sexton, Alden A. Dima

Abstract

With the rapid accumulation of text data brought forth by advances in data-driven techniques, the task of extracting "data annotations"—concise, high-quality data summaries from unstructured raw text—has become increasingly important. Researchers in the Technical Language Processing (TLP) and Machine Learning (ML) domains have developed weak supervision techniques to efficiently create annotations (labels) for large-scale unlabeled data. However, weakly-supervised annotations often have to balance the trade-off between annotation quality and speed. Annotations generated by the state-of-the-art weak supervision techniques may still fail in practice because of conflicts between user requirements, application scenarios, and modeling goals. There is a pressing need for efficient validation and relabeling of the output from weak supervision techniques that incorporates human knowledge and domain-specific requirements. Inspired by the practice of debugging in software engineering, we address this problem by presenting LabelVizier , a human-in-the-loop workflow that provides actionable insights into annotation flaws in large-scale multi-label datasets. We present our workflow as an interactive notebook with editable code cells for flexible data processing and a seamlessly integrated visual interface, which facilitates the annotation validation for multiple error types and the relabeling suggestion at different data scales. We evaluated the efficiency and generalizability of LabelVizier for improving the quality of technical text annotations with two use cases and five expert reviews. Our findings indicate that our workflow can be smoothly adapted to various application scenarios and is appreciated by domain experts with different levels of computer science background as a practical tool to improve annotation qualities.

Conference Dates

April 18-21, 2023

Conference Location

Seoul, KR

Conference Title

2023 IEEE 16th Pacific Visualization Symposium (PacificVis)

Pub Type

Conferences

Download Paper

https://doi.org/10.1109/PacificVis56936.2023.00026

Local Download

Keywords

Workflow Design, Technical Language Processing, Model Interpretation, Weak Supervision

Visualization research, Usability and human factors and Applied AI

Citation

Zhang, X. , Xuan, X. , Sexton, R. and Dima, A. (2023), LabelVizier: Interactive Validation and Relabeling for Technical Text Annotations, 2023 IEEE 16th Pacific Visualization Symposium (PacificVis), Seoul, KR, [online], https://doi.org/10.1109/PacificVis56936.2023.00026, https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=934878 (Accessed February 21, 2026)

Issues

If you have any questions about this publication or are having problems accessing it, please contact [email protected].

Created March 30, 2023, Updated August 21, 2023

Was this page helpful?

LabelVizier: Interactive Validation and Relabeling for Technical Text Annotations

Author(s)

Abstract

Download Paper

Keywords

Citation

Additional citation formats

Issues