2025 NIST GenAI (Pilot) Evaluation Plan for Image Generators

George Awad; Hariharan Iyer; Seungmin Seo; Peter Fontana; Yooyoung Lee

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

PUBLICATIONS

2025 NIST GenAI (Pilot) Evaluation Plan for Image Generators

Published

March 14, 2025

Author(s)

George Awad, Hariharan Iyer, Seungmin Seo, Peter Fontana, Yooyoung Lee

Abstract

In this NIST Generative AI (GenAI) program, we invite and encourage participating teams from academia, industry, and other research labs to support research in Generative AI. GenAI is an evaluation series that provides a platform for testing and evaluation to measure the performance of AI content generators (e.g., allies/adversaries) and AI content  discriminators (e.g., detectors/defenders). The platform is intended to support multiple modalities and technologies enabled by both sides of the generative spectrum, "generators" and "discriminators." Generator (G) teams will be tested on their system's ability to generate content that is indistinguishable from human-generated content. For the pilot study, the evaluation will help determine strengths and weaknesses in their approaches, including insights about how and when humans and/or AI can detect AI-generated content. Discriminator (D) teams will be tested on their system's ability to differentiate between AI-generated content and human-generated content. Lessons learned from both sides of teams should benefit future research directions and approaches to understand cutting-edge technologies as well as for providing recommendations and guidance for responsible and safe use of digital content. The 2025 GenAI evaluation pilot study, discussed in this document, will focus on the image modality. In the pilot GenAI generator task, the objective of Image Generators (Image-G) is to automatically generate realistic images given a textual description. The expected textual descriptions will span a set of diverse attributes across many categories representing realistic real-world scenarios. On the other side, the pilot Image Discriminators (Image-D) task is to detect if a target image was generated using a Generative AI system or not. The context of this evaluation assumes completely AI-generated content (cases where humans use AI tools to enhance content with no semantic changes such as resize, contrast, sharpness, etc., are not included).

Citation

NIST GenAI Program

Pub Weblink

https://ai-challenges.nist.gov/genai

Pub Type

Websites

Download Paper

Local Download

Keywords

Artificial Intelligence (AI), Generative AI, Measurement, Evaluation, Deepfake, Synthetic Content

AI measurement and evaluation

Citation

Awad, G. , Iyer, H. , Seo, S. , Fontana, P. and Lee, Y. (2025), 2025 NIST GenAI (Pilot) Evaluation Plan for Image Generators, NIST GenAI Program, [online], https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=959544, https://ai-challenges.nist.gov/genai (Accessed April 27, 2025)

Issues

If you have any questions about this publication or are having problems accessing it, please contact reflib@nist.gov.

Created March 14, 2025, Updated March 31, 2025

2025 NIST GenAI (Pilot) Evaluation Plan for Image Generators

Author(s)

Abstract

Download Paper

Keywords

Citation

Additional citation formats

Issues