10/02/25 - 1:00 PM - 2:00 PM EDT
Location
Virtual
There is an urgent need for standardized metadata for AI/ML datasets as their scale and complexity grows. In this talk, Elena Simperl, Director of Research at the Open Data Institute, will introduce Croissant, an open metadata vocabulary designed to make AI/ML datasets more discoverable, understandable, and reusable. Croissant provides a flexible framework that describes crucial information about data provenance, structure, licensing, and intended use. We will walk through Croissant's design and attributes and highlight how it can improve AI/ML dataset documentation, compliance, and reproducibility for real-world applications.