Short answer
What is model evaluation?
Model evaluation is the process of checking how well an AI system follows instructions, reasons, avoids unsafe claims, and handles domain-specific tasks. Human reviewers often judge examples against rubrics.
Context
Evaluation can be general or specialist. A clinician, lawyer, engineer, finance analyst, researcher, editor, or language specialist may review different kinds of model behavior.