Human Evaluation
Having people rate model output for quality, relevance, or safety.
Often combined with automatic metrics for a fuller picture.
Advertisement
Advertisement
Having people rate model output for quality, relevance, or safety.
Often combined with automatic metrics for a fuller picture.
Advertisement
Advertisement