Prompt Engineering & Evaluation Flashcards

Hard

Spaced repetition with the SM-2 algorithm — grade each card and PlayPrepHQ schedules it to resurface right before you'd forget it. Progress saves in this browser.

Loading your deck…

Terms in this set

Prompt Engineering The practice of designing prompts to get reliable, high-quality model output.
Zero-shot Prompting a model to perform a task with no examples provided.
Few-shot Prompting a model with a few examples to demonstrate the desired output.
Chain-of-Thought Prompting a model to show step-by-step reasoning before its final answer.
System Prompt A high-level instruction that sets a model's role, tone, and rules for a conversation.
Prompt Injection An attack that smuggles instructions into input to override the model's intended behavior.
Jailbreaking Crafting prompts that bypass a model's safety guardrails to elicit disallowed output.
ROUGE A recall-oriented metric that compares generated summaries to reference text by overlap.
BLEU A precision-oriented metric for evaluating machine translation quality.
BERTScore An evaluation metric that compares texts by embedding similarity rather than exact overlap.
Human Evaluation Having people rate model output for quality, relevance, or safety.
Agent An LLM-driven system that plans steps and calls tools or APIs to complete tasks.