Inference & Customization Flashcards

Medium

Spaced repetition with the SM-2 algorithm — grade each card and PlayPrepHQ schedules it to resurface right before you'd forget it. Progress saves in this browser.

Loading your deck…

Terms in this set

Temperature An inference parameter controlling randomness — higher values make output more varied.
Top-K A sampling setting that limits token choice to the K most likely options.
Top-P Nucleus sampling — choosing from the smallest set of tokens whose probabilities sum to P.
Max Tokens An inference parameter capping the length of the model's response.
RAG Retrieval-Augmented Generation — grounding model answers in relevant retrieved documents.
Vector Database A store of embeddings that supports fast similarity search for retrieval.
Knowledge Base Amazon Bedrock Knowledge Bases — managed RAG that connects FMs to your data sources.
OpenSearch Amazon OpenSearch Service — a search and analytics engine usable as a vector store for RAG.
pgvector A PostgreSQL/Aurora extension that stores and searches embedding vectors.
Fine-tuning Customizing a foundation model by further training it on labeled, task-specific data.
In-Context Learning Guiding a model's behavior by including instructions or examples directly in the prompt.
Continued Pre-training Further training an FM on large amounts of unlabeled domain data to build domain knowledge.