Top-P
Nucleus sampling — choosing from the smallest set of tokens whose probabilities sum to P.
A Top-P of 0.9 keeps the most probable tokens covering 90% of the mass.
Nucleus sampling — choosing from the smallest set of tokens whose probabilities sum to P.
A Top-P of 0.9 keeps the most probable tokens covering 90% of the mass.