Max Tokens

An inference parameter capping the length of the model's response.

It limits output length only — not the input context size.

Back to Applications of Foundation Models