Overview
Generation functions accept configuration options as flat top-level parameters on the options object. The core options temperature, maxTokens, and topP are available on every generation call. Additional parameters like stopSequences, frequencyPenalty, and presencePenalty are provider-specific and passed via providerOptions.
Core options
These options are available on BaseGenerateOptions and apply to generate(), stream(), generateObject(), and streamObject():
type BaseGenerateOptions = {
messages: Message[];
temperature?: number;
maxTokens?: number;
topP?: number;
reasoning?: ReasoningConfig;
providerOptions?: GenerateProviderOptions;
signal?: AbortSignal;
};
temperature
Controls randomness in the output. Higher values make output more creative and random, lower values make it more focused and deterministic.
Type: number
Range: 0.0 to 2.0 (provider-dependent)
Default: Usually 1.0
import { generate } from '@core-ai/core-ai';
const creative = await generate({
model,
messages: [{ role: 'user', content: 'Write a fantasy story opening' }],
temperature: 1.5,
});
const factual = await generate({
model,
messages: [{ role: 'user', content: 'What is the capital of Germany?' }],
temperature: 0.2,
});
Use low temperature (0.0-0.3) for factual tasks, code generation, and consistency. Use high temperature (1.0-2.0) for creative writing, brainstorming, and varied outputs.
maxTokens
Maximum number of tokens to generate in the response.
Type: number
Range: Varies by model and provider
const result = await generate({
model,
messages: [{ role: 'user', content: 'Explain quantum physics' }],
maxTokens: 150,
});
Some providers (like Anthropic) require maxTokens to be set. The provider wrapper may set a default value if not specified.
Token Estimation:
- 1 token ≈ 0.75 words (English)
- 100 tokens ≈ 75 words
- 1000 tokens ≈ 750 words
topP
Nucleus sampling: considers only tokens whose cumulative probability is above this threshold.
Type: number
Range: 0.0 to 1.0
Default: Usually 1.0
const result = await generate({
model,
messages: [{ role: 'user', content: 'Generate product names' }],
temperature: 1.0,
topP: 0.9,
});
Don’t use both high temperature and low topP together. They serve similar purposes and can conflict. Choose one approach.
Provider-specific options
Options like stopSequences, frequencyPenalty, and presencePenalty are not part of the core options. They are passed through providerOptions, namespaced by provider:
type GenerateProviderOptions = {
[provider: string]: Record<string, unknown> | undefined;
};
Each provider defines and validates its own set of options. See the provider pages for the full schema:
- OpenAI — Responses API (
createOpenAI): store, serviceTier, include, parallelToolCalls, user. Chat Completions API (createOpenAICompat): store, serviceTier, parallelToolCalls, user, stopSequences, frequencyPenalty, presencePenalty, seed
- Anthropic —
topK, stopSequences, betas, outputConfig, cacheControl
- Google GenAI —
stopSequences, frequencyPenalty, presencePenalty, seed, topK
- Mistral —
stopSequences, frequencyPenalty, presencePenalty, randomSeed, parallelToolCalls, promptMode, safePrompt
stopSequences
Array of sequences that stop generation when encountered. Passed via providerOptions:
const result = await generate({
model,
messages: [{ role: 'user', content: 'Count from 1 to 100' }],
providerOptions: {
google: { stopSequences: ['10'] },
},
});
console.log(result.content);
// "1, 2, 3, 4, 5, 6, 7, 8, 9, 10"
stopSequences support varies by provider. For OpenAI, it’s only available with createOpenAICompat (Chat Completions API). The default createOpenAI (Responses API) does not support it.
frequencyPenalty
Reduces likelihood of repeating tokens based on how often they’ve appeared.
Range: -2.0 to 2.0 (provider-dependent)
const result = await generate({
model,
messages: [{ role: 'user', content: 'List creative product features' }],
providerOptions: {
google: { frequencyPenalty: 0.7 },
},
});
Use frequencyPenalty between 0.5-1.0 for creative writing or lists where you want diverse output without repetitive phrases.
presencePenalty
Reduces likelihood of tokens that have already appeared at least once.
Range: -2.0 to 2.0 (provider-dependent)
const result = await generate({
model,
messages: [{ role: 'user', content: 'Suggest unique vacation destinations' }],
providerOptions: {
google: { presencePenalty: 1.0 },
},
});
Difference from Frequency Penalty:
presencePenalty: Binary — penalizes any token that appeared at least once
frequencyPenalty: Proportional — penalizes based on how many times token appeared
Complete configuration example
import { generate } from '@core-ai/core-ai';
import { createOpenAICompat } from '@core-ai/openai/compat';
const openai = createOpenAICompat();
const model = openai.chatModel('gpt-5-mini');
const result = await generate({
model,
messages: [
{ role: 'system', content: 'You are a creative writing assistant.' },
{ role: 'user', content: 'Write an engaging story opening' },
],
temperature: 1.2,
maxTokens: 500,
topP: 0.95,
providerOptions: {
openai: {
frequencyPenalty: 0.6,
presencePenalty: 0.3,
stopSequences: ['---'],
},
},
});
console.log(result.content);
console.log('Tokens used:', result.usage.outputTokens);
Reasoning configuration
For models that support extended thinking:
type ReasoningConfig = {
effort: ReasoningEffort;
};
type ReasoningEffort =
| 'minimal'
| 'low'
| 'medium'
| 'high'
| 'max';
Usage:
const result = await generate({
model: anthropic.chatModel('claude-sonnet-4-6'),
messages: [
{ role: 'user', content: 'Solve this complex logic puzzle...' },
],
reasoning: {
effort: 'high',
},
});
if (result.reasoning) {
console.log('Reasoning:', result.reasoning);
}
console.log('Answer:', result.content);
Reasoning configuration is provider-dependent. Check if your model supports extended thinking before using this option.
Providers interpret reasoning differently. Anthropic and OpenAI enforce model-specific restrictions, Google maps effort to thinking level or budget, and Mistral accepts the option but does not send effort to the API.
Configuration best practices
For different tasks
Code Generation:
const result = await generate({
model,
messages,
temperature: 0.2,
maxTokens: 2000,
providerOptions: {
google: { stopSequences: ['```\n\n'] },
},
});
Creative Writing:
const result = await generate({
model,
messages,
temperature: 1.3,
providerOptions: {
google: { frequencyPenalty: 0.7, presencePenalty: 0.4 },
},
});
Question Answering:
const result = await generate({
model,
messages,
temperature: 0.3,
maxTokens: 300,
});
Brainstorming/Ideas:
const result = await generate({
model,
messages,
temperature: 1.5,
providerOptions: {
google: { presencePenalty: 1.0 },
},
});
Testing configurations
Start with default settings and adjust one parameter at a time. Temperature is usually the most impactful setting to tune first.
const baseline = await generate({ model, messages });
const temps = [0.3, 0.7, 1.0, 1.5];
for (const temp of temps) {
const result = await generate({
model,
messages,
temperature: temp,
});
console.log(`Temperature ${temp}:`, result.content);
}
Usage tracking
All generation results include token usage information:
type ChatUsage = {
inputTokens: number;
outputTokens: number;
inputTokenDetails: ChatInputTokenDetails;
outputTokenDetails: ChatOutputTokenDetails;
};
type ChatInputTokenDetails = {
cacheReadTokens: number;
cacheWriteTokens: number;
};
type ChatOutputTokenDetails = {
reasoningTokens?: number;
};
Example:
const result = await generate({ model, messages });
console.log('Input tokens:', result.usage.inputTokens);
console.log('Output tokens:', result.usage.outputTokens);
console.log('Cache read:', result.usage.inputTokenDetails.cacheReadTokens);
if (result.usage.outputTokenDetails.reasoningTokens) {
console.log('Reasoning tokens:', result.usage.outputTokenDetails.reasoningTokens);
}
Abort signal
Cancel long-running requests with AbortSignal:
import { CoreAIError } from '@core-ai/core-ai';
const controller = new AbortController();
setTimeout(() => controller.abort(), 5000);
try {
const result = await generate({
model,
messages: [{ role: 'user', content: 'Write a long essay...' }],
signal: controller.signal,
});
} catch (error) {
if (error instanceof CoreAIError) {
console.log('Request was cancelled:', error.message);
}
}
Next Steps