Skip to main content
The generate() function provides synchronous chat completion, returning a complete response from the language model.

Basic Usage

Generate a simple chat completion:
import { generate } from '@core-ai/core-ai';
import { createOpenAI } from '@core-ai/openai';

const openai = createOpenAI({ apiKey: process.env.OPENAI_API_KEY });
const model = openai.chatModel('gpt-5-mini');

const result = await generate({
  model,
  messages: [
    {
      role: 'system',
      content: 'You are a concise technical assistant.',
    },
    {
      role: 'user',
      content: 'Explain what an embedding vector is in one short paragraph.',
    },
  ],
});

console.log('Response:', result.content);
console.log('Usage:', result.usage);

Using Different Providers

Core AI supports multiple providers with the same API:
import { generate } from '@core-ai/core-ai';
import { createOpenAI } from '@core-ai/openai';

const openai = createOpenAI({ 
  apiKey: process.env.OPENAI_API_KEY 
});
const model = openai.chatModel('gpt-5-mini');

const result = await generate({
  model,
  messages: [{ role: 'user', content: 'Hello!' }],
});

Configuration Options

Customize model behavior with configuration parameters:
const result = await generate({
  model,
  messages: [{ role: 'user', content: 'Tell me a story.' }],
  config: {
    temperature: 0.7,        // Control randomness (0-2)
    maxTokens: 500,          // Limit response length
    topP: 0.9,              // Nucleus sampling
    stopSequences: ['\n\n'], // Stop generation at sequences
    frequencyPenalty: 0.5,   // Reduce repetition
    presencePenalty: 0.3,    // Encourage topic diversity
  },
});

Response Structure

The generate() function returns a GenerateResult object:
type GenerateResult = {
  parts: AssistantContentPart[];  // Raw response parts
  content: string | null;          // Text content (null if only tool calls)
  reasoning: string | null;        // Reasoning content (if available)
  toolCalls: ToolCall[];          // Tool calls made by model
  finishReason: FinishReason;     // Why generation stopped
  usage: ChatUsage;               // Token usage information
};

Understanding Token Usage

const result = await generate({ model, messages });

console.log('Input tokens:', result.usage.inputTokens);
console.log('Output tokens:', result.usage.outputTokens);

// Token details breakdown
console.log('Cache read:', result.usage.inputTokenDetails.cacheReadTokens);
console.log('Cache write:', result.usage.inputTokenDetails.cacheWriteTokens);

if (result.usage.outputTokenDetails.reasoningTokens) {
  console.log('Reasoning tokens:', result.usage.outputTokenDetails.reasoningTokens);
}

Multi-Turn Conversations

Build conversations by including previous messages:
const messages = [
  { role: 'system', content: 'You are a helpful assistant.' },
  { role: 'user', content: 'What is TypeScript?' },
];

const firstResponse = await generate({ model, messages });

// Add assistant response and follow-up question
messages.push({
  role: 'assistant',
  content: firstResponse.content,
});

messages.push({
  role: 'user',
  content: 'How is it different from JavaScript?',
});

const secondResponse = await generate({ model, messages });
console.log(secondResponse.content);

Error Handling

Handle errors gracefully:
import { LLMError, ProviderError } from '@core-ai/core-ai';

try {
  const result = await generate({ model, messages });
  console.log(result.content);
} catch (error) {
  if (error instanceof ProviderError) {
    console.error('Provider error:', error.message);
    console.error('Status code:', error.statusCode);
  } else if (error instanceof LLMError) {
    console.error('LLM error:', error.message);
  } else {
    console.error('Unknown error:', error);
  }
}

Best Practices

System messages set the assistant’s behavior and context:
const result = await generate({
  model,
  messages: [
    {
      role: 'system',
      content: 'You are a concise technical writer. Always use examples.',
    },
    { role: 'user', content: 'Explain async/await' },
  ],
});
Control costs and response length with maxTokens:
const result = await generate({
  model,
  messages,
  config: {
    maxTokens: 200, // Limit response to 200 tokens
  },
});

if (result.finishReason === 'length') {
  console.warn('Response was truncated');
}
Check why generation stopped:
const result = await generate({ model, messages });

switch (result.finishReason) {
  case 'stop':
    // Normal completion
    break;
  case 'length':
    // Hit token limit
    console.warn('Response truncated');
    break;
  case 'tool-calls':
    // Model wants to use tools
    break;
  case 'content-filter':
    // Content filtered by provider
    console.warn('Content filtered');
    break;
}

Next Steps