Structured Outputs
While LLMs are great at generating text, it can be useful to get structured outputs instead. The AI
SDK provides a generate_object
function that allows you to do just that, you provide a schema and
a prompt and the LLM will generate an object that matches the schema.
The generate_object
function will choose the best way to generate the object based on
what is supported by the provider you are using. This includes:
- Explicit JSON support: OpenAI supports passing in a JSON schema that constrains the output of the model.
- Tool calls: Tool calls rely on specific argument formats, this can be abused to generate structured outputs.
- Text mode: If none of the above are supported, we will default back to prompting the LLM to answer the prompts in a structured way. This is the least reliable method and can often lead to parsing errors.
Basic Usage
To generate structured output, you need to:
- Define your output structure using Pydantic models
- Use the
generate_object
function with your model
Here’s a simple example:
from pydantic import BaseModel
from ai_sdk import generate_object
from ai_sdk.openai import openai
class WeatherResponse(BaseModel):
temperature: float
condition: str
humidity: int
wind_speed: float
response = generate_object(
model=openai("gpt-4"),
schema=WeatherResponse,
prompt="What's the weather like in Paris today?"
)
print(response.object.temperature) # e.g., 22.5
print(response.object.condition) # e.g., "sunny"
Understanding the Response
The generate_object
function returns an ObjectResult
that contains:
class ObjectResult:
object: BaseModel # Your structured data
finish_reason: FinishReason # Why generation stopped
usage: Usage # Token usage statistics
request: RequestMetadata # Request information
response: ResponseMetadata # Response information
warnings: List[Warning] # Any warnings generated
Basic Usage
class MovieReview(BaseModel):
title: str
rating: int
review: str
response = generate_object(
model=openai("gpt-4"),
output_model=MovieReview,
prompt="Review the movie 'Inception'"
)
print(f"Rating: {response.object.rating}/10")
print(response.usage.total_tokens) # Token usage
Advanced Features
Using System Messages
You can provide system messages to guide the structured output generation:
class ProductReview(BaseModel):
pros: List[str]
cons: List[str]
verdict: str
response = generate_object(
model=openai("gpt-4"),
schema=ProductReview,
system="You are a critical product reviewer. Be honest about both positives and negatives.",
prompt="Review the latest iPhone"
)
Nested Models
The AI SDK supports complex nested models with full validation:
from typing import Optional, Dict
class Address(BaseModel):
street: str
city: str
country: str
postal_code: str
class Contact(BaseModel):
email: str
phone: Optional[str]
address: Address
metadata: Dict[str, str] = {}
response = generate_object(
model=openai("gpt-4"),
schema=Contact,
prompt="Generate contact info for John Doe"
)
The LLM will automatically format its response to match your model’s structure, including all required fields and proper types.
Error Handling
When using structured outputs, you might encounter validation errors when the LLM generates data that doesn’t match your model’s requirements. The SDK provides the AI_ObjectValidationError
class for handling these cases:
from ai_sdk.core.errors import AI_ObjectValidationError
try:
response = generate_object(
model=openai("gpt-4"),
output_model=YourModel,
prompt="Your prompt"
)
except AI_ObjectValidationError as e:
print("Validation error:", e.message)
except AI_APICallError as e:
print(f"API error (status {e.status_code}):", str(e))
if e.is_retryable:
print("This error can be retried")
The AI_ObjectValidationError
provides a descriptive message about what fields failed validation and why.
Always implement error handling when using structured outputs in production. The LLM might occasionally generate invalid data that needs to be handled gracefully.
You might also encounter API-level errors when making requests. The AI_APICallError
provides detailed information about the failed request:
- Status code
- Whether the error is retryable
- Response body
- Request details
try:
response = generate_object(
model=openai("gpt-4"),
output_model=YourModel,
prompt="Your prompt"
)
except AI_APICallError as e:
print(f"API call failed:")
print(f" URL: {e.url}")
print(f" Status: {e.status_code}")
print(f" Response: {e.response_body}")
print(f" Retryable: {e.is_retryable}")