Structured Outputs

While LLMs are great at generating text, it can be useful to get structured outputs instead. The AI SDK provides a generate_object function that allows you to do just that, you provide a schema and a prompt and the LLM will generate an object that matches the schema.

The generate_object function will choose the best way to generate the object based on what is supported by the provider you are using. This includes:

Explicit JSON support: OpenAI supports passing in a JSON schema that constrains the output of the model.
Tool calls: Tool calls rely on specific argument formats, this can be abused to generate structured outputs.
Text mode: If none of the above are supported, we will default back to prompting the LLM to answer the prompts in a structured way. This is the least reliable method and can often lead to parsing errors.

Basic Usage

To generate structured output, you need to:

Define your output structure using Pydantic models
Use the generate_object function with your model

Here’s a simple example:


from pydantic import BaseModel
from ai_sdk import generate_object
from ai_sdk.openai import openai
 
class WeatherResponse(BaseModel):
    temperature: float
    condition: str
    humidity: int
    wind_speed: float
 
response = generate_object(
    model=openai("gpt-4"),
    schema=WeatherResponse,
    prompt="What's the weather like in Paris today?"
)
 
print(response.object.temperature)  # e.g., 22.5
print(response.object.condition)    # e.g., "sunny"

Understanding the Response

The generate_object function returns an ObjectResult that contains:


class ObjectResult:
    object: BaseModel           # Your structured data
    finish_reason: FinishReason # Why generation stopped
    usage: Usage                # Token usage statistics
    request: RequestMetadata    # Request information
    response: ResponseMetadata  # Response information
    warnings: List[Warning]     # Any warnings generated

Basic Usage


class MovieReview(BaseModel):
    title: str
    rating: int
    review: str
 
response = generate_object(
    model=openai("gpt-4"),
    output_model=MovieReview,
    prompt="Review the movie 'Inception'"
)
print(f"Rating: {response.object.rating}/10")
print(response.usage.total_tokens)  # Token usage

With Validation


from pydantic import BaseModel, Field
 
class Rating(BaseModel):
    score: int = Field(ge=1, le=10)  # Between 1-10
    review: str = Field(min_length=10)
 
response = generate_object(
    model=openai("gpt-4"),
    schema=Rating,
    prompt="Rate this product"
)
# Will automatically validate score is 1-10
# and review is at least 10 chars

Complex Model


from typing import List
 
class Character(BaseModel):
    name: str
    age: int
    skills: List[str]
 
class Story(BaseModel):
    title: str
    genre: str
    characters: List[Character]
    plot_summary: str
 
response = generate_object(
    model=openai("gpt-4"),
    schema=Story,
    prompt="Create a sci-fi story"
)
for char in response.object.characters:
    print(f"{char.name}: {', '.join(char.skills)}")

Advanced Features

Using System Messages

You can provide system messages to guide the structured output generation:


class ProductReview(BaseModel):
    pros: List[str]
    cons: List[str]
    verdict: str
 
response = generate_object(
    model=openai("gpt-4"),
    schema=ProductReview,
    system="You are a critical product reviewer. Be honest about both positives and negatives.",
    prompt="Review the latest iPhone"
)

Nested Models

The AI SDK supports complex nested models with full validation:


from typing import Optional, Dict
 
class Address(BaseModel):
    street: str
    city: str
    country: str
    postal_code: str
 
class Contact(BaseModel):
    email: str
    phone: Optional[str]
    address: Address
    metadata: Dict[str, str] = {}
 
response = generate_object(
    model=openai("gpt-4"),
    schema=Contact,
    prompt="Generate contact info for John Doe"
)

The LLM will automatically format its response to match your model’s structure, including all required fields and proper types.

Error Handling

When using structured outputs, you might encounter validation errors when the LLM generates data that doesn’t match your model’s requirements. The SDK provides the AI_ObjectValidationError class for handling these cases:


from ai_sdk.core.errors import AI_ObjectValidationError
 
try:
    response = generate_object(
        model=openai("gpt-4"),
        output_model=YourModel,
        prompt="Your prompt"
    )
except AI_ObjectValidationError as e:
    print("Validation error:", e.message)
except AI_APICallError as e:
    print(f"API error (status {e.status_code}):", str(e))
    if e.is_retryable:
        print("This error can be retried")

The AI_ObjectValidationError provides a descriptive message about what fields failed validation and why.

⚠️

Always implement error handling when using structured outputs in production. The LLM might occasionally generate invalid data that needs to be handled gracefully.

You might also encounter API-level errors when making requests. The AI_APICallError provides detailed information about the failed request:

Status code
Whether the error is retryable
Response body
Request details


try:
    response = generate_object(
        model=openai("gpt-4"),
        output_model=YourModel,
        prompt="Your prompt"
    )
except AI_APICallError as e:
    print(f"API call failed:")
    print(f"  URL: {e.url}")
    print(f"  Status: {e.status_code}")
    print(f"  Response: {e.response_body}")
    print(f"  Retryable: {e.is_retryable}")