Structured Output

Generate validated Pydantic models from any LLM provider

Basic Usage

from pydantic import BaseModel
from SimplerLLM.language.llm import LLM, LLMProvider
from SimplerLLM.language.llm_addons import generate_pydantic_json_model

class MovieReview(BaseModel):
    title: str
    rating: float
    summary: str
    recommended: bool

llm = LLM.create(provider=LLMProvider.OPENAI, model_name="gpt-4o")

review = generate_pydantic_json_model(
    model_class=MovieReview,
    prompt="Write a review for the movie Inception",
    llm_instance=llm
)

print(review.title)       # "Inception"
print(review.rating)      # 4.8
print(review.recommended) # True

The result is a validated MovieReview instance, not raw JSON.

Parameter Type Default Description
model_class Type[BaseModel] Required Pydantic model class to generate
prompt str Required The input prompt
llm_instance LLM Required LLM instance to use
max_retries int 3 Retries on validation failure
max_tokens int 4096 Maximum output tokens
temperature float 0.7 Sampling temperature
top_p float 1.0 Nucleus sampling
full_response bool False Return LLMFullResponse with metadata
images list None List of image URLs or local file paths for vision
detail str "auto" Image detail level: "low", "high", or "auto"
web_search bool False Enable web search before generation
reasoning_effort str None "low", "medium", or "high" (OpenAI thinking models)

Note: Validation failures trigger automatic retries with exponential backoff.

Nested Models

from pydantic import BaseModel
from SimplerLLM.language.llm import LLM, LLMProvider
from SimplerLLM.language.llm_addons import generate_pydantic_json_model

class Author(BaseModel):
    name: str
    bio: str

class BlogPost(BaseModel):
    title: str
    author: Author
    tags: list[str]
    word_count: int

llm = LLM.create(provider=LLMProvider.OPENAI, model_name="gpt-4o")

post = generate_pydantic_json_model(
    model_class=BlogPost,
    prompt="Generate a blog post about async Python programming",
    llm_instance=llm
)

print(post.title)
print(post.author.name)
print(post.tags)

Nested models are automatically handled -- the LLM output is validated against the full schema.

Generating Lists

Use RootModel to generate JSON arrays directly.

from typing import List
from pydantic import BaseModel, RootModel
from SimplerLLM.language.llm import LLM, LLMProvider
from SimplerLLM.language.llm_addons import generate_pydantic_json_model

class FAQItem(BaseModel):
    question: str
    answer: str

class FAQList(RootModel[List[FAQItem]]):
    pass

llm = LLM.create(provider=LLMProvider.OPENAI, model_name="gpt-4o")

faqs = generate_pydantic_json_model(
    model_class=FAQList,
    prompt="Generate 3 FAQs about machine learning",
    llm_instance=llm
)

for item in faqs.root:
    print(f"Q: {item.question}")
    print(f"A: {item.answer}")

Note: RootModel requires Pydantic v2. It lets you generate arrays directly instead of wrapping them in an object.

Full Response with Metadata

Set full_response=True to get token counts and timing:

response = generate_pydantic_json_model(
    model_class=MovieReview,
    prompt="Review the movie Interstellar",
    llm_instance=llm,
    full_response=True
)

print(response.model_object.title)
print(f"Input tokens: {response.input_token_count}")
print(f"Output tokens: {response.output_token_count}")
print(f"Time: {response.process_time:.2f}s")

Vision / Image Input

Pass images via the images parameter to extract structured data from visual content:

from pydantic import BaseModel
from SimplerLLM.language.llm import LLM, LLMProvider
from SimplerLLM.language.llm_addons import generate_pydantic_json_model

class ImageDescription(BaseModel):
    objects: list[str]
    scene: str
    mood: str

llm = LLM.create(provider=LLMProvider.OPENAI, model_name="gpt-4o")

# Using a URL
result = generate_pydantic_json_model(
    model_class=ImageDescription,
    prompt="Describe this image in detail",
    llm_instance=llm,
    images=["https://example.com/photo.jpg"]
)

print(result.scene)
print(result.objects)

You can also use local file paths and pass multiple images:

result = generate_pydantic_json_model(
    model_class=ImageDescription,
    prompt="Describe what you see",
    llm_instance=llm,
    images=["path/to/image1.png", "path/to/image2.jpg"],
    detail="high"  # "low", "high", or "auto"
)
Provider Vision Support
OpenAI Supported
Anthropic Supported
Gemini Supported
Ollama Supported (llava, llama3.2-vision)
Cohere Supported

Web Search

Enable web_search=True to ground output in real-time data:

from pydantic import BaseModel
from SimplerLLM.language.llm import LLM, LLMProvider
from SimplerLLM.language.llm_addons import generate_pydantic_json_model

class ResearchSummary(BaseModel):
    topic: str
    key_findings: list[str]
    sources_used: int

llm = LLM.create(provider=LLMProvider.OPENAI, model_name="gpt-4o")

response = generate_pydantic_json_model(
    model_class=ResearchSummary,
    prompt="Summarize recent advances in quantum computing",
    llm_instance=llm,
    web_search=True,
    full_response=True
)

print(response.model_object.key_findings)

for source in response.web_sources:
    print(f"{source['title']}: {source['url']}")
Provider Web Search
OpenAI Supported
Anthropic Supported
Gemini Supported
Perplexity Always enabled

ReliableLLM Fallback

Use generate_pydantic_json_model_reliable to automatically fall back to a secondary provider:

from SimplerLLM.language.llm import LLM, LLMProvider, ReliableLLM
from SimplerLLM.language.llm_addons import generate_pydantic_json_model_reliable

primary = LLM.create(provider=LLMProvider.OPENAI, model_name="gpt-4o")
secondary = LLM.create(provider=LLMProvider.ANTHROPIC, model_name="claude-sonnet-4-20250514")
reliable = ReliableLLM(primary_llm=primary, secondary_llm=secondary)

review, provider, model_name = generate_pydantic_json_model_reliable(
    model_class=MovieReview,
    prompt="Review the movie The Matrix",
    reliable_llm=reliable
)

print(f"{review.title} (from {provider.name} / {model_name})")

If the primary provider fails, the secondary is used automatically.

Async Usage

import asyncio
from SimplerLLM.language.llm_addons import generate_pydantic_json_model_async

async def main():
    review = await generate_pydantic_json_model_async(
        model_class=MovieReview,
        prompt="Review the movie Dune",
        llm_instance=llm
    )
    print(review.title)

asyncio.run(main())

Note: The async reliable variant is generate_pydantic_json_model_reliable_async.

Error Handling

On failure, the function returns an error string instead of a model instance.

result = generate_pydantic_json_model(
    model_class=MovieReview,
    prompt="Review a movie",
    llm_instance=llm
)

if isinstance(result, str):
    print(f"Error: {result}")
else:
    print(result.title)