PDF tool
pdf analyzes one or more PDF documents and returns text.
Quick behavior:
- Native provider mode for Anthropic and Google model providers.
- Extraction fallback mode for other providers (extract text first, then page images when needed).
- Supports single (
pdf) or multi (pdfs) input, max 10 PDFs per call.
Availability
The tool is only registered when OpenClaw can resolve a PDF-capable model config for the agent:agents.defaults.pdfModel- fallback to
agents.defaults.imageModel - fallback to best effort provider defaults based on available auth
pdf tool is not exposed.
Input reference
pdf(string): one PDF path or URLpdfs(string[]): multiple PDF paths or URLs, up to 10 totalprompt(string): analysis prompt, defaultAnalyze this PDF document.pages(string): page filter like1-5or1,3,7-9model(string): optional model override (provider/model)maxBytesMb(number): per-PDF size cap in MB
pdfandpdfsare merged and deduplicated before loading.- If no PDF input is provided, the tool errors.
pagesis parsed as 1-based page numbers, deduped, sorted, and clamped to the configured max pages.maxBytesMbdefaults toagents.defaults.pdfMaxBytesMbor10.
Supported PDF references
- local file path (including
~expansion) file://URLhttp://andhttps://URL
- Other URI schemes (for example
ftp://) are rejected withunsupported_pdf_reference. - In sandbox mode, remote
http(s)URLs are rejected. - With workspace-only file policy enabled, local file paths outside allowed roots are rejected.
Execution modes
Native provider mode
Native mode is used for provideranthropic and google.
The tool sends raw PDF bytes directly to provider APIs.
Native mode limits:
pagesis not supported. If set, the tool returns an error.
Extraction fallback mode
Fallback mode is used for non-native providers. Flow:- Extract text from selected pages (up to
agents.defaults.pdfMaxPages, default20). - If extracted text length is below
200chars, render selected pages to PNG images and include them. - Send extracted content plus prompt to the selected model.
- Page image extraction uses a pixel budget of
4,000,000. - If the target model does not support image input and there is no extractable text, the tool errors.
- Extraction fallback requires
pdfjs-dist(and@napi-rs/canvasfor image rendering).
Config
Output details
The tool returns text incontent[0].text and structured metadata in details.
Common details fields:
model: resolved model ref (provider/model)native:truefor native provider mode,falsefor fallbackattempts: fallback attempts that failed before success
- single PDF input:
details.pdf - multiple PDF inputs:
details.pdfs[]withpdfentries - sandbox path rewrite metadata (when applicable):
rewrittenFrom
Error behavior
- Missing PDF input: throws
pdf required: provide a path or URL to a PDF document - Too many PDFs: returns structured error in
details.error = "too_many_pdfs" - Unsupported reference scheme: returns
details.error = "unsupported_pdf_reference" - Native mode with
pages: throws clearpages is not supported with native PDF providerserror