What can the Anthropic PDF skill do?

It lets Claude read and extract text and tables, merge, split, rotate, watermark, create, encrypt or decrypt, extract images from, and OCR PDF files. It also fills PDF forms.

Can it extract tables from a PDF?

Yes. It uses pdfplumber's extract_tables() to pull tabular data and can convert the rows into pandas DataFrames and export them to Excel.

Does it work on scanned PDFs with no text layer?

Yes. It converts scanned pages to images with pdf2image and runs OCR using pytesseract to produce searchable text.

Can it merge or split multiple PDFs?

Yes. It merges PDFs by adding pages with pypdf or qpdf, and splits a PDF into one file per page or into custom page ranges.

Can it create brand-new PDFs?

Yes. It generates PDFs with reportlab, supporting multi-page reports, and uses XML markup tags for subscripts and superscripts instead of unicode glyphs.

Does it support password protection?

Yes. It encrypts PDFs with user and owner passwords using pypdf, and removes passwords from encrypted files using qpdf.

Is the PDF skill free and open source?

The skill is free to use but source-available, not open source. It is licensed as Proprietary (Anthropic); see LICENSE.txt for the full terms.

Where can I use this skill?

It works anywhere Claude has a file environment: Claude.ai paid plans where it is preinstalled, Claude Code via the document-skills plugin, and the Claude API.

ClaudeBeginnerFreeEditorial

PDF Document Skill

Read, merge, split, watermark, fill forms, encrypt, and OCR scanned PDF files — directly from Claude.

No reviews

Anthropic

164,1543816

Overview

Anthropic's official PDF skill teaches Claude to do almost anything with PDF files — extracting text and tables, merging, splitting, rotating, watermarking, creating new PDFs, filling forms, encrypting, extracting images, and running OCR on scans.

What is the PDF skill?

The PDF skill is one of the document skills that powers Claude's file capabilities. It packages the right Python libraries and command-line tools for each PDF task, plus the workflows to use them correctly. The core toolkit is pypdf for page operations, pdfplumber for text and table extraction, reportlab for creating PDFs, qpdf and pdftk for command-line manipulation, and pytesseract with pdf2image for OCR. It also references FORMS.md for filling PDF forms and REFERENCE.md for advanced libraries like pdf-lib and pypdfium2.

What it does

Extracts text and tables from PDFs (pdfplumber's extract_text() and extract_tables()), including export to Excel
Merges multiple PDFs into one and splits a PDF into individual pages
Rotates pages, adds watermarks, and extracts images with pdfimages
Creates new PDFs with reportlab, including multi-page reports and proper subscripts/superscripts
Fills PDF forms and encrypts/decrypts files with user and owner passwords
Runs OCR on scanned PDFs with pytesseract to make them searchable

How it works

Claude loads the skill's SKILL.md, which maps each task to the best tool. For reading, it opens the file with pypdf or pdfplumber and iterates pages; for tables it converts extracted rows into pandas DataFrames. For merging and splitting it uses pypdf's PdfWriter, or qpdf --empty --pages on the command line. Watermarks are applied by merging a watermark page onto every page. Creation uses reportlab's canvas or the Platypus document model, and the skill warns against unicode subscript/superscript glyphs (which render as black boxes) in favor of <sub> and <super> markup. Scanned documents are converted to images with pdf2image and read with pytesseract. Encryption uses pypdf's writer.encrypt(); decryption uses qpdf with the password.

Who it's for

Anyone who uses Claude to work with PDFs: analysts pulling tables out of financial statements, ops teams merging and splitting document bundles, legal teams encrypting or redacting files, and people who need to OCR scanned contracts or receipts into searchable text. It runs wherever Claude has a code/file environment — Claude.ai (paid plans), Claude Code, and the Claude API.

What you can build

Automated pipelines that extract tables from PDFs into Excel
Document assembly that merges, splits, and rotates pages on demand
Watermarked or password-encrypted versions of sensitive files
Searchable archives by OCR-ing scanned PDFs
Generated reports and filled PDF forms from structured data

Why it matters

PDFs are the universal exchange format, but they are notoriously hard to parse and edit reliably — scanned pages have no text layer, tables lose their structure, and forms and encryption add friction. This skill encodes the tool-for-the-job choices Anthropic uses in production, so Claude reaches for pdfplumber for tables, pytesseract for scans, and qpdf for encryption instead of guessing. It turns messy PDF work into dependable, repeatable operations.

What's Included

SKILL.md mapping every PDF task to the right Python library or CLI tool
pypdf recipes for merging, splitting, rotating, and encrypting PDFs
pdfplumber workflows for extracting text and tables, including export to Excel
reportlab patterns for creating multi-page PDFs with correct subscripts and superscripts
OCR pipeline using pytesseract and pdf2image to make scanned PDFs searchable
FORMS.md for filling PDF forms and REFERENCE.md for pdf-lib and pypdfium2

Installation

1. Install in Claude Code

/plugin marketplace add anthropics/skills
/plugin install document-skills@anthropic-agent-skills

Or install just this skill with the skills CLI:

bash

npx skills add anthropics/skills --skill pdf

2. Use it

On Claude.ai paid plans these document skills are already available. Just ask:

Use the PDF skill to extract the tables from report.pdf into Excel.

Or: "OCR scanned-contract.pdf so the text is searchable, then merge it with appendix.pdf."

Requirements

Claude.ai paid plan, Claude Code, or Claude API access
A code/file execution environment for running the skill scripts
Python libraries (pypdf, pdfplumber, reportlab, pytesseract) and tools like qpdf and poppler for full functionality

Changelog

v1.0.02026-06-01

Initial listing of the official Anthropic pdf skill.

FAQs

Reviews

No reviews yet. Be the first to review this skill!

Related Skills

Claude

Free

DOCX Document Skill

Create, read, and edit Microsoft Word (.docx) documents with formatting, tables of contents, tracked changes, and images — directly from Claude.

PPTX Presentation Skill

Create, read, and edit PowerPoint (.pptx) decks — slides, speaker notes, templates, and well-designed layouts — directly from Claude.

BeginnerEditorial

No reviews

Turn prompts into followers

Teach your AI new tricks

Learn AI, the practical way

PDF Document Skill

Overview

What is the PDF skill?

What it does

How it works

Who it's for

What you can build

Why it matters

What's Included

Installation

1. Install in Claude Code

2. Use it

Requirements

Changelog

FAQs

Reviews

Related Skills

DOCX Document Skill

PPTX Presentation Skill