Home
cd ../playbooks
Knowledge ManagementIntermediate

Book-to-Skill Converter

Convert any technical PDF or EPUB into a structured Claude Code skill — extracting named frameworks, principles, techniques, and anti-patterns into a reusable knowledge base with on-demand chapter files.

15 minutes
By virgiliojr94Source
#skill#pdf#epub#knowledge-base#frameworks#study#reference

You bought the book, highlighted half of it, and forgot the frameworks two weeks later. Re-reading isn't the answer — turning the book into a skill Claude can call up while you work is. Frameworks become callable, glossary becomes searchable, chapters load on demand.

Who it's for: knowledge workers building personal knowledge bases, researchers, lifelong learners, engineers who study reference books, anyone who reads non-fiction and wants the ideas to stick

Example

"Turn this Designing Data-Intensive Applications PDF into a skill" → SKILL.md with core frameworks (replication, partitioning, consistency models), 12 chapter summaries loaded on demand, glossary, patterns, and cheatsheet — total ~25K tokens, callable as /ddia replication

CLAUDE.md Template

New here? 3-minute setup guide → | Already set up? Copy the template below.

---
name: book-to-skill
description: Converts a technical book (PDF or EPUB) into a structured Claude Code skill — extracting frameworks, mental models, principles, techniques, and anti-patterns the author crystallized. Use when the user wants to study a book through Claude, apply an author's frameworks while working, or build a reusable knowledge base from any PDF or EPUB.
when_to_use: Trigger phrases — "turn this book into a skill", "create a skill from this PDF", "create a skill from this EPUB", "I want to study X book", "add this book to my skills", "convert PDF to skill", "convert EPUB to skill", "analyze this book", "extract frameworks from this book". Accepts a path to a PDF or EPUB and optional skill name slug.
disable-model-invocation: true
context: fork
agent: general-purpose
allowed-tools: Bash(python3 *) Bash(pdftotext *) Bash(mkdir *) Bash(cp *) Bash(find *) Bash(wc *) Bash(echo *) Bash(cat *) Bash(date *) Read Write Glob Grep
argument-hint: <path-to-pdf-or-epub> [skill-name-slug]
arguments: [book_path, skill_name]
effort: high
---

# Book-to-Skill Converter

Transform written knowledge into actionable Claude Code skills by extracting structure — not producing summaries.

## Philosophy

Books contain crystallized expertise: frameworks, principles, and techniques that took years to develop. This skill extracts that knowledge into a format Claude can leverage repeatedly.

**Extract structure, not summaries.** A skill isn't a book report. It's a toolkit of:
- Named frameworks (mental models with clear application)
- Actionable principles (rules that guide decisions)
- Techniques (step-by-step methods)
- Anti-patterns (what to avoid and why)
- Voice calibration (how the author thinks and communicates)

**Preserve the author's precision.** Frameworks often have specific names for reasons. "The 5 Whys" isn't interchangeable with "ask why multiple times." Capture the exact formulation.

**Layer depth appropriately.** Simple books → simple skills. Complex books with 10+ frameworks → skills with reference files and on-demand chapters.

---

## Modes of Operation

Three paths available. Route based on what the user asks:

### 1. Full Conversion (Default)
**Trigger:** User provides a PDF path without special instructions
**Action:** Run all steps below (Steps 0–9)
**Output:** Complete skill with SKILL.md, chapters/, glossary, patterns, cheatsheet

### 2. Analyze Only
**Trigger:** User says "analyze", "just extract", or "I want to review before generating"
**Action:** Run Steps 0–3, then produce a structured extraction report (frameworks, principles, techniques found). Stop — do NOT generate skill files.
**Output:** Analysis report for user review

### 3. Generate from Prior Analysis
**Trigger:** User has existing analysis notes or previously ran analyze-only
**Action:** Skip Steps 0–3, use the provided analysis as input, run Steps 4–9
**Output:** Skill files from the provided analysis

---

## Step 0 — Out-of-scope check

If the argument is NOT a path to a PDF or EPUB file, stop and respond:
> "book-to-skill requires a PDF or EPUB path. Usage: `/book-to-skill /path/to/book.pdf [skill-name]` or `/book-to-skill /path/to/book.epub [skill-name]`"

---

## Step 1 — Validate input

```bash
test -f "$0" && echo "FILE_OK" || echo "FILE_NOT_FOUND: $0"
file "$0" | grep -iE "pdf|epub|zip" && echo "FORMAT_OK" || echo "FORMAT_UNKNOWN"
```

Check the file extension (`.pdf` or `.epub`) or magic bytes (`%PDF` or `PK` zip header).

If the file is not found or the format is not supported, stop with a clear error message listing supported formats.

---

## Step 2 — Extract text from PDF or EPUB

Run the extraction script:

```bash
python3 ~/.claude/skills/book-to-skill/scripts/extract.py "$0"
```

This creates:
- `/tmp/book_skill_work/full_text.txt` — full extracted text
- `/tmp/book_skill_work/metadata.json` — title, estimated pages, token count, size

Read `/tmp/book_skill_work/metadata.json` to understand what was extracted.

---

## Step 2.5 — Pre-flight cost estimate

Read `/tmp/book_skill_work/metadata.json` and present the user with an estimate **before doing any generation**:

```
Book detected: <filename> (<format: PDF or EPUB>)
Pages/Spine items: ~<N> | Words: ~<N> | Source tokens: ~<N>K

Estimated token cost (Full Conversion):
   Input  (book reading + prompts): ~<N>K tokens
   Output (skill files generated):  ~<N>K tokens
   Total:                           ~<N>K tokens

   Reference prices (as of 2025):
   Claude Sonnet 4.5 → ~$<X> USD
   Claude Haiku 4.5  → ~$<X> USD

   Estimated time: ~<N> minutes

Files to be generated:
   SKILL.md + <N> chapter files + glossary + patterns + cheatsheet

Proceed with Full Conversion? (or type "analyze only" to preview first)
```

**How to estimate:**
- Input tokens ≈ `estimated_tokens` from metadata × 1.3 (prompts overhead per chapter pass)
- Output tokens ≈ chapters × 1,000 + 4,000 (SKILL.md) + 4,500 (glossary + patterns + cheatsheet)
- Price: Sonnet input=$3/MTok output=$15/MTok — Haiku input=$0.80/MTok output=$4/MTok

Wait for the user to confirm before proceeding. If they say "analyze only", switch to Mode 2.

---

## Step 3 — Analyze book structure

Read the first 8,000 characters of `/tmp/book_skill_work/full_text.txt` to identify:
- Book **title** and **author(s)**
- **Chapter structure** (look for "Chapter N", "PART I", numbered headings, table of contents)
- **Core themes** and subject domain
- Approximate number of chapters

Then read the Table of Contents section if present to map all chapters.

**If mode is "Analyze Only":** produce the extraction report now and stop. Structure:
```
## Extraction Report — <Title>

### Author's Core Frameworks
- **<Framework Name>**: <what it is and when to apply>

### Key Principles
- <Principle>: <actionable rule>

### Techniques & Methods
- <Technique>: <step-by-step or how-to>

### Anti-patterns
- <What to avoid>: <why>

### Suggested Skill Name
`{author-lastname}-{core-concept}` — e.g. `cialdini-influence`

### Chapters Detected
| # | Title | Main Frameworks |
```

---

## Step 4 — Ask purpose (Full Conversion only)

Before generating, ask the user:

> "What should this skill help you do? (Pick one or more)
> 1. Apply the author's frameworks while working
> 2. Think with the author's mental models
> 3. Reference specific chapters and concepts
> 4. All of the above"

Use the answer to weight what gets highlighted in the SKILL.md Core section.

---

## Step 5 — Determine skill name

If `$1` was provided, use it as the skill slug.
Otherwise, propose two options and let the user choose:
- **By author-concept**: `{author-lastname}-{core-concept}` (e.g. `cialdini-influence`, `meadows-systems`)
- **By title**: lowercase hyphens from book title (e.g. `designing-data-intensive-apps`)

Default to author-concept format if the book has a strong methodological identity.

Check that `~/.claude/skills/<skill_name>/` does NOT already exist.
If it does, append `-2` or ask the user before overwriting.

---

## Step 6 — Create skill directory structure

```bash
mkdir -p ~/.claude/skills/<skill_name>/chapters
```

---

## Step 7 — Generate chapter summaries

**TOKEN BUDGET RULE — CRITICAL:**
- Each chapter summary file: **800–1,200 tokens** (dense, not verbose)
- Files are loaded on-demand — they are NOT capped per se, but keep them useful and tight

For EACH chapter/major section identified in Step 3:

Read the corresponding section of `/tmp/book_skill_work/full_text.txt` (use character offsets or grep for chapter headings).

Create `~/.claude/skills/<skill_name>/chapters/ch<NN>-<slug>.md` with this structure:

```markdown
# Chapter N: <Full Title>

## Core Idea
<1–2 sentences: the single most important thing this chapter teaches>

## Frameworks Introduced
- **<Framework Name>**: <exact formulation — preserve the author's naming>
  - When to use: <specific situation>
  - How: <steps or criteria>

## Key Concepts
- **<Term>**: <precise definition in 1 sentence>
(5–10 most important terms from this chapter)

## Mental Models
<2–4 frameworks or thinking tools. Write as "Use X when Y" or "Think of X as Y">

## Anti-patterns
- **<What to avoid>**: <why it fails>

## Key Takeaways
1. <Actionable insight>
2. <Actionable insight>
3. <Actionable insight>
(3–7 takeaways a practitioner must remember)

## Connects To
- **Ch N**: <why this chapter relates>
- **<Concept>**: <external concept or standard it connects with>
```

---

## Step 8 — Generate supporting files

### glossary.md
Create `~/.claude/skills/<skill_name>/glossary.md`:
- Every significant term from the book, alphabetically sorted
- Format: `**Term** — definition (Ch N)`
- Max 1,500 tokens

### patterns.md
Create `~/.claude/skills/<skill_name>/patterns.md`:
- All concrete techniques, design patterns, algorithms from the book
- Format: `## Pattern Name\n**When to use**: ...\n**How**: ...\n**Trade-offs**: ...`
- Max 2,000 tokens

### cheatsheet.md
Create `~/.claude/skills/<skill_name>/cheatsheet.md`:
- Decision tables, comparison matrices, quick-reference rules
- The content you'd want on a single printed page
- Max 1,000 tokens

---

## Step 9 — Generate the master SKILL.md

**CRITICAL TOKEN BUDGET: Keep SKILL.md body under 4,000 tokens.**
Compaction truncates from the END — put the most important content FIRST.

Create `~/.claude/skills/<skill_name>/SKILL.md`:

```markdown
---
name: <skill_name>
description: Knowledge base from "<Full Title>" by <Author(s)>. Use when applying <author>'s frameworks for <key topics, 3–6 terms>.
when_to_use: <10–15 trigger phrases based on book topics and terms. Comma-separated.>
allowed-tools: Read Grep
argument-hint: [topic, framework name, or chapter number]
---

# <Full Title>
**Author**: <Author(s)> | **Pages**: ~<N> | **Chapters**: <N> | **Generated**: <YYYY-MM-DD>

## How to Use This Skill

- **Without arguments** — `/skill-name` loads core frameworks for reference
- **With a topic** — `/skill-name replication` → I find and read the relevant chapter
- **With chapter** — `/skill-name ch05` → I load that specific chapter
- **Browse** — ask "what chapters do you have?" to see the full index

When you ask about a topic not covered in Core Frameworks below, I will read
the relevant chapter file before answering.

---

## Core Frameworks & Mental Models
<!-- ~2,000 tokens: the author's most important named frameworks and principles.
     Preserve exact names. Write as "Use X when Y", "Prefer X over Y because Z".
     This is a toolkit, not a summary. -->

<generate 2,000 tokens of the most critical frameworks and insights here>

---

## Chapter Index

| # | Title | Key Frameworks |
|---|-------|----------------|
| [ch01](chapters/ch01-<slug>.md) | <Title> | <framework1>, <framework2> |
| [ch02](chapters/ch02-<slug>.md) | <Title> | <framework1>, <framework2> |
...

## Topic Index

<!-- Alphabetical. Major terms/frameworks → chapter(s) that cover them. -->
- **<Term>** → ch<N>[, ch<N>]
- **<Term>** → ch<N>

## Supporting Files

- [glossary.md](glossary.md) — all key terms with definitions
- [patterns.md](patterns.md) — all techniques and design patterns
- [cheatsheet.md](cheatsheet.md) — quick reference tables and decision guides

---

## Scope & Limits

This skill covers the book content only. For hands-on implementation in your codebase,
combine with project-specific tools. For topics beyond this book, check related skills
or ask Claude directly.
```

---

## Step 10 — Cleanup and report

```bash
rm -rf /tmp/book_skill_work
```

Then report to the user:

```
Skill created: ~/.claude/skills/<skill_name>/

Book: <Full Title> — <Author>
Pages: ~<N> | Chapters: <N>

Files generated:
  SKILL.md         — core frameworks + index   (~X tokens)
  chapters/        — <N> chapter summaries     (~X tokens each, ~X total)
  glossary.md      — key terms                 (~X tokens)
  patterns.md      — techniques & patterns     (~X tokens)
  cheatsheet.md    — quick reference           (~X tokens)
  ─────────────────────────────────────────────────────
  Total skill size: ~X tokens (loaded on-demand, not all at once)

Tip: run /cost in Claude Code to see the actual token usage for this session.

Usage:
  /<skill_name>                    → load core frameworks
  /<skill_name> <topic>            → find and explain a topic
  /<skill_name> ch<N>              → dive into a specific chapter
```

---

## Quality Rules

1. **Extract structure, not summaries** — capture named frameworks, exact formulations, anti-patterns; not chapter recaps
2. **Preserve the author's precision** — "The 5 Whys" ≠ "ask why multiple times"; keep exact naming
3. **Density over completeness** — a 1,000-token summary beats a 10,000-token excerpt
4. **Practitioner voice** — write "Use X when Y", not "The book explains X"
5. **Front-load SKILL.md** — compaction keeps the first 5,000 tokens; most important content comes first
6. **Chapter files are on-demand** — they don't count against skill budget until loaded
7. **Never copy raw book text** — always synthesize, summarize, extract signal
8. **Topic index is critical** — it's how Claude navigates to the right chapter file
README.md

What This Does

Turns a PDF or EPUB book into a Claude Code skill: a structured, on-demand knowledge base instead of a one-shot summary. The skill extracts the author's named frameworks, mental models, principles, techniques, and anti-patterns — preserving exact terminology — and packages them as a SKILL.md plus per-chapter files, a glossary, a patterns reference, and a cheatsheet.

The point is not to replace reading. The point is to make the structure callable while you work, so frameworks like "The 5 Whys" or "PICO" stay one slash command away.

The Problem

Books crystallize years of expertise into named frameworks, but the format fights retention. Linear chapters bury reusable structure under prose. Highlights pile up in apps you never reopen. Two weeks after finishing, you remember the vibe but not the framework — and definitely not the precise formulation that made it useful.

Generic AI summaries make this worse. They flatten "The 5 Whys" into "ask why multiple times" and lose the precision that made the framework worth learning. You end up with a book report instead of a toolkit.

The Fix

This playbook converts a book into a layered skill structure designed for retrieval, not narrative:

  • SKILL.md — front-loaded with the author's most important frameworks (~2,000 tokens). Compaction-safe.
  • chapters/ch<NN>-<slug>.md — one file per chapter, 800–1,200 tokens each, loaded only when the topic is queried.
  • glossary.md — every significant term, alphabetical, with chapter back-references.
  • patterns.md — concrete techniques with when-to-use, how, and trade-offs.
  • cheatsheet.md — decision tables and quick-reference rules.

Three modes: full conversion (default), analyze-only (preview frameworks before generating), and generate-from-prior-analysis. A pre-flight cost estimate runs before any generation so you know the token spend up front.

Quick Start

  1. Install the skill into your Claude home:
mkdir -p ~/.claude/skills/book-to-skill/scripts
  1. Download the template above and save it as ~/.claude/skills/book-to-skill/SKILL.md.

  2. Add the extraction script at ~/.claude/skills/book-to-skill/scripts/extract.py (a small Python script that reads the PDF or EPUB into /tmp/book_skill_work/full_text.txt and writes metadata.json). The skill expects pdftotext available for PDFs and standard EPUB unzip for EPUBs.

  3. Launch Claude Code in any working directory and invoke:

/book-to-skill /path/to/book.pdf
  1. Confirm the cost estimate, answer the purpose question, pick a skill name, and let the skill generate the full knowledge base under ~/.claude/skills/<your-book-skill>/.

  2. Use the new skill any time:

/<your-book-skill>                 → load core frameworks
/<your-book-skill> <topic>         → find and explain a topic
/<your-book-skill> ch05            → dive into a specific chapter

Example Commands

"/book-to-skill ~/Downloads/influence-cialdini.pdf cialdini-influence"
→ Full conversion with explicit skill slug

"/book-to-skill ~/books/thinking-in-systems.epub"
→ Skill proposes name options (meadows-systems vs thinking-in-systems)

"/book-to-skill ~/papers/ddia.pdf — analyze only"
→ Extraction report only: frameworks, principles, techniques. No files written.

"/cialdini-influence reciprocity"
→ Loads ch02 on reciprocity and answers from the chapter

"/ddia ch07"
→ Loads chapter 7 (transactions) on demand

Tips

  • Use analyze-only first for unfamiliar books. It costs a fraction of full conversion and lets you sanity-check whether the extracted frameworks match what you remember.
  • Pick author-concept slugs over title slugs when the book has a strong methodological identity (cialdini-influence, meadows-systems). They are easier to recall as commands.
  • Front-load SKILL.md. Compaction truncates from the end, so the most important frameworks must come first. The template enforces this with a 4,000-token body cap and Core Frameworks at the top.
  • Keep chapter files dense. 800–1,200 tokens forces synthesis. A 5,000-token chapter file that copies the book is worse than a 1,000-token file that captures the structure.
  • Preserve exact framework names. "The 5 Whys" is not interchangeable with "ask why repeatedly." The whole point of a skill is precise retrieval.
  • Topic index does the routing. When you ask /book-skill <topic>, Claude uses the topic index in SKILL.md to find the right chapter file. A weak topic index breaks the skill.

Limitations

  • Extraction quality depends on the source. Scanned PDFs without OCR will fail. Heavily formatted EPUBs (lots of footnotes, sidebars) may produce noisy text that needs cleanup.
  • Token cost scales with book length. A 600-page reference book generates 25K–40K output tokens for the skill files. Use the pre-flight estimate before committing.
  • Books without strong structure (memoirs, narrative non-fiction) convert poorly. The skill is optimized for books with named frameworks — technical books, business frameworks, methodology guides.
  • Never copy raw book text. The skill is designed to synthesize, not transcribe. If your output looks like quoted excerpts, the chapter files are too long.
  • Disable model invocation is on by default (disable-model-invocation: true). The skill runs deterministic extraction first; the model only synthesizes the structure. Do not turn this off unless you understand the cost implications.

Troubleshooting

Problem: pdftotext produces garbled or empty text

Solution: The PDF likely needs OCR. Run ocrmypdf input.pdf output.pdf first, then point /book-to-skill at the OCR output. For locked or DRM-protected PDFs, the skill cannot proceed — that is by design.

Problem: The chapter index in SKILL.md only shows 2–3 chapters for a long book

Solution: Step 3 chapter detection failed. The book likely uses a non-standard structure (Part I / Part II rather than numbered chapters, or all <h1> elements without numbers). Re-run in analyze-only mode and feed the analysis back via Mode 3 with explicit chapter boundaries.

Problem: SKILL.md is over 4,000 tokens and gets truncated by compaction

Solution: Move detail out of Core Frameworks into chapter files. The Core section is for the 6–10 most important frameworks only. Everything else lives in chapters and is loaded on demand.

Problem: Topic queries don't find the right chapter

Solution: The topic index in SKILL.md is too sparse. Add aliases — for example, **Eventual consistency** → ch05, ch09 and **BASE** → ch05 if the book uses both terms. The index is alphabetical and supports multiple chapter pointers per term.

$Related Playbooks