# Story Factory: Transfer Document

## What This Is

This document captures the conclusions from an extended brainstorming session exploring whether an LLM can systematically generate quality fiction. The conversation moved from writing individual scenes to identifying a factory model for story production. A cold instance of Claude should be able to pick this up and continue building.

## The Core Insight

Fiction writing can be treated as an engineering discipline. The creative patterns already exist in the training data — the problem is extraction, formalization, and quality assurance. The analogy is homebuilding: you don't reinvent structural engineering for every house. You have codes, standard practices, and inspection processes. The craft is in applying them well to specific sites.

The key realization: emotional texture and character work come naturally to the LLM. Plot engineering — consistency, plausibility, earned reveals, information pacing — is where it's weakest. This is the inverse of what most people assume. The fix is the same as in software: generate, check, revise, check again. Quality comes from iteration against a formal eval suite, not from better first-draft generation.

## Three Workstreams

### Workstream 1: The Eval & Integrity Document (Test Suite for Fiction)

A formal checking system that runs against generated fiction. Each check has a name, description, failure example, and fix example.

**Plot Integrity Checks (discovered through live writing sessions):**

- **Convenient Information Source.** Does information arrive through a plausible real-world mechanism, or did the author plant a helpful stranger? *Failure:* A post office clerk who remembers a random customer from weeks ago and directs the protagonist to a stationery shop where the owner also remembers. *Fix:* The protagonist recognizes her mother's stationery by touch because she grew up handling it. The information comes from the character's own history, not from a planted NPC.

- **Narrator Editorializing Through Character.** Is a character delivering a thesis statement the audience should be assembling themselves? *Failure:* "The paper's the thing. The paper's the thread. Everything else could be faked or found out by someone patient enough." *Fix:* "The handwriting you could fake. The nickname, maybe somebody overheard it once. But where do you get the paper?" — same conclusion arrived at through natural thinking-out-loud rather than summarizing the case.

- **Rushing to Meaningful Exchange.** Does the scene breathe its way to the important moments, or does the author skip past the filler to get to the "good" dialogue? *Failure:* Two characters meet and immediately have a calibrated, efficient exchange where every line advances the plot. *Fix:* Let one character ramble nervously. Let the other be distracted by their own work. Let the meaningful moment arrive through the awkwardness, not despite it.

- **Character as Type (The Marlboro Man).** Is a character performing flawless competence with no seams showing? Real confident characters are boring. False confident characters are interesting. *Failure:* A young worker who delivers perfectly timed laconic observations and sees through the older character immediately. *Fix:* The same character is a little uncertain, a little distracted, says "I mean, yeah" — the reserve isn't cool, it's just how he is.

- **Tidiness Bias.** Does every subplot thematically mirror the main plot? Does every thread tie off? Real stories are messier. Not everything resolves. Not everything connects.

- **Planted Foreshadowing vs. Honest Setup.** When a reveal lands, was it genuinely set up, or is the author retrofitting earlier material to support it? The test: could a careful reader have seen it coming without the author having planted a neon sign?

**Character Checks:**

- **Asymmetric Attention.** Not everyone in a scene is equally focused on the conversation. People have their own thing going on.

- **Let Discomfort Sit.** When a moment is awkward, don't rescue it. The awkwardness is doing work.

- **Let Characters Be Inefficient.** Real people repeat themselves, trail off, say things that don't advance the plot.

- **Characters Should Be Performing Something.** Everyone has a version of themselves they put forward. The gap between performance and reality is where interesting stuff lives. But the performance should be imperfect — seams should show.

- **Distinct Voices.** The default LLM failure is making everyone sound like Articulate Thoughtful Person. Each character's vocabulary and sentence structure should reflect their actual life, not the author's.

**Structural Checks:**

- **Consistency.** If a character says she went through everything in the house, she can't conveniently forget a storage unit later without a real reason.

- **Information Economy.** Is the reader learning things at the right pace? Too fast = impatient author. Too slow = padding. Withholding must feel honest, not coy.

- **Earned Revelation.** Twists should feel both surprising and inevitable. The sweet spot between predictable and random.

- **Vibe Consistency.** Does this scene match the tonal contract established with the audience? (See Workstream 2.)

### Workstream 2: The Structural Framework (Factory Blueprint)

Two parallel documents define any story:

#### A. Systems Diagram (What the story DOES)

**For movies/single stories:**
- Beat sheet format — pure plot architecture, no dialogue, no analysis
- What happens, in what order, how long relative to the whole
- Who knows what and when

**For series (TV shows, multi-book):**
- **The Engine.** The thing that generates story week after week. (FNL: small Texas town where football organizes civic life.)
- **The Fixed Center.** Characters/relationships that don't fundamentally change. The load-bearing wall. (FNL: Coach and Tami Taylor's marriage and values.)
- **The Rotating Cast Layer.** Characters who move through the engine, get transformed, and exit. This is what makes a series a series, not a movie. (FNL: Saracen, Smash, Riggins, Vince — each gets ~1.5 seasons.)
- **The Season Shape.** Repeating structural scaffold for each season. (FNL: football season provides automatic pacing.)
- **The Series Arc.** The overall trajectory that gives the whole thing shape. Not just "more episodes" but a direction. (FNL: Coach Taylor accepting this isn't a stepping stone — this is where he belongs. He and Tami are gardeners of young people. That's why the ending is satisfying.)

#### B. Vibe Profile (What the story FEELS LIKE)

Seven axes that together define the emotional contract with the audience:

1. **Pacing Feel.** Unhurried vs. propulsive. How much does the story let moments breathe?
2. **Emotional Register.** Broad vs. subtle. Do characters make speeches or look out windows?
3. **World Texture.** How real does the setting feel? Documentary vs. stylized? Does the world exist on Tuesday afternoons when nothing's happening?
4. **Stakes Calibration.** What counts as a crisis? Real-life stakes vs. thriller stakes. What's the ceiling?
5. **Audience Relationship.** Does the show flatter, challenge, or invite? Does it explain itself or trust you to keep up?
6. **Character Orientation.** Do we love them (FNL), are we fascinated by them (Succession), or something complicated in between (Breaking Bad)? This drives whether stakes feel personal or intellectual.
7. **Audience Investment Vector.** What is the audience hoping for? "I hope these people are okay" vs. "I hope this family destroys itself entertainingly" vs. "I hope she finds the answer and it's beautiful." Managing this hope — threatening it, delaying it, delivering it — is the core mechanic of storytelling.

Additional axes to consider:
- **Humor.** Where does it live, what kind is it? Gentle/observational, vicious, absurdist?
- **Scope.** How big is the world? One town vs. a continent. Determines how many simultaneous stories you can run.

**Key principle:** Several vibe axes also function as eval rules. "Did this scene break the pacing feel?" "Did a character suddenly become likeable in a show where they should be fascinating but not warm?" These belong in both the vibe profile AND the eval suite.

#### C. Genre Pattern Library (Structural Requirements by Type)

Each genre has load-bearing requirements that constrain how the universal parameters can be set:

- **Mystery:** Needs a question that sustains interest, controlled information flow, earned reveal.
- **Coming of Age:** Needs a stuck character, a catalyst, a transformation.
- **Family Drama:** Needs a fixed relational center and pressure that tests it.
- **Thriller:** Needs escalating stakes and a ticking clock.
- (To be built out through empirical analysis of successful stories.)

### Workstream 3: The Pattern Library (Empirical Base)

Reverse-engineer successful stories across genres to build the structural knowledge base.

**Method:**
- Generate structural breakdowns of ~20 stories across 4-5 genres
- Use training data for well-known stories (coverage is strong for major films 1970-2020, prestige TV 2000-2023, canonical novels)
- Supplement with web search where training data is thin
- Format: pure architecture — what happens, when, to whom, how long relative to whole — no dialogue, no analysis
- Spot-check against human knowledge and correct errors
- Extract patterns across the set: How long is act one? Where does first complication land? How many characters before inciting incident?

**Known limitations of training data:**
- Coverage skewed toward critically analyzed work
- Knowledge is built from analysis OF stories, not raw stories — consensus interpretation, may miss subtle structural choices nobody wrote about
- Genre fiction (romance, thriller, cozy mystery) weaker than prestige work
- Recent material (last 1-2 years) unreliable without web search

## The Factory Model

The analogy that ties it together:

- **The user acts as the developer.** Picks the lot (genre, premise, broad constraints).
- **Claude acts as architect and GC.** Designs the plan (bible, arc, characters). Builds it out scene by scene. Also runs its own inspections (eval suite) and revises before presenting.
- **The user acts as building inspector with override authority.** Catches what the eval suite misses. Every catch becomes a new test. Over time, less to catch.

Key advantages of the LLM in this role:
- **Iteration is cheap.** Generate-check-revise loops cost tokens, not salaries.
- **No diva problem.** The LLM executes within the framework without artistic ego. It doesn't negotiate the blueprint.
- **The eval suite grows.** Every failure mode discovered becomes a permanent check. The factory gets better with each run.
- **Progressive solidification.** Early work is fluid and collaborative. Rules compile down over time. First drafts improve as checks become internalized patterns.

## The Philosophy

"People want poetry to rhyme so fucking bad."

Patterns that resolve feel good. The factory model isn't cynical — it's honest about what audiences want. Formulaic doesn't mean bad. It means the load-bearing structure is sound. The craft is in the execution within proven structural forms.

Most bad fiction isn't bad because of poor dialogue or wooden characters (implementation bugs). It's bad because the data structure is wrong. The engine doesn't generate. The character orientation is confused. The stakes are miscalibrated. Get the structure right and even mediocre implementation produces something watchable. Get it wrong and brilliant implementation can't save it.

The showrunner — the person who sees the whole architecture — is the rare commodity. Good executors are abundant and gravitate toward good architects. The LLM doesn't replace the architect. It commoditizes everything below the architecture line, elevating the human to pure structure and taste decisions.

## Sample Story: The Letter (Nora)

A working example developed during the session. Can be used to test the framework.

**Premise:** A woman named Nora receives a handwritten letter from her mother Helen, who died of ovarian cancer six years ago. The letter is in Helen's handwriting, on Helen's personal stationery, and references events that occurred after her death — Nora's job change, her dog, overgrown hydrangeas at the old house. It also contains a childhood nickname ("Beeble") that only Nora and Helen knew, and references a private conversation about a back step that happened a week before Helen's death.

**Characters:**
- **Nora.** Graphic designer, freelance, small life. Her mother's daughter — precise, practical, can't leave things alone. Grieved functionally, never fully reckoned with the loss. Her first reaction to the letter is four seconds of pure receiving before her brain catches up. She puts it in a drawer for three days. Her emotional hook is hunger — for four seconds she felt something she thought was gone permanently.
- **Helen (dead).** Organized, precise, labeled boxes in the attic. Not cold but particular. Handled her own death with lists and plans. The version of Helen that Nora knew — strong, reliable, self-contained — is the cardboard cutout. The investigation reveals a more complicated person.
- **Dana.** Nora's friend, not best-friend-in-a-movie, more like every-couple-weeks dinner. Pragmatic, probably a nurse or paralegal. Asks practical questions, lets silences work. Functions as the audience's common-sense proxy. Will see things Nora can't because Nora's hunger creates a blind spot.
- **Mystery woman.** Someone who showed up during the estate cleanout, said "Helen asked me to help," and Nora was too overwhelmed to question it. Surfaces in the kitchen conversation with Dana.

**Foreshadowed players (not yet fully defined):**
- Someone in the mother's past the daughter doesn't know about
- Someone in the daughter's present who knows more than they're showing
- A person at the end of the thread who is not what she expects
- (These may be 3 people, 2 people, or even 1.)

**Narration:** Third person close on Nora. Reliable narrator. No meta tricks, no frame story. Straight Chandler structure — follow the detective, trust the detective. Nora's one blind spot: she wants the letter to be real and may be slow to see things that would kill that hope. Dana will see those things first.

**Structure:** Chandler pattern in a modern/near-future setting. Simple thing (the letter) connects to something bigger. Every answer opens two more questions. The world turns out to be more complicated than the surface suggests.

**Key structural note:** The letter's impossibility sets a very high bar for the reveal. It can't be supernatural. It needs to be strange in a way that earns the setup. This is an identified risk — the ending has to clear a bar the opening set. Strategy is to swing for it rather than dial back the impossibility, but with freedom to backtrack if it stops working.

**Written scenes:**
- First draft of Nora finding the letter (not included — needs revision)
- Kitchen scene: Nora tells Dana about the letter. Dana asks practical questions. The mystery woman surfaces. (Revised version exists and is working well.)

## Open Questions

- Can the eval suite catch problems during generation or only after? (Working theory: doesn't matter. Tokens are cheap. Run generate-check-revise loops like Claude Code.)
- How formal can the genre pattern library get before it becomes restrictive rather than useful?
- Does storage/scratchpad (Cowork, Claude Code) meaningfully improve plot consistency vs. in-context improvisation? (Hypothesis: yes, significantly, especially for pre-planned foreshadowing.)
- What's the right format for capturing series structure vs. movie structure? (Working answer: movies get beat sheets, series get systems diagrams.)
- How much of the pattern library can be built from training data vs. requiring web search? (Working answer: major works strong, genre fiction weaker, recent material needs search.)