This repository defines a document operations pipeline — a collection of declarative operator files that orchestrate AI-driven content transformation workflows. Each operator is a lightweight markdown file with YAML frontmatter that describes what to do, where to find inputs, and where to put outputs. Together, they form a composable, extensible system for turning raw ideas into polished, multi-format publications.

The underlying engine is DocProcessor, a frontmatter-driven build system that resolves file dependencies, matches source files via regex transforms, and dispatches work to specialized AI task types. Think of it as a Makefile for AI content generation — but instead of compiling code, it compiles thought.

How It Works

The Core Concept

Each .md file in the docs/ directory is an operator definition. An operator declares:

When the pipeline runs, DocProcessor scans for source files matching the input patterns, pairs them with their output targets, checks whether outputs are stale, and dispatches AI tasks to generate or update the results.

The Transform Pattern

1
transforms: (.+)/content\.md -> $1/comic.md

This is the heart of each operator. It says: “For every content.md file found anywhere in the directory tree, produce a sibling comic.md in the same directory.” The regex capture groups ($1, $2, etc.) allow flexible path rewriting, enabling operators to work across any number of content directories without hardcoding paths.

Task Types

Task types determine the AI’s mode of engagement with the content. They aren’t just prompt templates — they represent fundamentally different cognitive strategies:

The Content Lifecycle

The operators define a multi-stage pipeline that mirrors a real editorial workflow. Content flows through distinct phases, each handled by different operators:

Phase 1: Capture & Distill

Task Type	Purpose
`IterativeFileModification`	Careful, incremental writing and editing
`ComicBookGeneration`	Visual storytelling adaptation
`NarrativeGeneration`	Dramatic prose and fiction
`SocraticDialogue`	Exploratory question-and-answer format
`DialecticalReasoning`	Thesis-antithesis-synthesis analysis
`GameTheory`	Strategic and decision-theoretic framing
`MultiPerspectiveAnalysis`	Multiple viewpoint examination
`PersuasiveEssay`	Argumentative and rhetorical writing
`ProbabilisticReasoning`	Uncertainty-aware analysis
`FiniteStateMachine`	State-based system modeling
`SoftwareDesignDocument`	Technical architecture documentation
`TutorialGeneration`	Educational step-by-step content
`GenerateImage`	Visual asset creation
`ImageVariation`	Image-to-image transformation
`IllustrateDocument`	Inline illustration of existing documents
`CounterfactualAnalysisTask`	Alternative-outcome scenario analysis
`GeneticOptimizationTask`	Evolutionary optimization framing
`MathematicalReasoningTask`	Formal mathematical analysis
`BrainstormingTask`	Divergent idea generation
`Interactive`	Branching, reader-driven experiences
`Scriptwriting`	Screenplay and stage script adaptation
`TechnicalExplanation`	Precise, in-depth technical breakdowns

Raw material enters the system as unstructured notes — voice transcripts, brainstorm dumps, meeting recordings, or freeform writing — landing in scratch/ directories as notes.* files.

summarize_op picks up these raw notes and produces a thematic summary. This isn’t a chronological recap; it’s a conceptual distillation that extracts key ideas, identifies patterns, and organizes insights into a structured outline. If a content.md already exists, the summarizer compares it against the canonical notes to surface any missing insights.

Phase 2: Plan

instruct_op takes the summary and produces a plan for the plan — a structural blueprint that identifies the target audience, core message, and value proposition. It deliberately stops short of writing the actual piece, creating a reviewable checkpoint where a human can steer direction before committing to a full draft.

Phase 3: Draft

draft_article_op consumes the instructions (or summary, or raw notes) and produces a complete content.md — a polished, coherent article or essay. This is the canonical content artifact that all downstream operators build from.

Phase 4: Analyze & Enrich

Once a content.md exists, a constellation of analytical operators can process it in parallel, each producing a different lens on the same material:

These aren’t just reformattings — they’re genuinely different modes of thinking applied to the same source material. The dialectical analysis might surface contradictions the original author missed. The game theory lens might reveal hidden strategic dynamics. The Socratic dialogue might expose unstated assumptions.

Phase 5: Adapt & Visualize

Phase 6: Iterate & Converge

update_article_op closes the loop. It takes the original content.md along with any of the analytical or creative derivatives — the dialectical analysis, the game theory breakdown, the narrative adaptation, even raw notes — and folds their insights back into the canonical content. This creates a feedback cycle where each analytical lens can improve the source material.

comic_seq_op and narrative_seq_op demonstrate another iteration pattern: they consume outputs from previous generation runs (JSON for comics, Markdown for narratives) to produce sequels, enabling serialized storytelling.

Phase 7: Publish

frontmatter_op generates rich YAML frontmatter for the final content, producing the metadata needed for a dynamic site architecture — SEO tags, content classification, reading difficulty, navigation hints, schema.org structured data, and more. import-posts.js assembles the final Jekyll posts from the content directory. During assembly, it automatically enriches frontmatter with metadata about which content variants were integrated:

This ensures that published posts carry accurate metadata about their content composition without requiring manual bookkeeping.

Architecture Principles

Declarative & Composable

Each operator is a standalone declaration. Operators don’t know about each other — they only know their input pattern and output target. The pipeline emerges from the overlap of these patterns: one operator’s output matches another operator’s input. This makes the system trivially extensible; adding a new analytical lens is just adding a new .md file.

Convention Over Configuration

The directory structure is the configuration. Content lives in topic directories. Each directory accumulates artifacts as operators process its content.md. There’s no central manifest or build file — the regex patterns and file system do the routing.

Human-in-the-Loop

The pipeline is designed for human oversight at key junctures. The instruct_op creates an explicit planning checkpoint. The update_article_op requires a human to decide which analytical outputs to feed back. The DocProcessor’s overwrite modes (SkipExisting, PatchExisting, etc.) give fine-grained control over what gets regenerated versus preserved.

Idempotent & Incremental

Operators can be re-run safely. The PatchToUpdate default mode means outputs are only regenerated when inputs are newer than outputs, and updates are applied as patches rather than full replacements — preserving any manual edits while incorporating new insights.

Multi-Modal

The pipeline doesn’t just produce text. Image generation, comic book creation, and illustration operators treat visual assets as first-class outputs, enabling rich multimedia publications from a single source of truth.

Directory Structure

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
scratch/                          # Raw input material
  topic-name/
    notes.md                      # Raw notes, transcripts, brainstorms
    summary.md                    # ← summarize_op
    instruct.md                   # ← instruct_op
    content.md                    # ← draft_article_op (Drafts live here until promoted)

content/                          # Published content (Manually promoted from scratch)
  topic-name/
    content.md                    # Canonical article
    frontmatter.yaml              # ← frontmatter_op
    dialectical.md                # ← dialectical_op
    gametheory.md                 # ← gametheory_op
    perspectives.md               # ← perspectives_op
    persuasive.md                 # ← persuasive_op
    probablistic.md               # ← probablistic_op
    socratic.md                   # ← socratic_op
    statemachine.md               # ← statemachine_op
    design.md                     # ← softwaredesign_op
    tutorial.md                   # ← tutorial_op
    comic.md                      # ← comic_op
    narrative.md                  # ← narrative_op
    counterfactual.md             # ← counterfactual_op
    genetic.md                    # ← genetic_op
    mathematical.md               # ← mathematical_op
    brainstorm.md                 # ← brainstorm_op
    main.png                      # ← icon_op
    main.html                     # ← icon_variant_op

docs/                             # Operator definitions (this directory)
  summarize_op.md
  instruct_op.md
  draft_article_op.md
  update_article_op.md
  frontmatter_op.md
  dialectical_op.md
  ...

Creating New Operators

Adding a new operator is straightforward. Create a markdown file with frontmatter:

1
2
3
4
5
6
7
8
---
transforms: (.+)/content\.md -> $1/my_analysis.md
task_type: MyTaskType
---

* Specific instructions for the AI
* What to focus on
* What format to produce

The transform pattern determines routing. The task type determines the AI’s cognitive strategy. The markdown body provides domain-specific guidance. That’s it — the pipeline will automatically pick up any matching content files on the next run.

Design Philosophy

This system embodies a particular philosophy about AI-assisted content creation:

DocOps: AI-Powered Content Pipeline

Overview