A library of causal models is not a stack of model files. It is the operational form of a discipline — the practice of building, composing, bounding, translating, and auditing the models so that the library remains defensible as it grows. The pages in this cluster are the discipline written down.

Read in order, they take the reader from why this format at all through what belongs in the library, how the pieces compose, where each one stops, how a question reaches the library faithfully, how the library remembers what it knows — and end on the work the library cannot do for itself. The last page matters as much as the first.

For the long-form treatment of the architecture in a single essay rather than across the cluster: The Library as Architecture →

Foundation — what the library is and why

The Library as Architecture → The long-form essay. What the library is, what it contains, how content enters, and why composition is the durability property.

A causal model answers one question. A library of causal models answers questions an organization will ask for the next decade — including ones it hasn’t thought of yet. The transition from model to library is not a matter of scale; it is a matter of architecture. Start here for the foundational treatment.

Why SCMs? → Structured causal models encode what your experts know, in a form that survives the expert leaving.

SCMs are not a replacement for domain experts. They are a durable record of expert knowledge — a different and stronger claim than “AI replaces experts,” and the one that matters in the domains where machine learning and large language models do not reach.

Why SCMs Improve LLM Accuracy → Pure LLM scaling improves statistical interpolation. SCMs improve mechanistic reasoning. These are not the same capability.

The technical accuracy argument for adding SCMs to an LLM system, separate from architectural elegance and separate from AGI speculation. The RAG ceiling, transferable causal topology, sparse-data robustness, and what each component contributes in a hybrid system.

Why Not Use an LLM? → The negative case. Why a language model alone cannot answer causal questions.

A language model is the right interface to a causal answer. It is not the engine that produces one. The page names the failure modes specifically — where the LLM’s answer sounds correct and where it fails — and shows the architecture that closes the gap.

Why Current Tools Fail → Where ML, dashboards, and conventional analytics stop, and why no algorithm closes the remaining gap.

The tools most organizations rely on were built for Rung 1. The questions boards actually ask require Rung 3. The gap is not one of degree; it is one of kind. This page traces it.

Tradition & Frontier — where the work sits in the field

Formalizing Expertise → What the literature calls the work, and the lineage your engagement joins.

Converting tacit expert knowledge into explicit computational structure is the central problem at the intersection of AI, cognitive science, and epistemology. Five families of method — rule-based, causal, knowledge graphs, probabilistic, neurosymbolic — form a tradition; the work you are reading sits on rungs 2 and 5. With citations.

LLM-Mediated Libraries → Where the field is heading, and the part of it your engagement builds today.

Language models acting as interfaces to reusable libraries of structured causal models — a five-layer architecture now active in the literature. Vision-direction page with citation-anchored research and an honest split between what is shippable now and what is research, not delivery.

Four Paradigms, One Bet → The positioning page. What this work bets on, and where it stands relative to other methodological commitments in the field.

Four paradigms describe what AI systems bet on when they make causal claims. The methods arc commits to a fourth: human-curated content, with the LLM mediating access rather than generating substance. Different bet, different failure modes.

Operations — how the library works in practice

Design → What belongs in a causal model library, and at what grain.

The question that decides whether such a library has any real-world utility is not how to retrieve from it but what to put in it. The page lays out the categories of artifact and the criteria for deciding what is admissible at each grain.

Composition → Two models that are each internally valid can produce silent nonsense the moment they touch.

Composition is where most academic SCM libraries underinvest, and where an LLM mediator does the most damage. The primitives of safe composition — what has to be true of two models for their join to mean what the user thinks it means.

Scope → Internal validation tells you a model is right within its frame. It cannot tell you whether the question is inside the frame.

Scope declarations are the artifacts that close the gap between “the model is valid” and “this question is one the model can answer.” Without them, a defensible model still produces undefensible answers to questions it was never intended to take.

Translation → Where the question becomes the query.

Composition checks how models join. Scope checks where they apply. Both assume the user’s plain-language question reaches the library faithfully — that what gets checked matches what was asked. That assumption is the system’s largest unguarded surface, and it lives at the translation interface.

Provenance → How the library remembers what it knows.

The three prior pages produce checkable artifacts at query time. Audit is what makes those artifacts checkable later — by a reviewer, a regulator, a domain expert, or the modeler returning to their own work six months on. A library without audit is not a library.

Capabilities → What an SCM library can answer that other systems cannot.

A capability-by-capability accounting of what the library does — intervention reasoning, counterfactual queries, mediation analysis, identification under confounding — and what the alternatives leave on the table.

Stack & Deployment → How the library lives in your environment.

Bayes Server files, version control, API surfaces, LLM interfaces, query logging. The library is a real software artifact your team operates and extends. This page names the components and how they fit your stack.

Caretakers — the work the library cannot do

Caretakers → The work the library cannot do.

Across the prior methods pages, the library has accumulated four categories of work it cannot complete on its own — contestation, decay, naïveté, and the reviewer-in-the-system. The work is not residual. It is constitutive of inhabiting a discipline.

Knowledge Survival → What happens when the expert who built the model leaves.

The library’s deepest claim: it preserves expert reasoning in a form that survives turnover, retirement, and corporate reorganisation. The pages here document how that survival actually works — and what fails when caretakers leave without succession.

After these seven pages, the natural next moves are into the methods clusters: Building for how a model is elicited and assembled in the first place, Querying for how decisions are extracted from a built model, and Causal Modeling for the foundational ideas that this library rests on.