The methods don't reduce human work; they concentrate it. Four pages of primitives have produced four categories of work humans must still do — and the work, examined closely, turns out to be constitutive of what it means to inhabit a discipline at all. The library is infrastructure for that work, not a substitute for it.
This page covers one component of the Library. For the case for the Library itself — why SCMs, not ML or LLMs, are the framework that justifies a library — see Why Structured Causal Models?
The Caretaker Problem
Each of the four methods pages ended at a category of work the library could not complete. Read the closing limits of each in sequence:
Two models can specify conflicting graphs over shared variables. The primitives detect the disagreement. They cannot adjudicate it.
Scope shrinks as the world changes. The library cannot detect misdeclaration without external evidence.
If the user fundamentally does not know what they are asking, translation can surface the gap but cannot supply the missing expertise.
The library can record everything. It cannot interpret what it recorded.
Read together, these are not four unrelated limitations. They are the same kind of work — the work of someone who is of a field — observed from four different methods pages. The library has spent four pages preparing material for a role it cannot fill. This page names the role.
Why "caretaker"
The word carries weight, and the weight is worth defending before it is used. Several alternatives are tempting and each falls short.
Custodian has the right institutional gravity but the wrong frame: a custodian protects what already exists, while the role here also grows the library by drawing in new knowledge. Practitioner is technically correct but too neutral — it names the position without naming the stake. Attestor is precise about the witness operation but reduces a four-operation role to one of its operations. Human-in-the-loop is the genre's default term, and it is exactly what this page is rejecting: humans here are not in the loop, they are at the spine. The library is in the loop attached to them.
Caretaker holds maintenance, growth, accountability, and standing in one frame, and carries the connotation of care for something larger than oneself. That last connotation matters. The work the methods arc has been pointing at is not employment-like; it is closer to the way an ecologist tends to a watershed, or a curator to a collection — the kind of attentive, accountable, ongoing work that a discipline does on itself, in its own name. The page commits to the word because the word does work the alternatives cannot.
Across four methods pages, the library has accumulated material for a role it cannot fill. This page names that role and the operations that constitute it. The methods arc was preparation; this is the page where the preparation lands.
Four Operations
The methods pages each named four primitives — operations the library performs. This page names four operations the caretaker performs, in parallel. The shift in word matters: primitives are mechanisms, operations are deliberate acts. Each one is the human counterpart of a category the library prepared material for and could not complete.
1. Adjudication
When the library surfaces contestation it cannot resolve — two domain experts disagreeing on whether a model applies, two Structural Causal Models (SCMs) with conflicting graphs over shared variables, a refusal one reviewer accepts and another challenges — a caretaker rules. The operation is not voting and not consensus-building. It is the entry of a sustained ruling, with reasoning attached, into the audit trail, where it serves as the basis for future adjudication and is itself revisable.
Adjudication is what disagreement becomes when it is recorded as an institutional act rather than a transient dispute. The Composition page named contestation between SCMs as a category the primitives could not adjudicate. Adjudication is the work that handles it — and that records its handling so the next caretaker has somewhere to start.
2. Elicitation
When the library cannot grow on its own — when a model needs scope conditions a modeler did not yet know to declare, when a domain expert holds knowledge the library does not yet hold, when a question is asked that no model can answer faithfully — a caretaker elicits the missing material. Polanyi's formulation is still the cleanest: we can know more than we can tell (Polanyi, 1966). The work of elicitation is the work of helping someone tell what they know.
This is the operation where the LLM mediator most reliably fails. Drawing tacit knowledge out of an expert and converting it into structured, machine-readable form is not paraphrase. It requires asking the questions an expert has not yet been asked, recognizing when an answer has not actually addressed the question, and knowing when the conversation has reached the boundary of what the expert can articulate. Translation §5 named domain naïveté as outside what mechanism can teach. Elicitation is what teaches — not the user, but the library, by drawing into it knowledge that previously lived only in a person.
3. Stewardship
The work of maintaining models and the library across time. Detecting when a model's regime has been broken by external events; sponsoring revisions and additions; deprecating models whose training conditions no longer hold; keeping provenance chains intact across generations of modelers.
The library cannot tell when a model's training regime has been silently broken — only that the model is still nominally in-scope. A steward can, because the steward is paying attention to the world the model is supposed to apply to. Scope §5 named decay as outside what scope checks can detect; Audit §5 named provenance gaps and the reviewer-in-the-system as compounding institutional issues. Stewardship is the work that holds them together over time. Without it, the library accumulates the silent rot of models still nominally available but actually no longer applying.
4. Witness
The operation of standing behind a refusal, a revision, an adjudication, or an interpretation in a way the library cannot. Provenance records that someone signed off; witness is what the signature means.
Standing — named reputation, accountable position, willingness to be challenged — is the property the library cannot supply on its own. A signature without standing is a clerical entry. A signature with standing is an act of professional accountability that institutions can rely on because someone has staked something they care about on it being correct. Audit §5 named provenance gaps and reviewer drift as compounding problems that audit cannot resolve from inside; witness is the operation that prevents recorded sign-offs from becoming clerical residue.
Witness is also the operation that makes the audit trail matter outside the library. Regulators, modelers, and future caretakers can build on the trail because someone known and accountable has staked their reputation on what the trail contains. The library can record. Only a person can witness.
| Operation | What the caretaker does | Which methods page named the gap |
|---|---|---|
| Adjudication | Rules on contestation; records the ruling and its reasoning | Composition (conflicting graphs, identification under disagreement) |
| Elicitation | Draws out tacit knowledge; converts it to library-readable form | Translation (domain naïveté) |
| Stewardship | Maintains models across time; detects decay; sponsors revision | Scope (decay), Audit (provenance gaps) |
| Witness | Stakes reputation on a recorded sign-off; supplies standing | Audit (reviewer-in-the-system) |
A Worked Example
The methods pages walked the wage-elasticity case through four lenses. By page five it has carried what it can carry. The case here is different: an attribution model whose measurement scope was implicitly conditional on a tracking regime that has been broken by external events. The principal is a caretaker, and the timeline runs across months rather than minutes.
The setup. The library contains a marketing-attribution model fit on user-level identifiers across iOS and Android, vintage 2018–2020, with a measurement scope that assumed reliable cross-app attribution at the device level. In April 2021, Apple ships iOS 14.5 with App Tracking Transparency, and the underlying tracking regime breaks. The model still runs cleanly. The training data is still valid. Scope checks still pass against unchanged declared fields. The library has no way to know that anything has changed.
The caretaker detects the regime break
The caretaker, embedded in the field, sees what the library cannot: the policy change is announced, opt-in rates are landing in the low single digits, attribution data is becoming a different kind of measurement than the model was trained on. The model's declared measurement scope has not changed. Its effective measurement scope has collapsed. The caretaker flags the model for review and posts a deprecation candidate on the library's review queue.
The caretaker draws out the new regime
The caretaker works with attribution specialists at three firms to articulate what attribution-without-IDFA actually looks like. The new regime has different missingness patterns (probabilistic, not random), different effective sample sizes (collapsed for iOS), and new correlations between previously independent variables (privacy-preserving aggregation introduces structural dependencies). Elicitation here is not a quick interview. It is the work of getting the new measurement scope declared in machine-readable form, with worked examples, scope conditions, and provenance from each contributing expert.
The caretaker rules on the disposition
The model's prior users disagree about disposition. Some argue for revision in place — the model still works for the Android-only segment. Some argue for full deprecation — the library should not host a model whose silent-failure mode is this severe. Some argue for a new sibling model with the post-IDFA regime declared from scratch. Each position has merit. The caretaker rules: deprecate the existing model, host the new sibling, and add a decay flag to any future attribution model that depends on platform-level identifiers. The ruling is recorded with reasoning. Future caretakers can revisit it.
The caretaker signs the change into the library
The caretaker signs the deprecation, the new model, and the decay flag, with their name and standing. The signature is what makes the change institutionally durable. Future regulators, modelers, and downstream library users can build on the change because someone known and accountable has staked their reputation on it. The audit trail now contains the regime break, the elicitation transcripts and contributors, the adjudication and its reasoning, and the signed disposition. The library has changed. The discipline has changed with it. The change is signed.
What just happened: a caretaker performed four operations across a six-month timeline in response to one regime change. Multiply by every model in the library, every domain expert who needs eliciting, every contestation that needs adjudicating. The work is not light. It is the work disciplines have always done — now made explicit and tracked rather than left implicit and invisible.
Why a Caretaker, Specifically
The methods pages each made a version of the same argument: only the library can do this work, because the user and the mediator cannot. The Audit page made a different argument: the library cannot do this work alone, only prepare material for it. This page makes the strongest version of that argument and commits to it. The four operations of caretaking are not separable from inhabiting a discipline. They are constitutive of what it means to inhabit one.
Disciplines are practitioner communities
A discipline is not a body of knowledge. It is a community of practitioners who together adjudicate disputes, elicit knowledge from each other and from the world, steward what has been built, and stand behind it. MacIntyre's account is the closest framework for what this means: a discipline is a practice in his sense — a coherent, complex, socially established cooperative activity with internal standards of excellence that can only be realized by participating in it (MacIntyre, 1981). The standards are constituted by participation; the participation is constituted by accountability to the standards. Neither exists without the other.
Bodies of knowledge can be copied. Disciplines have to be inhabited. The difference is the page.
Each operation requires being inside
Adjudication requires standing to rule — the recognition by a community that this person's ruling counts, and the corresponding willingness of that person to be challenged within the community when others disagree. Standing is conferred, not assumed; it accumulates through participation, signed work, and acceptance by peers.
Elicitation requires being known to the experts whose knowledge is being drawn out. Tacit knowledge does not transfer through interview protocols alone. It transfers through trust, shared vocabulary, mutual recognition of what a "real" question is in the domain, and the kind of slow conversation that requires the elicitor to have credibility the expert respects. An elicitor without standing in the field gets surface answers; a peer gets the harder things.
Stewardship requires being trusted by colleagues to make calls about what stays, what changes, and what is removed. The trust is not a credential. It is the cumulative product of past calls, made publicly and held to over time, that the field has accepted or contested through its own mechanisms. A steward without that history is making the same decisions, but the decisions do not propagate into institutional acceptance the same way.
Witness requires having something to stake. Standing is the property that makes a signature an act rather than a clerical entry, and standing is the product of inhabiting the discipline long enough to have something a signature can risk.
The implication for AI
The fashionable view is that AI agents can be trained to perform these operations — given enough capability, enough domain training, enough accountability scaffolding. The argument here is that the operations are not skills that can be acquired in isolation. They are positions that can be occupied by entities a discipline recognizes as occupying them. An agent that performs all four operations well, but is not embedded in the discipline whose work it is doing — not held to its peer accountability, not credentialed by its mechanisms of standing, not available to be challenged by its colleagues across decades — is not a caretaker. It is producing outputs that look like caretaking. The library cannot tell the difference. The discipline can.
This is not a claim about AI capabilities. It is a claim about what disciplines are, and what makes their operations meaningful. The Rung-3 cognitive substrate — counterfactual reasoning, imagining alternative worlds, holding multiple causal structures at once — is what makes the operations cognitively possible. But the operations are also social. They require a position in a community, and the community is the entity that recognizes whether the position is rightly held. That recognition is not delegable.
Caretakers are not infrastructure for the library. The library is infrastructure for caretakers — for the people who do the work disciplines have always done, with better materials than disciplines have ever had. The methods arc has been building those materials. This is the page where the building is for.
What Caretakers Can't Do
The methods pages' limits sections each pointed forward to the next page's primitives. This page is the last in the arc, and its limits point outward — at institutional questions the framework opens but cannot close.
Selection and authority
Who chooses caretakers? The framework requires people with standing in their disciplines, but cannot specify how that standing is conferred or by whom. Different disciplines have different answers — peer election, credentialing, demonstrated track record, institutional appointment, less formal recognition that accumulates without being named. The library's reach across disciplines means the question of caretaker selection is plural, contested, and outside the framework's scope. This is where the framework hands off to the institutions whose work it is helping.
Disagreement among caretakers
When two caretakers in the same discipline rule differently, the framework records both rulings but does not adjudicate between them. Meta-adjudication — by senior caretakers, by professional bodies, by appellate structures — is institutional infrastructure that disciplines have or have not built, and that the library inherits but does not provide. A field with mature meta-adjudication mechanisms gets a working library; a field without them gets a library that records its own confusion accurately.
Caretaker failure
A caretaker who fails — through misdeclaration, capture, drift, or ordinary error — leaves their failures in the audit trail along with their successes. The framework provides material for retrospective accountability; the operation of holding caretakers accountable is itself a discipline-internal function the library cannot perform. Audit produces the record. Discipline produces the consequence. A field that does not hold its caretakers accountable will accumulate signed errors; the library will record them faithfully and exposes them to anyone who looks, but cannot itself be the mechanism of correction.
The framework's own claim
The argument that caretaker operations are constitutive of disciplinary practice is itself a claim the framework cannot prove. It is a position this page commits to, defended in §4 by structural argument. Empirical disconfirmation is possible — sufficiently capable AI agents performing all four operations across many disciplines, with disciplines accepting their outputs as genuine caretaking, would put pressure on the claim. The page does not deny this possibility. It commits to the claim while it stands, and acknowledges that the question is not closed by argument alone. The next century will close it, in one direction or another.
The four limits above are not failures of the framework. They are the boundary at which the framework stops and the disciplines themselves take over. The framework's job has been to clarify what is mechanizable and what is not, and to prepare material for those who handle the rest. From here, the work is theirs.
Key Terms
The framework needs work from inside disciplines, not from outside them. If you can offer that, I want to hear from you.
If you're a caretaker — by the definition this page argues for — in finance, marketing, or pricing, and you have an hour for a conversation about how the four operations land against your actual practice: I'd like that hour. The framework is at the stage where it needs contact with people who already do this work, even if they don't yet call it caretaking.
info@rung3.ai
Polanyi, M. (1966). The Tacit Dimension. Doubleday/Anchor. Reissued 2009, University of Chicago Press, with foreword by Amartya Sen.
MacIntyre, A. (1981). After Virtue: A Study in Moral Theory. University of Notre Dame Press. Chapter 14, "The Nature of the Virtues," develops the account of practice and internal goods drawn on here.