BASIS spec v1.0.0
ATSF Mathematics
The formulas, constants, and decision rules that govern trust scoring in the BASIS standard.
Every value here is exported from packages/basis/src/canonical.ts —
the single source of truth for conforming implementations.
Trust score space
Scores live on a bounded interval. Every agent starts at zero and must pass a qualification course before reaching the first operational tier. There is no trust inheritance — a derived agent does not inherit its parent's score.
- Range
- [0, 1000]
- Initial score
- 0
- Qualification pass
- 200
Tiers
Scores partition into eight tiers. Each tier defines the capability envelope an agent may operate within. Promotion is immediate at boundary crossings for T0–T4 and time-gated for T5–T7 (7, 10, 14 days respectively). Demotion uses per-tier hysteresis buffers to prevent oscillation.
| Tier | Name | Range | Description |
|---|---|---|---|
| T0 | Sandbox | 0–199 | No external effects. |
| T1 | Observed | 200–349 | Read-only, monitored. |
| T2 | Provisional | 350–499 | Limited write, scoped tools. |
| T3 | Monitored | 500–649 | Standard operations, audit. |
| T4 | Standard | 650–799 | Full operational capability. |
| T5 | Trusted | 800–875 | Cross-system operations. |
| T6 | Certified | 876–950 | Multi-agent coordination. |
| T7 | Autonomous | 951–1000 | Full autonomous operation. |
Gain formula
A successful action yields trust. Headroom (C − S) throttles
gain logarithmically, so the first points are cheap and the last points expensive. The cube root
on the risk multiplier gives a sub-linear bonus for riskier successes, preventing risk-seeking as a
path to faster promotion.
- C
- Observation-tier ceiling
- S
- Current trust score
- R
- Risk multiplier (1–30)
Loss formula
A failed action costs trust. The penalty scales with tier (higher tier = more to lose, by design)
and with risk. The fixed midpoint reference ln(1 + C/2)
prevents an agent from parking near the ceiling to minimize loss exposure: magnitude depends on
the ceiling, not the current position.
where P(T) = 3 + T for tier index T ∈ [0, 7],
giving a penalty ratio from 3× (T0) to 10× (T7).
Risk levels
Every action is classified into one of six risk levels. The multiplier enters both gain and loss formulas. The threshold is the minimum trust score required to attempt an action at that risk level.
| Level | Multiplier | Trust threshold | Description |
|---|---|---|---|
| READ | 1× | 0 | Observation only. |
| LOW | 3× | 200 | Minor, reversible. |
| MEDIUM | 5× | 400 | Operational impact. |
| HIGH | 10× | 600 | Significant damage. |
| CRITICAL | 15× | 800 | Severe, hard to reverse. |
| LIFE_CRITICAL | 30× | 951 | Human safety at stake. |
Observation tiers
An agent's maximum achievable trust score is capped by how much of its internal state is visible. A black-box agent cannot exceed T3 regardless of how well it behaves. This prevents trust from being spent on opacity.
| Visibility | Ceiling | Max tier | Description |
|---|---|---|---|
| BLACK_BOX | 600 | T3 | I/O only. |
| GRAY_BOX | 750 | T4 | I/O + logs. |
| WHITE_BOX | 900 | T6 | Full code access. |
| ATTESTED_BOX | 950 | T6 | TEE-verified execution. |
| VERIFIED_BOX | 1000 | T7 | TEE + interpretability stack. |
Circuit breakers
Two absolute thresholds and one rolling accumulator. The score thresholds are latch-on, latch-off — an agent crossing below the tripped line stops operating until manual reinstatement.
Score thresholds
- Tripped
- Trust < 100
- Degraded
- Trust < 200
Hard stop. Agent cannot operate until reinstated.
Gains frozen, losses still apply. Observation only.
Risk accumulator (24h rolling window)
Each failure contributes P(T) · R to the accumulator.
Thresholds escalate monitoring; crossing the top threshold trips the breaker regardless of current score.
- Warning
- 60
- Degraded
- 120
- CB trip
- 240
Hybrid fingerprint
A fingerprint is useful only if it is hard to forge. BASIS combines two independent signals into a single attestation: a structural signature (SVD over the model's weight space) and a behavioral signature (responses to a sealed probe set). Either signal alone has known attacks; together they close the sleeper-backdoor gap.
Structural: SVD fingerprint
Singular-value decomposition over the weight tensors, quantized to 4 decimal places, top-64 components, 15% layer sample. Dual-fingerprinted (two independent seeds) to prevent grinding attacks. SVD is a necessary condition for identity but not sufficient — a malicious fine-tune can preserve the top-k spectrum.
Behavioral: canary probes
A sealed, hashed probe set bound to agent identity. Kept private to prevent adversarial crafting. Minimum 5% probe coverage required. Drift triggers a canary-rate boost. Changes to the reference workload require a governance ceremony.
Adversarial harness
ATSF v1.1 ships with an adversarial harness that runs six evasion attempts against the combined fingerprint (quantization spoofing, sparse perturbation, targeted rotation, fine-tune camouflage, probe dodging, replay). Current evasion rate: 0/6, seed 42.
Reproducibility
Every number on this page is exported from canonical source. Every test run is deterministic under a fixed seed. The audit pipeline is a single command.
Audit locally
npm install @vorionsys/basis
npm run audit The audit validates tier boundaries, formula invariants, risk-threshold monotonicity, observation-tier ceilings, and circuit-breaker thresholds against the canonical export. Downstream consumers can pin to the canonical module and catch spec drift at CI time.