BASIS spec v1.0.0

ATSF Mathematics

The formulas, constants, and decision rules that govern trust scoring in the BASIS standard. Every value here is exported from packages/basis/src/canonical.ts — the single source of truth for conforming implementations.

Trust score space

Scores live on a bounded interval. Every agent starts at zero and must pass a qualification course before reaching the first operational tier. There is no trust inheritance — a derived agent does not inherit its parent's score.

Range
[0, 1000]
Initial score
0
Qualification pass
200

Tiers

Scores partition into eight tiers. Each tier defines the capability envelope an agent may operate within. Promotion is immediate at boundary crossings for T0–T4 and time-gated for T5–T7 (7, 10, 14 days respectively). Demotion uses per-tier hysteresis buffers to prevent oscillation.

Tier Name Range Description
T0 Sandbox 0–199 No external effects.
T1 Observed 200–349 Read-only, monitored.
T2 Provisional 350–499 Limited write, scoped tools.
T3 Monitored 500–649 Standard operations, audit.
T4 Standard 650–799 Full operational capability.
T5 Trusted 800–875 Cross-system operations.
T6 Certified 876–950 Multi-agent coordination.
T7 Autonomous 951–1000 Full autonomous operation.

Gain formula

A successful action yields trust. Headroom (C − S) throttles gain logarithmically, so the first points are cheap and the last points expensive. The cube root on the risk multiplier gives a sub-linear bonus for riskier successes, preventing risk-seeking as a path to faster promotion.

gain = 0.05 · ln(1 + C − S) · ∛R
C
Observation-tier ceiling
S
Current trust score
R
Risk multiplier (1–30)

Loss formula

A failed action costs trust. The penalty scales with tier (higher tier = more to lose, by design) and with risk. The fixed midpoint reference ln(1 + C/2) prevents an agent from parking near the ceiling to minimize loss exposure: magnitude depends on the ceiling, not the current position.

loss = −P(T) · R · 0.05 · ln(1 + C/2)

where P(T) = 3 + T for tier index T ∈ [0, 7], giving a penalty ratio from 3× (T0) to 10× (T7).

Risk levels

Every action is classified into one of six risk levels. The multiplier enters both gain and loss formulas. The threshold is the minimum trust score required to attempt an action at that risk level.

Level Multiplier Trust threshold Description
READ 0 Observation only.
LOW 200 Minor, reversible.
MEDIUM 400 Operational impact.
HIGH 10× 600 Significant damage.
CRITICAL 15× 800 Severe, hard to reverse.
LIFE_CRITICAL 30× 951 Human safety at stake.

Observation tiers

An agent's maximum achievable trust score is capped by how much of its internal state is visible. A black-box agent cannot exceed T3 regardless of how well it behaves. This prevents trust from being spent on opacity.

Visibility Ceiling Max tier Description
BLACK_BOX 600 T3 I/O only.
GRAY_BOX 750 T4 I/O + logs.
WHITE_BOX 900 T6 Full code access.
ATTESTED_BOX 950 T6 TEE-verified execution.
VERIFIED_BOX 1000 T7 TEE + interpretability stack.

Circuit breakers

Two absolute thresholds and one rolling accumulator. The score thresholds are latch-on, latch-off — an agent crossing below the tripped line stops operating until manual reinstatement.

Score thresholds

Tripped
Trust < 100

Hard stop. Agent cannot operate until reinstated.

Degraded
Trust < 200

Gains frozen, losses still apply. Observation only.

Risk accumulator (24h rolling window)

Each failure contributes P(T) · R to the accumulator. Thresholds escalate monitoring; crossing the top threshold trips the breaker regardless of current score.

Warning
60
Degraded
120
CB trip
240

Hybrid fingerprint

A fingerprint is useful only if it is hard to forge. BASIS combines two independent signals into a single attestation: a structural signature (SVD over the model's weight space) and a behavioral signature (responses to a sealed probe set). Either signal alone has known attacks; together they close the sleeper-backdoor gap.

Structural: SVD fingerprint

Singular-value decomposition over the weight tensors, quantized to 4 decimal places, top-64 components, 15% layer sample. Dual-fingerprinted (two independent seeds) to prevent grinding attacks. SVD is a necessary condition for identity but not sufficient — a malicious fine-tune can preserve the top-k spectrum.

Behavioral: canary probes

A sealed, hashed probe set bound to agent identity. Kept private to prevent adversarial crafting. Minimum 5% probe coverage required. Drift triggers a canary-rate boost. Changes to the reference workload require a governance ceremony.

Adversarial harness

ATSF v1.1 ships with an adversarial harness that runs six evasion attempts against the combined fingerprint (quantization spoofing, sparse perturbation, targeted rotation, fine-tune camouflage, probe dodging, replay). Current evasion rate: 0/6, seed 42.

Reproducibility

Every number on this page is exported from canonical source. Every test run is deterministic under a fixed seed. The audit pipeline is a single command.

Audit locally

npm install @vorionsys/basis
npm run audit

The audit validates tier boundaries, formula invariants, risk-threshold monotonicity, observation-tier ceilings, and circuit-breaker thresholds against the canonical export. Downstream consumers can pin to the canonical module and catch spec drift at CI time.