BASIS spec v1.0.0

ATSF Mathematics

The formulas, constants, and decision rules that govern trust scoring in the BASIS standard. Every value here is exported from packages/basis/src/canonical.ts — the single source of truth for conforming implementations.

Trust score space

Scores live on a bounded interval. Every agent starts at zero and must pass a qualification course before reaching the first operational tier. There is no trust inheritance — a derived agent does not inherit its parent's score.

Range: [0, 1000]
Initial score: 0
Qualification pass: 200

Tiers

Scores partition into eight tiers. Each tier defines the capability envelope an agent may operate within. Promotion is immediate at boundary crossings for T0–T4 and time-gated for T5–T7 (7, 10, 14 days respectively). Demotion uses per-tier hysteresis buffers to prevent oscillation.

Tier	Name	Range	Description
T0	Sandbox	0–199	No external effects.
T1	Observed	200–349	Read-only, monitored.
T2	Provisional	350–499	Limited write, scoped tools.
T3	Monitored	500–649	Standard operations, audit.
T4	Standard	650–799	Full operational capability.
T5	Trusted	800–875	Cross-system operations.
T6	Certified	876–950	Multi-agent coordination.
T7	Autonomous	951–1000	Full autonomous operation.

Gain formula

A successful action yields trust. Headroom (C − S) throttles gain logarithmically, so the first points are cheap and the last points expensive. The cube root on the risk multiplier gives a sub-linear bonus for riskier successes, preventing risk-seeking as a path to faster promotion.

gain = 0.05 · ln(1 + C − S) · ∛R

C: Observation-tier ceiling
S: Current trust score
R: Risk multiplier (1–30)

Loss formula

A failed action costs trust. The penalty scales with tier (higher tier = more to lose, by design) and with risk. The fixed midpoint reference ln(1 + C/2) prevents an agent from parking near the ceiling to minimize loss exposure: magnitude depends on the ceiling, not the current position.

loss = −P(T) · R · 0.05 · ln(1 + C/2)

where P(T) = 3 + T for tier index T ∈ [0, 7], giving a penalty ratio from 3× (T0) to 10× (T7).

Risk levels

Every action is classified into one of six risk levels. The multiplier enters both gain and loss formulas. The threshold is the minimum trust score required to attempt an action at that risk level.

Level	Multiplier	Trust threshold	Description
READ	1×	0	Observation only.
LOW	3×	200	Minor, reversible.
MEDIUM	5×	400	Operational impact.
HIGH	10×	600	Significant damage.
CRITICAL	15×	800	Severe, hard to reverse.
LIFE_CRITICAL	30×	951	Human safety at stake.

Observation tiers

An agent's maximum achievable trust score is capped by how much of its internal state is visible. A black-box agent cannot exceed T3 regardless of how well it behaves. This prevents trust from being spent on opacity.

Visibility	Ceiling	Max tier	Description
BLACK_BOX	600	T3	I/O only.
GRAY_BOX	750	T4	I/O + logs.
WHITE_BOX	900	T6	Full code access.
ATTESTED_BOX	950	T6	TEE-verified execution.
VERIFIED_BOX	1000	T7	TEE + interpretability stack.

Circuit breakers

Two absolute thresholds and one rolling accumulator. The score thresholds are latch-on, latch-off — an agent crossing below the tripped line stops operating until manual reinstatement.

Score thresholds

Tripped: Trust < 100
Degraded: Trust < 200

Risk accumulator (24h rolling window)

Each failure contributes P(T) · R to the accumulator. Thresholds escalate monitoring; crossing the top threshold trips the breaker regardless of current score.

Warning: 60
Degraded: 120
CB trip: 240

Hybrid fingerprint

A fingerprint is useful only if it is hard to forge. BASIS combines two independent signals into a single attestation: a structural signature (derived from the model's weights) and a behavioral signature (responses to a sealed probe set). Either signal alone has known attacks; together they close the sleeper-backdoor gap. The fingerprinting method is patent-pending; the specifics below are intentionally omitted.

Structural signature

A compact, one-way signature derived from the model's weight space. It is a necessary condition for identity but not sufficient on its own — which is why it is paired with a behavioral signature. For open-weight (white-box) models, with documented sensitivity limits.

Behavioral signature: canary probes

A sealed, hashed probe set bound to agent identity, kept private to prevent adversarial crafting. Drift triggers an elevated monitoring rate. Changes to the reference workload require a governance ceremony.

Adversarial harness

ATSF ships with an adversarial harness that runs a battery of evasion attempts against the combined fingerprint and reports the evasion rate, so the attestation's resistance is measured rather than asserted.

Reproducibility

Every number on this page is exported from canonical source. Every test run is deterministic under a fixed seed. The audit pipeline is a single command.

Audit locally

npm install @vorionsys/basis
npm run audit

The audit validates tier boundaries, formula invariants, risk-threshold monotonicity, observation-tier ceilings, and circuit-breaker thresholds against the canonical export. Downstream consumers can pin to the canonical module and catch spec drift at CI time.

← Architecture Benchmarks →