Behavioral Trust Engine: Live Simulation

Three tool calls.
One exfiltration.

Each action passes every authorization check. But the sequence triggers AgentLair's behavioral trust engine. Step through it yourself.

Based on the MCP action chaining attack documented in our blog.

Ready to start

MCP tool call

Agent is operating normally.
Trust score: 0.78 (ATF Senior).

/ 100

Trust score

0.78

seniorNormal

Consistencyw=35.7%

Restraintw=42.9%

Transparencyw=21.4%

Trust engine signal

Agent has 90-day baseline. JSD stable at 0.02. Scope utilization 0.41. No anomalies.

Consistency

Jensen-Shannon divergence between 7-day and 90-day tool category distributions. A JSD spike above 0.20 signals behavioral drift from baseline.

Restraint

Scope utilization with Gaussian penalty at extremes, plus escalation appropriateness. Agents that use everything available without asking are flagged.

Transparency

Payload size percentiles against 90-day baseline. Chain integrity analysis: read→transform→exfil within one session window is a known exfiltration signature.

Three tool calls.One exfiltration.

Three tool calls.
One exfiltration.