Sydney Harbour Night
← AI News
analysis

AI Agents Need Approval Gates Before They Need Autonomy

Autonomous AI agents are becoming practical, but organisations should design approval gates, permissions and evidence trails before granting action rights.

·TheAICommand

AI agents are one of the most important shifts in enterprise AI. A chatbot answers questions. An agent can plan steps, use tools, call APIs, search systems, write files, send messages, update records and trigger workflows. That makes agents useful, but it also changes the risk profile. The question is no longer only whether AI produced a good answer. The question is whether AI should be allowed to act.

Deloitte's 2026 State of AI in the Enterprise reports that worker access to AI increased by 50 percent in 2025, but only one in five companies has a mature governance model for autonomous AI agents. APRA's April 2026 AI letter also identifies autonomous agent misuse, insecure integrations, prompt injection and data leakage as emerging AI-related threats. Together, these signals point to the same governance principle: agents need approval gates before they need autonomy.

Diagram of an agent workflow with human approval gates inserted at high-risk action points
Approval gates before autonomy

Agents change the control problem

Traditional AI risk often centres on outputs. Did the model hallucinate? Did it reveal sensitive information? Was the summary accurate? Was the recommendation biased? Those questions remain relevant, but agents add action risk. An agent may email the wrong person, update the wrong record, retrieve sensitive data, execute code, create a public post, approve a workflow or pass information to another service.

The OECD definition of an AI system describes a machine-based system that infers from inputs how to generate outputs such as predictions, content, recommendations or decisions that can influence environments. Agents go a step further in practical terms because they may not only influence environments through recommendations. They may interact with tools that change environments directly.

Agent capabilityNew risk question
Read documentsWhat data can it access, and should it see all of it?
Search systemsCan it retrieve confidential or irrelevant information?
Use APIsWhat actions can it perform, and are permissions too broad?
Send messagesWho approves external or sensitive communications?
Write filesAre records versioned, auditable and reversible?
Execute workflowsCan the agent trigger financial, employment or customer-impacting actions?

This is why human approval gates are not a sign of immature automation. They are a design control.

The approval gate pattern

An approval gate is a point in a workflow where an AI agent must stop and obtain human confirmation before proceeding. The gate should be based on risk, not inconvenience. Low-risk actions may be automated. Medium-risk actions may need sampled review or manager approval. High-risk actions should require explicit human approval every time.

Approval gates work best when they are designed into the system rather than added through vague policy. A policy that says users must review important outputs is weaker than a workflow that prevents the agent from sending an external email until a human approves the recipient, content and attachments.

Workflow stageExample approval gate
Data accessHuman authorises connection to sensitive repositories
DraftingHuman approves final version before external use
Tool useAgent can search but cannot update records without approval
EscalationAgent pauses when confidence is low or policy exceptions appear
External actionHuman approves messages, submissions, purchases or system changes
Incident responseAgent suggests containment steps but does not execute high-impact actions alone

The Australian voluntary AI Safety Standard supports this approach through guardrails on accountability, human oversight, risk management, testing and monitoring. For agents, human oversight should be specific enough to define who approves, what they review, what evidence they see and how the approval is recorded.

Permissions should be narrow by default

The most dangerous agent is not necessarily the smartest agent. It is the agent with broad permissions and weak monitoring. If an agent can access every document, call every tool and act without review, a single prompt injection or configuration mistake can become a major incident.

APRA's AI letter calls out prompt injection and exploit injection as emerging threats. An agent connected to external content can be exposed to malicious instructions hidden in webpages, documents or emails. If the agent also has broad action rights, those instructions may lead to unauthorised disclosure or action.

A safer design starts with least privilege. The agent should have only the data access and tool permissions needed for the approved use case. Permissions should be time-limited where possible, separated by environment and logged. Sensitive actions should require a second factor of human approval.

Permission designSafer default
Data accessLimit repositories, fields and records by role and purpose
Tool accessAllow read-only tools before write or execute tools
External communicationBlock direct sending until human approval is recorded
System changesRequire human confirmation and rollback capability
Financial actionsRequire separate authority outside the agent workflow
LoggingRecord prompts, retrieved data, tool calls, outputs and approvals

The strongest control is not telling the agent to be careful. It is preventing the agent from doing things it should never do.

Diagram of an agent evidence trail capturing prompts, retrieval, tool calls, approvals and actions
Evidence trails are part of product design

Evidence trails are part of the product

Agents can create complex sequences of reasoning, retrieval and action. If something goes wrong, the organisation needs to reconstruct what happened. That means evidence trails should be treated as part of product design, not as an afterthought.

A useful evidence trail records the initiating user, system instructions, user prompt, retrieved sources, tool calls, data accessed, outputs generated, approvals obtained, actions taken, timestamps and errors. This record supports quality review, incident response, audit and continuous improvement.

The NIST AI Risk Management Framework encourages organisations to govern, map, measure and manage AI risks. Evidence trails support all four functions. They help governance bodies understand use, help teams map workflow context, help assurance teams measure performance and help owners manage failures.

Evidence itemWhy it matters
User promptShows the task and instruction context
Retrieved sourcesAllows verification of factual basis
Tool callsReveals what the agent attempted to do
Data accessedSupports privacy and security review
Human approvalsProves oversight operated at the right points
Final actionConnects AI output to business impact

Without this evidence, organisations may know that an agent acted, but not why it acted or whether controls operated.

Testing needs to include misuse

Agent testing should include ordinary performance tests and misuse tests. Ordinary testing asks whether the agent completes the intended task. Misuse testing asks what happens when instructions are ambiguous, malicious, conflicting or outside policy. This is especially important when agents read untrusted content, interact with email, access documents or call external tools.

Testing should include prompt injection scenarios, excessive permission checks, incorrect recipient scenarios, sensitive data retrieval attempts, failed approval gates and rollback exercises. The aim is not to prove that the agent will never fail. The aim is to prove that failure is constrained, visible and recoverable.

Start with constrained autonomy

Organisations do not need to choose between no agents and fully autonomous agents. The better starting point is constrained autonomy. An agent may be allowed to read approved sources, draft a response, prepare a checklist or open a ticket. It may not be allowed to send the response, close the ticket, approve the transaction or update the record without human review.

This pattern lets organisations learn safely. Over time, low-risk steps with strong evidence and stable performance can receive more automation. High-risk steps should remain gated.

The bottom line

AI agents will be valuable because they can move from advice to action. That is also why they require stronger controls than ordinary chatbots. The organisations that scale agents safely will design permissions, approval gates, evidence trails and misuse testing before granting autonomy.

The future of agentic AI should not be "let the agent do everything". It should be "let the agent do the right things, with the right permissions, under the right human control".

References

  1. Deloitte, State of AI in the Enterprise 2026
  2. APRA letter to industry on artificial intelligence, 30 April 2026
  3. OECD AI system definition
  4. Australian Government Voluntary AI Safety Standard
  5. NIST AI Risk Management Framework

TheAICommand. Intelligence, At Your Command.

Tags

AI AgentsHuman ApprovalAutomationAI GovernanceEnterprise AI
← Back to AI News