CPS 230 and AI: A Practical Operational Resilience Playbook

CPS 230 has been live for nine months. Evidence now matters more than intent.

Context for general readers: CPS 230 is the prudential standard APRA uses to make sure regulated financial institutions can keep operating when something breaks. It pulls together the older rules on outsourcing, business continuity, and operational risk into one framework. The key concepts are critical operations* (the things that, if they stopped, would cause real harm to customers or the financial system), tolerance levels* (the maximum disruption the entity can absorb before harm occurs), and material service providers* (third parties whose failure would prevent the entity from operating). AI tools have entered all three categories.

The intent of CPS 230 is well rehearsed at this point. The implementation question has moved on. The APRA Corporate Plan 2025-26 signalled escalating supervisory focus on technology and operational resilience. Internal audit teams across the major banks are now circulating CPS 230 thematic reviews, and AI tooling is the variable most boards have least line of sight on.

This article is a practitioner playbook for closing that gap. It assumes you know the standard. It focuses on the four areas where AI changes the operational risk picture and where supervisory questions are most likely to land in 2026.

What changed when AI entered the operational stack

When CPS 230 was drafted, the dominant third-party operational risk picture was relatively static. A core banking platform. A general ledger. A payments gateway. The vendor list was long but reasonably stable, and the contractual surface area moved slowly.

Enterprise AI has changed three things about that picture.

First, the change cadence. Foundation model providers ship material capability changes monthly. A model that handled a credit decisioning summary acceptably in March may behave differently in May. CPS 230's expectation that material service providers are subject to ongoing monitoring now requires a monitoring cadence that matches the rate of underlying change.

Second, the operational dependency. AI tools have moved from convenience to embedded workflow in many institutions. Compliance monitoring, suspicious matter triage, customer correspondence drafting, and credit memo summarisation are now AI-assisted at scale. The unavailability of an AI service is no longer a productivity issue. It is an operational disruption that needs a tolerance level.

Third, the chain of dependency. The AI tool a regulated entity contracts with often relies on a sub-processor for the underlying model, a separate cloud provider for compute, and another for hosting. CPS 230 paragraph 35 makes the entity responsible for managing material risks across this chain, not just the contracting layer.

The five supervisory pressure points

1. Critical operations identification

CPS 230 requires entities to identify their critical operations and the resources they depend on. The supervisory question now: have you reassessed your critical operations list since AI tooling went into production?

A credit assessment workflow that was historically dependent on a core banking platform and a credit bureau feed may now also depend on a generative AI tool that drafts the assessment narrative. If that tool's unavailability would prevent timely customer outcomes, it sits inside the critical operation. If its outputs would influence the credit decision, the model itself becomes part of the operation's risk profile.

The practical action: re-run your critical operations mapping with AI tooling treated as a first-class component, not a productivity add-on. Document the dependency.

2. Tolerance levels and impact tolerance

Paragraph 30 of CPS 230 requires tolerance levels expressed as the maximum disruption the entity is willing to accept. For traditional infrastructure, this is well practised: an RTO of four hours for the payments rail, an RPO of fifteen minutes for transaction data.

AI tooling sits awkwardly in this framework because the disruption can be partial rather than binary. A foundation model may remain available but degrade in quality. A specialist AI tool may produce outputs that are technically valid but materially less useful than the previous version.

The practitioner answer: tolerance levels for AI-supported operations need to express both availability (is the service up) and quality (are the outputs fit for purpose). A monitoring metric that only tracks uptime will miss the quality dimension entirely.

3. Material service provider management

CPS 230 imposes specific obligations on relationships with material service providers, including formal contracts, ongoing performance monitoring, business continuity arrangements, and exit planning. CPG 230 paragraph 64 elaborates on the chain of providers.

For most enterprise AI tools, the contracting party is large (Microsoft, Google, Anthropic, OpenAI, AWS for Bedrock-hosted models). Concentration risk is the supervisory concern. If an institution's compliance monitoring, claims triage, and customer service drafting all depend on one provider's models, the failure of that provider creates a correlated disruption.

The practitioner action: map your AI tool dependencies to underlying model providers. The vendor name on the procurement record may be different from the model provider, and APRA expects you to know both. Where concentration is high, document the substitution path. The exit clause in the contract is necessary but not sufficient; you need a credible operational plan for substitution.

4. Incident response and notification

CPS 230 requires incident response capability and timely escalation. AI failure modes do not always look like incidents. A model that begins producing systematically biased outputs is failing operationally, but it does not generate an alert in the way a system outage does.

The practical implication: incident triggers for AI-supported operations must include output quality degradation, not just availability loss. This is not a hypothetical risk. APRA's emerging focus on model risk (covered in our piece on the upcoming thematic review) sits exactly here.

5. Internal audit and assurance

CPS 230 paragraph 56 expects internal audit to provide independent assurance over the operational risk framework. For AI-related controls, this introduces a capability question. Internal audit teams that were sized and skilled for traditional technology assurance work often lack the in-house expertise to evaluate AI-specific controls.

Three approaches are emerging across the major Australian institutions. The first is upskilling internal audit through formal training, often anchored to the Voluntary AI Safety Standard or international frameworks such as NIST AI RMF. The second is co-sourcing AI-specific audit work with external specialists. The third is establishing a second line of defence AI risk function with sufficient depth that internal audit can audit the function itself, rather than auditing the underlying technology directly.

None of these approaches is wrong. The supervisory question is whether the assurance model produces credible findings, not which staffing structure produces them. What is unacceptable is silence: an internal audit plan that does not address AI controls in any form, given the operational footprint AI now has.

The three documentation gaps that surface most often

In practice, three documentation gaps surface most often in CPS 230 readiness work for AI tooling.

The first is the threshold test for materiality. Entities have not always documented the threshold at which an AI tool becomes a material service provider. Without a documented threshold, the materiality classification of any individual tool is contestable, and a supervisor reviewing the third-party register can reasonably ask why some AI tools are inside scope and others are not.

The second is the change cadence assumption. Standard third-party risk frameworks assume the third-party service does not change unilaterally. AI tools change with each model update. The frequency of supervisory review baked into the standard third-party framework (often annual) does not match the frequency of underlying change in the AI tool. Where this mismatch is undocumented, it operates as a control gap.

The third is the human-in-the-loop adequacy claim. Many entities have documented that AI outputs are reviewed by humans before action. Few have documented what that review actually involves: how much time is allocated, what information is available to the reviewer, and how often the reviewer materially modifies or rejects the AI output. CPS 230 expects controls to be designed and operating effectively. A nominal human review without supporting evidence may not satisfy this.

A short note on tolerance level definition

Because tolerance levels are the area where AI tooling stretches CPS 230 most, it is worth pausing on what good looks like.

A tolerance level for an AI-supported critical operation should answer four questions in writing. What is the maximum tolerable disruption to the service availability dimension? What is the maximum tolerable degradation to the output quality dimension? What is the manual fallback during a disruption, including the throughput rate that fallback can sustain? Who is the named owner of the tolerance level, accountable for both setting it and triggering escalation when it is breached?

The answers will vary by entity, by tool, and by use case. What should not vary is the existence of the answers in writing. A regulated entity that cannot, on supervisory request, produce written tolerance levels for its material AI-supported critical operations is in a weaker position than one that can.

Practical implications this quarter

For GRC, operational risk, and internal audit teams, four actions sit inside the next ninety days:

Re-run the critical operations mapping with AI in scope. The 2024 mapping done before broad enterprise AI rollout is almost certainly stale. Document the AI tool dependencies inside each critical operation.
Inventory the AI tool stack against material service provider criteria. Apply your standard threshold tests for materiality. Where AI tools cross the threshold, they need the full CPS 230 treatment, including formal risk assessment, tested business continuity, and documented exit plans.
Define quality tolerance levels alongside availability tolerance levels. This is new. The work to do is both technical (defining what quality degradation looks like for each tool) and governance (defining who owns the monitoring and the trigger for escalation).
Map the model and infrastructure chain. For each material AI tool, document the contracting party, the model provider, the inference infrastructure, and the data residency. The chain matters for both concentration analysis and incident response.

These actions are not optional reading. APRA's Corporate Plan flagged operational resilience and technology risk as enduring supervisory priorities, and CPS 230 thematic reviews are likely in the second half of 2026.

How the framework intersects with the broader AI risk picture

CPS 230 is one piece of a larger regulatory mosaic. Its operational risk and resilience focus does not, on its own, capture model risk (where APRA's expected thematic review will press), conduct risk (where ASIC has primary supervisory responsibility), AML/CTF risk (AUSTRAC), or privacy risk (OAIC). Each of these regulators reaches AI uses through its own framework.

What CPS 230 does is establish the operational floor. Where AI tools are operationally embedded inside a regulated entity, CPS 230 reaches them. Other risk frameworks layer on top. For GRC and risk teams, the practical implication is that AI governance cannot be designed against any single regulatory regime; it must be designed to satisfy the full set, with CPS 230 as the operational backbone for APRA-regulated entities.

The institutions that have struggled most are those that built AI governance frameworks around a single regulatory lens (often privacy or conduct) and then had to retrofit the operational risk dimension. The institutions doing best have built operational risk into the AI governance design from inception, with CPS 230 as the structural reference point.

Direction of travel

CPS 230 is not the ceiling of AI operational risk regulation in Australia. The Voluntary AI Safety Standard, the upcoming model risk thematic review by APRA, and the cross-regulator coordination between APRA, ASIC, AUSTRAC, and OAIC on AI all sit on top of CPS 230's foundation.

For GRC professionals, the question is not whether AI is inside the regulatory perimeter. It is. The question is whether your governance evidence is operating at the cadence the technology requires. For most regulated entities in Australia, that work is still under way. Practitioners who use the next two supervisory cycles to close the documentation gaps, formalise quality tolerance levels, and tighten the chain mapping will be in a stronger position than those who wait for the supervisory question to arrive first.

The supervisory machinery is not standing still. APRA's Corporate Plan, the upcoming model risk thematic review, and the joint cross-regulator engagement on AI are all moving in the same direction. CPS 230 is the operational anchor; the surrounding context is shifting around it.

Content disclaimer: This article is for general educational and informational purposes only. It does not constitute legal advice, regulatory guidance, or a substitute for professional compliance judgement. Regulatory obligations vary by entity type, licence, and circumstance. Always refer to primary source guidance from APRA, ASIC, or the relevant regulatory authority.

TheAICommand. Intelligence, At Your Command.