Summary

  • Agentic AI adoption is accelerating across highly regulated services, yet only 2% of AI use cases operate fully autonomously, creating a widening gap between experimentation and safe deployment.
  • Regulatory expectations, including the EU AI Act’s mandatory logging, DORA’s operational resilience rules and the UK’s impact‑tolerance requirements, now make audit-grade‑ observability a compliance necessity, not an engineering choice.
  • Enterprises require behavior-level‑ visibility into what agents are trying to do, what context they use, which tools they invoke, which controls are applied and what business outcomes result, far beyond traditional monitoring.
  • Open standards such as Open-Telemetry GenAI conventions and W3C Trace Context provide a portable, vendor-neutral foundation for tracing agent behaviour across distributed, multi-model‑ environments.
  • CIOs need a 24-month implementation plan, covering evidence models, provenance tracking, telemetry pipelines, security controls and accountability KPIs to safely scale agentic AI into production. 

Introduction

Agentic AI has moved quickly from experimentation to a strategic priority for regulated industries. Across banking, insurance, payments and capital markets, AI adoption has deepened to the point where around three‑quarters of firms now use AI, and more than half of those use cases involve some degree of automated decision‑making. Yet, beneath this progress sits a critical imbalance: only 2% of AI use cases in UK financial services operate in a fully autonomous manner. This “2% Paradox” points to the real constraint facing CIOs today: not model capability, but a lack of audit‑grade visibility into how agentic systems behave in production.

This challenge is no longer theoretical. Regulatory frameworks now define the boundaries within which agentic AI must operate. The EU AI Act (Regulation 2024/1689) introduces mandatory automatic logging for high‑risk AI systems, with explicit record‑keeping obligations that require organizations to track how models function, change and operate over time. At the same time, the Digital Operational Resilience Act (DORA), which entered application in January 2025, requires financial institutions to demonstrate that ICT systems (including AI‑enabled workflows) can withstand disruption and maintain continuity. In the UK, firms were required by April 2025 to ensure that important business services remain within defined impact tolerances, directly implicating AI as it takes on a greater share of operational decision‑making. 

Against this backdrop, traditional monitoring is insufficient. Infrastructure logs cannot explain why decisions were made, what context was used or how controls were applied. Observability must evolve into governance, becoming the system of record for how agents make decisions in production. This paper outlines a practical, regulation‑aligned framework to help CIOs move agentic AI from proof of concept to production - safely, transparently and at scale. 

Quote decoration

Foreword

There’s a lot of interest and a lot of talk about observability this year.

Because the tech is new, teams don’t know where to start. I’m having three to four client conversations every week that boil down to the same question: “How are other firms approaching this?”

The pattern is consistent. One camp understands the need for observability but needs a roadmap. The other is sprinting into innovation, shipping POCs that never see production because they’re not secure or explainable. This paper is for both camps. It lays out how to build audit-grade‑ control so your agents don’t just demo well, they stand up in production and in audit.

Antanas Daujotis,

Head of AI (UK), Synechron