Most AI systems aren't ready. Check yours in 15 min →
HA

How AI Systems Are Assessed Against 43 EU Obligations

AuthorAndrew
Published on:
Published in:AI

How AI Systems Are Assessed Against 43 EU Obligations

AI compliance in the EU increasingly depends on whether you can show, at any time, how an AI system meets a defined set of legal obligations. That’s where an obligation tracking and compliance mapping engine becomes practical: it turns broad regulatory requirements into a structured, testable, evidence-backed assessment process.

This guide walks you through a workable method to assess AI systems against a catalog of 43 EU obligations using an obligation library, a mapping model, and an evidence workflow that can scale across teams and products.

Step 1: Define your “43 obligations” library as a structured catalog

Start by converting your obligation set into a machine- and human-readable library. Whether your obligations come from a regulation, internal policy, contractual controls, or a combination, the method is the same: standardize each obligation so it can be tracked, assigned, tested, and audited.

For each obligation, define the following fields:

  • Obligation ID: stable identifier (e.g., EU-OB-01)
  • Obligation statement: the exact requirement in plain language
  • Applicability conditions: when it applies (system type, risk tier, user group, geography, lifecycle stage)
  • Control objective: what “good” looks like (measurable outcome)
  • Required evidence types: documents, logs, tests, approvals
  • Owner role: who is responsible (Product, ML, Security, Legal, Compliance)
  • Verification method: how it will be checked (review, automated test, monitoring)
  • Frequency: one-time, per release, quarterly, continuous
  • Severity / priority: based on legal impact and operational risk

Actionable tip: Avoid long narrative obligations. Rewrite each into a testable format: “We must do X for systems meeting Y, evidenced by Z.” If it can’t be evidenced, it can’t be closed.

Step 2: Build an applicability decision tree to reduce noise

A common failure mode is applying all obligations to every system. Your engine should first run an applicability filter so teams focus only on what matters.

Create an applicability questionnaire that feeds a rules engine:

  • System purpose (recommendation, classification, generation, decision support)
  • Domain (employment, finance, health, education, public sector, etc.)
  • User context (internal-only vs customer-facing)
  • Impact profile (safety, fundamental rights, financial)
  • Autonomy level (human-in-the-loop, human-on-the-loop, fully automated)
  • Data types (personal data, sensitive categories, children’s data)
  • Deployment status (prototype, pilot, production)

The output should be:

  • Applicable obligations list
  • Not applicable obligations list with rationale
  • Conditional obligations that depend on a future design choice (e.g., whether automated decisioning will be enabled)

Actionable tip: Store applicability outcomes as evidence. “Not applicable” needs the same discipline as “compliant,” or it will be challenged during review.

Step 3: Decompose each obligation into controls, tests, and evidence

An obligation rarely maps to a single artifact. A robust engine breaks it into:

  • Controls: what you implement (policies, technical measures, process steps)
  • Tests/Checks: how you verify it works (manual review, automated checks, monitoring)
  • Evidence: what you retain (documents, logs, screenshots, approvals, metrics)

Example decomposition structure:

  • Obligation EU-OB-12 (statement)
    • Control C-12.1: access restriction for training data
    • Control C-12.2: data retention rules applied
    • Check T-12.1: quarterly access review completed
    • Check T-12.2: retention job success logs
    • Evidence E-12.x: approvals, logs, reports

Actionable tip: For each control, define a pass/fail criterion. Avoid “reviewed” as a status unless “reviewed” is tied to a decision and recorded outcome.

Step 4: Create the compliance mapping model (system → components → obligations)

The compliance mapping engine is essentially a graph:

  • AI system (product)
    • Use cases (what decisions it supports)
    • Model(s) (versions, architecture, training runs)
    • Data pipelines (sources, preprocessing, labeling)
    • Runtime services (APIs, UI, logging, monitoring)
    • Human processes (approvals, escalation, incident handling)

Map obligations to the level where they are actually controlled:

  • Some obligations map to the system (user transparency)
  • Some to the model (performance evaluation, drift monitoring)
  • Some to the data pipeline (data quality, provenance)
  • Some to operations (incident response, access governance)
  • Some to vendor dependencies (third-party model provider obligations)

Actionable tip: Do not force all obligations to map only at system level. You’ll lose traceability and create duplicate evidence requests.

Step 5: Set up an evidence workflow that’s audit-ready

Compliance mapping fails when evidence is scattered. Your engine should standardize evidence capture with these attributes:

  • Evidence ID
  • Linked obligation(s) and control(s)
  • Artifact type (policy, test report, monitoring dashboard export, model card, DPIA-like assessment, change request)
  • Owner
  • Creation date and validity period
  • System/model version it applies to
  • Integrity (who approved, immutable record if possible)

Establish evidence tiers:

  • Tier 1 (must-have): required to ship or operate
  • Tier 2 (supporting): strengthens defensibility
  • Tier 3 (nice-to-have): improves maturity but not blocking

Actionable tip: Make evidence reusable across obligations. A single risk assessment can support multiple requirements if it’s structured and cross-referenced.

Step 6: Operationalize tracking with clear statuses and gates

Define a consistent set of statuses so dashboards are meaningful:

  • Not applicable (with rationale)
  • Planned (work item created)
  • In progress
  • Implemented
  • Verified (check passed)
  • Exception granted (time-bound, approved)
  • Non-compliant (known gap with remediation plan)

Tie these statuses to lifecycle gates:

  • Design gate: applicability + initial risk assessment complete
  • Pre-build gate: data and model documentation initiated
  • Pre-release gate: required controls implemented + verified
  • Post-release gate: monitoring, incident processes active

Actionable tip: Treat “exception granted” as a first-class state with expiry. Exceptions without expiration become permanent gaps.

Step 7: Add automated checks where they reduce burden (and document the rest)

Not everything can be automated, but a practical engine uses automation to prevent regressions:

Automatable checks often include:

  • Required documentation present (model cards, change logs)
  • Dataset lineage fields filled (source, license/permission flags, retention tags)
  • Logging enabled and validated (coverage checks)
  • Access control checks (group membership, least-privilege rules)
  • Monitoring alerts configured (drift, anomalies, safety filters)
  • Release checks (no deployment without verified obligations)

For non-automatable obligations (e.g., human oversight design), standardize:

  • Templates for review
  • Approval workflows
  • Recorded decisions and sign-offs

Actionable tip: Every automated check should produce evidence output (a report artifact) linked to obligations and versions.

Step 8: Handle shared services and vendors with dependency mapping

If you rely on third-party models, labeling vendors, or managed platforms, your engine must map obligations to dependencies and contracts.

Create a dependency record per vendor/component:

  • What the vendor provides (model, infrastructure, data)
  • What obligations are inherited, shared, or retained
  • Evidence expected from the vendor (security reports, change notices, incident SLAs)
  • Your internal compensating controls if vendor evidence is limited

Actionable tip: Mark each obligation with a responsibility model: Responsible, Accountable, Consulted, Informed. Many compliance gaps come from unclear boundaries.

Step 9: Build a reporting layer that answers the questions regulators and executives ask

A good compliance mapping engine produces reports that answer:

  • Which obligations apply to this system and why?
  • What is the current compliance status per obligation?
  • What evidence proves compliance, and for which version?
  • What changed since last release (models, data, features)?
  • What exceptions exist, who approved them, and when do they expire?
  • What incidents occurred, and how were they handled?

Design dashboards for different audiences:

  • Engineering view: open tasks, failing checks, required evidence
  • Compliance/legal view: obligation status, exceptions, audit pack readiness
  • Executive view: risk posture, trend over time, high-severity gaps

Actionable tip: Include a “time travel” capability: reproduce the compliance state as-of a specific release or date.

Step 10: Run continuous improvement with gap reviews and control refinement

Once the engine is live, improve it systematically:

  • Monthly review of recurring failed obligations (root cause)
  • Simplify obligations that produce ambiguous results
  • Merge duplicate evidence requests
  • Add automation for high-frequency manual checks
  • Update applicability logic when products evolve

Set a cadence for:

  • Control testing (periodic verification)
  • Evidence refresh (expired artifacts)
  • Obligation library updates (new interpretations, product scope changes)

Actionable tip: Track “time-to-verified” per obligation as an operational metric (not as a performance score for individuals). It helps prioritize automation and template improvements.


An obligation tracking and compliance mapping engine works when it is version-aware, evidence-driven, and built around applicability. With a structured obligation library, a clear mapping model, and disciplined evidence workflows, professionals can assess AI systems against a defined set of 43 EU obligations in a way that is repeatable, scalable, and defensible under scrutiny.

Frequently asked questions

What is AI agent governance?

AI agent governance is the set of policies, controls, and monitoring systems that ensure autonomous AI agents behave safely, comply with regulations, and remain auditable. It covers decision logging, policy enforcement, access controls, and incident response for AI systems that act on behalf of a business.

Does the EU AI Act apply to my company?

The EU AI Act applies to any organisation that develops, deploys, or uses AI systems in the EU, regardless of where the company is headquartered. High-risk AI systems face strict obligations starting 2 August 2026, including risk management, data governance, transparency, human oversight, and conformity assessments.

How do I test an AI agent for security vulnerabilities?

AI agent security testing evaluates agents for prompt injection, data exfiltration, policy bypass, jailbreaks, and compliance violations. Talan.tech's Talantir platform runs 500+ automated test scenarios across 11 categories and produces a certified security score with remediation guidance.

Where should I start with AI governance?

Start with a free AI Readiness Assessment to benchmark your current maturity across 10 dimensions (strategy, data, security, compliance, operations, and more). The assessment takes about 15 minutes and produces a prioritised roadmap you can act on immediately.

Ready to secure and govern your AI agents?

Start with a free AI Readiness Assessment to benchmark your maturity across 10 dimensions, or dive into the product that solves your specific problem.