Building AI agent workflows compliant: audit trail, approval gates, EU hosting

Multi-agent systems — programs in which several AI agents share tasks — have been production-ready since 2024 and are deployed by more and more mid-market companies. They promise to automate entire business processes: a "CEO" agent plans strategy, an "engineer" agent writes code, a "reviewer" agent reviews. Sounds elegant. Is legally demanding. This article shows how to build multi-agent systems compliant with the EU AI Act and GDPR Art. 22.

The compliance double problem

Multi-agent systems combine two frameworks that together demand more than either alone:

EU AI Act Art. 26: human oversight at critical decision points.
GDPR Art. 22: no fully automated individual decisions with legal effect without opt-in.

Both together: approval gates wherever money, people or customer-facing content is involved. Without an audit trail, no compliance proof. Without rollback, no error correction.

The five pillars of a compliant agent workflow

1. Audit trail per agent run

Every agent run is fully logged: input, model, output, timestamp, tool calls. A flat log file isn't enough — not searchable, not auditable. Structured logging in Postgres with a search UI is the minimum standard:

CREATE TABLE agent_runs (
  id            UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  agent_name    TEXT NOT NULL,
  parent_run_id UUID REFERENCES agent_runs(id),  -- multi-agent hierarchy
  input         JSONB NOT NULL,
  model         TEXT NOT NULL,
  output        JSONB,
  tool_calls    JSONB,
  status        TEXT CHECK (status IN ('pending', 'running', 'ok', 'error', 'rejected')),
  duration_ms   INTEGER,
  approved_by   UUID REFERENCES auth.users(id),
  created_at    TIMESTAMPTZ DEFAULT NOW()
);

CREATE INDEX idx_agent_runs_agent ON agent_runs (agent_name, created_at DESC);
CREATE INDEX idx_agent_runs_status ON agent_runs (status);

2. Approval gates at critical decisions

Before any action with external impact the system waits for human approval:

Customer-facing content (blog post, email, invoice) → approval queue, not direct send.
Money movement (invoice, refund) → approval queue with amount limit.
HR decision (applicant pre-screening) → GDPR Art. 22 opt-in + manual review.
Code deploy to production → human clicks Deploy.

Rule of thumb: "Would I let an inexperienced employee do this without review?" If no, then not an AI agent either.

3. Memory system for continuous learning

Agents learn from their own findings. Best practice: separate memory collections for "insights" (what works) and "improvements" (what to do better next run). Both are humanly auditable:

CREATE TABLE agent_memory (
  id           UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  agent_name   TEXT NOT NULL,
  type         TEXT NOT NULL CHECK (type IN ('insight', 'improvement')),
  source_run   UUID REFERENCES agent_runs(id),
  content      TEXT NOT NULL,
  embedding    vector(1536),  -- pgvector for semantic search
  created_at   TIMESTAMPTZ DEFAULT NOW()
);

4. Rollback paths

Every agent action must be reversible. Concretely: before sending a blog post the draft sits in the DB; before sending an invoice a cancellation path is built in; before every code deploy there's a backup branch. No "irreversible by design".

5. EU-hosted models by default

Sensitive workloads (HR files, health data, strategy papers) stay in the EU. My default: Scaleway Mistral for text, Pixtral for multimodal tasks. US models (GPT-4, Claude) only where the use case justifies it — and with Standard Contractual Clauses and Transfer Impact Assessment.

Tool rules per agent

Each agent may only do what is explicitly allowed. Example for a code-engineer agent:

agent: engineer
allowed_tools:
  - read_file
  - write_file
  - run_tests
  - create_git_commit
forbidden_tools:
  - delete_file       # only via approval
  - push_to_remote    # only via approval
  - send_email        # not their job
  - create_invoice    # not their job
data_access:
  - source_code: read-write
  - test_data: read
  - production_db: NONE

Tool rules aren't a recommendation — they're a compliance requirement. An agent allowed to do too much is a compliance risk.

Human oversight dashboard

Admin sees all agent outputs before production. Filter by agent, date, status. Approve / reject with one click. On reject a reason goes into memory — the agent learns from it.

Retention periods

AI prompts and outputs are deleted after 30 days — unless there's a legal retention requirement (e.g. invoices 8 years, see e-invoice article). Don't forget: the memory system also has retention. Insights longer (e.g. 2 years), improvements shorter (e.g. 6 months).

What I do concretely

For a multi-agent setup on your project I deliver: architecture with clear agent roles (e.g. CEO/CTO/engineer pattern) and tool rules per role, memory system with separate insights/improvements collections, approval workflow before production actions with admin UI and email notification, audit logging in Postgres with search UI, EU-hosted models by default, AiBadge integration on every AI output, and a dashboard with agent run history.

More at /compliance/ai-agents.

Building AI agent workflows compliant: audit trail, approval gates, EU hosting

The compliance double problem

The five pillars of a compliant agent workflow

1. Audit trail per agent run

2. Approval gates at critical decisions

3. Memory system for continuous learning

4. Rollback paths

5. EU-hosted models by default

Tool rules per agent

Human oversight dashboard

Retention periods

What I do concretely

Harald Schwankl

Related Articles

EU AI Act for software vendors: what you need to implement now

Cookie consent done right: what § 25 TDDDG actually requires

GDPR 2026: eight years old, still implemented wrongly

Newsletter

Building AI agent workflows compliant: audit trail, approval gates, EU hosting

The compliance double problem

The five pillars of a compliant agent workflow

1. Audit trail per agent run

2. Approval gates at critical decisions

3. Memory system for continuous learning

4. Rollback paths

5. EU-hosted models by default

Tool rules per agent

Human oversight dashboard

Retention periods

What I do concretely

Harald Schwankl

Related Articles

EU AI Act for software vendors: what you need to implement now

Cookie consent done right: what § 25 TDDDG actually requires

GDPR 2026: eight years old, still implemented wrongly

New articles by email

Newsletter