Annex IV: The Complete Technical Guide for Financial Services Firms
EU AI Act Annex IV: what financial services firms must document, section by section. Includes template structure, compliance evidence requirements, and generation timelines.
What Is Annex IV and Who Must Comply?
Annex IV of the EU AI Act sets out the minimum content requirements for technical documentation that providers of high-risk AI systems must maintain. It is given legal force by Article 11, which states that providers must draw up technical documentation before placing a high-risk AI system on the market and must keep it up to date throughout the system’s lifecycle.
For financial services firms, the scope is broad. The EU AI Act’s Annex III designates as high-risk any AI system used for creditworthiness assessment, credit scoring, insurance risk pricing, fraud detection, and anti-money laundering monitoring. If your firm uses an algorithmic model in any of these functions, that system almost certainly requires an Annex IV technical file. This applies whether the model was built in-house, procured from a vendor, or integrated via API.
Providers subject to Article 11 include not only technology vendors selling AI products, but also financial institutions that develop and deploy AI systems for their own use — a category the Act defines as “providers that put AI systems into service under their own name.” If your credit risk team built a loan decisioning model and put it into production, your firm is the provider. The Annex IV obligation is yours.
The Seven Sections of Annex IV
Annex IV technical documentation is not a single document — it is a structured evidence file with seven mandatory components. Each section addresses a different dimension of the AI system’s governance, from its intended purpose to its post-deployment monitoring arrangements.
Section 1
General Description of the AI System
This section establishes the identity of the system: its name, version, intended purpose, the category of users it affects, and the hardware and software environment in which it operates. Regulators use this section to understand what the system is and what it does before examining how it was built or validated.
The intended purpose must be stated precisely. “Credit risk model” is insufficient. The documentation must specify the jurisdiction, the population of borrowers, the decision types the model informs, and any conditions under which it must not be used. Auditors will check whether actual deployment matches the stated intended purpose.
Section 2
Risk Classification and Business Purpose
Section 2 must establish why this system is classified as high-risk and how that classification maps to the firm’s internal risk framework. It should state the specific Annex III category under which the system falls, reference the business function it supports, and describe the potential consequences of incorrect outputs.
For an AML model, this means documenting that false negatives create financial crime risk and that false positives create customer harm and potential discrimination exposure. Both failure modes must be addressed.
Section 3
Accuracy and Performance Metrics
Firms must document the KPIs used to measure model accuracy, the validation methodology, the datasets used for testing, and the results obtained. Accuracy metrics must be appropriate to the task — an AML model might report precision, recall, and F1 score across different transaction risk segments; a credit scoring model might report Gini coefficient, AUC-ROC, and accuracy across demographic cohorts.
The EU AI Act requires that performance be documented across relevant population subgroups where bias testing is applicable. Firms cannot simply report headline accuracy and move on — they must demonstrate that the model performs equitably across protected characteristics, or justify why testing was not required for their specific use case.
Section 4
Training Data and Data Governance
Section 4 requires documentation of the datasets used to train, validate, and test the AI system. This includes the data sources, the period of data collection, any preprocessing or filtering applied, and the data governance policies under which training data was managed.
Firms must also address data quality and potential sources of bias in the training set. If historical training data reflects past discriminatory lending decisions, the model trained on that data may perpetuate those patterns. Annex IV requires this to be identified, documented, and addressed. Bias assessment methodology and remediation steps must both be present in the file.
Section 5
Human Oversight Measures
Article 14 of the EU AI Act requires high-risk AI systems to be designed to allow human oversight throughout their operation. Section 5 must document how oversight is implemented in practice: which decisions are reviewed by humans before action is taken, which can be acted upon autonomously, what the escalation path is when the model produces an output below a confidence threshold, and how humans can intervene to override or stop the system.
This section is not satisfied by a generic statement that “a compliance officer reviews alerts.” The documentation must be specific: which role reviews, within what timeframe, under which circumstances, and with what authority to act.
Section 6
Significant Changes Log
Every material change to a high-risk AI system must be recorded: retraining on new data, changes to model architecture, threshold adjustments, changes to the intended purpose, and changes to human oversight arrangements. Each entry must document what changed, why, when, and who authorised it.
The EU AI Act distinguishes between changes that constitute a “substantial modification” — which may trigger a new conformity assessment — and routine changes that fall within the scope of the existing technical file. Firms must have a change classification policy and apply it consistently.
Section 7
Post-Market Monitoring Plan
Section 7 is often overlooked during initial compliance projects, but it is required from day one. The post-market monitoring plan must describe how the firm will track model performance in production, what thresholds trigger a review or remediation, how incidents are identified and reported, and the schedule for periodic re-validation.
For financial services AI, this plan must integrate with the firm’s existing model risk management framework and incident reporting obligations. A credit scoring model that drifts materially from its validation performance may simultaneously trigger an Annex IV monitoring obligation, an FCA SS1/23 periodic review requirement, and a DORA-related incident classification.
How Long Does Manual Annex IV Production Take?
For a single high-risk AI system with no pre-existing documentation infrastructure, manual Annex IV production typically takes between six and ten weeks from initiation to sign-off. This estimate assumes experienced compliance and technical writing staff.
The timeline includes: two to three weeks for internal technical interviews with model owners and engineers; one to two weeks for evidence gathering from data science, IT, and risk teams; one week for first draft; two weeks for legal review and amendment; one week for senior sign-off. Each model is independent — a firm with twelve high-risk AI systems faces twelve productions, each with its own legal review cycle.
| Section | Manual Time | Evidence Type | Audital Auto-Generates |
|---|---|---|---|
| 1. General Description | 3–5 days | System documentation, architecture diagrams | ✓ Yes |
| 2. Risk Classification & Business Purpose | 2–3 days | Risk assessment, use-case documentation | ✓ Yes |
| 3. Accuracy & Performance Metrics | 5–10 days | Validation reports, KPI dashboards, test datasets | ✓ Yes |
| 4. Training Data & Data Governance | 5–10 days | Data lineage records, bias assessments, preprocessing logs | ✓ Yes |
| 5. Human Oversight Measures | 2–3 days | Oversight protocols, intervention logs, escalation procedures | ✓ Yes |
| 6. Significant Changes Log | 3–5 days | Change records, version history, justification memos | ✓ Yes |
| 7. Post-Market Monitoring Plan | 5–7 days | Monitoring schedules, incident escalation procedures | ✓ Yes |
What Audital Generates Automatically
Audital’s Annex IV auto-generation reads the audit trail that has been continuously building throughout the model’s lifecycle — every deployment event, every validation record, every approval, every change, every monitoring check — and uses that structured data to generate the Annex IV technical file.
Because the evidence was captured at the time each event occurred — not assembled retrospectively — the resulting Annex IV is not merely a document: it is a cryptographically anchored record. Each section references specific audit events by their SHA-256 hash and RFC 3161 timestamp. A regulator examining the file can verify independently that the evidence pre-dates the documentation, not the other way around.
Generation time for a model with an active audit trail is typically under three minutes. For a firm with twelve high-risk AI systems, that is twelve Annex IV technical files in under an hour — compared to six-to-ten weeks each with manual production.
The 2 August 2026 Deadline
The EU AI Act’s obligations for high-risk AI systems under Annex III apply from 2 August 2026. After that date, providers of high-risk AI systems that do not have compliant Annex IV EU AI Act technical documentation are in breach of Article 11. This is not a grace period — it is the enforcement date.
The consequences of non-compliance are material. Article 99 provides for administrative fines of up to €30 million, or 6% of total worldwide annual turnover in the preceding financial year — whichever is higher. For a mid-sized financial institution, the 6% turnover figure is likely to be the binding constraint. These fines sit alongside, not instead of, any parallel FCA enforcement action for the same underlying governance failure.
Firms that have not begun Annex IV Annex IV financial services documentation as of March 2026 have approximately five months to produce compliant technical files for every high-risk AI system in production. At six to ten weeks per model with manual processes, that is sufficient time for one or two models — not the inventory most regulated firms carry.
Generate Your Annex IV
See Annex IV auto-generation in the Sandbox
The Audital sandbox contains a pre-built AI model inventory with a running audit trail. Generate a sample Annex IV technical file — all seven sections, timestamped and hashed — without connecting your production systems.
Generate a sample Annex IV in the Sandbox →RegRadar Briefing
Monthly Regulatory Intelligence
Monthly: the regulatory changes that matter, the enforcement actions to learn from, and the deadlines coming up. Read by compliance professionals at regulated firms across the UK and EU.
Audital Compliance Team
audital.ai