Key Takeaways
- The GDPR and the AI Act address different problems, but they do so in ways that don’t align at a structural level.
- The GDPR expects clarity at the moment of data collection, while the AI Act assumes stability at the point of deployment; neither reflects how AI actually evolves.
- Organisations aren’t just following rules; they’re navigating contradictions between data protection principles and AI system requirements.
- Both frameworks promise visibility into how AI systems use data, yet neither gives individuals meaningful insight into whether their data was actually used.
- As AI systems scale and regulations diverge globally, the biggest challenge is not breaking the rules, but trying to follow two systems that don’t fit together.
Tsugite. That’s the name of a traditional Japanese joinery technique in which each piece of wood is precisely crafted to interlock with the other, creating strength through design alone. That’s how the GDPR and the EU AI Act were meant to work together. But in reality, they don’t.
While both regulatory frameworks promise to deliver safer, more accountable AI, what’s rarely said out loud is that they were built in different moments, for different risks, and on very different assumptions about how technology works. It’s true they overlap in places, but they don’t really connect. So, the real story behind the GDPR and the EU AI Act isn’t alignment. It’s friction.
Friction here isn’t abstract and isn’t just general tension either, showing up in very specific places. Given that it directly affects how companies operate, let’s work through them one by one.
Tension 1: Data Minimisation vs What AI Actually Needs
GDPR’s data minimisation principle says organisations must collect only what’s necessary for a specific purpose. That logic holds in a world where a company is, for instance, building a CRM system or processing payroll. But it breaks down in the world of AI, where the whole point is to train on as much data as possible.
That’s because developers don’t know what’s useful until after training has ended. That’s not a flaw in the technology, but the fundamental principle of how training AI systems actually works.
Simply put, you can’t train an AI model to detect fraud just by showing it labelled fraud examples. To learn properly, the model needs enormous volumes of normal transactions, so it can learn for itself what deviation looks like and how it typically occurs. At the moment of data collection, it’s nearly impossible to say with the precision GDPR expects which data is strictly necessary. Because, as mentioned above, that can be seen only after the model extracts and processes all the data it needs in order to learn a specific thing.
Example: A European health tech company wants to build a diagnostic support tool using historical patient records. But to make the tool accurate, it needs a lot of sensitive health data. Under the GDPR, that usually means getting clear consent from patients or relying on a research exception. In reality, the rule about using only the minimum data doesn’t fit well with how much data an AI actually would need. The research option helps, but its conditions (like limiting commercial use) don’t match how most health tech companies operate. So, businesses end up with three choices: compromise on model quality, work in a legal grey area, or relocate the training workload outside the EU.
So, on the one hand, the GDPR requires companies to minimise data use and rely on a valid legal basis, such as consent or research exemptions. But, on the other hand, under the AI Act, high-risk AI systems must be trained on high-quality data that properly reflects real-world situations, and companies need to clearly document how they collect, manage, and use that data. Needless to say, these requirements pull in opposite directions, and there’s no clean legislative answer bridging them.
In May 2025, a German court dismissed a case against Meta over its use of public Facebook and Instagram posts to train an AI model, finding that this could fall under “legitimate interests” under the GDPR. It was an important decision, but it was just one court. At the EU level, it’s still unclear which legal basis applies to large-scale AI training on personal data.
From a GDPR perspective, this raises some important questions: What is the lawful basis for training AI models? Can a sufficiently specific purpose be defined at the training stage? How can the data minimisation requirement be satisfied when relevance is only discovered after training?
If we layer the AI Act on top of this, we see that it mainly classifies systems based on their intended use and risk level. That works well for clearly defined applications like credit scoring or biometric identification. But things get more complicated with general-purpose systems, where the original purpose might not fully match how they’re used later.
So, we end up with a mismatch: while the GDPR struggles with the openness of purpose at the training stage, the AI Act struggles with the unpredictability of how systems are used later. The conclusion? Neither framework fully captures the lifecycle of these models end to end.
Tension 2: Case-by-Case vs. Pre-Set Risk Categories
To begin with, the GDPR is built around context, with every case being assessed on its own terms (e.g. who the controller is, what the legal basis is, what risks exist for real people, and whether those risks can be justified or reduced). The Data Protection Impact Assessment (DPIA) under Article 35 requires organisations to look at each specific use case, identify the risks, and then apply appropriate safeguards.
The AI Act takes a fundamentally different route, classifying AI systems into fixed risk categories based on their intended use rather than how they’re deployed in real-world contexts. The tension appears when this logic collides with the GDPR’s.
For instance, a system may be classified as high-risk under the AI Act yet pose minimal actual risk in practice. Conversely, a system that falls outside the Act’s high-risk categories may still raise serious GDPR concerns. The mismatch exists because the AI Act focuses on use-case categories while the GDPR focuses on data processing practices and individual rights. Compliance with one doesn’t guarantee compliance with the other.
Example: If a company uses an automated shortlisting tool internally and follows a transparent process that includes real human review and properly informs candidates, the actual risk may be quite low, given the safeguards in place. But under the AI Act, it’s still high-risk, with all the formal requirements that come with it. As a result, the company may end up running two separate risk assessments that don’t really speak to each other and may reach conflicting conclusions.
There’s also a more subtle mismatch. Under Article 22 of the GDPR, individuals have the right to meaningful human involvement in fully automated decisions with significant effects. The AI Act requires human oversight for high-risk systems, but that’s not the same thing. If a human is simply approving 95% of what the system suggests, that may meet the AI Act’s requirements, but it likely doesn’t meet the standards under Article 22.
This shows, once again, that these frameworks weren’t designed to align at this level of detail, and that gap is becoming increasingly visible as more AI systems are deployed.
Tension 3: The Dual-Role Problem
This is the tension that causes the most genuine confusion in practice. That’s because modern AI deployment doesn’t fit into neat legal boxes. Organisations typically wear multiple hats simultaneously, including but not limited to the following:
- A company that builds an AI model on customer data is an AI provider under the Act and a data controller under the GDPR.
- A company that deploys a third-party AI model in its operations is an AI deployer under the Act, but under the GDPR, it could be a data processor or a data controller, depending on how it uses the data.
- A company that white-labels a third-party AI model and distributes it to customers is a distributor or importer under the Act, and the data protection role depends entirely on what data flows are involved.
Both laws attach different compliance obligations to these roles, and the AI Act itself doesn’t clearly resolve this mapping.
Example: Take a European insurer building a claims assessment model using its own data. Under the AI Act, assessing insurance claims is likely to fall within high-risk use cases, as it affects access to financial services. The insurer becomes the provider and has to meet requirements on risk management, data governance, transparency, and registration.
At the same time, under the GDPR, it’s also the data controller, which means it must comply with the conditions for processing sensitive data, carry out a DPIA under Article 35, provide notices under Articles 13 and 14, and ensure the Article 22 automated decision-making provisions are addressed.
Both frameworks require documentation explaining how the system works. But the documentation templates, level of detail, and intended audience are different. There is currently no standardised way to produce one document that satisfies both sets of documentation requirements. In practice, this leads to duplicate documentation that ticks compliance boxes without improving how the system is governed.
The European Commission attempted to address some of these issues through the Digital Omnibus package, proposed in 2025. However, the proposals have been criticised, including by digital rights groups, for weakening protections rather than coherently integrating the frameworks. The proposals have also attracted scrutiny over the potential influence of industry lobbying, which is a separate problem worth watching.
Tension 4: The Training Data Transparency Gap
Within the data minimisation debate, one issue deserves particular attention: what happens when personal data is used to train AI models without people knowing?
Under the GDPR, Articles 13 and 14 say individuals should be informed about how their data is used. But in practice, when personal data ends up in a training set, whether scraped from public web pages, purchased from data brokers, or reused from earlier services, the original data subjects rarely receive notification. Some of them might not even know their data is there, so they can’t exercise their rights.
The AI Act tries to improve transparency. From August 2025, providers of general-purpose AI models have to publish summaries of their training data sources. The Commission released an updated Public Summary Template requiring basic disclosure of training datasets.
This sounds like progress, and to a point, it is. But the level of disclosure is deliberately limited to protect trade secrets. You’ll typically get a high-level description of dataset categories, not the kind of granular information that would allow individuals to determine whether their data was used or to exercise their GDPR rights.
So, you end up with two transparency regimes, both aiming at the same problem, but working in different ways and producing different results. And neither really solves the core issue. If your medical records, social media activity, or customer service transcripts were used to train an AI model, there’s still no clear, practical way to find out or do much about it.
The US Enters the Scene and Things Get More Complicated
Everything we’ve discussed so far sits within the EU. But in December 2025, the “Ensuring a National Policy Framework for Artificial Intelligence” Executive Order shifted the conversation and not in a helpful way.
At a high level, the EO pushes towards a single federal AI framework that challenges conflicting state laws. The direction is clear: fewer obligations, less friction, and a “minimally burdensome” approach. That phrasing isn’t accidental; in practice, that means deregulation, and a direct contrast to the EU’s approach.
Here’s where things start to clash: a US framework designed to minimise requirements will, by definition, fall short of what EU law expects, ranging from transparency obligations under the GDPR to disclosure rules in the AI Act.
For any company operating on both sides of the Atlantic — which captures essentially every major AI company — this creates an immediate structural conflict. Because in practice, they can’t design a single disclosure system that satisfies both frameworks. If they meet EU standards, they go beyond what US law requires. If they follow the US “minimally burdensome” standard, they risk non-compliance in the EU.
There’s also a deeper issue around data transfers. Under the GDPR, personal data can only move to countries with “adequate” protection. If the US framework positions itself as “minimally burdensome” and begins overriding state laws that provide protections equivalent to what EU residents enjoy, that raises questions about whether the current EU-US Data Privacy Framework can be sustained. We’ve been here before when the original Privacy Shield was invalidated precisely because US surveillance law failed to meet the protection standards that EU residents were entitled to, and it didn’t end well.
Another issue for US-based AI companies operating in the EU is that the AI Act applies based on where the AI system is used, not where the provider is based. So, a US company offering AI in the EU has to follow EU rules, regardless of US law. That’s because the EO can override state laws, but it can’t override EU regulation.
The result is a split compliance reality that will be expensive and very complex to maintain. While large companies will likely manage, many mid-sized ones are simply not equipped to do so.
What All This Means in Practice
As of early 2026, most organisations are navigating frameworks that still don’t fully tell them what to do. Key rules aren’t in force yet, proposals are still being debated, and the US approach is more direction than law. But the trajectory is clear, and the tensions described above aren’t going away. In practice, that breaks down into a few key points, as follows:
- The data minimisation tension – The most defensible position is to treat GDPR’s legitimate interests basis as genuinely requiring a well-documented balancing exercise. The German court’s Meta ruling gives you some runway, but relying on one national court ruling for your entire data governance strategy is not a robust approach.
- The dual-role documentation problem – Until there’s clear guidance from the AI Office or DPAs on how DPIA and conformity assessment documentation can be integrated, you’re going to be producing parallel documents. Rather than running two completely separate workstreams that never talk to each other, the practical approach is to design your internal processes so the same underlying information feeds both.
- The US federal framework – If you’re a US company operating in the EU, design for EU requirements, not US federal minimums. The AI Act’s extraterritorial reach is real, the penalties are significant, and US federal law will not protect you in front of a European regulator. Meeting the higher bar is the safer bet.
- The adequacy question – If the US starts scaling back transparency and data protection standards, you need to revisit how you handle data transfers. It’s risky to assume the EU-US Data Privacy Framework will survive a sustained US federal campaign to roll back AI transparency law.
The tensions described above aren’t isolated issues. They all point to the same underlying problem. And that problem has a clear shape. Think back to the tsugite joint, which is a highly precise interlocking design that creates strength without nails or glue. The same level of precision is required for the GDPR and the EU AI Act to fit together and function as intended.
But that precision can only be achieved when each piece is shaped with the other in mind. In reality, the GDPR and the AI Act “were cut” in different workshops, by different craftsmen working from different blueprints at different times. The joint fails not because the design was wrong, but because no one ever sat down to compare measurements. So, the coordination we’re all waiting for still hasn’t happened. And until it does, the gap between the two frameworks isn’t an abstraction; it’s where compliance risk actually sits.
Extra Sources and Further Reading
- Balancing AI Development and Privacy Compliance – An Analysis of the Principles of ‘Complete Data Sets’ in the AI Act and ‘Data Minimization’ in the GDPR – Lund University
https://lup.lub.lu.se/student-papers/search/publication/9194982
This legal thesis analyses the conflict between GDPR data minimisation and the AI Act’s need for complete datasets. - AI data governance – overlaps between the AI Act and the GDPR – Taylor & Francis Online
https://www.tandfonline.com/doi/full/10.1080/17579961.2026.2633677
In this academic paper, the authors examine the conceptual differences and overlaps between the GDPR and the EU AI Act frameworks. - Policy-Driven Data Minimization Techniques in AI Pipelines for Enhanced Privacy-by-Design https://www.researchgate.net/publication/397933034
This paper explores the challenge of balancing data minimisation principles with the practical demands of AI development and deployment. - Machine Learners Should Acknowledge the Legal Implications of Large Language Models as Personal Data – Cornell University https://arxiv.org/html/2503.01630v2
This article argues that AI models may contain personal data, raising major GDPR implications.

