Model Provenance: 3 Critical Vendor Questions

A US State Department cable on Chinese AI distillation has turned model provenance into an immediate vendor due diligence question for EU deployers. Three concrete asks for every supplier this quarter, and where the EU AI Act helps but does not finish the job.
Glass distillation apparatus in a workshop with a row of derivative receiving vessels behind it, illustrating model provenance and the difference between an original AI system and its distilled copies.

A US State Department cable dated 24 April 2026 told embassies worldwide to warn partner governments about Chinese firms distilling US AI models. DeepSeek, Moonshot AI and MiniMax are named. Most coverage will treat this as a foreign-policy story; for anyone running AI inside an EU organisation, it is a documentation problem. The cable claims that distilled models can have their security protocols stripped and their neutrality controls undone. If that is even partly true, model provenance is no longer a curiosity. It is now a vendor due diligence question your compliance officer needs to answer this quarter.

Why model provenance is a deployer problem, not a provider problem

Most AI vendor reviews stop at uptime, security accreditations and contract terms. Provenance disputes change the brief. If a model marketed under one name was distilled from a different upstream system, the licensed model may carry inherited behaviours, biases or stripped safeguards. The vendor never trained those properties themselves; you will be answering for them.

The cable explicitly warns that distillation campaigns “deliberately strip security protocols from the resulting models”. Whether you accept the geopolitical framing is beside the point; the technical claim has implications for any deployer relying on a vendor’s safety story.

This is not theoretical. Anthropic stated in February 2026 that DeepSeek, Moonshot and MiniMax used 24,000 fraudulent accounts to make 16 million exchanges with its Claude model. OpenAI has made similar accusations. Whatever the truth of any specific claim, the broader pattern is now part of the public record: provenance disputes about widely deployed models are real, contested and unlikely to be resolved through vendor reassurance alone.

What changes for the deployer is the burden of evidence. Telling your board “the vendor said the model is safe” is no longer a defensible position when a major government has formally challenged that vendor’s training story. You need model provenance evidence, not provider assurance. The same shift in burden has already played out in adjacent vendor governance disputes, and it now applies one layer further upstream.

Three model provenance questions to ask vendors this quarter

Three concrete asks should appear in every active AI vendor relationship and in every new procurement. They are simple to ask and revealing in the answers.

Documented model lineage

Ask the vendor for a written model lineage: which base model the deployed system descends from, what training data sources were used, and whether any distillation, fine-tuning or model merging occurred. The EU AI Act’s Article 53 already requires GPAI providers to publish a public summary of training content; that obligation has been live since August 2025. Treat it as your floor, not your ceiling. If a vendor cannot describe their own model lineage in writing, that is itself the answer.

Independent safety re-testing

Ask for evidence that the safety mechanisms claimed by the upstream model are still present and functional after any modification. A distilled model is, by design, a different model. Red-team results, refusal-rate tests and bias evaluations should be re-run on the version you are deploying, not inherited from the parent system’s marketing material. If the vendor relies entirely on a third party’s safety report, that report belongs in your file too.

Notification of upstream change

Ask for a contractual notification clause covering any change in upstream model dependencies. If a vendor is repackaging a model and quietly switches the underlying system, you need to know before your users do. The logic is the same as sub-processor notification under the GDPR; treat it as a supplier governance requirement, not a courtesy. Document the response. Then document who in your organisation gets notified when the answer changes.

Where the EU AI Act helps and where it does not

Article 53 is helpful but partial. The transparency obligations on GPAI providers, including the public training-data summary and the technical documentation that downstream providers can request, have been applicable since 2 August 2025. The Commission’s enforcement powers begin on 2 August 2026, roughly three months from now. The instruments are real and the deadline is short.

The limit is structural. Article 53 covers providers, not deployers. The deployer still has to verify the evidence, store it, update it and produce it on request. The Act gives you a right to ask; it does not give you a right to assume the answer is true. For practical purposes, model provenance verification belongs inside the deployer’s own AI inventory and governance system, alongside risk classification and human oversight records.

Provenance records as the new minimum dataset

Until recently, an AI inventory typically held the system name, the vendor, the use case, the risk classification and the human oversight arrangements. From this quarter, model provenance evidence belongs in the same record: the documented lineage, the safety re-test, the notification clause. The State Department cable will not be the last public dispute over a major model’s training story. The deployers who are ready will be those who already know exactly what they have bought, and from whom.

Add model provenance to your AI inventory this quarter, before the next public dispute makes it urgent.

Newsletter
Releted Blogs
LATEST NEWS

AI governance is not a future problem

Regulation is already in effect. Your competitors are already building internal capability. The gap between ‘we are aware of AI’ and ‘we have operational control’ is closing, and it closes faster with a structured framework.

 

Book a 30-minute discovery call. No obligation. We will assess where your organisation stands and what a realistic starting point looks like.

No sales pressure. No jargon. Just a structured conversation about your organisation's AI readiness.

Scroll to Top