With the FDA's final cybersecurity guidance now in full effect and the new Quality Management System Regulation (QMSR) aligned to ISO 13485 taking hold in February 2026, medical device manufacturers face the most rigorous cybersecurity submission requirements to date....
Meeting FDA Cybersecurity Guidance for Artificial Intelligence-Enabled Device Software Functions: Lifecycle Management and Marketing Submissions
Summary
The post argues that the FDA's January 2025 AI-Enabled Device guidance identifies seven AI-specific cyber threats, but only one (model evasion) lives at runtime on-device. The other six — data poisoning, model inversion/stealing, data leakage, overfitting, model bias, and performance drift — attack the upstream pipeline: training data, model registries, evaluation logic, and deployment mechanisms. Most testing programs today only cover the deployed model, which means they're addressing one-seventh of what the FDA is explicitly asking about. ELTON's approach expands the digital twin to include the full AI pipeline as an associated system, producing traceable test coverage across all seven threat categories for premarket submissions.
The FDA’s AI Cybersecurity Requirements Go Way Beyond the On-Product Model
The FDA’s January 2025 draft guidance on AI-Enabled Device Software Functions (Artificial Intelligence-Enabled Device Software Functions: Lifecycle Management and Marketing Submission Recommendations) includes a cybersecurity section that should be keeping medical device security teams up at night. Not because it introduces radically new concepts, but because it makes explicit what many manufacturers have been hoping they could avoid: penetration testing and threat mitigation testing for AI-specific vulnerabilities that exist well beyond the trained model at runtime.
This isn’t just “test the model.” This is “test everything that touches the model.”
What the Guidance Actually Says
Section XII (pages 34–36) identifies seven categories of AI-specific cyber threats that sponsors of “cyber devices” under Section 524B(c) of the FD&C Act must address in their marketing submissions:
Data Poisoning — Deliberately injecting inauthentic or maliciously modified data into the pipeline, risking outcomes in areas like medical diagnosis. This isn’t a runtime attack on the deployed model. This is an attack on the process that creates the model.
Model Inversion and Stealing — Using forged or altered data to infer details from or replicate models. This threatens continued model performance, intellectual property, and patient privacy simultaneously.
Model Evasion — Intentionally crafting or modifying input samples to deceive models into incorrect classifications. This one targets runtime inference, and it’s the threat most testing programs already think about. It’s also the one that’s insufficient on its own.
Data Leakage — Exploiting vulnerabilities to access sensitive training or inference data. The attack surface here isn’t the model’s prediction endpoint. It’s the storage, transit, and access controls around the data that built the model.
Overfitting — Deliberately overfitting a model to expose AI components to adversarial attacks as they struggle to adapt to modified patient data. This is a subtle, long-game attack that degrades the model’s ability to generalize.
Model Bias — Manipulating training data to introduce or accentuate biases, exploiting known biases with adversarial examples, embedding backdoors during training to trigger biased behaviors later, or leveraging pre-trained models with inherent biases and amplifying them with skewed fine-tuning data.
Performance Drift — Changing the underlying data distribution over time to degrade model performance, causing inaccurate predictions or increased susceptibility to adversarial attacks.
Read that list carefully. Only one of those seven threats, model evasion, is primarily a runtime, on-device concern. The other six attack the pipeline upstream of the deployed model: the data, the training environment, the model registry, the evaluation logic, the deployment mechanisms.
Why This Changes the Testing Scope
Most cybersecurity testing for AI-enabled devices today focuses on the deployed model: Can we feed it adversarial inputs? Do the input guardrails hold? Does the model produce safe outputs under stress?
That’s necessary. It’s also not sufficient.
The FDA is telling you explicitly, in writing, in a guidance document you will need to address in your premarket submission that the threat landscape for AI-enabled devices extends across the full AI system lifecycle. Data poisoning doesn’t happen at the inference endpoint. Model bias manipulation doesn’t happen on the device. Overfitting attacks don’t materialize at runtime. These threats live in the preprocessing pipeline, the training environment, the model registry, the artifact storage, the deployment and update mechanisms.
If your penetration test only covers the deployed model on-device, you’re covering one-seventh of the threat categories the FDA has explicitly identified.
How We’re Approaching This at ELTON
To address this for both premarket submissions and interim testing, we’re expanding our scope beyond runtime inference testing to cover the full AI system lifecycle:
The AI pipeline becomes an associated system in the digital twin. Just as we model web applications, APIs, and mobile components as part of the overall attack surface, the AI pipeline, from data ingestion through model deployment, gets the same treatment. This means assessing vulnerabilities across the lifecycle pipeline, particularly those involving authentication, integrity controls, artifact signing, access control, and input validation that could enable manipulation of training data, model artifacts, evaluation logic, or deployment.
Runtime testing remains focused and purpose-built. On-device testing of the deployed model validates mitigations, input guardrails, and protections against adversarial misclassification. This is the evasion-focused testing most teams are already thinking about, and it stays in scope.
The result is full traceability. Every test case in the penetration test maps back to specific FDA-identified threats, executed not only against the trained model at runtime on-device, but also against upstream pipeline components where the AI-specific threats identified in the guidance can actually materialize.
The specific pipeline components that need coverage:
- Data preprocessing pipeline — where poisoning and bias manipulation originate
- Training environment — where overfitting attacks and backdoor injection occur
- Model evaluation — where compromised evaluation logic can mask degraded performance
- Model registry and artifact storage — where model stealing, tampering, and unauthorized access occur
- Deployment and update mechanisms — where integrity of the production model can be compromised
The Bottom Line
The FDA isn’t asking manufacturers to test their AI model. They’re asking manufacturers to demonstrate that the entire system that produces, stores, evaluates, and deploys that model has been assessed for the specific threat categories outlined in the guidance.
For premarket submissions, this means your cybersecurity documentation needs to show coverage across all seven threat categories — not just the one that lives on-device.
For testing services, this means expanding the aperture from “can we break the model at inference?” to “can we compromise the pipeline that builds, validates, and ships the model?”
That’s a fundamentally different scope. And it’s what the guidance is asking for.
