Vendor Risk Management for AI Tools: A Security Checklist
Every AI SaaS app can quietly become a supply-chain risk if it sees your prompts, files, or customer data. Does your vendor questionnaire cover data processing agreements, model-training opt-outs, breach-notification SLAs, and the full subprocessor chain?
Vendor Risk Management for AI Tools: A Security Checklist
When Samsung employees pasted source code into ChatGPT in 2023, the company didn’t discover a clever new attack chain; it discovered that a third-party chatbot can become an exfiltration channel the moment someone treats it like Notepad with better autocomplete. That’s the problem with AI SaaS: the risk isn’t the model’s “intelligence,” it’s that your prompts, files, tickets, and customer records may now live in someone else’s retention bucket, under someone else’s subprocessors, with someone else’s incident clock.
The usual vendor questionnaire is still asking the wrong questions. “Do you encrypt data at rest?” is table stakes. The real issue is whether the service stores prompts for training, whether “opt out” means anything beyond a support ticket, whether deleted data is actually purged from backups, and whether the vendor can name every subprocessor that touches your content without sending you a marketing PDF and a prayer.
Treat AI SaaS as a data processor, not a clever widget
If an AI tool can ingest customer chats, HR documents, source code, or incident notes, it is not a productivity toy; it is a processor under whatever privacy regime applies to you. For many teams, that means the contract should reflect GDPR-style processing terms even if the vendor is headquartered in California and the sales rep insists “we’re not a processor, we’re a platform.” That distinction evaporates the second the tool stores your content, logs your prompts, or uses your data to improve a model.
Ask for the data processing agreement, and read the part about retention, deletion, and subprocessors. OpenAI, Microsoft Copilot, Google Gemini, and Anthropic all have different enterprise data-handling terms, and the practical differences matter more than the logo on the login page. If the vendor cannot state whether customer prompts are excluded from training by default, or whether admin-controlled retention is available, you do not have a policy problem — you have an unknown exfiltration path.
Demand the full subprocessor chain, not the curated version
Most security teams stop at the vendor’s trust center page because it looks official and has a lock icon. That is not diligence; that is tourism. The useful question is which companies actually touch your data after the first hop: cloud hosting, telemetry, support tooling, content moderation, analytics, and backup providers. A vendor that uses AWS for compute but sends logs to Datadog, support cases to Zendesk, and abuse review to another AI vendor has already multiplied your exposure.
This is where the questionnaire should force specificity. Require a live subprocessor list with company names, regions, functions, and notification commitments for changes. If the vendor says “we may update subprocessors from time to time,” press for advance notice and an objection window. If they refuse to identify backup providers or cross-border transfer mechanisms, assume your data will eventually take the scenic route through jurisdictions your legal team never approved.
Separate model-training opt-outs from actual retention controls
A lot of enterprise AI contracts advertise “no training on your data,” which sounds reassuring until you notice the fine print still allows retention for abuse monitoring, debugging, or service improvement. That is not the same thing as non-retention, and it is not the same thing as deletion. A prompt that is excluded from model training can still sit in logs long enough to be subpoenaed, breached, or reused by a support analyst who has no business seeing it.
Your questionnaire should force the vendor to answer three separate questions: Is customer content used for training by default? Can training be disabled at the org level, and is that setting contractual or just configurable? How long are prompts, files, embeddings, and chat transcripts retained after deletion? If the answer to any of those is “it depends,” then you need the exact dependency, not the slogan.
Put breach-notification SLAs in writing, not in a blog post
Security teams love to ask whether a vendor has a breach notification process, as if process were the same thing as timing. It isn’t. The difference between a 24-hour notice and a “we’ll notify you without undue delay” clause can be the difference between preserving logs and explaining to regulators why you learned about the incident from a reporter. SolarWinds and Okta both taught the same lesson in different flavors: the first public clue is often not the first internal clue.
For AI vendors, the notification language should cover unauthorized access to prompts, uploaded files, fine-tuning datasets, vector stores, and admin consoles, not just “customer records” in the abstract. Ask for an SLA measured in hours, not business days. If the vendor won’t commit to notifying you when your content is exposed through a subcontractor or logging pipeline, then their incident process is designed for the vendor’s convenience, not your risk.
Test for prompt injection and data exfiltration before procurement, not after rollout
The lazy assumption is that AI risk is mostly about users typing secrets into a chatbot. That is only half the mess. If the tool can browse documents, query ticketing systems, or summarize inboxes, prompt injection becomes a supply-chain problem inside your own workflow. Microsoft’s Copilot ecosystem, Google Workspace integrations, and browser-based AI extensions all widen the blast radius because the model is now acting on behalf of the user with access to other systems.
Before procurement, run a small red-team exercise against the actual workflow: can a malicious PDF, email, or Jira ticket instruct the assistant to reveal hidden context, leak attachments, or forward data to an external endpoint? If the vendor has no documented controls for tool-use authorization, content isolation, or retrieval filtering, you are not buying “AI assistance.” You are buying a confused deputy with a license key.
The contrarian move: ban the tool only where it matters
Blanket bans on AI tools make for dramatic policy memos and terrible security. They also guarantee shadow usage, which means employees will paste the same data into personal accounts, consumer chatbots, or browser extensions you never reviewed. A safer approach is to classify the data, then permit only the tools that can prove org-level no-training, bounded retention, SSO/SAML, SCIM, audit logs, and admin disablement of public sharing.
That does not mean “approve everything with an enterprise plan.” It means the approval bar should be higher for tools that ingest source code, customer PII, or regulated data than for a generic writing assistant. If a vendor cannot support DPA terms, subprocessor disclosure, and deletion attestations, then the answer is no — even if the demo was slick and the sales team promised “enterprise readiness” in a font size large enough to be mistaken for a fact.
The Bottom Line
Require every AI vendor to answer, in writing, four questions before approval: Can they exclude your data from training, how long do they retain prompts/files, who are all subprocessors, and what is the breach-notification SLA. If they cannot answer cleanly, do not let them near customer data, source code, or internal incident content. Re-review the contract whenever they add a new subprocessor, change retention, or introduce browsing, connector, or agent features that expand what the tool can touch.