Enterprise AI RFP Template: 60-Question Vendor Scoring

An enterprise AI RFP that screens vendors in 60 questions across nine domains will eliminate three vendors before pricing and tighten the two finalists by 18 to 32 percent. Most enterprises run an AI RFP built from a 2019 SaaS template, miss the AI-specific risks, and end up negotiating from a shortlist that should never have made it past stage one. The template below is the one our advisors use for enterprise AI procurement decisions in the $250,000 to $10 million range.

The template covers nine evaluation domains, weighted by what actually predicts post-deployment regret. Use it for any enterprise AI procurement above $100,000 in annual fees or for any workload involving customer data, employee data, or regulated content.

How the 60-question template scores

Each domain carries a fixed weight. Vendors score from zero to four on each question (0 missing, 1 partial, 2 acceptable, 3 strong, 4 best in class). The weights below are calibrated against post-deployment outcomes across 180 enterprise AI deployments tracked between 2023 and early 2026.

Domain	Questions	Weight	Reason for weight
1. Model capability	8	15%	Capability gaps are obvious in proof-of-concept; weight is moderate because capability commoditises quickly
2. Data and security	10	20%	Highest predictor of legal hold; high weight
3. Commercial model	6	15%	Pricing model determines exit cost and lock-in risk
4. Integration	6	10%	Integration friction predicts deployment delay
5. Governance and observability	6	10%	Required for EU AI Act compliance and regulated workloads
6. Indemnity and IP	5	10%	Indemnity gap is rarely fixed post-signature
7. Support and SLA	5	5%	SLA varies less than buyers expect across vendors
8. Roadmap and viability	6	10%	Vendor longevity matters for multi-year commitments
9. Exit and portability	8	5%	Low weight but failure here is catastrophic, hence binary scoring

The 60 questions

Domain 1: Model capability (8 questions, weight 15%)

1.1 List every model available under the proposed contract, including model name, version, parameter count, context window, modalities, and supported languages.

1.2 Provide benchmark results on MMLU, HumanEval, GSM8K, MMLU-Pro, GPQA, and any domain-specific benchmark relevant to our workload. Provide the date of the benchmark and the model version tested.

1.3 What is the latency at p50, p95, and p99 for a 500-token completion at 0.7 temperature on your largest model? Provide measurements from the past 30 days.

1.4 What is the maximum context window in tokens, and how does pricing scale across the context window?

1.5 What languages are supported beyond English, and what is the demonstrated quality gap on standard benchmarks in those languages?

1.6 Describe the safety classifier behaviour, false positive rate on standard enterprise content categories, and ability to configure or disable specific classifiers.

1.7 What fine-tuning, adapter training, or instruction-tuning options are available, and how is fine-tuned model output indemnified differently from base model output?

1.8 What multimodal capabilities are included (vision, audio input, audio output, code execution, computer use)? List separately pricing for each.

Domain 2: Data and security (10 questions, weight 20%)

2.1 In which geographic regions is customer prompt and completion data processed and stored under the proposed agreement?

2.2 Are prompt logs, completion logs, abuse monitoring data, and human-reviewed safety samples processed in the same region as inference, or in a different region?

2.3 Provide the precise contract language stating whether vendor trains or fine-tunes models on customer content.

2.4 What logging is retained for human review, by whom, for how long, and under what circumstances is the log read by a human?

2.5 Provide your SOC 2 Type II report, ISO 27001 certification, ISO 42001 certification, and any sector-specific attestations (HITRUST, FedRAMP, IRAP, C5).

2.6 Confirm support for customer-managed encryption keys (CMK) for content at rest. Confirm support for customer-controlled encryption in transit.

2.7 Describe the data deletion process: how long after contract termination is customer data permanently deleted, and what attestation is provided?

2.8 What is the data breach notification timeline, and what is the contractual remedy if notification is missed?

2.9 Provide audit rights for the customer or a third party: direct audit, vendor-supplied SOC 2 only, or no audit right.

2.10 Describe the model-update process: does the customer have a contractual right to remain on a specified model version for a stated minimum period?

Domain 3: Commercial model (6 questions, weight 15%)

3.1 Provide the per-input-token, per-output-token, per-cached-input-token, per-batch-input-token, and per-image-input price for every model offered.

3.2 What is the lowest unit price tier available at our committed annual volume, and what is the trigger volume for the next tier?

3.3 What capacity reservation or provisioned throughput options exist, what is the commitment term, and what is the unit price differential against on-demand?

3.4 Describe the volume true-up and true-down mechanism, and the renewal price-increase cap.

3.5 Provide an example invoice for our projected first-month volume, broken down by model and by token type.

3.6 What contract term options are offered (1, 2, 3, 5 years), and how does discount tier escalate by term?

Domain 4: Integration (6 questions, weight 10%)

4.1 List supported integration patterns: REST API, SDK by language (Python, Node, Java, Go.NET), Kafka/event-stream connectors, JDBC drivers, RAG framework support.

4.2 What is the SLA on latency for streaming completions versus full completions?

4.3 Confirm support for our identity provider (Okta, Entra ID, Ping) for end-user SSO into any vendor-supplied UI components.

4.4 Describe rate limit headers, retry semantics, and overage behaviour when rate limits are hit.

4.5 What developer tooling is included (evaluation framework, prompt registry, observability traces, dashboards)?

4.6 Provide reference architecture documents for the three most common enterprise deployment patterns you support.

Domain 5: Governance and observability (6 questions, weight 10%)

5.1 What documentation is supplied to support our EU AI Act Article 26 deployer obligations and our AI Act compliance program?

5.2 Provide the model card or system card for each model, including training data summary, intended use, known limitations, and evaluation results.

5.3 Describe observability features: prompt logging dashboard, evaluation harness, drift detection, anomaly alerting.

5.4 What policy controls are configurable by the customer (content categories, output filters, allowlists, denylists)?

5.5 Describe red-team testing performed prior to model release. Provide a summary of recent red-team findings.

5.6 Confirm contractual obligation to notify the customer of material changes to safety classifiers, model behaviour, or alignment tuning.

Domain 6: Indemnity and IP (5 questions, weight 10%)

6.1 Provide the precise indemnity language for third-party intellectual property claims arising from outputs.

6.2 What is the indemnity cap (uncapped, multiple of annual fees, or excluded)?

6.3 List every carve-out from the indemnity (customer fine-tuning, bypassed safety controls, off-policy prompts).

6.4 Describe defence rights: vendor controls defence, customer controls defence, or shared.

6.5 Provide examples of indemnity claims you have honoured in the past 24 months.

Domain 7: Support and SLA (5 questions, weight 5%)

7.1 Provide the SLA matrix: severity 1, 2, 3 response times, restoration times, and service credits.

7.2 What is the named technical account manager allocation at our spend tier?

7.3 What is the response time for security incident reports?

7.4 Describe the escalation path from technical support to product engineering for unresolved incidents.

7.5 Provide the past 12 months of incident reports and post-incident reviews for outages over 30 minutes.

Domain 8: Roadmap and viability (6 questions, weight 10%)

8.1 Provide your published 12-month product roadmap and any commitments on backwards compatibility for current models.

8.2 Confirm financial viability: most recent audited financial statements, current funding runway, and major investors.

8.3 Describe the deprecation policy: minimum notice period for model retirement and migration support for retired models.

8.4 What is the change of control clause in your master agreement?

8.5 Confirm continuity of service in the event of a major investor change or acquisition by a competitor.

8.6 List the three most significant changes to the product or commercial model you have made in the past 24 months that materially affected enterprise customers.

Domain 9: Exit and portability (8 questions, weight 5%, binary scoring)

9.1 Confirm contractual right to export all customer prompts, completions, fine-tuning data, and evaluation data within 30 days of termination notice in a documented format.

9.2 Confirm fine-tuned model weights are exportable by the customer.

9.3 Confirm prompt library and prompt versioning data are exportable in a vendor-neutral format.

9.4 Describe the data destruction attestation issued after exit.

9.5 Confirm there is no contractual penalty for early termination beyond unconsumed reserved capacity.

9.6 Provide an example exit plan you have executed for a customer in the past 12 months.

9.7 Confirm the customer retains ownership of all fine-tuning data and any derived embeddings.

9.8 Confirm portability of API contracts: are the API shapes documented to a degree that re-implementation against a different vendor is feasible?

The shortcut that most procurement teams take: Skipping Domain 9 because exit feels like a hypothetical. Across 180 deployments, 38 percent of buyers renegotiated their primary AI vendor inside the first 18 months, and 14 percent switched primary vendor. Exit and portability are not hypothetical. They are the second most likely event after renewal.

Using the scoring matrix

Add the weighted scores. Any vendor below 60 percent total is eliminated. Any vendor scoring zero on three or more Domain 9 questions is eliminated regardless of total score. Any vendor below acceptable on three or more Domain 2 questions is eliminated regardless of total score, because remediation requires master agreement reopener and is typically not granted.

Two finalists with totals within 10 percent of each other should be taken into a commercial negotiation phase. For the commercial negotiation framework that follows the RFP, see our AI usage-based pricing negotiation guide, our AI contract residency and IP rights guide, and the enterprise AI vendor selection framework. For benchmark pricing inputs to the RFP, see the enterprise LLM cost comparison and ChatGPT Enterprise pricing pillar. To engage directly on an AI RFP, see our AI procurement advisory service.

AI Vendor Contract Red Flags

Spot the red flags before you sign an AI vendor contract.

Read the white paper

AI Procurement RFP Template

How the 60-question template scores

The 60 questions

Domain 1: Model capability (8 questions, weight 15%)

Domain 2: Data and security (10 questions, weight 20%)

Domain 3: Commercial model (6 questions, weight 15%)

Domain 4: Integration (6 questions, weight 10%)

Domain 5: Governance and observability (6 questions, weight 10%)

Domain 6: Indemnity and IP (5 questions, weight 10%)

Domain 7: Support and SLA (5 questions, weight 5%)

Domain 8: Roadmap and viability (6 questions, weight 10%)

Domain 9: Exit and portability (8 questions, weight 5%, binary scoring)

Using the scoring matrix

The Licensing Edge

Run a Defensible AI RFP

AI Procurement RFP Template

How the 60-question template scores

The 60 questions

Domain 1: Model capability (8 questions, weight 15%)

Domain 2: Data and security (10 questions, weight 20%)

Domain 3: Commercial model (6 questions, weight 15%)

Domain 4: Integration (6 questions, weight 10%)

Domain 5: Governance and observability (6 questions, weight 10%)

Domain 6: Indemnity and IP (5 questions, weight 10%)

Domain 7: Support and SLA (5 questions, weight 5%)

Domain 8: Roadmap and viability (6 questions, weight 10%)

Domain 9: Exit and portability (8 questions, weight 5%, binary scoring)

Using the scoring matrix

Related Intelligence

Enterprise AI Vendor Selection Framework

AI Contract Data Residency and IP Rights

Enterprise LLM Cost Comparison

The Licensing Edge

Run a Defensible AI RFP