AI Strategy & Frameworks·May 12, 2026·12 min read·By Rodrigo Ortiz

How to Choose an AI Implementation Partner: 10 Questions Before You Sign

How to choose an AI implementation partner: 10 questions that separate firms that ship from firms that pitch. The questions to ask before you sign.

The most important AI implementation question is not "which model should we use" — it is "who is going to ship this, and what happens when they're wrong." Most companies discover the answer after the contract is signed, the kickoff deck is delivered, and the partner has billed the first $250K. By that point the questions you should have asked at the sales stage are the same ones you cannot ask without admitting you should have asked them earlier.

Knowing how to choose an AI implementation partner is not a procurement exercise. It is a decision about whose judgment you are going to outsource for the next 9 to 18 months on a technology stack that is moving faster than your in-house team can keep up with. According to McKinsey's most recent state of AI survey, the organizations reporting disappointing AI results overwhelmingly trace the problem back to a single decision: choosing a partner on price or brand recognition instead of fit, accountability, and track record. The 10 questions below are the ones we have watched separate good partnerships from expensive ones across hundreds of buyer conversations.

The track-record questions: separating real shipping experience from deck experience

The first three questions are about what the partner has actually shipped — not what they have advised on, presented about, or written a thought-leadership post on. AI implementation is a doing discipline; the firms that have done it look different from the firms that have talked about it. The same is true whether the work is document intelligence pipelines or full multi-system automations — without a portfolio of live, in-production systems, the partner is asking you to be their first.

  • Q1: Show me three projects in the last 12 months that are still in production. Not pilots. Not "successful proofs of concept." Production systems still running, still used by the customer, with users still depending on them. If the partner cannot name three, the answer to "have you actually shipped AI" is no, regardless of the case studies on the website. Many AI consulting decks are roughly 80% pilot results and 20% single-sentence claims about scale-up — the pilot graveyard is what scared executives off AI for the last 18 months, and the firms that have not crossed it should not be running your rollout.
  • Q2: Of those three projects, what is the one that almost failed and how did you recover? A partner who has shipped real AI has at least one project where the model regressed, the data pipeline broke, an eval set caught something embarrassing in week six, or a stakeholder lost faith and had to be re-won. If they cannot tell that story specifically — with the people, the failure mode, and the recovery — they have not actually done the work. The honest stories are how you know.
  • Q3: Can I talk to two of those customers without you on the call? Reference calls supervised by the partner are sales theater. Unsupervised reference calls are diligence. The partners who will set up the unsupervised call are the partners whose customers are not quietly furious; the partners who deflect are deflecting for a reason.

The non-obvious point. The single highest-signal moment in the AI partner sales cycle is the response to "can I talk to your customers alone." Vendors confident in their work hand over the phone. Vendors who are not, change the subject. Watch the response, not the words.

According to Deloitte's state of AI in the enterprise survey, the gap between AI projects that reach scale and those that stall is widening — and it tracks closely to the experience and accountability of the partner running the work. The track record is not a vanity check; it is the single best leading indicator of whether your rollout joins the production list or the pilot graveyard.

If a partner cannot name three production projects from the last year, share the recovery story for at least one, and connect you to customers without supervision, the answer to "have they actually shipped" is no — and you are about to pay them to learn on your dime.

The methodology questions: how they think about the work that determines whether it works

AI implementation is not a software build. The hard part is not writing the code; it is choosing the workflow to point the AI at, building the evaluation infrastructure that tells you whether it works, and running a careful enough rollout that the customer trusts the system the day it goes live. Questions 4 through 6 are about how the partner thinks about that work — and you can tell more from a 20-minute methodology conversation than from a 200-page proposal.

  • Q4: How do you build an evaluation set, and at what stage? The correct answer is "before we write the first prompt, against the customer's own historical data." If they say "we test as we go" or "we use industry benchmarks," they are about to ship you a system you cannot trust. The eval set is the substrate of every reliable AI deployment; partners who treat it as an afterthought treat reliability as an afterthought. The pattern that distinguishes successful AI projects from the ones that stall is exactly this — substrate first, model second.
  • Q5: Walk me through your shadow-mode rollout. Any partner who has shipped AI to production has a shadow-mode practice: running the new system in parallel with the existing process, comparing outputs, and only cutting over once the deltas are well understood. If they describe a "big bang launch" or "cutover weekend," they are about to put your business on a system they cannot verify in production. Run, do not walk.
  • Q6: What does your handoff look like at month 12? The partners worth hiring have a clear answer about who owns the system at the end of the engagement, what training your team gets, what documentation lives where, and what the partner's role becomes after handoff (typically: occasional improvements, eval-set refresh, and emergency support — not "we run it forever and you pay us forever"). Partners who hedge on this question are partners building a dependency, not a capability.
The AI is not the hard part anymore. The hard part is the workflow you are pointing it at, the eval set that tells you it is working, and the rollout that keeps your team's trust intact.

Eval set before prompts, shadow-mode before cutover, and a clear handoff at month 12 — partners who cannot describe all three are running a methodology that does not survive contact with production.

The accountability questions: what happens when it goes wrong

It will go wrong. A model will drift. A vendor will deprecate the model you depend on. A workflow change on your side will break an assumption the partner made in week three. Questions 7 and 8 are about how the partner has agreed to handle that — and the answers are almost always in the contract you have not read yet.

  • Q7: What happens to the engagement if the agreed-upon ROI is not hit in the first 90 days? The partners worth hiring have a real answer: they run a structured root-cause review, agree on what changes, and continue with a documented adjustment to the plan. The partners not worth hiring have a non-answer ("every project is different") and a contract that protects their hours-billed regardless of whether anything works. The cleanest contracts we have seen tie a portion of fees to specific, measurable outcomes — and the firms willing to sign those contracts are the firms who know they can hit them. A real ROI framework is the basis of this conversation; without one, you and the partner are both flying blind.
  • Q8: Who owns the IP — the prompts, the eval sets, the integration code? The default answer should be "you own everything specific to your business; we retain the right to reuse generic frameworks and patterns." If the partner wants to own your prompts and your eval sets, they are building a moat against your ability to fire them, and you should leave the room. The substrate of your AI system has to be yours.

This is where buying patterns diverge sharply between firms that treat AI like an asset they own and firms that treat it like a service they rent. According to a Harvard Business Review piece on managing generative AI risk, the organizations that retained ownership of their AI substrate — data, evals, prompts, integration patterns — were materially better-positioned to switch vendors, upgrade models, and adapt to regulatory shifts than the organizations that outsourced ownership along with execution. The IP question is not a procurement detail. It is the difference between owning an asset and renting one.

Tie a portion of fees to outcomes, retain full ownership of the substrate (prompts, evals, integration code), and document the partner's accountability for misses — the contract you sign is the AI strategy you actually have.

The exit questions: how you fire them and what you keep

The most overlooked questions in the AI partner selection process are about how the relationship ends. Every engagement ends eventually — either at the contracted milestone, because the work is done; or earlier, because the work is not working. Questions 9 and 10 force the conversation now, when you have leverage, instead of later, when you do not.

  • Q9: If we terminate at month 6, what do we walk away with? The right answer is: every artifact produced to date, full documentation of what was built and why, the eval set, the prompts, the integration code, and a knowledge-transfer session with your team. If the answer is "we'd need to negotiate that at the time," they are setting up a future hostage situation. Negotiate it now, when you are the prospect and they want the contract.
  • Q10: What is your relationship with the model providers, and what happens if we want to switch models in year two? AI implementation partners exclusively wedded to one model vendor (one foundation model, one cloud, one tooling stack) will steer you in their preferred direction whether or not it serves your interests. Look for partners with cross-vendor experience and an architectural pattern that abstracts the model layer — so that switching providers is a model-config change, not a six-month migration. This is non-trivial: the firms that have actually shipped at scale have a strong opinion about this and can show you their abstraction layer; the firms that have not, hand-wave.

The reason these exit questions matter is that AI is moving too fast to lock yourself into a single vendor or a single methodology. The model you start with in 2026 will not be the model you ship to production in 2027, and the implementation partner who built your system has to be the one who helps you migrate — or you are now paying twice. Look across how partners structure rollouts in real estate versus how they structure them in legal, or how financial services firms approach the same problem under heavier regulatory load, and you will see the same underlying portability test: can the partner move the customer between models, providers, and even use cases without re-shipping the entire stack? The ones who can are the ones who built the abstraction layer correctly the first time.

The trap. Signing a multi-year contract with a partner who has not answered Q9 and Q10. Two years in, you discover you cannot fire them without rebuilding from scratch, and you cannot switch models without their cooperation. The premium you pay them from that point forward is the cost of having signed without the exit clauses.

Negotiate the exit before the entry — what you walk away with at month 6, and what happens when the model you started with is not the model you finish with — because every leverage you have is in the sales cycle, not the renewal.

How to put the 10 questions to work when choosing your AI implementation partner

The 10 questions are not a checklist to be marched through. They are a way to listen. The best partners answer them directly, with specifics, and offer to put their answers in writing. The worst partners deflect, change the subject, or offer to "schedule a deeper dive next week." The medium partners give you 60% of an answer and assume you will not notice the missing 40%. Most procurement processes for AI partners fail because they grade on the quality of the proposal instead of the quality of the conversation, and the conversation is where the truth lives.

If you are running a real selection process, get every question answered by every shortlisted partner, in the same room with the same people, ideally with a senior technical buyer on your side who can pressure-test the methodology answers. Rank the partners not on which one promises the most but on which one is most specific about how the work actually gets done. The most specific partner is almost always the one who has actually done it. If you want a candid second opinion on a partner you are already evaluating — or if you want to bypass the procurement theater and talk directly to a team that can answer all 10 questions on the first call — talk to a Groath growth expert and we will tell you exactly how we would approach your specific rollout, including what would have to be true for us to walk away from the work.