The IT automation market is noisy. Every vendor claims to "use AI to resolve tickets automatically." But when you dig into the architecture, most fall into one of two categories: glorified scripting with a chatbot on top, or heavyweight ITSM platforms selling automation as an expensive add-on to software you already own.
Neither is what most enterprise IT teams actually need. This guide gives you the evaluation framework to find the real thing — and the questions that expose vendor weaknesses before you sign anything.
Before you evaluate: define your success criteria
Before you open a demo, write down the answers to three questions:
- What are your top 5 ticket types by volume? Any vendor you evaluate should be able to show you working automations for all five on day one.
- What does your current stack look like? List every tool involved in ticket resolution: ITSM (Jira, ServiceNow, Zendesk), identity (Okta, Entra), endpoint (Intune, Jamf), messaging (Slack, Teams). Your automation platform needs native integrations for all of them.
- What's your risk tolerance? Do you need human approval before every automated action, or are you comfortable with selective autopilot? This determines which control model is right for you.
The evaluation framework
1. Integration depth vs. breadth
Every vendor will show you a long list of integrations. Ask a sharper question: how deep does each integration go?
A shallow integration can read ticket data and create comments. A deep integration can reset MFA factors, push MDM profiles, assign licenses, trigger conditional access changes, and write back resolution details — all in one workflow. The difference is everything.
Questions to ask:
- "Can you show me a live demo of an Okta MFA reset, end-to-end, in my environment?"
- "What specific API scopes does your Okta integration require?"
- "What happens if an API call fails mid-workflow? Show me the rollback."
2. AI architecture: classification vs. execution
This is the most important architectural question and most vendors answer it badly. Ask explicitly: does your AI model directly execute actions on production systems?
The answer should be no. Large language models should be used for one thing: classifying the intent of a ticket and extracting entities (user, device, system). All actual execution should happen through deterministic, version-controlled runbooks that the AI cannot modify or deviate from. This is not a nice-to-have — it's the only architecture that is safe for enterprise production environments.
If a vendor tells you their AI "figures out the right fix and runs it," that's a red flag. Ask to see the execution logs. Ask what would happen if the LLM hallucinated a username.
Questions to ask:
- "Is your LLM air-gapped from execution? Walk me through exactly where the boundary is."
- "Can I see the runbook that would be executed for a password reset? Is it version-controlled?"
- "What pre-checks run before any action is taken?"
3. Human-in-the-loop controls
Even if you want full automation eventually, you should start with human oversight. Every serious vendor offers this. What separates good implementations from bad ones is how the approval flow works.
Good: The system stages the complete proposed action (what will happen, to which system, with what API call), presents it to an approver in Slack or Teams with one-click approve/deny, and executes only after confirmation.
Bad: A vague "an action requires your approval" notification that sends you to a portal where you have to figure out what's actually being requested.
Questions to ask:
- "Show me exactly what an approver sees in Slack before clicking approve."
- "Can we configure different approval thresholds by ticket type?"
- "What happens if an approver doesn't respond within X minutes?"
4. Security and data architecture
You're giving this platform access to your identity provider and endpoint management system. Security is non-negotiable.
The minimum bar is: single-tenant data isolation, least-privilege API scoping, encryption at rest and in transit, and SOC 2 Type II certification (or a credible roadmap with controls already in place).
Ask specifically about credential storage. Your Okta API key and Intune service account credentials should be stored in an encrypted vault within your tenant — not on a shared platform.
Questions to ask:
- "Is our data isolated from other customers? Logically or physically?"
- "Where are our API credentials stored? Can you show me the architecture diagram?"
- "What's your SOC 2 status and timeline?"
- "Can I revoke your access to our systems at any time, instantly?"
5. Time to value
This is where most enterprise platforms fail. "We'll have you up in 3–6 months" is a common answer. But for Tier-1 automation, there's no reason it should take that long.
Pre-built runbooks for your top ticket types — password reset, MFA re-enrollment, VPN push, account unlock — should be ready to configure on day one. Your deployment timeline should be measured in hours and days, not months.
Questions to ask:
- "How long until we can handle our first automated ticket?"
- "What do we need to build ourselves vs. what comes pre-built?"
- "What does your onboarding process look like in week one?"
6. Pricing model clarity
Automation pricing is often opaque. Watch out for: per-action pricing that punishes you for high ticket volume, "platform fees" that dwarf the automation value, and contracts that require you to replace your existing ITSM.
A fair pricing model is per-seat or per-endpoint — predictable, scales with your team, and doesn't create a perverse incentive to limit automation volume.
Questions to ask:
- "Is pricing per action, per seat, or per endpoint?"
- "What happens if we process 10x our expected ticket volume?"
- "Do we need to replace any existing tools?"
The shortlist test
Before you put a vendor on your shortlist, ask them to run a live demo against a scenario from your actual environment. Not a canned demo. Not a slide deck. An actual automated resolution of a real ticket type you handle, connected to a test instance of your identity provider.
Vendors who can't do this in a 30-minute call are vendors who need more time in your environment than they're admitting.
The right platform shows you a working automation on your stack before you sign. Everything else is a promise.
