Short answer: The AI model is the cheap part. With an efficient model, a typical business conversation costs roughly a cent or less in API fees — the real cost of an AI agent in 2026 is integration, data preparation, and ongoing maintenance, which usually dwarf the token bill. Decide the use case first; the cost follows from it.
Key takeaways
- Tokens are tiny: Claude Haiku 4.5 is $1 / $5 per million input/output tokens and GPT-4o mini is $0.15 / $0.60 — cents per conversation, not dollars.
- Integration dominates: connecting the agent to your data, tools, and workflow is where most of the budget goes.
- Managed vs. custom is the real fork: off-the-shelf tools price per seat or per resolution; a custom agent trades higher build cost for lower per-use cost and full control.
- Measure cost per outcome: cost per resolved ticket or booked lead matters more than cost per token.
What you actually pay for
An AI agent's cost has three layers, and the headline "API price" is the smallest. Model usage is metered per token and, with caching and an efficient model, is genuinely cheap. The platform layer — hosting, vector storage for a document/RAG agent, monitoring — is modest and predictable. The layer that moves the budget is build and integration: wiring the agent into your CRM, knowledge base, or WhatsApp channel, plus the data cleanup that makes answers accurate.
Model API pricing (mid-2026)
Verified list prices per million tokens — prompt caching cuts cached input about 90% and batch mode about 50%:
| Model | Input / 1M tokens | Output / 1M tokens |
|---|---|---|
| GPT-4o mini | $0.15 | $0.60 |
| Claude Haiku 4.5 | $1.00 | $5.00 |
| GPT-4o | $2.50 | $10.00 |
| Claude Sonnet 4.6 | $3.00 | $15.00 |