Flat-Rate vs Per-Execution Pricing for AI Platforms
Mentiko Team
Every AI agent platform has to answer one question before anything else: how do we charge for this? The answer they choose determines whether your costs stay flat or blow up the moment your product gains traction.
We've done the math. Here's what the pricing landscape actually looks like, and why we chose the model we did.
How most platforms charge
The industry has converged on a handful of models, all variations on the same theme: the more you use the platform, the more you pay.
- Per execution. You pay for every agent run, chain invocation, or workflow trigger. CrewAI Enterprise charges roughly $0.50 per run. LangSmith charges per trace. This is the most common model.
- Per seat. Traditional SaaS pricing bolted onto AI tooling. Works fine for dashboards, breaks down when your agents run autonomously at 3 AM with no humans involved.
- Per token. Some platforms mark up the underlying LLM tokens by 2-5x and bill you for pass-through. You're paying twice -- once to the platform, once to the model provider -- for the same inference call.
- Hybrid tiers. A monthly base fee plus overage charges once you exceed a run cap. This feels predictable until you actually exceed the cap.
All of these models share one trait: your costs scale with your success. The more value you extract from the platform, the more it costs you.
The math at scale
Let's put real numbers on this. We'll compare a per-execution model at $0.50/run against Mentiko's flat $29/month.
| Monthly runs | Per-execution cost | Mentiko cost | Per-execution multiplier | |---|---|---|---| | 100 | $50 | $29 | 1.7x more | | 500 | $250 | $29 | 8.6x more | | 1,000 | $500 | $29 | 17x more | | 5,000 | $2,500 | $29 | 86x more | | 10,000 | $5,000 | $29 | 172x more |
At 100 runs per month, per-execution is already more expensive. At 10,000 runs, you're paying 172x what you'd pay on a flat rate. And 10,000 runs isn't some hypothetical enterprise number. If you have a content pipeline that runs 4 chains per article and you publish 80 articles a month, that's 320 runs for one workflow. Add monitoring agents, QA chains, and scheduled maintenance tasks, and you're at thousands of runs before you've done anything unusual.
The per-execution model doesn't just get expensive at scale -- it makes scale irrational. Every new automation you build increases your operating cost linearly. At some point, the math forces you to choose between automating more or keeping your margins intact.
Why per-execution pricing exists
Per-execution isn't a scam. It's a deliberate business model that optimizes for the vendor, not the user.
Low barrier to entry. Paying $0.50 for a test run feels cheap. You sign up, run a few chains, and the bill is $3. That initial experience is frictionless. The cost is invisible until you depend on the platform.
Revenue scales with adoption. From the vendor's perspective, per-execution is the dream. Every new user, every new workflow, every new cron schedule generates incremental revenue without the vendor lifting a finger. Usage-based pricing turns your growth into their growth.
It punishes experimentation. This is the part that doesn't show up in pitch decks. When every run costs money, you stop testing edge cases. You don't run that chain 50 times to fine-tune the prompt. You don't build speculative automations to see if they work. The pricing model creates a tax on iteration, and iteration is the entire point of building with AI agents.
Why flat-rate works for agent orchestration
The reason flat-rate pricing is viable for agent platforms is technical, not ideological. Agent orchestration is overwhelmingly I/O bound, not compute bound.
Here's what actually happens when an agent chain runs:
- The orchestrator reads the chain definition (microseconds)
- It makes an HTTP request to an LLM API (2-30 seconds of waiting)
- It parses the response and routes to the next agent (microseconds)
- Repeat steps 2-3 for each agent in the chain
The platform spends 99%+ of each run waiting on network I/O. The CPU cost of orchestrating a chain is negligible. The expensive part -- the LLM inference -- happens on Anthropic's or OpenAI's servers, and you pay for that directly through your own API keys.
This is why per-execution pricing for orchestration is a markup on almost nothing. The platform isn't doing $0.50 worth of compute per run. It's doing fractions of a cent worth of compute and charging you $0.50 for the privilege of triggering it.
Flat-rate pricing reflects the actual cost structure. It's honest.
And the downstream effects on how you build are significant:
- Free experimentation. Run your chain 200 times while you tune your prompts. It costs you the LLM tokens and nothing else on the platform side.
- Predictable budgets. Your orchestration cost is a fixed line item. The only variable is your LLM spend, which you control through model selection and prompt engineering.
- No automation tax. Build ten new workflows on a Tuesday because you had an idea. There's no incremental platform cost for automating more.
The Mentiko model
Here's exactly how our pricing works:
$29/month gets you a dedicated, isolated instance. Not a shared tenant on a multi-tenant cluster -- an actual machine provisioned for you. On that instance, you get:
- Unlimited chains, runs, agents, and schedules
- Visual chain builder and raw JSON editing
- Event-driven orchestration with cron scheduling
- Error handling with exponential backoff and alerting
- Full run logs and execution history
You bring your own LLM API keys. Your keys are stored on your instance and never transit our servers. Your model costs are between you and your provider. We don't mark up tokens, we don't proxy your calls through our infrastructure, and we don't see your prompts.
The only thing that scales your cost is compute tier. If you need more RAM or CPU for your instance -- maybe you're running dozens of concurrent chains with heavy tool use -- you upgrade your compute tier. That's it. There are no usage meters, no overage charges, no per-seat fees.
This model works because we're not trying to capture a percentage of your usage. We're selling you infrastructure, and infrastructure has a fixed cost.
When flat-rate is the wrong choice
We're not going to pretend flat-rate is optimal for everyone.
If you run 5 chains a month to generate a weekly report, per-execution pricing at $0.50/run costs you $2.50. That's cheaper than $29. If your usage is genuinely low and predictable, a usage-based platform might be the better deal.
Flat-rate wins when you actually use the platform. It wins when you're building multiple workflows, iterating on prompts, running scheduled jobs, and treating agent orchestration as core infrastructure rather than an occasional convenience.
The break-even point is roughly 58 runs per month. If you're above that -- and anyone building production agent workflows will be far above that -- flat-rate saves you money every single month, and the gap widens the more you build.
The bottom line
Pricing models aren't just billing details. They shape how you build. Per-execution pricing makes you cautious. It adds friction to experimentation. It turns every new automation into a cost center.
Flat-rate pricing gets out of your way. Build more. Test more. Automate more. The platform cost stays the same.
That's the model we built Mentiko on, and we think it's the only one that makes sense for teams that are serious about agent orchestration.
See our pricing or join the waitlist to get your own instance.
Get new posts in your inbox
No spam. Unsubscribe anytime.