WhatsApp Cloud API wrapper limits are a three-layer stack, not a single number
Most guides quote the 80 MPS default and stop. The actual ceiling is Meta's throughput + tier + per-user marketing cap, then the BSP wrapper's sandbox throttle and per-message markup, then country policy like the US (+1) marketing pause. They compound. Here is the full stack with literal numbers, plus what a non-API path looks like when none of them apply.
Direct answer (verified 2026-05-20)
Three layers stack on top of each other. Meta caps throughput at 80 MPS per number (upgradable to 1,000 MPS if you qualify) and tiers business-initiated conversations at 250 → 1k → 10k → 100k → unlimited unique users per rolling 24 hours. BSP wrappers layer their own restrictions on top: the Twilio sandbox caps at 1 message every 3 seconds, 360dialog charges a $49+/mo channel fee on top of Meta pass-through, MessageBird adds about $0.005 markup per message, Vonage charges a 10 to 20 percent markup. The third layer is country policy: marketing templates to US (+1) numbers have not been delivered since April 1, 2025, and any reply outside the 24-hour customer service window must be a template or it returns error 63016.
Sources: Meta messaging limits, Twilio WhatsApp sandbox, 360dialog pricing, Vonage WhatsApp pricing.
The three layers, in order
A send that fails could be hitting any one of these. The visible error usually points at the topmost layer (your BSP's 429), but the binding constraint is often two layers down. Knowing which layer is binding is the difference between waiting for tier promotion, renegotiating with the BSP, switching template categories, or changing the country mix of the campaign.
Layer 1. Meta's floor (the Cloud API itself)
These limits ship with the API. Every wrapper inherits them; nobody removes them.
- 80 messages per second per phone number, request rate capped at about 10 requests per second (each request can bundle multiple messages). Upgrade path to 1,000 MPS requires sending to 100k+ unique users within a 24h window outside the service window and holding a yellow-or-green quality rating.
- Tiered conversation cap: business-initiated conversations to unique users in a rolling 24h window are gated at 250, then 1k, 10k, 100k, then unlimited, contingent on quality and business verification.
- Per-user marketing template caps: a fixed maximum of marketing template sends per user across all businesses, enforced by Meta and not visible from your own dashboard.
- 24-hour customer service window: once a user messages you, free-form replies are allowed for 24 hours; after that you can only send approved templates. Crossing the window with anything else returns error 63016.
Layer 2. The BSP wrapper sub-caps and per-message markups
Every wrapper adds something. Some add a slower sandbox. Some add a fixed monthly fee. Some take a percentage of every message.
- Twilio sandbox: capped at 1 message every 3 seconds (about 0.33 MPS, roughly 240x slower than Meta's default 80 MPS). Sandbox sessions expire 3 days after a user joins and the user has to rejoin by re-sending the opt-in code.
- 360dialog: pass-through on Meta's per-message conversation fee, but a recurring channel fee starting at about $49/mo per number (premium plans around $99/mo) plus standard Meta charges.
- MessageBird (Bird): adds about $0.005 markup per session message and $0.005 per template message on top of Meta's conversation charge.
- Vonage: platform fee per message starting around $0.00015, with reported aggregate markup of 10 to 20 percent depending on contract terms.
- Template approval queue: most templates clear in minutes, but the documented ceiling is about 48 hours per template. Bulk submissions, multi-language variants, and rejected/resubmitted templates compound this latency before any message can be sent.
Layer 3. Policy caps Meta layers on top of everything
Country bans and category gates can void a campaign even when both layers above are green.
- US marketing pause: marketing template messages to WhatsApp users with US (+1) numbers have not been delivered since April 1, 2025. Service, authentication, and utility templates still work; the BSP will accept the marketing send and silently drop it.
- Quality rating gate: a red rating freezes tier advancement until rating recovers, even if the rest of the account is healthy.
- Opt-in proof requirement: marketing templates need documented opt-in for each user; some BSPs require uploading the opt-in flow URL during template review.
- Business verification: tier upgrades and unlimited messaging require Meta business verification, which depends on legal documents and can take days to weeks.
What happens to one send as it passes through the stack
Walking through every gate a single business-initiated marketing template hits. Each layer can deny it independently. The documentation for each layer lives in a different place; the layers do not share an error vocabulary.
1. Throughput cap (Meta)
Your sender process produces messages at, say, 200 MPS. Cloud API rate-limits to 80 MPS per number. Excess gets 429s or buffered server-side. If you have not been upgraded to 1,000 MPS you stay at 80.
2. Wrapper sub-cap (BSP)
If you are in the Twilio sandbox, the effective cap drops to 0.33 MPS regardless of Meta's 80. If you are on a small 360dialog plan, you also pay the channel fee that month even at zero throughput.
3. Per-24h conversation tier (Meta)
Even if throughput is fine, you cannot start a business-initiated conversation with a unique user beyond your tier (250, 1k, 10k, 100k, unlimited). The 24h window is rolling. Hitting the tier returns a tier-exceeded error per recipient.
4. Per-user marketing cap (Meta)
Individual users can only receive a fixed number of marketing templates per period, summed across all businesses. Your campaign may pass tier and throughput but get filtered per-recipient with no visible counter.
5. Country / category gate (Meta policy)
Marketing template to a +1 number? Not delivered since April 1, 2025. Your BSP returns success, the message never lands. Service and authentication templates are exempt; marketing is not.
6. 24-hour customer service window (Meta)
Any free-form reply outside the 24h window since the user's last message returns error 63016, regardless of throughput or tier. Templates are the only path out.
“Cloud API limit is 80 messages per second, while the On-premise API is 13 messages per second.”
Documented Meta default throughput cap, per public BSP and academy references
Cloud API + wrapper stack vs. driving WhatsApp Desktop on macOS
The alternative is not a faster wrapper; it is a different surface entirely. whatsapp-mcp-macos does not call the Cloud API. It attaches to the WhatsApp Desktop process via macOS accessibility (AXUIElement), walks the UI tree, posts CGEvent clicks, and pastes text into the compose box. The Cloud API limits stop applying because nothing in this path is a Cloud API client.
| Feature | WhatsApp Cloud API via BSP wrapper | WhatsApp MCP on macOS |
|---|---|---|
| Default throughput | Cloud API: 80 MPS per number (upgradable to 1,000 MPS once you qualify). | macOS Desktop via accessibility: ~120ms per full AX tree traversal, single-process. Not a thousand-MPS path; designed for one-at-a-time agent workflows. |
| Slowest layer in the wrapper stack | Twilio sandbox: 1 message every 3 seconds. Production-grade Cloud API access tier still starts at 80 MPS and ramps via Meta-graded quality scoring. | There is no sandbox. Send rate is bounded by paste/return latency in WhatsApp Catalyst, roughly 1 message every 1-2 seconds, the same as a human typing. |
| Per-message cost | Meta conversation fee (varies by category and country) plus BSP markup. MessageBird publishes about $0.005 per message; Vonage about 10-20% markup; 360dialog passes through Meta + $49/mo channel fee. | Zero per-message cost. The MCP server runs locally; the only outbound traffic is the WhatsApp Desktop client itself, which uses your existing personal WhatsApp account. |
| Template approval | Required for all business-initiated messages outside the 24h service window. Approval ceiling around 48h per template; bulk submissions and rejections compound. | No templates. Outbound text is just typed into the compose box. Subject to WhatsApp's normal anti-spam behaviour on a personal account. |
| Country-level blocks | Marketing templates to US (+1) numbers undelivered since 2025-04-01. Other regions have category gates and opt-in proof requirements. | Sends are 1:1 messages from your account to a contact, like any human-driven send. Not subject to marketing template gating. |
| Opt-in proof | Mandatory for marketing templates; BSP review pipelines require documenting the opt-in flow. | Not applicable. You can only message your own contacts and recent chats from your own WhatsApp account. |
| Where the limits live | Documented across Meta + BSP docs + per-country policy notices, often in different places and not stacked together. | Documented in the Swift source of whatsapp-mcp-macos. The constraint surface is finite: ~15-level AX depth, ~120ms per tree walk, 350ms paste settle window, and macOS-only. |
Limits that do not exist on the desktop-app path
- No 80 MPS cap (the cap is human-paced typing, ~1 message per 1-2 seconds, single process)
- No 250/1k/10k tier on unique users per 24h
- No per-user marketing template cap (no templates at all)
- No 24h customer service window
- No country-level marketing pause (no marketing template concept)
- No BSP markup (no BSP)
- No template approval queue (no templates)
- No opt-in flow review (you message your own contacts)
Limits that do exist on the desktop-app path
- ·macOS only (uses AXUIElement, a macOS framework)
- ·Requires WhatsApp Desktop running and Accessibility permission granted
- ·One process per account (you cannot horizontally scale the way a Cloud API number scales)
- ·Subject to WhatsApp's normal anti-spam heuristics on your personal account (sending to thousands of strangers will get flagged the same way a real user would)
- ·Not a replacement for high-volume opted-in template broadcasts; that is what the Cloud API exists for
This is not a replacement for high-volume opted-in broadcasts. It is the right tool for AI agents that need to act on your behalf across your existing chats.
Which limit binds for which use case
- Sandbox prototype: Twilio's sandbox at 1 msg per 3 seconds is the binding limit. Meta tier and country policy are irrelevant because you never reach them.
- Single-tenant SaaS sending a few thousand transactional alerts per day: Meta's 80 MPS is plenty, the BSP markup is the most expensive layer, and the 24h service window is the gate you keep hitting on replies.
- Marketing broadcast to global list: Meta's per-user marketing cap and the US (+1) pause bind first; then 24h conversation tier; throughput last. Switching BSPs does not move any of these.
- AI agent reading + replying for one person: none of the Cloud API limits apply when the agent drives WhatsApp Desktop directly. The binding constraint is single-process and macOS-only.
- Customer support handling inbound at scale: the 24h service window is the dominant constraint; throughput rarely matters because inbound paces outbound. Tier and policy apply only when initiating new conversations.
Frequently asked questions
Why are Cloud API limits and BSP wrapper limits different?
Meta publishes the floor: throughput per number (80 MPS default), the rolling-24h tier for business-initiated conversations, per-user marketing caps, and country policy. Each BSP layers something on top: Twilio runs a slower sandbox (1 msg per 3 seconds) and a separate production tier; 360dialog adds a recurring channel fee; MessageBird and Vonage take a per-message markup. Some BSPs add their own template review pipeline before submitting to Meta. So the BSP cannot remove Meta's floor, but it can add caps and costs on top.
What is the actual Cloud API throughput cap, and how do I get above it?
80 messages per second per phone number is the documented default for the Cloud API. The request rate is around 10 requests per second; each request can bundle multiple messages. The upgrade to 1,000 MPS requires sending to 100,000+ unique users within a 24h window outside the customer service window, holding an unlimited messaging tier, and maintaining a yellow or green quality rating. There is no public path to higher throughput without meeting those conditions.
What does the US (+1) marketing pause actually do?
Since April 1, 2025, marketing template messages sent to WhatsApp users with US phone numbers have not been delivered. The send call to your BSP returns success and you get a normal delivery callback chain, but the recipient does not see the message. Service templates (account updates, transaction confirmations) and authentication templates (OTP codes) still work. The pause applies to the marketing category only.
Is the Twilio sandbox cap the same as the Twilio production cap?
No. The sandbox is hard-capped at 1 message every 3 seconds and sandbox sessions expire 3 days after a user joins. The production tier (after Twilio onboards your verified WhatsApp Business Account) inherits Meta's 80 MPS default. The sandbox is a developer sandbox; it is not representative of production throughput.
Can I send messages without going through any BSP wrapper?
Two paths. First, Meta's own Cloud API (no BSP) which exposes the raw 80 MPS / tiered conversation surface and the template approval pipeline; you still pay Meta's conversation fees. Second, the desktop-app path: drive WhatsApp Desktop with macOS accessibility from a local MCP server. The second path has none of the throughput, tier, template, or markup ceilings, but is bounded by what a single WhatsApp account can plausibly do without getting flagged for spam.
How does the 24-hour customer service window interact with the other limits?
It is orthogonal. Throughput, tier, and per-user marketing caps gate sends initiated by you. The 24h window gates whether your reply can be free-form or has to be a template. Even if your account has unlimited tier and 1,000 MPS throughput, a reply 25 hours after the user's last message must be a template, or the BSP returns error 63016.
Where does the desktop-app path break down?
It is macOS only and binds to a single WhatsApp Desktop process. It cannot scale to ten thousand sends per second the way a Cloud API number can; it sends at human pace. It is not a substitute for broadcast marketing templates that need to land at scale. It is the right tool when an AI agent needs to act on your behalf across your existing chats: triaging inbound, replying with context, searching contact history, sending one-off outbounds.
Does whatsapp-mcp-macos count as a 'wrapper'?
No, and that is the point. A wrapper is software that sits between you and the Cloud API. whatsapp-mcp-macos does not call the Cloud API at all. It drives the genuine WhatsApp Desktop client through macOS accessibility (AXUIElement). The client connects to WhatsApp over the same path your phone does; the MCP server only controls the local UI. There is no Meta developer account, no template review, no BSP, and no per-message fee.
Hitting one of the layers and not sure which?
15 minutes to walk through the stack against your specific send pattern and figure out whether the answer is a tier upgrade, a BSP switch, a template restructure, or a desktop-app path.
Comments (••)
Leave a comment to see what others are saying.Public and anonymous. No signup.