The retry primitive was correct. The callers weren’t.

Last week I wrote about how 429s turn into retry storms.
This week I started scanning real repos.
Same pattern.
Over and over.

So I built a small CLI to flag it:

npx pitstop-check ./src

It scans TypeScript/JavaScript and flags:

429 handled without Retry-After
blanket retry of all 429s (no CAP vs WAIT distinction)
unbounded retry loops

Example: OpenClaw
https://github.com/openclaw/openclaw/issues/50866

[WARN] src/agents/venice-models.ts:24 --- 429 handled without Retry-After
[WARN] src/agents/venice-models.ts:24 --- All 429s treated as retryable (CAP vs WAIT not distinguished)

pitstop-check found 2 issues

What's happening:

API returns Retry-After: 600
client ignores it
retries on its own schedule
multiple agents retry independently
request rate increases under throttling → sustained failure

What surprised me

The retry primitive is often correct.
The call sites aren't.

In OpenClaw:

retry.ts supports retryAfterMs
it's implemented correctly
it's tested

But:

venice-models.ts
compaction.ts
memory-tool.ts

all call retryAsync without passing it.
The correct behavior exists in the codebase—It's just not wired up.

The pattern

Most systems collapse three distinct cases:

WAIT → respect Retry-After
CAP → bound retries / reduce load
STOP → fail fast

into:

retry()

That's how you get retry storms.

Repo: https://github.com/SirBrenton/pitstop-check
If you run this on your code and it flags something interesting, I'd genuinely like to see it.

The retry primitive was correct. The callers weren’t.

What's happening:

What surprised me

The pattern

Comments

More from this blog

Status Decays. Artifacts Compound.

AI + API Execution Needs a Contract

How a 429 Turns Into a Retry Storm

Command Palette

What's happening:

What surprised me

The pattern

Comments

More from this blog