The n8n Wait node looks harmless. You drop it in, pick “resume after 30 minutes” or “resume on webhook call,” and move on. It runs green in your test. Then weeks later a customer asks why they never got the follow-up, the second reminder, the post-delay sync — and you discover a pile of executions that paused and never woke up. No error. No alert. Nothing in the error log. Just silence.
I consult exclusively on n8n, and the Wait node is one of the quietest sources of production incidents I find. It’s dangerous precisely because it doesn’t fail loudly — it fails by not happening, which the default setup gives you no way to notice. This post is the systematic version: the five ways the Wait node silently fails, and the concrete fix for each. It’s the same failure class I cover in the broader common production mistakes writeup.
First, understand how Wait actually works
When execution hits a Wait node, n8n persists the execution to the database and stops the process. The execution goes into a waiting state. Later — at the resume time, or when the resume webhook is called — n8n loads that saved execution back from the database and continues it from where it paused.
That single fact is the root of every silent failure below: a waiting execution is just a row in your database that something has to come back and resume. If that row is deleted, or the thing meant to resume it never does, or the instance isn’t running at the moment it should — the execution doesn’t error. It just never finishes. And because it never finishes, it never reaches your Error Trigger, your Slack alert, or anything else you built to catch failures.
Silent failure #1: the execution timeout kills the wait
If you’ve set EXECUTIONS_TIMEOUT (or you’re on a plan with a max execution duration), a long Wait can be terminated by the timeout while it’s waiting. The clock is on the whole execution, not just the active steps. A workflow that waits 2 hours under a 1-hour timeout never makes it to the other side.
The fix: for any workflow that waits longer than a few minutes, either raise/disable the timeout for that workflow, or — better — don’t hold the process at all. For waits measured in hours or days, use the re-trigger pattern (#3 below) instead of an in-process Wait. Know your platform’s hard limits before you design around a long wait.
Silent failure #2: pruning deletes the waiting execution
This is the one that surprises people most. n8n prunes old execution data to keep the database lean — controlled by EXECUTIONS_DATA_PRUNE, EXECUTIONS_DATA_MAX_AGE (default 336 hours / 14 days), and EXECUTIONS_DATA_PRUNE_MAX_COUNT. If a Wait node is set to resume after the pruning window — say a 21-day “trial ending” reminder while max age is 14 days — the waiting execution can be deleted before it ever resumes. It’s gone. There is nothing left to wake up.
The fix: make your longest wait comfortably shorter than EXECUTIONS_DATA_MAX_AGE, or raise the max age past your longest wait, and check the max-count prune isn’t evicting waiting rows under volume. For anything longer than the prune window, don’t rely on a held execution — persist the due-time externally (a database row, a Google Sheet, a Postgres table) and re-enter with a schedule.
Silent failure #3: a restart or deploy happens mid-wait
Every deploy, container restart, crash, or OOM kill is a moment your instance is down. Waiting executions live in the database, so they survive a restart — but the resume only happens if the instance is back up and running the main process at the moment the timer is due. On self-hosted setups this bites in two ways: a crash-loop or long deploy window that straddles the resume time, and queue mode, where waiting executions are resumed by the main instance, not the workers — so if main is down at the due moment, nothing resumes it.
The fix: for short waits, make sure restarts are quick and main stays up. For anything important or long-running, use the re-trigger pattern: instead of one execution that waits days, write a “due at” record to a datastore, end the execution cleanly, and run a Schedule Trigger every few minutes that picks up anything now due and continues it. State lives in your database, not in a fragile in-flight process — restarts become harmless.
Silent failure #4: the resume webhook that’s never called
The “On webhook call” Wait mode pauses until something hits the execution’s resume URL ($execution.resumeUrl). The trouble: that something is usually a third party — a payment provider, an approval email, an external job. If they never call back (their webhook fails, the email link is never clicked, the job dies), your execution waits forever — or until a timeout silently ends it. No callback, no error, no notice.
The fix: never ship a resume-webhook Wait without a timeout and a fallback branch. The Wait node lets you set a maximum wait time; on timeout, route to a deliberate path — escalate, alert, mark the record stuck, retry the upstream call. A resume that depends on someone else calling you back must always have an answer for “what if they don’t.”
Silent failure #5: a “wait until” time that’s already in the past
Set the Wait to resume at a specific timestamp and n8n will resume immediately if that time has already passed. The classic cause is a timezone mismatch — you compute “9am tomorrow” in local time, your instance runs UTC, and the value lands in the past. The delay you designed silently collapses to zero, the “next day” reminder fires instantly, and nobody notices until the timing looks wrong in the data.
The fix: compute resume times in UTC, explicitly, and validate the target is in the future before the Wait — if it isn’t, clamp it or branch. Treat “the time I’m about to wait until” as untrusted input, because a timezone bug makes it exactly that.
The safety net: monitor for stuck waiting executions
Every fix above is prevention. You also need detection, because the whole problem with the Wait node is that its failures are invisible by default. The catch-all is a watchdog: on a schedule, query for executions that have been in the waiting state longer than they ever should be, and alert on them.
You can do this through the n8n API (or a direct read of the execution_entity table) — list executions with status waiting, filter to those whose resume time is well past due or that have been waiting beyond your longest legitimate wait, and push the count to Slack or email. One waiting execution that’s three days overdue is a silent failure caught before a customer finds it for you. This is exactly the kind of outcome reconciler I build into every production workflow — not “did it error” but “did the thing that was supposed to happen actually happen.”
The one-line version
The Wait node doesn’t fail with a red error — it fails by quietly never resuming. Treat every wait as a database row that something has to come back for, and ask the four questions: Will the timeout kill it? Will pruning delete it? Will a restart strand it? Will the resume ever actually fire — and what happens if it doesn’t? Answer those, add a watchdog for stuck waits, and the quietest node in n8n stops being the riskiest.
This is one of eighteen silent-failure modes in the Production n8n Checklist — the list I run before any workflow ships. If you’ve got a workflow leaning on a Wait node and you’re not sure it’ll hold, a pre-flight review is the fastest way to find out.