Slack Dispatch Runbook
This runbook covers the ThinkWork Slack workspace app from signed ingress through outbound response delivery.
Metrics
Section titled “Metrics”Slack emits CloudWatch Embedded Metric Format records in the ThinkWork/Slack namespace.
| Metric | Dimensions | Meaning |
|---|---|---|
slack.events.ingest_ms | handler | Time spent handling a signed Slack request. Watch for p95 approaching Slack’s 3-second ack limit. |
slack.events.dedupe_hits | surface | Duplicate Slack delivery was accepted but not re-enqueued. Some hits are normal during Slack retries. |
slack.events.unknown_team | handler | Slack sent a request for a workspace that is not actively installed. |
slack.dispatch.success | surface | A completed Computer response was delivered to Slack. |
slack.dispatch.failure | error_class | The dispatcher gave up and marked the task failed. |
slack.attribution.degraded | none | Slack rejected customized username/avatar attribution and ThinkWork retried with bot identity. |
Common procedures
Section titled “Common procedures”Attribution degradation alarm
Section titled “Attribution degradation alarm”Signal: slack.attribution.degraded is non-zero or Slack messages render as the ThinkWork bot with a body prefix.
- Confirm the dispatcher still posted the response.
slack.dispatch.successshould increase for the same window. - Check Slack app installed scopes in the affected workspace. Missing or rejected
chat:write.customizeis the expected cause. - If the customer intentionally removed the scope, no incident is required. The fallback is by design: bot identity, shared Computer body prefix when needed, and attribution footer.
- If the scope should be present, ask the workspace admin to reinstall or reauthorize the Slack app with optional identity customization enabled.
- Watch the metric after reinstall. It should return to zero for new messages.
Bot token revoked or workspace uninstalled
Section titled “Bot token revoked or workspace uninstalled”Signal: slack.dispatch.failure{error_class="bot_token"} or Slack Web API errors such as not_authed, invalid_auth, token_revoked, or account_inactive.
- Find the affected task in
computer_eventsbyslack.dispatch_failedand inspectpayload.error. - Resolve the Slack team id from the task envelope (
computer_tasks.input.slack.slackTeamId). - In admin, check the Slack workspace status. If the workspace was uninstalled in Slack, mark or treat it as revoked.
- Ask a tenant admin to reinstall the app from ThinkWork admin.
- Ask affected users to re-link only if their
slack_user_linksrow is missing or stale. A workspace reinstall alone does not always require per-user relinking. - Do not manually mutate production secrets. Token recovery flows through the normal OAuth install path.
Unknown Slack team
Section titled “Unknown Slack team”Signal: slack.events.unknown_team increases.
- Confirm whether the Slack team id belongs to a previously installed workspace.
- If the workspace is no longer active, this is likely a stale Slack retry or an uninstall race. No action is needed unless it persists.
- If the workspace should be active, verify the
slack_workspacesrow status isactiveand the Slack app install completed successfully. - If the row is missing, ask the tenant admin to install the app again from admin.
Ingest latency near 3 seconds
Section titled “Ingest latency near 3 seconds”Signal: slack.events.ingest_ms p95 approaches 3000ms or Slack retries increase.
- Check which handler dimension is slow:
events,slash-command, orinteractivity. - For
interactivity, prioritize modal-open latency. Message shortcuttrigger_idvalues expire quickly. - Review recent cold starts, Lambda duration, and Secrets Manager latency.
- Confirm the handlers are not waiting on Computer completion. They should only verify, resolve, enqueue, and ack.
- If retries already occurred, dedupe should prevent duplicate Computer work. Verify
slack.events.dedupe_hitsincreased instead of duplicatecomputer_tasks.
Slash command response missing
Section titled “Slash command response missing”Signal: /thinkwork acked, but no ephemeral response appeared.
- Check
computer_tasksfor asource=slacktask withtriggerSurface=slash_command. - If the task completed, check
slack.dispatch_failedforresponse_urlerrors. - If the Computer turn exceeded Slack’s
response_urlusability window or follow-up limits, use the source task/thread to post an operator note and investigate runtime latency. - If dispatch succeeded, ask the user to check ephemeral visibility in the original channel; ephemeral responses are visible only to the invoking user.
Message shortcut modal did not update
Section titled “Message shortcut modal did not update”Signal: The working modal opened but stayed stale, or the answer posted without modal confirmation.
- Check the task envelope for
modalViewId. - Check
slack.dispatch_failedforviews.updatefailures. - If the response was still posted in-thread, treat the modal update as a degraded delivery rather than lost work.
- If both modal update and thread post failed, follow the dispatch failure path.
Useful queries
Section titled “Useful queries”Find recent Slack dispatch failures:
select created_at, tenant_id, computer_id, task_id, payloadfrom computer_eventswhere event_type = 'slack.dispatch_failed'order by created_at desclimit 25;Find attribution degradation by workspace/team:
select e.created_at, t.input->'slack'->>'slackTeamId' as slack_team_id, e.payloadfrom computer_events ejoin computer_tasks t on t.id = e.task_idwhere e.event_type = 'slack.attribution_degraded'order by e.created_at desclimit 25;Find pending completed Slack tasks that have not been dispatched:
select t.id, t.tenant_id, t.computer_id, t.status, t.updated_at, t.input->'slack' as slackfrom computer_tasks twhere t.input->>'source' = 'slack' and t.status = 'completed' and not exists ( select 1 from computer_events e where e.task_id = t.id and e.event_type in ('slack.dispatch_completed', 'slack.dispatch_failed') )order by t.updated_at asclimit 25;Recovery boundaries
Section titled “Recovery boundaries”- Do not manually invoke production Slack callbacks with forged payloads.
- Do not edit Slack bot tokens directly in Secrets Manager as a recovery path; reinstall through OAuth.
- Do not manually post final user answers from operator accounts unless the customer explicitly asks for a one-off status note.
- Do not delete
computer_eventsorcomputer_tasksrows to “retry” a dispatch. Create code or operational fixes, then let the scheduler drain eligible tasks.
Related code
Section titled “Related code”- Slack ingress handlers:
packages/api/src/handlers/slack/ - Slack metrics helper:
packages/api/src/lib/slack/metrics.ts - Dispatch Lambda:
packages/lambda/slack-dispatch.ts - Slack task envelope:
packages/api/src/lib/slack/envelope.ts - Slack data disclosure: Slack data handling