Evaluation Seed Cleanup
The ThinkWork RedTeam starter pack replaces the older maniflow evaluation seeds. Fresh tenants receive the new pack automatically. Existing deployments should run the cleanup SQL once so old seeded rows no longer appear in the Studio category list or future evaluation runs.
Run the cleanup
Section titled “Run the cleanup”After pulling a release that includes the new seed pack, run the migration against each deployed tenant database:
psql "$DATABASE_URL" \ -v ON_ERROR_STOP=1 \ -f packages/database-pg/drizzle/0089_remove_maniflow_eval_seeds.sqlThen open Evaluations Studio or run the seed command to import the new pack:
thinkwork eval seed --stage <stage>If the deployment previously imported now-retired non-RedTeam slices or ambiguous safety-scope cases, run the RedTeam-only cleanup as well:
psql "$DATABASE_URL" \ -v ON_ERROR_STOP=1 \ -f packages/database-pg/drizzle/0096_true_redteam_eval_seed_cleanup.sqlVerify
Section titled “Verify”Legacy seed rows should be gone:
-- The `sub-agents` category is a legacy seed label retained for cleanup.SELECT category, count(*)FROM eval_test_casesWHERE source = 'yaml-seed' AND category IN ( 'email-calendar', 'knowledge-base', 'mcp-gateway', 'red-team', 'sub-agents', 'brain-onepager-citations', 'brain-triage-routing', 'brain-trust-gradient-promotion', 'brain-write-back-capture', 'thread-management', 'tool-safety', 'workspace-memory', 'workspace-routing' )GROUP BY category;The query should return zero rows. New seeded categories should be limited to red-team dimensions:
SELECT DISTINCT categoryFROM eval_test_casesWHERE source = 'yaml-seed'ORDER BY category;