AbsaOSS · miroslavpojer · Jun 23, 2026 · Jun 23, 2026 · Jun 23, 2026 · Jun 24, 2026
@@ -78,6 +78,7 @@ its purpose, trigger phrases, and full instructions.
 | Skill                                                | Description                                                                                                                         |
 |------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------|
 | **[pr-review](./skills/pr-review/)**                 | Pull request code review — reviews diffs for risk, security issues, API contract changes, dependency bumps, CI/CD and infrastructure changes. Produces concise Blocker / Important / Nit comments. |
+| **[tdd-workflow](./skills/tdd-workflow/)**           | Test-driven development: upfront SPEC.md planning + confirmation gate (avoids batch design), then vertical-sliced implementation (one test → one code cycle at a time, not all tests then all code). |
 | **[token-saving](./skills/token-saving/)**           | Always-active response discipline — enforces brevity, no filler openers or closers, structured output, and a What/Why/How footer on code responses. Suspends on explicit "full detail" requests. |
 
 ## Finding More Skills

@@ -25,6 +25,7 @@ Navigation hub for all guides in this repository. Browse by category below.
 | Guide | Description |
 |----|----|
 | [PR Review](./pr-review.md)             | How the PR review skill works, what sections it applies, and how to trigger it     |
+| [TDD Workflow](./tdd-workflow.md)       | Test-driven development with: specification, confirmation gates, and vertical-sliced implementation |
 | [Token Saving](./token-saving.md)       | Keeping AI responses concise — how the token-saving skill works and when it applies |
 
 > **Keep this index up to date.** When you add a new guide, add a row to the appropriate table above.

@@ -0,0 +1,76 @@
+# TDD Workflow Skill
+
+The `tdd-workflow` skill guides test-driven development using planning — upfront specification with confirmation gates, then vertical-sliced implementation (one test → one implementation cycle at a time). It activates automatically when you ask to build features, fix bugs, or implement functionality.
+
+---
+
+## What it does
+
+The skill walks you through a six-step cycle:
+
+| Step | Purpose |
+|------|---------|
+| 1. **SPEC.md** | Upfront behavioral specification: purpose, scenarios, edge cases, out-of-scope, open questions |
+| 2. **Test Table & Gate** | Extract test cases from scenarios; get user confirmation before coding |
+| 3. **Tracer Bullet** | Write ONE test for first scenario → implement minimal code → verify pass |
+| 4. **Incremental Loop** | Repeat: one test → one implementation → pass → next test |
+| 5. **Refactor** | Clean up code while keeping all tests passing |
+| 6. **Done** | Discard SPEC.md (session scratchpad only) |
+
+---
+
+## Philosophy: Test Behavior, Not Implementation
+
+Tests should verify capabilities through public interfaces — not internal structure. A good test reads like a spec: "refund 30 of 100, leaving 70 refundable." These survive refactors because they test behavior, not how it's done.
+
+**Vertical slicing (one test → one implementation cycle)** ensures each test responds to what you learned from the previous one — avoiding the "batch test" anti-pattern that produces speculative, brittle tests.
+
+---
+
+## When it applies
+
+The skill activates on intent like:
+
+```
+write code to...
+implement a feature...
+fix this bug...
+build a module...
+design this system...
+add functionality...
+```
+
+Also applies implicitly to: designing systems, adding test coverage, capturing edge cases, documenting design decisions — even without mentioning TDD.
+
+---
+
+## Pre-Code Checklist
+
+Before writing the first test, verify:
+
+- [ ] SPEC.md complete (Purpose, Scenarios, Edge Cases, Out of Scope, Open Questions)
+- [ ] Test table created and shown to user
+- [ ] User confirmed test table (this is the gate)
+- [ ] Edge cases identified
+- [ ] Design decisions documented
+- [ ] Test table is specific (no vague summaries)
+- [ ] Tests will verify behavior through public interface only
+- [ ] Ready to write first test (tracer bullet)
+
+If any box is unchecked, do not proceed.
+
+---
+
+## Core Rules
+
+- **One test at a time** — write one test, make it pass, refactor, then the next. Not all tests, then all code.
+- **Do not code before confirming the test table** — design first, code second.
+- **Do not commit SPEC.md** — it's a session scratchpad, not a deliverable.
+- **Test behavior, not implementation** — do not access private class members or mock internal collaborators.
+- **Never refactor while RED** — get tests passing first, then improve code.
+
+---
+
+## Research Backing
+
+The approach (upfront SPEC.md + vertical slicing) is canon TDD endorsed by Kent Beck (TDD creator) and validated across 50+ real-world projects. Academic research indicates that quality improves more with small, uniform development steps than with test-first ordering alone — the discipline of the cycle matters as much as writing tests first.
@@ -0,0 +1,150 @@
+---
+name: tdd-workflow
+description: >
+  Test-first development workflow for new code, bug fixes, features, and systems. Activate for:
+  implementing functionality, fixing bugs, designing modules or systems, building utilities,
+  planning tests, or documenting design before code. Uses vertical slicing (one test → one
+  implementation at a time, not all tests first). Creates SPEC.md (local scratchpad), proposes
+  test table, confirms with user, then cycles red (write failing test) → green (minimal code) →
+  refactor. Covers: requirement capture, edge case discovery, test table construction, confirmation
+  gates, tracer bullets, and incremental TDD cycles. Does NOT use TDD when: answering conceptual
+  TDD questions, reviewing/analyzing existing code, or refactoring passing code without new requirements.
+---
+
+# TDD Workflow
+
+Write tests before code, always. SPEC.md is a session scratchpad — never commit it.
+
+## Philosophy
+
+**Test behavior, not implementation.** Tests verify capabilities through public interfaces. A good test reads like a spec: "refund 30 of 100, leaving 70 refundable." These survive refactors; implementation doesn't.
+
+**Vertical slicing:** One test → implement → repeat. Each cycle learns from the last. ✅  
+**Horizontal slicing (anti-pattern):** Write all tests, then all code. Produces speculative, brittle tests. ❌
+
+## Step 1 — Create SPEC.md
+
+Write a specification in the relevant package directory using `assets/SPEC_TEMPLATE.md` as your starting point. Complete all sections:
+
+- **Purpose:** What does this do? Why does it exist?
+- **Scenarios:** Table with 3-5+ concrete cases (inputs → expected outputs)
+- **Edge Cases:** Systematic list covering: input validation, boundaries, format variations, state transitions, preconditions
+- **Out of Scope:** What this does NOT handle
+- **Open Questions:** Unresolved design decisions
+
+**Do not proceed until all sections are complete.** SPEC.md is your test-first blueprint.
+
+---
+
+## Step 2 — Test Table & Confirmation Gate
+
+Create a test case table from your scenarios:
+
+| # | Name | Intent | Input | Output |
+|---|------|--------|-------|--------|
+| 1 | test_name | goal | inputs | expected result |
+
+**Each row must be specific enough to write a test from it without questions.** Bad: "handles refunds". Good: "refund 30 of 100, leaving 70 refundable".
+
+### ⚠️ CONFIRMATION GATE
+
+**STOP. DO NOT CODE YET.**
+
+Present the test table. Ask the user:
+- Does this cover the requirements?
+- Add, remove, or change any cases?
+- Is each case specific enough?
+
+Only proceed when the user confirms "Yes, this is our test plan."
+
+Record any key design decisions now (error handling approach, data types, state management).
+
+---
+
+## Step 3 — Tracer Bullet (First Test → First Implementation)
+
+Start with your first test from the confirmed table. This is your tracer bullet—it proves the path works end-to-end.
+
+**Red phase:**
+
+1. Write ONE test for the first scenario
+2. Give it a clear docstring explaining its behavior
+3. Run it. It should fail (code doesn't exist yet)
+
+**Green phase (immediately after):**
+
+1. Write the minimum code to make this test pass
+2. Do not add speculative features or handle other test cases
+3. Run the full suite—this test should pass, others should not yet exist
+4. Do not refactor yet—focus only on passing this test
+
+**Key rule:** One test at a time. You just proved the path works. Move to the next test.
+
+---
+
+## Step 4 — Incremental Loop (Repeat for Each Remaining Test)
+
+For each remaining scenario in your confirmed test table:
+
+1. **Write ONE test** for the next scenario → run → fails
+2. **Write minimum code** to pass this test → run → passes (should not break previous tests)
+3. **Do not anticipate** future tests — only handle what this test requires
+4. **Run full suite** after each cycle to confirm you haven't broken anything
+
+Repeat: test → code → pass → test → code → pass...
+
+Once all tests from your confirmed table pass, the incremental loop is done.
+
+---
+
+## Step 5 — Refactor Phase (Clean Up)
+
+Only after ALL tests pass, now improve the code:
+
+- Extract duplication
+- Improve naming
+- Simplify logic
+- Organize structure
+- Consider deeper modules (small interface, deep implementation)
+
+**Rules:**
+- Never refactor while RED (tests failing)
+- Run full test suite after every change
+- If a test fails, revert immediately
+- If refactoring reveals new behaviors, pause and write tests for them
+
+---
+
+## Step 6 — Done
+
+SPEC.md served its purpose. Do not update it unless the user asks to keep it.
+
+---
+
+## Pre-Code Checklist
+
+Before you write the first test (Step 3), verify:
+
+- [ ] SPEC.md complete (Purpose, Scenarios, Edge Cases, Out of Scope, Open Questions)
+- [ ] Test table created and shown to user
+- [ ] User confirmed test table ← **This is the gate**
+- [ ] Edge cases identified
+- [ ] Design decisions documented
+- [ ] Test table is specific (no vague summaries — "handles refunds" → "refund 30 of 100, leaving 70 refundable")
+- [ ] No implementation code written
+- [ ] Tests will verify behavior through public interface only (not private methods or internal structure)
+- [ ] Ready to write first test (tracer bullet)
+
+**If any box is unchecked, do not proceed.**
+
+---
+
+## Core Rules
+
+- **Do not code before confirming the test table** — this is the #1 pitfall. Design first, code second.
+- **Do not commit SPEC.md** — it's a session scratchpad, not a deliverable.
+- **Do not access private class members in tests** — it couples tests to implementation and breaks on refactors.
+- **Do not mock internal collaborators** — test through the public interface or the behavior is implementation-specific.
+- **One test at a time** — write one test, make it pass, refactor, then move to the next. Not all tests, then all code.
+- **Test before code, always** — if you write implementation code, pause and write tests instead.
+- **Use descriptive test names over inline comments** — e.g. in Python, prefer section separators (`# --- deposit ---`) and self-describing test names rather than prose comments inside the test body.
@@ -0,0 +1,70 @@
+# <Component/Feature Name>
+
+## Purpose
+One paragraph: what this component does, why it exists, and the public interface it provides. Be specific about the inputs it accepts and outcomes it produces.
+
+## Scenarios
+
+Describe the happy path and key failure modes. Each row should have concrete inputs and expected outputs that a developer could test against.
+
+| # | Name | Intent | Input | Expected Output |
+|---|------|--------|-------|-----------------|
+| 1 | name | clear one-line goal | specific input values | specific result or error |
+| 2 | ... | ... | ... | ... |
+
+**Examples:**
+- ✅ GOOD: "approve valid card" | "card=4111111111111111, amount=100.00" | "approved + transaction_id"
+- ❌ VAGUE: "handle cards" | "card data" | "works or fails"
+
+## Edge Cases
+
+List known boundary conditions and failure modes. Think systematically:
+- **Input validation:** What happens with empty, null, negative, zero, or very large values?
+- **Boundary conditions:** What's the smallest positive value? Largest supported? Off-by-one boundaries?
+- **Format variations:** Spaces, dashes, case sensitivity, trailing zeros?
+- **State transitions:** Can operation B happen before operation A? What's the valid sequence?
+- **Precondition violations:** What if a required precondition doesn't exist?
+
+Example:
+```
+- Card normalization: spaces and dashes are stripped before Luhn validation
+- Zero and negative amounts: rejected with validation error
+- Refund ceilings: cannot refund more than the remaining approved balance
+- Unknown transactions: refund request for non-existent tx returns error
+```
+
+## Out of Scope
+
+What this component does NOT handle (prevents scope creep and clarifies stopping points):
+
+Example:
+```
+- PCI-compliant storage or encryption
+- Real payment gateway integration
+- Chargebacks or payment disputes
+- Card brand detection (only Luhn validation)
+- Multi-currency support
+```
+
+## Open Questions
+
+Unresolved design decisions needing input before implementation. Marking these now prevents rework later:
+
+Example:
+```
+- Should errors be exceptions, result objects, or status codes?
+- Should amounts be Decimal, integer cents, or language-native money type?
+- Are refunds idempotent per refund ID or simply additive?
+```
+
+## Design Decisions (Optional but Helpful)
+
+Once you've reviewed this SPEC with the user and they've approved your test plan, record key design choices here before implementation:
+
+```
+- Error handling: Choose one → _____
+- Amount representation: Choose one → _____
+- [Other decision] → _____
+```
+
+This prevents rework when implementing.