TL;DR

Teams run experiments, write playbooks, and run training, and then forget 70% of it within days. The fix isn’t more slides or longer onboarding; it’s designing memory-first workflows: attach knowledge to context, turn lessons into reusable stories, validate them with small experiments, and automate recall where possible. This edition gives a practical, step-by-step playbook (templates included) you can apply in the next 30–90 days.

A quick note

Amazon just announced 30,000 layoffs. UPS cut 48,000. Both cited AI as a key driver, and it’s not stopping there.

If you were recently affected and want to pivot into an AI-related role, I want to help. I’m putting together a small effort to connect people with resources, mentors, and companies hiring for AI-skilled roles.

I’ve been in the AI space for 8+ years, worked with ML systems at Meta, founded an AI education non-profit that reached 70,000 people, and now run an AI testing platform where I see firsthand how companies are implementing AI and reshaping their approach to business.

If that sounds useful, you can fill out the form below. I’ll share what I learn as I help people navigate this shift.

Fill out form

Most teams think they’re learning. They hold retros, publish “lessons learned,” and applaud experimentation. But the follow-through is the problem: knowledge decays fast, decisions get re-litigated, and the same problems reappear.

It’s not a people problem. It’s a systems problem.

Below is an engineer-first, founder-friendly manual to turn momentary learning into a durable advantage.

The problem, in three acts

1) Knowledge has a half-life.
Psychology shows that most new information decays rapidly without reinforcement. At scale, that means a product team can forget why a past experiment succeeded, then re-run the same tests and re-learn the same lesson. This wastes time and morale.

2) Documentation is fragmented and passive.
Docs live in five places: PR descriptions, Slack threads, Confluence pages, someone’s head, and — if you’re lucky — a dated Google Doc. No single source of truth. Even when it exists, it’s stored as instructions, not stories.

3) Learning rarely connects to daily work.
Training is siloed (courses, workshops). Work is context-rich. If you don’t immediately apply a lesson in the flow of work, it won’t stick.

The memory-first stack (what reliable learning systems actually look like)

Treat organizational memory like a product. Build a stack with clear responsibilities:

Event layer (capture): hooks that capture decisions, outcomes, and signals automatically (PR merges, deployment results, user feedback, experiment results).
Storybank (store): a searchable, linked knowledge graph of short narratives — what happened, why, who, and outcome. Not long docs, but focused stories.
Context layer (attach): connect stories to the places people work: tickets, PRs, dashboards, meeting agendas.
Recall layer (surface): nudges and discoverability: weekly digests, decision suggestions in PRs, contextual prompts in Slack/IDE.
Apply layer (practice): small tasks, micro-experiments, or “do-it-now” challenges that force immediate application.
Eval layer (measure): metrics and replay tests that validate whether knowledge changed behavior and outcomes.

You don’t need all six day one. Start with capture + storybank + context; add recall and apply next.

Tactical playbook. Do this in the next 30/60/90 days

0–30 days: Stop the leak

Mandate 1: Every decision that changes a user experience (or core infra) must have a one-paragraph “decision note” linked to the PR/ticket.
Capture point: Add a short “What changed / Why / Outcome metric” template to PR templates and incident reports. Keep it ≤ 120 words.
Weekly habit: Run a 15-minute “what we learned” slot in your team’s weekly sync — one story only.

Decision note template (copy-paste):

TITLE: <short, imperative>
CONTEXT: <one sentence — what led to the decision>
DECISION: <what we did>
EXPECTED_METRIC: <what we were aiming to change>
ACTUAL_OUTCOME: <numeric or short qualitative>
NEXT_STEPS: <tweak, revert, monitor> 
AUTHOR, DATE

30–60 days: Build a searchable storybank

Schema: short stories (200–400 words), tags, linked artifacts (PRs, dashboards), and outcome metrics.
Process: Convert 8 recent decisions/incidents into stories. Tag by theme (onboarding, payments, infra, UX).
Tooling: Start in Notion/Confluence/Mem — but enforce the schema and links. Index via simple tags and a people owner.

Story skeleton (copy-paste):

HEADLINE: <What happened — short story title>
WHEN / WHO: <date, owners>
THE PROBLEM: <1-2 sentences>
THE ACTION: <what we tried>
WHY IT MATTERED: <context>
OUTCOME (metrics): <before → after>
LESSON (1 line): <playbook name>
LINKS: <PR, ticket, dashboard>

60–90 days: Surface & apply knowledge in flow

Integrate recall: add one of these: a) brief summary in PR templates that suggests “related stories to consider,” or b) a Slack bot that suggests a relevant story when certain keywords appear.
Micro-practice: pair each new hire’s onboarding with 3 “do-it-now” tasks derived from storybank playbooks (e.g., run the same smoke test, reproduce a bug fix).
Quarterly tiebreaker: a 1-hour cross-team “story showcase” where teams present two stories: one win, one failure.

Convert postmortems into playbooks (exact steps)

Too many postmortems end with “lessons learned.” Convert that into “playbooks” your team can execute.

Extract 3 reproducible steps from the postmortem that should be attempted next time.
Create a short checklist (3–6 items) and a decision threshold (when to apply this playbook).
Assign an owner and a test case to validate the playbook (replay a past failure in a sandbox).
Publish the playbook and link it to the original story.

Example playbook (payment-retry):

When payment.failure_rate > 1% and error_code == TIMEOUT:
1. Trigger retry_policy_v2 (max 3 retries, exponential backoff).
2. Log retry_attempts and tag the ticket with retry_test.
3. If failures persist after 3 attempts, open a manual review flag.
  Owner: payments lead — test case: replay last 10 failed transactions in staging.

Metrics that prove your memory system works

Pick 3 metrics and measure them weekly:

Recall rate: % of PRs / tickets that linked to one or more relevant stories. Goal: 60% within 90 days.
Playbook adoption: % of incidents where a relevant playbook was executed. Goal: 40% in quarter 1 → 70% by quarter 3.
Reoccurrence rate: frequency of repeated failures for the same class of issue (should trend down).
Time-to-resolution (for recurring issues): median time to fix problems of the same class — trending down is success.
Knowledge reuse events: count of times a story was surfaced and acted on (via PR suggestion, Slack, onboarding task).

Low-friction automation ideas (practical experiments)

PR Suggestion Hook: a small script that looks for keywords in the PR description and returns 1–3 matching storybank entries. Starts as a query against your docs index.
Deploy Digest: a daily automated email listing “new stories” + “stories related to today’s deploys.”
Onboarding checklist: for every new hire, auto-assign 3 story-based tasks in their first 2 weeks.
Incident-to-playbook pipeline: after a postmortem, require a follow-up ticket: “create or update playbook and assign test.”

These yield big returns because they make memory actionable.

One concrete example (how this plays out)

Problem: Support sees recurring confusion about subscription proration. Engineers add a band-aid; support invents manual workarounds; product re-implements the same UI twice.

Memory-first fix:

Capture the incident as a story linked to the support ticket, PR, and product spec.
Extract actionable playbook: a 4-step check for proration logic, with a small test harness.
Surface the playbook in any PR that touches billing via the PR Suggestion Hook.
Onboarding: new support hires complete the “billing playbook” task in week 1.
Result (metric): reoccurrence of proration confusion drops from 3/month → 0.5/month in two quarters.

Common traps & how to avoid them

Trap: “We’ll remember because we wrote it down.” → Reality: writing is necessary but insufficient. Pair it with recall.
Trap: “This is a knowledge problem, not a tooling problem.” → Reality: both matter. Good tools with no rules = clutter. Good rules with no tools = friction.
Trap: “We’ll build the index later.” → Reality: if you don’t capture now, you lose causal context. Capture when the memory is fresh.

Quick playbook for founders & leaders (1-page action plan)

Week 0 (this week): Add decision-note field to PR and incident templates. Mandate a 1-paragraph capture for key changes. Run one 15-minute retro slot.

Week 2: Convert 5 recent retrospective notes into storybank entries. Tag them, link artifacts.

Week 4: Deploy a simple PR Suggestion Hook (script or manual “related stories” checklist) and require 1 related story per PR.

Month 2: Add onboarding tasks from the top 10 story playbooks. Run the first cross-team story showcase.

Quarter 2: Measure recall rate and reoccurrence rate; iterate on playbooks.

Two ready-to-use short prompts for your team (copy-paste)

For engineers writing PRs

Before you merge, add a 1-line link to a related story (if any). If none exists, add a 2-sentence decision note:
- Context
- Decision
- Expected metric

For incident responders (postmortem ending)

Action required before closing: create or update a playbook with:
- When to run (trigger)
- 3 reproducible steps
- Owner + test case
Link the playbook to the postmortem.

Final note

Culture is the multiplier. But culture without systems is optimism. If you want learning to compound — the thing that creates long-term advantage — design for memory with the same rigor you design for service reliability.

Playbooks, stories, and small automation win more than one-off trainings. Start with capture, make reuse mandatory, and measure whether knowledge changes what people do.

👉 If you found this issue useful, share it with a teammate or founder navigating AI adoption.

And subscribe to AI Ready for weekly lessons on how leaders are making AI real at scale.

Until next time,
Haroon

Why Most Teams Forget What They Learn