Subscribe
Sign in
Home
Notes
Chat
LLM Engineer's Handbook
Agentic AI Engineering Course
Roadmaps
Perks
Contact Me
Archive
About
Intermediate
Latest
Top
Discussions
The AI Evals Roadmap I Wish I Had
From vibe checking to trusted agents in production
Mar 24
•
Paul Iusztin
63
6
9
Why RAG Has Exactly 6 Failure Modes. No More, No Less.
A complete guide for evaluating your retrieval-augmented generation systems.
Mar 17
•
Paul Iusztin
32
6
3
Our LLM Judge Passed Everything. It Was Wrong.
Align your evaluator with human judgment, or don't trust it at all.
Mar 10
•
Paul Iusztin
20
7
4
How to Design Evaluators That Catch What Actually Breaks
The practical guide to code-based checks, LLM judges, and rubrics for real-world AI apps
Mar 3
•
Paolo Perrone
22
6
5
Generate Synthetic Datasets for AI Evals
5 strategies from cold start to 450 diverse inputs in minutes
Feb 24
•
Paul Iusztin
26
7
5
No Evals Dataset? Here's How to Build One from Scratch
Build evaluators to signal problems that users actually care about. Step-by-step guide.
Feb 17
•
Paul Iusztin
29
1
4
Integrating AI Evals Into Your AI App
The holistic guide: From optimization to production monitoring
Feb 10
•
Paul Iusztin
41
7
8
Behind the Scenes of AI Observability in Production
What actually works after 6 months of trial and error
Feb 3
•
Alejandro Aboy
30
5
6
Your Agent's Reasoning Is Fine - Its Memory Isn't
Using GraphRAG to build a Production Engineer agent that knows dependencies, incidents, and ownership.
Jan 20
•
Anca Ioana Muscalagiu
63
5
15
From 100+ AI Tools to 4: My Prod Stack
How simplicity beats complexity in real AI systems
Dec 30, 2025
•
Paul Iusztin
63
12
8
We Killed RAG, MCP, and Agentic Loops. Here's What Happened.
A brutally honest case study of building our vertical AI agent and shipping it to production.
Dec 23, 2025
•
Paul Iusztin
55
5
10
Stop Launching AI Apps Without This Framework
A practical guide to building an eval-driven loop for your LLM app using synthetic data, before you have users.
Oct 30, 2025
•
Hugo Bowne-Anderson
41
4
6
This site requires JavaScript to run correctly. Please
turn on JavaScript
or unblock scripts