This piece really resonated with me. The 'POC Purgatory' is such an accurate description; it’s a proplem I've seen too often. Your emphasis on Evalution-Driven Development is spot on. It truly feels like we're still figuring out the engineering principles for robust LLM apps. A fantastic read.
"vibe something to start, then add more tests/use cases to hit, see if you hit them, if not, figure out why, fix those problems, keep running those tests/evals, repeat", simple!
Thanks for yet another wonderful collaboration, Paul!
I hope that it helps your audience build more reliable AI-powered software :)
Great collab as always, Hugo! Hehe, I hope as well. It's hard to get into this Eval-Driven mindset.
This piece really resonated with me. The 'POC Purgatory' is such an accurate description; it’s a proplem I've seen too often. Your emphasis on Evalution-Driven Development is spot on. It truly feels like we're still figuring out the engineering principles for robust LLM apps. A fantastic read.
I'm so glad it resonated, Daniel, and thank you for your kind words!
Thanks for the good 😊
Amazing collab between you, guys!
It's really helpful to read this kind of approaches to set some best practices that helps build meaningful products beyond demos.
Thanks for sharing!
Agree! Evaluation-driven design is probably the future of software as we start integrating more AI into it.
Also glad you enjoy this “Hugo” month 🤟😂
"vibe something to start, then add more tests/use cases to hit, see if you hit them, if not, figure out why, fix those problems, keep running those tests/evals, repeat", simple!