Skip to main content
Free forever · no paywall · no ads
Request a course
→
en
vi
claudem
y
.org
Tracks
Library
By use case
Skills
Search courses, lessons…
⌘K
en
vi
Home
YouTube Library
AI Engineer — Evals & Observability
Course · YouTube Library
AI Engineer — Evals & Observability
AI Engineer
6 lessons
5h 14m
Feedback-driven prompt optimization
1
Build a Prompt Learning Loop - SallyAnn DeLucia & Fuad Ali, Arize
intermediate
52m
Measuring AI agent developer productivity
2
How METR measures Long Tasks and Experienced Open Source Dev Productivity - Joel Becker, METR
intermediate
1h 16m
LLM evaluator calibration and optimization
3
Judge the Judge: Building LLM Evaluators That Actually Work with GEPA — Mahmoud Mabrouk, Agenta AI
intermediate
41m
Benchmarking LLM real-world limitations
4
What Do Models Still Suck At? - Peter Gostev, Arena.ai, BullshitBench
advanced
20m
Building effective LLM agent eval platforms
5
Why building eval platforms is hard — Phil Hetzel, Braintrust
intermediate
26m
Observability for production AI systems
6
Shipping complex AI applications — Braintrust & Trainline
intermediate
1h 39m
💬
Feedback / Report
Spotted an issue or have an improvement idea?
→