Robert Lehmann
Planet-Scale Dashboards
#1about 3 minutes
The challenge of creating monitoring dashboards from scratch
Monitoring is often an afterthought, leading to painful incident response without the necessary dashboards for troubleshooting.
#2about 3 minutes
Understanding Google's unique observability scaling challenges
Google's massive scale, global distribution, and monorepo architecture created a unique need for a scalable, reusable monitoring solution.
#3about 5 minutes
Building reusable dashboards with templated dimensions
Replace hardcoded values in queries with template variables, called dimensions, to create a single dashboard that can be reused for any service.
#4about 6 minutes
Solving dashboard discovery with scopes and traits
Address the problem of too many dashboards by having users select a "scope" (e.g., a service), which then uses discovered "traits" to show only relevant dashboards.
#5about 2 minutes
Modeling different entities with scope types
Introduce "scope types" to create namespaces for different kinds of monitorable entities, such as servers, databases, or machine learning models.
#6about 4 minutes
Why infrastructure as code is not the right solution
Static provisioning with infrastructure-as-code or dashboards-as-code is insufficient because it lacks dynamic runtime information and creates a stale second source of truth.
#7about 3 minutes
Improving performance at scale with query variants
Use pre-aggregated metrics and define multiple query "variants" within a graph, allowing the system to automatically select the most performant query based on the user's drill-down level.
#8about 1 minute
Visualizing dependencies with a service graph
Leverage the scope and dependency information to build a service graph that helps engineers quickly navigate between related systems during an incident.
#9about 1 minute
Key takeaways for building planet-scale dashboards
A summary of the core principles: use dimensions for reusability, traits for discovery, scope types for genericity, and variants for performance.
Related jobs
Jobs that call for the skills explored in this talk.
Matching moments
02:07 MIN
Adopting an "as code" approach for dashboards
Monitoring as Code - Managing your dashboards at scale
18:09 MIN
Overcoming observability challenges with a unified platform
All your telemetry data from any source in one place
12:34 MIN
Moving from basic monitoring to full system observability
All your telemetry data from any source in one place
29:58 MIN
How engineers handle production errors and monitoring
DevOps at Netflix
06:23 MIN
Addressing the challenges of scaling a global data platform
Blueprints for Success: Steering a Global Data & AI Architecture
19:50 MIN
Evaluating the state of current monitoring solutions
Deployed ML models need your feedback too
13:14 MIN
Building a cost-effective hybrid observability platform
Software Engineering Social Connection: Yubo’s lean approach to scaling an 80M-user infrastructure
04:29 MIN
Navigating the overwhelming explosion of observability tools
Telemetry without the 'Tool Tax'
Featured Partners
Related Videos
Monitoring as Code - Managing your dashboards at scale
Gabriel Labachelerie
Single Server, Global Reach: Running a Worldwide Marketplace on Bare Metal in a Cloud-Dominated World
Jens Happe
Modularity: Let's dig deeper
Pratishtha Pandey
Empowering Developer Innovation - Balancing Speed, Security, and Scale
Amir Friedman, Martin Reynolds & Yair Etziony
The Rise of Reactive Microservices
David Leitner
Handling incidents collaboratively is like solving a rubix cube
Nele Uhlemann
How to Destroy a Monolith?
Babette Wagner
Blueprints for Success: Steering a Global Data & AI Architecture
Dominik Schneider
Related Articles
View all articles



From learning to earning
Jobs that call for the skills explored in this talk.




Senior Consultant, Red Team, Google Cloud, Mandiant Consulting
Remote
Senior
Linux
Network administration

Software Engineer - Observability Platform (Golang / Kubernetes
Roku's Cloud Technology Infrastructure
Senior
Grafana
Prometheus
Kubernetes



Software Engineer - Data & Analytics Governance
Datadog
PostgreSQL
Kubernetes
Apache Kafka
Data analysis
Microservices

Technical Program Manager, Core Data, Analytics and Insights
€51K
Senior
Core Data
Data analysis
Machine Learning