Security wants detections + investigations in logs because they are not able to detect it in time
What is Cloud SIEM
What’s Datadog’s “secret sauce” in one word: connecting metrics, logs, traces, and user data together?
Correlation
“A time series chart of CPU%, request count, error rate.”
Metrics
What tool is important for a Product Owner/Product Manager
Product Analytics
“We want one tool instead of five different dashboards.”
What is tool consolidation
“Users say the app feels slow, but our servers look fine.”
What is Real User Monitoring (RUM)
Customer asks: “Can Datadog monitor cloud + on-prem together?
Yes, explain how
“A searchable record of an error message, stack trace, and context fields.”
Logs
The “allowed downtime” concept that helps teams balance reliability vs shipping features is called what?
What is an error budget?
“Our incidents take forever to resolve.”
What is high MTTR?
“We don't have a single pane of glass for managing the problems in our infrastructure and routing, escalations are chaos.”
What is Datadog Incident Management Suite
Customer says we have microservices that we need to track. Which product helps them here?
Application Performance Monitoring (APM)
Follow a single request as it travels through multiple services.”
The meeting/report teams do after an incident to capture learnings is called what?
What is a post-mortem?
We keep getting paged for noise.
What is Alert Fatigue
“We just need to know if our website is up or down from different locations.”
What is Synthetic Monitoring (API tests)
The Datadog concept for measuring reliability against a target (e.g., 99.9%) is called what?
SLOs
“We need to connect a spike in errors to the exact related logs and trace spans.”
What is Correlation across metrics, logs, and traces?
The set of metrics often used to measure software delivery performance (deployment frequency, lead time, MTTR, change failure rate).
DORA Metrics
I need to understand my data pipelines and see where the problem is
Data Streams Monitoring
"We need to track costs by team/service and cut wasted spend.”
Cloud Cost Management
The Datadog view that shows how services depend on each other is called what?
Service Map
“We want to see the full story of an incident in one place without tab-hopping.”
What is a unified observability workflow (single platform investigation)?
The job role focused on reliability, error budgets, and operational excellence.
Site Reliability Engineer
I need a tool to understand the security problems in my application
App & API Protection