An event that disrupts or diminishes the quality of Spreedly’s service, impacting a customer's ability to use the product.
What is an incident?
State the navigation path to the incident page in Datadog.
What is "Service Management -> Incidents"?
- Complete the statement below. -
If you are unsure if something that you are seeing is an incident, _____ _____ and err on the side of caution and page someone.
What is "say something"?
These two ticket types are used for organizing and tracking tickets in Zendesk during an incident.
What are "Problem" and "Incident" tickets?
Spreedly currently has _____ severity levels.
What is "4" severity levels?
The goal of incident response is to ______ ______ as quickly as possible.
What is "restore service"?
True or False. Datadog allows posting directly to the status page.
What is "True"?
Some(not all) requests to Spreedly’s public API to tokenize, deliver, authorize, or purchase are down. Give this situation a severity.
What is a "Severity 2"?
States what's happening below.
What is "paging the incident coordinator on call"
There are three channels in Slack that are of importance when it comes to incidents.
What is:
-"incident-chat"
-"incident-response"
-"incident-post-review"
Part of this individual's responsibility is to:
-Declare the severity level of the Incident
-Approve public messaging
-Form the incident response team
What is "Incident Coordinator"?
True or False: When declaring an incident in Datadog, an individual must choose the severity level and the correct Incident coordinator before declaring the incident.
These items are not mandatory, and the IC will update the fields.
True or False: A former customer writes in and states that all transactions are down on the Datatrans gateway. You should initiate an incident.
What is "False"?
This is not a Spreedly customer, and therefore, we should not initiate an incident.
You are the support engineer on call for incidents, and you receive a page from OpsGenie. Indicate the channel you should access first.
What is "incident-chat"?
In a _____ session, we address any problems, bottlenecks, mistakes made, and successes achieved that took place during the incident.
What is a "Retro" session?
This incident responder is called in to resolve the issue and identify the root cause of the incident.
What is "Engineering Responder"?
True or False. Datadog has built-in monitoring that can aid our engineers in the early detection of an issue.
What is "True"?
Datadog's monitoring enables our engineers to detect issues more quickly, even before customers are aware.
True or False: We don't have to engage the Account Manager for sensitive accounts after an incident.
What is "False"?
We should engage the account manager in instances where we know the account is sensitive, and especially in sticky instances like incidents.
List 1 of the three ways an incident may be triggered.
2. What is "monitoring"?
3. What is via "red alert"?
True or False: You have been asked to update the status page of an ongoing incident. You should write a draft and post it without review.
What is "false"?
You should write up a draft and have it reviewed by the Incident Coordinator.
This incident responder is in charge of customer communications, updating the Status Page, and handling communications via Zendesk and other channels.
What is "Support Responder"?
What are 1 of the two ways we can declare an incident?
1. What is "from Slack type /datadog".
Or
2. What is "from Datadog Service Mgmt → Incidents"
See Visual: Declaring An Incident
All requests to Spreedly’s public API to tokenize, deliver, authorize, and purchase are down. Give this situation a severity.
What is a "Severity 1"?
True or False: A post-mortem will follow an incident that the incident coordinator labeled as a Sev1 or Sev2.
What is "True"?
This channel is used after an incident to relay details of the post-mortem.
What is "incident-post-review"?