Where do you go to find out who is their ISP?
ITG Internet / WAN asset
You're working on a LOB app performance issue affecting everyone but need to restart SQL. Who do you contact to get approval? Do you need a CR?
POC and no CR is needed since its an incident but you still need to notify CC Teams
Something bad happened and you're trying to set a priority on the ticket but can't decide between two priorities (Med / High)
Which one do you set?
High. Always set the higher of the two priorities if you're unsure
What do you fill out after an incident is resolved?
Incident report
You use this to get pre-windows lights out control on HP servers
Bonus points if you know the words in the acronym
Uh oh. User called to report their office internet is down. Whats the first thing you do?
Hint: its the first step in our network down ops manual article.
Check monitoring to verify that the alert or report from a user is indeed an outage.
Worldox is failing to launch for everyone. Who do you call if the client has both Baker Cadence and Worldox Support listed as vendors?
WD Support
BC is for consulting, projects or complex changes.
Who do you communicate incident updates to on the client side?
POC or alternate POCs if they're unavailable
Uh oh. Colo is officially down, what priority do we use?
Priority 1 - Critical
You use this to get pre-windows lights out control on Dell servers
Bonus points if you know the words in the acronym
Integrated Dell Remote Access Controller (iDRAC)
Modem is offline and confirmed to be powered on and cables firmly reseated. Who do you call to fix?
ISP
Uh oh. PC Law is spitting out a funky error preventing users from using Matter Manager. You've gone through our docs, Googled and checked with our team but no solution in sight.
What do you do next?
Call the vendor for support
When you communicate outage updates which medium do you use (phone, email, RFC 2549)? Why?
Phone (cell # if phones are down) because its fastest and if network is down emails probably wont work
Ah man. A network is confirmed to be down. Which document should I follow to troubleshoot?
Network Down in the SDOM
Hmm... RAID controller is showing a failed disk. What do you do next?
Hint: this answer is pretty flexible so use your judgement
1. Check that we have a good backup and possibly run another one for now + shorten the time between backups
2. Dispatch onsite tech to reseat drive. If it still shows failed then remove drive.
3. Contact vendor while onsite to get a warranty issued.
In which order do you troubleshoot an internet down incident?
Router, AD / DNS server(s), Modem, Switch
Oh man. You have to do some risky stuff to fix a LOB app issue after hours on a Windows VM.
What should you do before and after you apply the fix?
Create a HV checkpoint and delete the checkpoint afterwards.
Uh oh. A security incident has occurred and you've confirmed it with the client and systems. You've notified the CC team and are working on the issue now.
Whats your first priority?
Stop the attack in progress
Awww yeah. The server performance issue self resolved.
Do we call it a win and close the issue?
No, if you did nothing then its likely it will reoccur. Be a good tech and find out why and what mitigations should be put in place to prevent it from happening again.
Uh oh. Server won't power on and onsite contact confirmed that power is good on that circuit.
What should you do? Walk us through the steps and troubleshooting you might do.
1. Have onsite contact check that power cables are seated and UPS is powered on.
2. Dispatch onsite ASAP
3. While onsite, double check tests and contact vendor for warranty support.
4. Spin up BDR
Uh oh. Meraki is showing as online but all computers and servers are offline.
Wut could it be? Name 2 likely causes [SERIOUS]
¯\_(ツ)_/¯
DNS, L2 switching related, L3 (bad firewall rule, NAT, etc)
When assessing a LOB app outage, what should you do on that initial contact with them?
Phew. Security incident averted and incident report filed.
What is the last thing to do and who should do it?
Have Scott / Alfred contact the client with the incident report details.
Who can never be trusted?
The user
Server has died and is out of warranty. We've told the client multiple times already to replace the thing and they've turned us down.
After you've chuckled at their misfortune what are a few things you can do?
Send to sales and get a quote out for a replacement.
Get a hot spare replacement server in place with approval from Andy, Scott or Alfred.
Fire up BDR. Chances are if they didn't listen to our warnings about the server they didn't want to buy a BDR.