Fastest command to determine TSDB app health status across a Pod
tbrpc status
Number of master nodes in the per-pod OpenSearch clusters
6
Department that ensures Build Agents are standardized
TechOps
What to do when vpn.logicmonitor.net is not working for you
1. clear route tables
2. try location-specific VPN
Process for getting help after receiving an alert
1. Ping TOP Slack channel
2. Escalate alert via PD
25TB for DC
12TB for AWS
Number of worker nodes in per-pod OpenSearch clusters
3
Department responsible for correcting/adjusting ACL permissions in Bamboo
InfoSec ala Engineering Access Teams
How can you quickly notify backup on-call that you need help?
or File an SD ticket and let the page slip through
One way to keep support updated about a Service Disruption
communicate via slack in #help-support or #techops-support
What system or service does TSDB consume data from?
kafka (topic name tsdb)
How to resolve out-of-storage alerts in OpenSearch?
Resize cluster EBS volumes on worker nodes
Reason we cannot safely stop Santaba deployments
The ansible play, if cancelled mid-way, will leave Santaba in a non-functioning state
Why load increases on CSProxies
When collectors cannot reach their normal endpoint (bonus for CAv2 mention)
Best course of action for Santaba HTTPS Can't Connect alerts at 2am
Restart Santaba
Do this when Santaba queries to one TSDB are failing
Shut TSDB app server down on that server
How to resolve OpenSearch worker node over-utilization
Resize the instance types of the worker nodes
How Bamboo deploys non-k8s applications (and which are they)
Dockerized ansible playbooks (Santaba, Reporting, TSDB)
Reason why a customer cannot reach LogicMonitor
the network
Two ways to get Development online to help with a Service Disruption
1. Escalate via PagerDuty
2. File a DEVTS P0 ticket
Two ways to restore data to TSDB
1) TSDB restore from backup procedure
2) tbclone
Two ways we limit access to OpenSearch clusters
1) VPC facing (old ones were public facing)
2) IAM policies
How Bamboo deploys k8s applications
k8sdeployer
How you determine active companies on a Santaba using only mysql commands
select name,status from santaba.companies (where status='active')
How you recover a failed Santaba server