Cloud DevOps Team
If you’re looking for Delivery documentation instead of DevOps, visit this page.
The two primary pillars of the Cloud Team are Availability and Observability as defined in RFC 498
This team ensures that has the same reliability and availability as other world-class SaaS offerings. This team is also responsible for Observability monitoring and tooling to ensure that we are meeting these goals.
More can be found in our Cloud Vision
Areas of Ownership
The Cloud DevOps team is responsible for the infrastructure used to host This includes but is not limited to dashboard and observability, uptime and reliability, and managing our cloud provider resources. This team works closely with the other teams in the Cloud org to ensure is available and functional for our users. Notably, this team has the ability to slow or stop rollouts to if needed to improve stability.
This team is responsible for
- Continuous deployment of
- Cloud monitoring infrastructure (Prometheus / Grafana)
- Managed instances
- Bill Creager, Product Manager
- Jennifer Mitchell, Engineering Manager - DevOps
- Dax McDonald, Software Engineer
- Manuel Ucles, Software Engineer
- Filip Haftek, Software Engineer DevOps
- Daniel Dides, Cloud DevOps
- Sander Ginn, Cloud DevOps engineer
- How to deploy a code change to the Cloud
- Large release (rollout release) process
- How to make configuration changes to
- Datadog monitoring
- Onboarding
- How to add or modify DNS Records
- Disaster Recovery
- How to resize disks in StatefulSet
- How to use preprod aka staging
- Persistent disk backup schedule
- Silencing Alerts
How to contact the team and ask for help
The best way to contact the cloud-devops team is in the #cloud-devops slack channel.
- Assist the Cloud-SaaS team with RFC 525
- Stabilize our CD environment
- Create a pre-production environment for Cloud
- Standardize our monitoring, logging and error reporting systems
- Migrate our zonal cluster to a regional cluster and document the process
- Complete assigned security tasks on behalf of security and provide evidence of completion