Overview Join to apply for the Senior Incident Manager role at Databricks . At Databricks, we are passionate about empowering data teams to tackle the world’s most challenging problems—building and operating the world’s best data and AI infrastructure platform to enable customers to leverage deep data insights and enhance their business. We design and scale our services across millions of virtual machines with a focus on reliability, transparency, and continuous improvement. This role combines operational leadership, technical systems knowledge, and exceptional communication skills. You will be at the intersection of engineering depth and operational clarity, ensuring that major incidents are managed with precision and that customers and stakeholders remain informed and confident during high-impact events. Responsibilities Lead critical incidents — coordinate multi-disciplinary response efforts across Databricks’ cloud-based services to rapidly mitigate impact and restore operations. Drive technical root cause analysis and reliability improvements: Collaborate with engineering teams to trace and document underlying causes across distributed systems, services, and data stores. Summarize key learnings, communicate action items clearly, and ensure technical and procedural improvements are followed through. Own communications during incidents — deliver frequent, high-quality updates to internal stakeholders (executives, engineering leadership, support) and publish customer-facing notifications that are accurate, timely, and empathetic. Mentor and train peers in incident communication and technical response disciplines to raise the overall quality of Databricks’ incident response. What we are looking for (Qualifications) 5+ years of experience in incident management, site reliability engineering, or production operations supporting large-scale, cloud-native systems. Proven ability to lead and coordinate high-severity incidents, including identifying impact, isolating fault domains, and managing multi-team response efforts. Strong understanding of cloud infrastructure (AWS, Azure, or GCP) — including compute, networking, storage, and observability components. Deep expertise in log analysis and debugging: Familiarity with log aggregation and search tools (e.g., Datadog, Elasticsearch, Splunk, Cloud Logging, or OpenTelemetry). Hands-on experience with observability systems — metrics, logging, and tracing frameworks (Prometheus, Grafana, OpenTelemetry, etc.). Proficiency in at least one major programming or scripting language (Python, Go, or Bash) for automating diagnostics, data collection, or analysis. Experience developing and maintaining incident playbooks and communication templates to ensure consistent, timely updates. Excellent contextual interpretation and writing skills, with the ability to summarize and communicate to both technical and business audiences. BS, Master’s or other advanced degree in Computer Science or Computer Engineering, or related engineering field. Pay Range Transparency Databricks is committed to fair and equitable compensation practices. The pay range for this role is listed below and represents the expected salary range for non-commissionable roles or on-target earnings for commissionable roles. Actual compensation packages are based on factors including skills, depth of experience, certifications, training, and location. The total package may include eligibility for annual performance bonus, equity, and benefits. For more information regarding which range applies to your location, please visit the company page. Zone 1 Pay Range
...the subject, per Education Code 87355 (issued prior to July 1, 1990); b. Master's degree from an accredited institution in political science, government, or international relations. c. Bachelor's degree in political science, government, or international relations...
...Maids is a small, community-established, locally owned residential cleaning service for the Mid TN area. We are currently hiring for houses... ...Carry all cleaning products and equipment to and from the office, vehicle, and customers homes Assist in keeping supplies stocked...
...Why Work at Colcord Hotel? At Colcord Hotel, a Curio Collection Hotel by Hilton, we blend... ...guest. POSITION SUMMARY: ~ The Valet Supervisor at The Colcord plays a key role... ...: Efficiently and safely park all vehicles in designated areas, ensuring...
...Job Description Job Description Join Our Team as an Agency Nurse AM at Prestige ER Mesquite Emergency Room Looking for an opportunity to make a real difference in healthcare? At Prestige ER Mesquite Emergency Room, were dedicated to providing exceptional care...
...Social Media Manager Role Overview The Social Media Manager is responsible for planning, creating, publishing, and optimizing content... ...of social media while aligning content with brand voice, marketing goals, and business objectives. Key Responsibilities...