Position Description Senior DevOps & SRE Manager - Platform Reliability & Global Operations is a senior technical leader responsible for the reliability, scalability, security, and operational excellence of a complex, multi"‘platform ecosystem spanning applications, workflows, event streaming, and data platforms. Location & Work Arrangement Candidates must be able to work primarily within Pacific or Central Time Zone business hours to support collaboration with global teams. Employees located within 50 miles of a Qcells office (e.g., Irvine, San Francisco, Houston, or South Carolina locations) are expected to follow the company's hybrid work policy of at least three in"‘office days per week. Responsibilities Lead and scale a global, multi"‘tier (L1/L2/L3) DevOps and SRE organization Design and operate follow"‘the"‘sun on"‘call and support models Own incident management, including Sev"‘1/Sev"‘2 incident command and executive communication Define and operate SLOs, SLIs, and error budgets across apps, workflows, events, and data pipelines Oversee DevOps practices for CI/CD, Kubernetes, IaC, automation, and cost optimization Ensure reliable operation of event"‘driven and telemetry pipelines Govern and manage third"‘party DevOps and SRE vendors, including SLAs and escalations Drive operational maturity: post"‘mortems, automation, reliability improvements Partner with security on secure operations, incident response, and compliance readiness Platforms in Scope Application Platforms: Kubernetes, containerized, EMS telemetry & control Workflow Orchestration: Fleet Manager, Power Automate, cross"‘system workflows Event & Streaming: Microsoft Event Hub, event streams, Kafka, RabbitMQ Data & Telemetry: Microsoft Fabric, Kusto, PostgreSQL, TimescaleDB, Cassandra CI/CD & Infrastructure: GitHub Actions, Jenkins, Terraform, Helm, Ansible (Azure & AWS) IAM across Azure and AWS Experience with SalesForce, Snowflake preferred Technical Strengths Kubernetes and container platforms in production Azure (required), AWS Event streaming and messaging systems Data pipelines and telemetry platforms Power pages, Power Automate CI/CD, Infrastructure as Code, and automation Observability and incident troubleshooting at scale Operational Expectations Escalation management for on"‘call and major incidents Willingness to work off"‘hours when required Comfortable making high"‘impact decisions under pressure Required Qualifications 15+ years in DevOps, SRE, Platform Engineering, or Production Operations 5+ years leading globally distributed engineering teams Proven ownership of 24x7, mission"‘critical production platforms Strong experience managing third"‘party vendors/managed service providers Deep hands"‘on experience with Kubernetes, cloud platforms, and event"‘driven systems Preferred Qualifications Solar industry experience (Renewable) Use of AI Tools Qcells expects team members to leverage AI models and AI"‘assisted tools in their daily workflows where appropriate. Candidates should be comfortable working in an AI"‘augmented environment and applying sound judgment when using AI"‘generated outputs. During the interview process, candidates will be asked to share examples of how they have used AI tools or models in their work. Salary Range The salary range is required by the California Pay Transparency Act and may differ depending on the location of those candidates hired nationwide. Actual compensation is influenced by a wide array of factors including but not limited to skill set, education, licenses and certifications, essential job duties and requirements, and the necessary experience relative to the job's minimum qualifications. This target salary range is for CA positions only and should not be interpreted as an offer of compensation. #J-18808-Ljbffr
Job Title
Senior Manager, DevOps & SRE - Platform Reliability & Global Operations