Staff Software Development Engineer - Customer/Cloud Reliability
Role
We are looking for a Staff Software Engineer to join our team. This is a hybrid role, reporting to the Director, SW Engineering in the Software Engineering department. You will drive the technical evolution of our cloud reliability platforms, designing distributed software that enables autonomous detection and remediation of issues at enterprise scale. By leveraging agentic AI frameworks and deep systems expertise, you will reduce mitigation times and build the foundational tools that ensure our cloud services remain secure and highly available.
What you'll do (Role Expectations)
Managing and resolving high-impact customer escalations for enterprise products and services
Acting as a technical liaison between engineering and support teams to drive rapid issue resolution
Debugging and troubleshooting complex problems in cloud environments and operating systems (Linux/Unix)
Driving hotfixes, patch releases, and other corrective actions to maintain high service availability
Identifying and recommending enhancements to proactively improve product reliability and customer experience
Who You Are (Success Profile)
You thrive in ambiguity. You're comfortable building the path as you walk it. You thrive in a dynamic environment, seeing ambiguity not as a hindrance, but as the raw material to build something meaningful.
You act like an owner. Your passion for the mission fuels your bias for action. You operate with integrity because you genuinely care about the outcome. True ownership involves leveraging dynamic range: the ability to navigate seamlessly between high-level strategy and hands-on execution.
You are a problem-solver. You love running towards the challenges because you are laser-focused on finding the solution, knowing that solving the hard problems delivers the biggest impact.
You are a high-trust collaborator. You are ambitious for the team, not just yourself. You embrace our challenge culture by giving and receiving ongoing feedback-knowing that candor delivered with clarity and respect is the truest form of teamwork and the fastest way to earn trust.
You are a learner. You have a true growth mindset and are obsessed with your own development, actively seeking feedback to become a better partner and a stronger teammate. You love what you do and you do it with purpose.
What We're Looking for (Minimum Qualifications)
Foundational understanding of AI/ML technologies and experience leveraging, securing, or positioning AI-driven solutions to optimize outcomes within your functional domain
5+ years of experience supporting enterprise products/services and handling customer escalations
In-depth knowledge of Linux/Unix operating systems and operating duplication principles
Demonstrated expertise in debugging issues in cloud environments and strong understanding of networking concepts including TCP/IP stack performance
Experience in identity and access management (IAM), especially SAML
What Will Make You Stand Out (Preferred Qualifications)
Hands-on experience designing, training, or deploying agentic AI frameworks or LLM-driven automation tools to architect autonomous self-healing capabilities within cloud-scale infrastructure
Strong proficiency in C programming, data structures, and algorithms, with hands-on experience in POSIX multi-threading and low-level network/socket programming
Demonstrated experience diagnosing complex issues across the stack-from low-level network performance to application-level APIs-coupled with conceptual or working knowledge of Zero Trust Security
#Hybrid #LI-GL2