Posted 22 May, 2026
Senior Staff Software Engineer-SRE
OneTrust
Bengaluru, India
Full Time
Reference: 102_700149_7825603
The Challenge
As Senior Principal Software Engineer develop strategic methods and principles to improve the technical stability and efficiency of the company's products/services.
You lead and influence several teams and initiatives.
Your Mission
- Design and build platforms, tools, and frameworks to improve system reliability, scalability, and performance.
- Define and implement SRE best practices, including SLIs/SLOs, error budgets, and reliability metrics.
- Lead incident response efforts, drive root cause analysis, and implement long-term fixes to prevent recurrence
- Analyze system behavior, identify bottlenecks and saturation points, and implement solutions to improve resilience
- Partner with engineering teams to embed reliability into the software development lifecycle
- Evaluate emerging technologies and recommend tools that enhance productivity, observability, and system robustness
- Drive capacity planning, performance tuning, and cost optimization efforts
- Collaborate with cross-functional teams to identify gaps, prioritize improvements, and resolve production issues
- Provide technical leadership and mentorship across the engineering organization
- Influence senior leadership with insights, metrics, and recommendations to improve system health and operational excellence
You Are
- Bachelor's or Master's degree in Computer Science, Engineering, or related technical field
- 10+ years of experience in software engineering with a strong focus on backend systems and distributed architecture
- Extensive experience building and operating Java-based systems using: RESTful APIs, Spring Boot, Microservices architecture.
- Strong understanding of distributed systems concepts, including fault tolerance, eventual consistency, and scalability
- Proven experience with cloud platforms (AWS/Azure/GCP) and cloud-native architectures
- Expertise in observability tools (monitoring, logging, tracing) such as Prometheus, Grafana, ELK, or similar
- Experience defining and managing SLIs, SLOs, and error budgets
- Strong knowledge of CI/CD pipelines, automation, and infrastructure as code
- Hands-on experience with incident management, root cause analysis (RCA), and postmortems
- Excellent analytical, debugging, and problem-solving skills
- Strong communication, collaboration, and leadership abilities