Skip to main content
Posted 24 June, 2026

Lead Site Reliability Engineer

Cvent
Faridabad, HR, IN Full Time
Reference: fe51176330999c7e

Job Description

Cvent is a leading meetings, events, and hospitality technology provider with more than 5,000+ employees and 24,000+ customers worldwide, including 60% of the Fortune 500. Founded in 1999, Cvent delivers a comprehensive event marketing and management platform for marketers and event professionals and offers software solutions to hotels, special event venues and destinations to help them grow their group/MICE and corporate travel business. Our technology brings millions of people together at events around the world.

In short, we’re transforming the meetings and events industry through innovative technology that powers the human connection.\nCvent's strength lies in its people, fostering a culture where everyone is encouraged to think like entrepreneurs, taking risks and making decisions confidently. We value diverse perspectives and celebrate differences, working together with colleagues and clients to build strong connections.\n\nAI at Cvent: Leading the Future:\nAre you ready to shape the future of work at the intersection of human expertise and AI innovation?\nAt Cvent, we’re committed to continuous learning and adaptation—AI isn’t just a tool for us, it’s part of our DNA. We’re looking for candidates who are eager to evolve alongside technology.

If you love to experiment boldly, share your discoveries, and help define best practices for AI-augmented work, you’ll thrive here. Our team values professionals who thoughtfully integrate AI into their daily work, delivering exceptional results while relying on the human judgment and creativity that drive real innovation.\nThroughout our interview process, you’ll have the chance to demonstrate how you use AI to learn, iterate, and amplify your impact. If you’re excited to be part of a team that’s leading the way in AI-powered collaboration, we’d love to meet you.\n\nDisclaimer: Beware of Recruitment Scams – Legitimate Cvent recruiting communications will always come from an official ‘’ email.

We never request any payments or ask for sensitive personal or financial information via chat or social media platforms. For more information, please visit: https://www.cvent.com/en/notice-recruitment-fraud\n\nAbout the Role:\nSite Reliability is about combining development and operations knowledge and skills to help make the organizationbetter. Whether you have a development background and are interested in learning more about operations andsecurity or have an operations or security background and are interested in developing internal tools andautomation – Cvent SRE can benefit from your skillsets.

Ultimately, we are looking for passionate people who lovelearning, love technology and always want to make things better.\nAs a Lead SRE on the SRE Security team, you will be responsible for mentoring others and helping Cvent to bothenvision and achieve our DevSecOps goals. We are looking for someone with the drive, ownership and ability totake on challenging problems, both technical and process related, in a dynamic, collaborative and highlydistributed, multi-disciplinary team environment. You will use your background as a generalist to work closely withproduct development teams, Information Security, Cloud Infrastructure and other SRE teams to ensure the effectiveand efficient maintenance of our platforms' security.

You must be able to see the big picture and workcollaboratively with teams to solve hard multi-disciplinary problems.\nTechnical expertise in topics such as cloud operations, the software development lifecycle, and securityvulnerability management will be of great help to you. However, excellent soft skills in mentorship, communicationand the ability to drive alignment are must haves. We use SRE principles such as blameless postmortems and afocus on automation to ensure we're constantly improving our knowledge and maintaining a good quality of life.\nOverall, we're passionate about continuous improvement, learning and participating in dynamic day to day workwhere success is rewarded with recognition and upward mobility.\n\nWhat You Will Be Doing:\nEnlighten, Enable and Empower a fast-growing set of multi-disciplinary teams, across multiple applications andlocations.\nTackle complex development, automation and business process problems.

Champion Cvent standards and bestpractices.\nEnsure the scalability, performance, and resilience of security related systems and processes.\nWork with product development teams, Information Security, Cloud Automation and other SRE teams to ensurea holistic understanding of security concerns and their effective and efficient identification and resolution.\nIdentify recurring problems and anti-patterns in development, operational and security processes.\nDevelop build, test and deployment automation that seamlessly targets multiple on-premises and AWS regions.\nGive back by working on and contributing to Open-Source projects.\n\nWhat You Need for this Position:\n7–10 years of hands-on experience in Site Reliability Engineering — with a demonstrated track record ofowning reliability, security, and operational excellence at scale in production environments.\nHands-on experience with AWS WAF — including rule authoring, rate-based rules, bot control integration, WAFrule group management, and multi-product WAF sharing strategies (e.g., managing WAF rule limits acrossapplications sharing the same WebACL).\nExperience designing and implementing DDoS protection using AWS Shield Advanced — including transitioningendpoints from count to block mode, building observability solutions (Lambda + CloudWatch alarms), and self-service enablement for product teams.\nExperience with bot mitigation strategies — including AWS Bot Control, silent challenge / token-based trafficclassification (verified humans, verified bots, unknown traffic), JA4+ASN fingerprinting, and evaluation of third-party bot mitigation vendors (e.g., Datadome).\nExperience managing AWS services and operational knowledge of running applications in AWS — ideally viaautomation and Infrastructure as Code (IaC) using CloudFormation or CDK.\nStrong understanding of CI/CD pipelines — experience with Jenkins or equivalent, PR-based deploymentworkflows, build/test/deploy automation, and troubleshooting pipeline failures in distributed environments.\nIncident management experience — able to act as IC, write clear incident summaries, drive RCA, andcoordinate resolution across teams under pressure.\nChange management discipline — ability to communicate changes proactively to stakeholders, documentrollout strategies, and manage phased production deployments with rollback plans.\nFluent in at least one scripting language such as TypeScript, JavaScript, Python, Ruby, or Bash.\nExperience with SDLC methodologies (preferably Agile).\nExcellent communication skills and a track record of driving alignment across multi-disciplinary teams.\n\nAI & Automation Literacy (Must Have):\nPractical understanding and hands-on exposure to AI fundamentals as applied to SRE and operational workflows:\nPrompt Engineering — ability to design effective prompts for LLMs to assist with incident analysis, RCAgeneration, runbook creation, and on-call triage.\nRetrieval-Augmented Generation (RAG) — basic understanding of RAG patterns; ability to leverage orcontribute to RAG-based internal tools that surface relevant runbooks, past incidents, and knowledgebase articles during operational events.\nAI-assisted Workflow & Process Automation — experience using or building AI-powered automationsin operational contexts, such as automated incident summarization, alert enrichment, change riskassessment, or post-mortem drafting using LLM integrations (e.g., via MCP tools, Slack bots, or custompipelines).\n\nGood to Have Skills:\nDisaster recovery planning and execution — experience with multi-region failover, DR runbooks, and recoverytime / recovery point objective (RTO/RPO) management.\nExperience managing CloudFront distributions, API Gateways, and ALBs as part of a layered security posture.\nExperience with APM, monitoring and logging tools (Datadog, New Relic, Splunk).\nFamiliarity with security assessment tools and methodologies:\nCloud Security Posture Management (CSPM)\nInfrastructure Vulnerability Scanning\nStatic Code Analysis\nSoftware Composition Analysis (SCA)\nStatic, Interactive and Dynamic Application Security Testing (SAST, IAST and DAST)\nRuntime Application Self Protection (RASP)\nGood understanding of containerization concepts — Docker, ECS, EKS, Kubernetes.\nExperience managing 3-tier application stacks.\nUnderstanding of basic networking concepts.\nFamiliarity with risk assessment and management concepts and practices.\nExperience with IaC tools such as CloudFormation, CDK (preferred), or Terraform.

Sign up for Job Alerts