Linux OS SME - L3 -Pune
Role: Site Reliability Engineer (SRE) / Monitoring Engineer
Responsible for maintaining site reliability through proactive monitoring, incident detection, and resolution. Ensures system uptime, performance, and resilience by designing and adjusting monitoring solutions, collaborating with crossfunctional teams, and automating repetitive tasks.
Skills / Products (Network, Common, Software Engineering):
Network: Palo Alto Firewall, Citrix NetScaler, Cisco ACI & Routers, VeloCloud & Aruba SDWAN, Aruba Wireless, Meraki Wireless, Cisco RAS (VPN), NSX Firewall, Infoblox.
Common: Puppet, Ansible, Vulnerability management best practices, ITSM, Release management.
Software Engineering: GitHub, Python/any other programming language.
Capabilities:
Troubleshoot routing, latency, throughput, and devicelevel issues.
Review and deploy automation manifests/playbooks.
Identify and prioritize vulnerability remediation.
Apply ITSM and release management practices.
Store and update code/scripts in GitHub.
Write automation code independently, leveraging AI if needed.
CertificCertifications:
RHCSA / RHCE
Experience with:
Configuration management tools (Ansible advanced usage)
Backup tools, DR environments
Exposure to:
Hybrid cloud + containerized workloadsations
Experience
5-8 years Should be relevant in OS Linux (L3 Engineer)
Experience in large-scale enterprise production environments
What Will Differentiate You
Ability to deep-dive and resolve critical Linux issues under pressure
Strong automation-first approach (not just manual ops)
Experience in 24x7 IT Command Center environments
High ownership with "detect fix automate" mindset
Good to Have (Strong Advantage)