Posted 21 May, 2026
Staff MLOps Engineer (Manager)
Anblicks
Hyderabad, TG, IN
Full Time
Reference: 7881ea65ae99e013
Job Description
Role: Staff MLOps Engineer (Manager)
Experience: 10 Years+
Location: Hyderabad, India
About the Role
We are looking for a Staff MLOps Engineer (Manager) to lead the design, build, and scale of enterprise-grade MLOps and DevOps platforms. This role combines hands-on engineering excellence with team leadership , focusing on productionizing machine learning systems, enabling developer productivity, and driving automation at scale.
You will work at the intersection of ML engineering, cloud infrastructure, and platform engineering , helping teams deliver reliable, scalable, and secure ML solutions in production.
Key Responsibilities
- Lead and mentor a team of MLOps/DevOps engineers, driving technical excellence and delivery outcomes
- Architect, build, and scale end-to-end MLOps platforms and CI/CD pipelines for ML workloads
- Design and implement automated deployment pipelines for training, testing, and model serving at scale
- Operationalize ML models into production with a focus on performance, reliability, and observability
- Partner with data scientists and engineering teams to enable self-service ML platforms and developer tooling
- Implement Infrastructure-as-Code (IaC) and automation frameworks for cloud environments
- Ensure platform compliance with security, governance, and reliability standards
- Troubleshoot complex production issues and continuously improve developer experience and system resilience
- Drive best practices for CI/CD, testing, monitoring, and release management across ML and data platforms
- Evaluate and optimize environments supporting large-scale data pipelines and ML workflows
Required Qualifications
- 10+ years of experience in DevOps, MLOps, or Platform Engineering roles
- 5+ years of people management experience , leading teams of 5+ engineers
- Strong hands-on expertise in building and scaling MLOps pipelines and platforms
- Proven experience with Infrastructure-as-Code (Terraform preferred) in public cloud environments
- Deep experience with CI/CD tools such as GitHub Actions, Jenkins, and code quality/security tools (e.g., Snyk)
- Strong knowledge of MLOps and orchestration frameworks such as Airflow, Kubeflow, MLflow, or similar
- Experience deploying and managing ML models in production at scale
- Hands-on experience with distributed data processing frameworks such as Apache Spark, EMR, or Databricks
- Strong programming skills in Python (preferred) or Node.js/Bash
- Experience with containerization and orchestration (Docker, Kubernetes)
- Strong understanding of cloud platforms (AWS, Azure, or GCP) and cloud-native services
- Experience with data platforms and services such as Snowflake, Redshift, Glue, BigQuery, or similar
- Solid understanding of distributed systems, monitoring, logging, and reliability engineering
- Experience with Git-based workflows and version control best practices
Preferred Qualifications
- Experience with configuration management tools (Ansible, Chef, Puppet)
- Familiarity with ML libraries and frameworks such as scikit-learn, PyTorch, TensorFlow
- Exposure to large-scale inference systems and batch/real-time scoring architectures
- Experience supporting multi-runtime environments (Node.js, Java/Spark/Scala, React)
- Cloud certifications (AWS/GCP/Azure)
What We’re Looking For
- Strong problem-solving mindset with a passion for automation and scalability
- Ability to balance hands-on engineering with team leadership
- Focus on developer experience, platform reliability, and operational excellence
- Excellent collaboration skills across data, engineering, and architecture teams