Algotale (Safe Security)-DevOps Engineer
About Algotale
Algotale is a data-centric IT consulting and staffing firm dedicated to reshaping industries through AI, machine learning, and cutting-edge cloud solutions. We pride ourselves on building resilient digital infrastructures and empowering businesses to unlock the true potential of their data. As we scale our AI-driven platforms, we are looking for a Senior DevOps Engineer with an SRE mindset to ensure our systems are fast, reliable, and infinitely scalable.
The Role
We are seeking an expert DevOps/Site Reliability Engineer to bridge the gap between development and operations. You will be the architect of our cloud infrastructure on AWS, ensuring that our Kubernetes-orchestrated services run with high availability and efficiency. With over 5 years of experience, you will lead the charge in automation, performance tuning, and infrastructure security.
Key Responsibilities
- Kubernetes Orchestration: Design, deploy, and manage production-grade Kubernetes clusters (EKS). Optimize resource allocation, scaling policies, and service mesh configurations.
- Infrastructure as Code (IaC): Architect and maintain scalable AWS infrastructure using Terraform or CloudFormation, ensuring environment consistency across Dev, Staging, and Production.
- Reliability & Performance: Act as an SRE expert to monitor system health, reduce latency, and improve the reliability of our data-heavy applications.
- Automation & Scripting: Develop complex automation workflows and internal tools using Python and Shell/Bash to eliminate manual toil.
- CI/CD Excellence: Streamline our deployment pipelines (GitHub Actions/Jenkins) for zero-downtime releases and robust rollback capabilities.
- Linux Administration: Perform deep-dive troubleshooting and performance tuning within Linux-based environments.
- Security & Compliance: Implement IAM best practices, VPC security, and secret management to protect sensitive data.
Technical Requirements
- Experience: 5+ years of professional experience in DevOps, SRE, or Infrastructure Engineering.
- AWS Mastery: Expert-level knowledge of AWS services, including EC2, S3, Lambda, RDS, VPC, and EKS.
- Kubernetes: Deep hands-on experience with K8s, including Helm charts, Ingress controllers, and persistent storage.
- Programming: High proficiency in Python and advanced Shell scripting for system automation.
- Linux: Expert command-line skills and a deep understanding of Linux internals and networking.
- Monitoring: Experience with Prometheus, Grafana, or ELK stack for proactive system observation.