Specialist Software Engineer-Data Enigneer
Job Description - Senior Data Engineer
Overview
We are seeking a Senior Data Engineer to design, build, and optimize largescale data platforms. The ideal candidate has strong expertise in big data ecosystems (Cloudera 7.1.9, Spark/Scala), data pipelines, CI/CD automation, and modern storage frameworks. This role involves collaborating with cross-functional teams to deliver robust, scalable, and secure data solutions.
Key Responsibilities
Data Engineering & Architecture
Design, build, and maintain data pipelines on large-scale environments using Spark/Scala.
Develop and optimize distributed data processing jobs (batch & nearrealtime).
Manage and administer Cloudera 7.1.9 clusters and big data services.
Implement data ingestion, transformation, and storage processes.
Data Platforms & Storage
Build connectors and services using Apache Livy for Spark job orchestration.
Work with Scality S3 for object storage lifecycle, performance tuning, and data governance.
Manage relational datasets in PostgreSQL (schema design, optimization, partitioning).
Software Engineering & Automation
Develop microservices and data services using Java Spring Boot.
Implement CI/CD automation using GitHub Actions integrated with JFrog Artifactory and SonarQube.
Automate operations using Ansible AWX.
Quality, Governance & Performance
Apply coding best practices and maintain code quality via SonarQube.
Monitor and optimize pipeline performance, reliability, and scalability.
Ensure data security, compliance, and metadata documentation.
Collaboration & Leadership
Mentor junior engineers and contribute to engineering standards.
Work closely with data scientists, business analysts, and platform teams.
Participate in architecture decisions and roadmap definition.
Required Technical Skills (Hard Skills)
Big Data Processing
Apache Spark (Scala)
Apache Livy
Cloudera 7.1.9 (HDFS, YARN, Hive, Impala, Oozie/Kerberos)
Data Storage & Databases
PostgreSQL
Scality S3 or S3compatible stores
Parquet/ORC/Avro
Advanced SQL & optimization
Programming & Backend Development
Scala
Java/Spring Boot
Shell scripting (Bash)
DevOps & CI/CD
GitHub Actions
JFrog Artifactory
SonarQube
Ansible AWX
Docker (optional)
Architecture & Engineering Practices
Data pipeline design patterns
Distributed systems fundamentals
Monitoring & observability
REST API design
Soft Skills - Sorted by Themes
Leadership & Ownership
Ability to mentor and guide junior engineers
Autonomy and initiative on complex topics
Strong ownership of deliverables and platform reliability
Collaboration & Communication
Clear communication with technical and business teams
Strong documentation skills
Team-player mindset
Problem Solving & Critical Thinking
Strong analytical skills
Ability to troubleshoot distributed systems
Capacity to simplify complex problems
Adaptability & Continuous Learning
Ability to adopt new technologies quickly
Comfort in dynamic environments
Curiosity for data engineering innovation
Nice-to-Have Skills
Experience with Kubernetes
Data governance (Atlas, Ranger)
Kafka or Flink streaming
Agile methodology knowledge