Posted 17 May, 2026
Data Engineering Pipeline Engineer - Role Description
Innoventes
KARNATAKA,Bengaluru,India,560008
Full Time
Reference: 219_560405_Mh2z5RPOtAI8
Data Engineering Pipeline Engineer - Role
ABOUT THE ROLE
We are looking for a skilled Data Engineering Pipeline Engineer to design, build, and maintain scalable data infrastructure that powers our analytics, machine learning, and business intelligence platforms. You will work across the full data lifecycle - from ingestion and transformation to storage and delivery - ensuring reliability, performance, and governance at every stage. This is a high-impact role where you will collaborate closely with data scientists, analysts, and platform engineers to ship robust pipelines that enable data-driven decisions across the organization.
KEY RESPONSIBILITIES
Design, build, and maintain scalable batch and real-time data pipelines using tools such as Apache Spark, Kafka, Flink, or Airflow
Develop and optimize ETL/ELT workflows to ingest data from diverse sources including APIs, databases, event streams, and flat files
Architect and manage cloud-based data infrastructure on AWS, GCP, or Azure (e.g., S3, BigQuery, Redshift, Databricks, Snowflake)
Implement data quality monitoring, alerting, and observability frameworks to ensure pipeline reliability and SLA compliance
Collaborate with data scientists and ML engineers to support model training, feature engineering, and inference pipelines
Partner with analytics engineers to maintain and evolve data warehouse models (dbt, dimensional modeling)
Define and enforce data governance standards including cataloging, lineage tracking, and access control policies
Optimize pipeline performance through profiling, query tuning, partitioning strategies, and cost management
Document pipeline architecture, data contracts, and runbooks for operational clarity
Participate in on-call rotation and incident response for critical data infrastructure
REQUIRED QUALIFICATIONS
4+ years of experience in data engineering, ETL development, or a related software engineering discipline
Proficiency in Python and/or Scala for pipeline development; strong SQL skills across multiple dialects
Hands-on experience with distributed processing frameworks such as Apache Spark, Beam, or Flink
Experience with workflow orchestration tools such as Apache Airflow, Prefect, or Dagster
Deep familiarity with cloud data platforms (AWS, GCP, or Azure) and managed services such as BigQuery, Redshift, or Synapse
Experience designing and maintaining data warehouses or lakehouses (Snowflake, Databricks, Delta Lake, Iceberg)
Strong understanding of data modeling concepts: normalization, star/snowflake schema, slowly changing dimensions
Experience with streaming and event-driven architectures using Kafka, Kinesis, or Pub/Sub Familiarity with CI/CD practices and infrastructure-as-code tools (Terraform, Pulumi) for data platform deployments
Excellent communication skills with the ability to translate business requirements into technical solutions
Employment Type: FULL_TIME