Data Engineer (Azure & Databricks)
Job Description
Role Overview
\n\nWe are seeking a highly skilled Data Engineer with strong experience in Azure and Databricks, who will play a critical role in designing, transforming, and operationalizing data pipelines within a modern Lakehouse architecture.
\n\nThe role primarily focuses on transforming data from the Bronze layer into curated analytics-ready datasets, building automated CI/CD pipelines, and developing high-quality Python and PySpark-based data solutions. The engineer will also collaborate closely with Data Scientists and Software Engineers and should be open to contributing to data-driven UI/UX initiatives.
\n\nData Engineering & Transformation
\n\n- \n
- Design, develop, and maintain scalable data transformation pipelines using Python (with tools like PySpark, ADF) and SQL in Azure Databricks \n
- Implement transformation logic to move data from Bronze to Silver/Gold layers following data engineering best practices \n
- Apply strong data engineering principles to ensure data reliability, quality, performance, and reusability \n
- Work with structured and semi-structured data at scale \n
Databricks, Azure & Cloud ETL
\n\n- \n
- Build and manage Databricks notebooks, jobs, Delta Lake tables, and orchestrated workflows \n
- Hands-on experience with Cloud-based ETL platforms \n
(Preferred: Microsoft Azure Databricks, Synapse, Azure Functions; otherwise AWS or Google Cloud)
\n\n- Optimize data pipelines for performance, scalability, and cost efficiency
Python Applications, APIs & Automation
\n\n- \n
- Design, develop, and maintain Python applications, scripts, and APIs for data processing and automation \n
- Write production-grade Python code with strong focus on readability, maintainability, and testing \n
- Leverage Python for orchestration, validation, and integration with downstream systems \n
Collaboration with Data Science & Engineering Teams
\n\n- \n
- Collaborate closely with Data Scientists and Data Analysts to understand data, analytical models, and consumption requirements \n
- Enable and support advanced analytics and data science workflows by preparing high-quality feature datasets \n
- Translate analytical needs into scalable data engineering solutions \n
CI/CD, DevOps & Platform Engineering
\n\n- \n
- Build and maintain automated CI/CD pipelines for data and Databricks workloads \n
- Hands-on experience with DevOps tools and practices, including Git-based version control \n
- Exposure to containerization and orchestration platforms such as Kubernetes / OpenShift \n
- Ensure smooth promotion of code and pipelines across environments (Dev/Test/Prod) \n
Data Modeling & Querying
\n\n- \n
- Design and implement robust data models optimized for analytics and reporting \n
- Strong hands-on knowledge of SQL and exposure to KQL or other query languages \n
- Apply best practices in data structures, indexing, and performance tuning UI / UX & Data Applications (Additional Advantage) \n
- Open to contributing to data-driven UI/UX components, dashboards, or lightweight data applications \n
- Work with analytics and business teams to improve data usability and customer experience \n
\n\n
Required Skills & Qualifications
\n\nMust-Have
\n\n- \n
- Strong hands-on expertise in Python (with frameworks like PySpark)\n \n
- Solid foundation in Data Engineering principles and large-scale data processing \n
- Experience with Azure Databricks and cloud-based ETL platforms \n
- Strong knowledge of SQL and data querying techniques \n
- Experience with CI/CD pipelines and DevOps practices\n \n
- Experience in pipeline monitoring and alerting \n
- Ability to design efficient, scalable solutions to complex data problems\n \n
Good-to-Have
\n\n- \n
- Experience with Azure Synapse, Azure Functions\n \n
- Exposure to AWS or Google Cloud data platforms \n
- Hands-on experience with OpenShift\n \n
- Knowledge of data science concepts and workflows\n \n
- Familiarity with analytics platforms, dashboards, and UI/UX considerations \n