Posted 20 May, 2026
Data Engineer (Python & PySpark)
Zensar Technologies
Bangalore, Karnataka, India
Full Time
Reference: 218_649632_145695
Key Responsibilities
- Pipeline Development: Design, develop, and maintain end-to-end ETL/ELT pipelines using Python and PySpark.
- Big Data Processing: Build large-scale data processing frameworks to handle structured and unstructured data, ensuring high performance and reliability.
- Cloud Infrastructure: Architect and manage data solutions within the GCP ecosystem, focusing on cost-efficiency and security.
- Data Modeling: Design and implement robust data warehouse models (Star/Snowflake schemas) and data lake architectures.
- Optimization: Identify, design, and implement internal process improvements, such as automating manual processes and optimizing data delivery for greater scalability.
- Collaboration: Work closely with stakeholders to understand data requirements and translate them into technical specifications.
Technical Qualifications
- Core Programming: Strong proficiency in Python, including experience with libraries like Pandas, NumPy, and logging frameworks.
- Big Data: 3+ years of hands-on experience with Apache Spark (PySpark) for distributed data processing.
- GCP Ecosystem: Practical experience with Google Cloud services, specifically:
- BigQuery (Optimization, Partitioning, Clustering).
- Cloud DataProc or Dataflow.
- Cloud Storage (GCS) and Cloud Functions.
- Cloud Composer (Apache Airflow) for orchestration.
- Data Warehousing: Solid understanding of relational databases and SQL (PostgreSQL, MySQL) as well as NoSQL environments.
- DevOps & Tools: Experience with Git, Docker, and CI/CD pipelines. Familiarity with Terraform or other IaC tools is a significant plus.
Part of the $4.8 billion RPG Group, we're a community of 10,000+ innovators across 30+ global locations, including Milpitas, Seattle, Princeton, Cape Town, London, Zurich, Singapore, and Mexico City. Explore Life at Zensar and join us to Grow. Own. Achieve. Learn. to be the best version of yourself.
We believe the best work happens when individuality is celebrated, growth is encouraged, and well-being is prioritized. We are an equal employment opportunity (EEO) and affirmative action employer, committed to creating an inclusive workplace. All qualified applicants will be considered without regard to race, creed, color, ancestry, religion, sex, national origin, citizenship, age, sexual orientation, gender identity, disability, marital status, family medical leave status, or protected veteran status.
Soft Skills
- Analytical Thinking: Ability to break down complex data problems into manageable technical tasks.
- Communication: Strong verbal and written skills to interact with both technical and non-technical teams.
- Adaptability: A self-starter who stays current with the evolving data engineering landscape.
- Mentorship: Willingness to provide guidance and conduct code reviews for more junior team members.
Preferred Skills
- Experience with real-time data streaming (e.g., Google Pub/Sub or Kafka).
- Knowledge of data governance, security, and privacy compliance (GDPR/CCPA).
- Experience in optimizing Spark jobs (shuffling, partitioning, and memory management).
- Professional Google Cloud Data Engineer certification.