Posted 17 May, 2026
Data Engineer - Pyspark
Veracity
Pune,Maharashtra,India
Full Time
Reference: 365_621153_24-02049
Skills: - Python(Advance Should know about inheritance and Class)
- EMR, Athena, Redshift, AWS Glue ,IAM role CloudFormation CFT(Optional), Apache Airflow ,Git
- SQL
- Py-Spark
- Open Metadata
- Data Lakehouse
- Metadata experience -
AWS Services - S3
Important: Candidate should know about
1) Creation of ETL Pipeline
2) Should have the knowledge to Deploy code in EMR
3) Able to query in Athena
4) Able to create airflow Dag for Scheduling ETL pipeline
5) know about AWS Lambda and how to create a lambda function
This position is for an individual contributor. Therefore, we expect the candidate to independently manage client communication and proactively resolve technical issues without external assistance.
- EMR, Athena, Redshift, AWS Glue ,IAM role CloudFormation CFT(Optional), Apache Airflow ,Git
- SQL
- Py-Spark
- Open Metadata
- Data Lakehouse
- Metadata experience -
AWS Services - S3
Important: Candidate should know about
1) Creation of ETL Pipeline
2) Should have the knowledge to Deploy code in EMR
3) Able to query in Athena
4) Able to create airflow Dag for Scheduling ETL pipeline
5) know about AWS Lambda and how to create a lambda function
This position is for an individual contributor. Therefore, we expect the candidate to independently manage client communication and proactively resolve technical issues without external assistance.