Posted 14 June, 2026
Data Engineer - Python AND Kafka AND (Hadoop OR HDFS OR Hive) AND Snowflake AND apache AND (iceberg
VARITE INDIA PRIVATE LIMITED
Bangalore, Karnataka, IN
Full Time
Reference: 26-29244-2522-2
Company Name: VARITE India Private Limited
About The Client:
A global IT services and consulting company, multinational information technology (IT), headquartered in Tokyo, Japan. The Client offers a wide array of IT services, including application development, infrastructure management, and business process outsourcing. Their consulting services span business and technology, while their digital solutions focus on transformation and user experience design. It excels in data and intelligence services, emphasizing analytics, AI, and machine learning. Additionally, their cybersecurity, cloud, and application services round out a comprehensive portfolio designed to meet the diverse needs of businesses worldwide.
About The Job:
Pipeline Migration Logic & Scheduling
Unlock Rewards: Refer Candidates and Earn.
If you're not available or interested in this opportunity, please pass this along to anyone in your network who might be a good fit and interested in our open positions. VARITE offers a Candidate Referral program, where you'll receive a one-time referral bonus based on the following scale if the preferred candidate completes a three-month assignment with VARITE.
Experience Level Bonus Referral:
About VARITE: VARITE is a global staffing and IT consulting company providing technical consulting and team augmentation services to Fortune 500 Companies in USA, UK, CANADA and INDIA. VARITE is currently a primary and direct vendor to the leading corporations in the verticals of Networking, Cloud Infrastructure, Hardware and Software, Digital Marketing and Media Solutions, Clinical Diagnostics, Utilities, Gaming and Entertainment, and Financial Services.
Equal Opportunity Employer:
VARITE is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees. We do not discriminate based on race, color, religion, sex, sexual orientation, gender identity or expression, national origin, age, marital status, veteran status, or disability status
About The Client:
A global IT services and consulting company, multinational information technology (IT), headquartered in Tokyo, Japan. The Client offers a wide array of IT services, including application development, infrastructure management, and business process outsourcing. Their consulting services span business and technology, while their digital solutions focus on transformation and user experience design. It excels in data and intelligence services, emphasizing analytics, AI, and machine learning. Additionally, their cybersecurity, cloud, and application services round out a comprehensive portfolio designed to meet the diverse needs of businesses worldwide.
About The Job:
- Engineer will be part of the Datastore Migration Factory Team responsible for performing end-to-end datastore migration from on-prem DataLake to AWS-hosted LakeHouse.
- This is a high-visibility and business-critical project for the Client.
Pipeline Migration Logic & Scheduling
-
Refactor and migrate:
- Extraction logic
- Job scheduling
- Transition from legacy frameworks to the new Lakehouse environment.
Data Transfer
- Execute physical migration of underlying datasets.
-
Ensure:
- Data integrity
- Smooth migration process
Stakeholder Engagement
- Act as a technical liaison for internal clients.
-
Facilitate:
- Handoff discussions
- Sign-off conversations with data owners
- Ensure migrated assets meet business requirements.
Consumption Pattern Migration / Code Conversion
-
Translate and optimize:
- Legacy SQL-based consumption patterns
- Spark-based consumption patterns
- Raw and modeled datasets
-
Ensure compatibility with:
- Snowflake
- Iceberg
Usage Analysis
- Understand data usage patterns.
- Deliver required data products aligned with business needs.
Additional Stakeholder Collaboration
- Partner with data owners and stakeholders throughout migration activities.
- Ensure successful transition and validation of migrated assets.
Data Reconciliation & Quality
- Follow a rigorous approach to data validation.
-
Work with reconciliation frameworks to:
- Validate migrated datasets
- Build confidence in migrated data quality
- Ensure migrated data is functionally equivalent to existing production data flows.
- Work with additional internal data management platforms.
-
Demonstrate aptitude for:
- Learning new workflows
- Adapting to new language constructs
- Supporting evolving platform requirements.
- Experience ( Relevant) : 6+ years
- PySpark,
- Hadoop,
- Data Lakehouse
Education
-
Bachelor’s or Master’s degree in:
- Computer Science
- Applied Mathematics
- Engineering
- Related quantitative field
Experience
- Minimum 3–5 years of professional hands-on coding experience in a collaborative, team-based environment.
- Strong troubleshooting skills in SQL.
- Basic scripting experience required.
Programming Languages
-
Professional proficiency in:
- Python or
- Java
Methodology
-
Deep familiarity with:
- Full Software Development Life Cycle (SDLC)
- CI/CD best practices
- Kubernetes (K8s) deployment experience
Core Data Engineering Competencies
- Candidates should demonstrate a strong understanding of the following concepts to ensure data correctness during reconciliation:
Temporal Data Modeling
- Manage state changes over time
-
Example:
- SCD Type 2
Schema Management
-
Expertise in:
- Schema Evolution
- Schema enforcement strategies
-
Reference:
- Apache Iceberg
Performance Optimization
-
Advanced knowledge of:
- Data partitioning
- Data clustering
Architectural Theory
-
Understanding of:
- Normalization vs. Denormalization
- Natural Keys vs. Surrogate Keys
- Ability to apply the right design strategy based on use cases
Technical Stack Requirements
- While candidates are not expected to be experts in every technology, the collective team should cover:
Extraction & Logic
- Kafka
- ANSI SQL
- FTP
- Apache Spark
Data Formats
- JSON
- Avro
- Parquet
Platforms
- Hadoop (HDFS/Hive)
- Snowflake
- Apache Iceberg
- Sybase IQ
Core Competencies
- Demonstrates strong integrity and consistently models ethical behavior and decision-making.
- Acts as a trusted team player and collaborates effectively across multiple teams and functions.
-
Communicates with clarity and confidence through:
- Concise written communication
- Structured verbal briefings
- Proactive stakeholder management
-
Works effectively with global teams across:
- Time zones
- Cultures
- Builds alignment and resolves issues constructively.
- Delivery-focused with a strong sense of ownership.
- Drives work to completion and consistently meets commitments.
-
Brings high energy and urgency while maintaining:
- Quality
- Professionalism
-
Demonstrates intellectual curiosity by:
- Asking thoughtful questions
- Identifying risks early
- Seeking feedback for continuous improvement
Unlock Rewards: Refer Candidates and Earn.
If you're not available or interested in this opportunity, please pass this along to anyone in your network who might be a good fit and interested in our open positions. VARITE offers a Candidate Referral program, where you'll receive a one-time referral bonus based on the following scale if the preferred candidate completes a three-month assignment with VARITE.
Experience Level Bonus Referral:
| 0-2 years | INR 5,000 |
| 2-6 years | INR 7,500 |
| 6+ years | INR 10,000 |
About VARITE: VARITE is a global staffing and IT consulting company providing technical consulting and team augmentation services to Fortune 500 Companies in USA, UK, CANADA and INDIA. VARITE is currently a primary and direct vendor to the leading corporations in the verticals of Networking, Cloud Infrastructure, Hardware and Software, Digital Marketing and Media Solutions, Clinical Diagnostics, Utilities, Gaming and Entertainment, and Financial Services.
Equal Opportunity Employer:
VARITE is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees. We do not discriminate based on race, color, religion, sex, sexual orientation, gender identity or expression, national origin, age, marital status, veteran status, or disability status