Posted 12 June, 2026

Intern - Data Engineer

bluCognition

Remote Nationwide, IN Full Time

Reference: eeda2ebfbfc0f2e3

Job Description

Data Engineer / Analytics Engineer About bluCognition: bluCognition is an AI/ML based start-up specializing in risk analytics, data conversion and data enrichment capabilities. Founded in 2017, by some very senior professionals from the financial services industry, the company is headquartered in the US, with the delivery centre based in Pune. We build all our solutions while leveraging the latest technology stack in AI, ML and NLP combined with decades of experience in risk management at some of the largest financial services firms in the world.

Our clients are some of the biggest and the most progressive names in the financial services industry. We are entering a significant growth phase and are looking for motivated and analytical freshers who want to join us in this exciting journey. What will your role involve?

• Work with large structured datasets using SQL and PySpark. • Build, maintain, and optimize ETL/data processing pipelines. • Assist in business/entity matching logic and fuzzy matching implementations.

• Create and validate analytical datasets for model development and reporting. • Perform data cleaning, transformation, aggregation, and quality checks. • Write efficient SQL queries using joins, CTEs, window functions, and aggregations.

• Support feature engineering for ML/risk modeling use cases. • Work on incremental data processing and monthly/daily refresh strategies. • Analyze data discrepancies, debug pipeline failures, and improve reliability.

• Collaborate with analytics, data science, and engineering teams. • Participate in testing, deployment, and code review activities. To help us level up, you will ideally have: • A background in Computer Science, Data Science, Statistics, Mathematics, or a related field.

• Strong SQL knowledge — joins, CTEs, aggregations, CASE statements, and window functions. • Basic understanding of Python and familiarity with PySpark or distributed data processing concepts. • Understanding of relational databases, data structures, and ETL/data pipeline concepts.

• Exposure to AWS or cloud platforms such as Redshift, Spark, Hadoop, or Databricks is a plus. • Familiarity with Git/version control and basic understanding of APIs. • An analytical mindset and strong problem-solving skills, with attention to detail and data accuracy.

• The ability to work in a fast-paced environment and to deal with ambiguity. • Strong communication, documentation, and collaboration skills across multiple teams.

Apply to this Job

Intern - Data Engineer

Job Description

Sign up for Job Alerts

Share this Job