Data Engineer
Job Description
data lakehouse
\n\npython
\n\netl pipelines
\n\nData Engineering
\n\napache hop
\n\nDatabases
\n\nNoSQL
\n\nSQL
\n\nLocation: Bangalore (5 days work from office, HSR Layout)
\n\nJob Type: Full Time
\n\nIndustry: Agritech
\n\nOpenings: Multiple- junior, mid level etc
\n\nPython- Mandatory
\n\n\n\n
Overview
\n\nThe Data Engineer/ Senior Data Engineer will design, develop, and maintain scalable data pipelines and infrastructure to support data-driven decision-making and advanced analytics. This role requires deep expertise in data engineering, strong problem-solving skills, and the ability to collaborate with cross-functional teams to deliver robust data solutions.
\n\n\n\n
Key Responsibilities
\n\n%CF; Data Pipeline Development: Design, build, and optimize scalable, secure, and reliable data pipelines to ingest, process, and transform large volumes of structured and unstructured data.
\n\n%CF; Data Architecture: Architect and maintain data storage solutions, including data lakes, data warehouses, and databases, ensuring performance, scalability, and cost-efficiency.
\n\n%CF; Data Integration: Integrate data from diverse sources, including APIs, third-party systems, and streaming platforms, ensuring data quality and consistency.
\n\n%CF; Performance Optimization: Monitor and optimize data systems for performance, scalability, and cost, implementing best practices for partitioning, indexing, and caching.
\n\n%CF; Collaboration: Work closely with data scientists, analysts, and software engineers to understand data needs and deliver solutions that enable advanced analytics, machine learning, and reporting.
\n\n%CF; Data Governance: Implement data governance policies, ensuring compliance with data security, privacy regulations (e.g., GDPR, CCPA), and internal standards.
\n\n%CF; Automation: Develop automated processes for data ingestion, transformation, and validation to improve efficiency and reduce manual intervention.
\n\n%CF; Mentorship: Guide and mentor junior data engineers, fostering a culture of technical excellence and continuous learning.
\n\n%CF; Troubleshooting: Diagnose and resolve complex data-related issues, ensuring high availability and reliability of data systems.
\n\n\n\n
Required Qualifications
\n\n%CF; Education: Bachelor’s or Master’s degree in Computer Science, Engineering, Data Science, or a related field.
\n\n%CF; Experience: 5+ years of experience in data engineering or a related role, with a proven track record of building scalable data pipelines and infrastructure.
\n\n%CF; Technical Skills:
\n\n%CB; Proficiency in programming languages such as Python, Java, or Scala.
\n\n%CB; Expertise in SQL and experience with NoSQL databases (e.g., MongoDB, Cassandra).
\n\n%CB; Strong experience with cloud platforms (e.g., AWS, Azure, GCP) and their data services (e.g., Redshift, BigQuery, Snowflake).
\n\n%CB; Hands-on experience with ETL/ELT tools (e.g., Apache Airflow, Talend, Informatica) and data integration frameworks.
\n\n%CB; Familiarity with big data technologies (e.g., Hadoop, Spark, Kafka) and distributed systems.
\n\n%CB; Knowledge of containerization and orchestration tools (e.g., Docker, Kubernetes) is a plus.
\n\n%CF; Soft Skills:
\n\n%CB; Excellent problem-solving and analytical skills.
\n\n%CB; Strong communication and collaboration abilities.
\n\n%CB; Ability to work in a fast-paced, dynamic environment and manage multiple priorities.
\n\n%CF; Certifications (optional but preferred): Cloud certifications (e.g., AWS Certified Data Analytics, Google Professional Data Engineer) or relevant data engineering certifications.
\n\n\n\n
Preferred Qualifications
\n\n%CF; Experience with real-time data processing and streaming architectures.
\n\n%CF; Familiarity with machine learning pipelines and MLOps practices.
\n\n%CF; Knowledge of data visualization tools (e.g., Tableau, Power BI) and their integration with data pipelines.
\n\n%CF; Experience in industries with high data complexity, such as finance, healthcare, or e-commerce.
\n\n\n\n
Work Environment
\n\n%CF; Team: Collaborative, cross-functional team environment with data scientists, analysts, and business stakeholders.
\n\n%CF; Hours: Full-time, with occasional on-call responsibilities for critical data systems