Sarvagram-Senior Data Engineer (Snowflake)
About Us
SarvaGram is building India's first household centric, data led 'high-tech high-touch' FinTech platform to meet the growing aspirations of rural India, with a bouquet of financial and productivity enhancing offerings.
We achieve this by offering a suite of customized financial and capacity-building products designed to meet the unique requirements of rural households.
From loans and insurance solutions to innovative services like Farming-as-a-Service (FaaS), we equip rural communities with the tools they need to prosper.
Job Description
Key Responsibilities
Data Ingestion & Pipeline Engineering
Design and build reliable, scalable data pipelines ingesting from PostgreSQL, MySQL, MongoDB, Apache Kafka, and AWS SQS into Snowflake.
Implement CDC (Change Data Capture) patterns for near-real-time ingestion from transactional Aurora (PostgreSQL and MySQL) databases.
Build and maintain ELT pipelines using Python and dbt, ensuring data quality, lineage, and observability at every stage.
Handle schema evolution gracefully across heterogeneous source systems without breaking downstream consumers.
Data Modelling & Warehouse Design
Design dimensional models, data vault structures, or medallion architecture layers (bronze/silver/gold) in Snowflake suited to SarvaGram's lending, collections, and field operations domains.
Own the Snowflake warehouse - clustering keys, micro-partition strategy, materialized views, dynamic tables, and cost governance.
Define and enforce data modelling standards, naming conventions, and documentation practices across all datasets.
Snowflake Administration
Manage Snowflake account administration - user and role management, RBAC configuration, resource monitors, and virtual warehouse sizing.
Monitor and govern Snowflake credit consumption - identify expensive queries, configure auto-suspend/auto-resume policies, and right-size virtual warehouses for different workload types.
Maintain Snowflake security posture - network policies, data masking policies, row access policies, and column-level security for PII fields.
Manage Snowflake storage - database, schema, and table lifecycle policies, time travel configuration, and Fail-safe awareness.
Drive Snowflake feature adoption (e.g. dynamic tables, Snowpark) and stay current with platform capabilities relevant to SarvaGram's data stack.
Analytics Layer & Trino
Maintain and optimise the Trino query layer over Snowflake and other data stores, ensuring performant and cost-efficient analytical queries.
Collaborate with product and business teams to design semantic layers, aggregated marts, and self-serve datasets for reporting and dashboards.
Partner with the analytics/BI function to ensure Grafana, Metabase, or equivalent dashboards are backed by well-structured, tested data models.
Data Quality & Observability
Implement data quality checks, anomaly detection, and freshness SLAs across all critical datasets.
Build alerting and monitoring for pipeline failures, schema drift, and data volume anomalies.
Maintain data cataloguing and lineage documentation so that any dataset consumed by product or business is traceable to its source.
Cross-functional Collaboration
Work closely with backend engineers to understand source system schemas, event structures, and data contracts as new product features are built.
Translate analytical and reporting requirements from product managers and business stakeholders into well-scoped data engineering deliverables.
Participate in data governance discussions, especially around PII handling, DPDP compliance, and RBI audit data requirements.
Requirements
Requirements
Must-Have
3 to 5 years of hands-on data engineering experience.
SnowPro Core Certification
Demonstrable production experience with Snowflake not just familiarity.
Hands-on Snowflake administration experience - RBAC, resource monitors, virtual warehouse management, and credit governance.
Working knowledge of Snowflake security features - data masking policies, row access policies, and network policies.
Strong experience building ELT/ETL pipelines from relational (PostgreSQL, MySQL) and NoSQL (MongoDB) sources.
Hands-on experience consuming from event streaming systems - Kafka and/or AWS SQS - into a data warehouse.
Proficiency in Python for data pipeline development and orchestration.
Experience with Apache Airflow for pipeline orchestration and scheduling.
Strong understanding of data warehouse design - star schema, dimensional modelling, or medallion/data vault approaches.
Experience with dbt (data build tool) for transformation, testing, and documentation.
Advanced SQL skills - window functions, CTEs, query optimisation, execution plan analysis.
Ability to work directly with non-technical stakeholders - product managers, business analysts, and operations teams - and translate their requirements into data engineering deliverables.
Strong documentation skills - data dictionaries, pipeline runbooks, model documentation.
Good to Have
SnowPro Advanced certification (Data Engineer or Architect track).
Experience with CDC tools such as Debezium or AWS DMS.
Experience with Trino as a distributed query engine.
Familiarity with data quality frameworks - Great Expectations, Soda, or dbt tests.
Exposure to data cataloguing tools - Apache Atlas, Amundsen, or Collibra.
Understanding of PII masking, data anonymisation, and compliance obligations under DPDP Act and RBI data guidelines.
Experience in fintech, lending, insurance, or any regulated financial domain.
Hands-on with AWS services - S3, Lambda, SQS - in a data engineering context.
Basic familiarity with infrastructure-as-code (Terraform) for data infrastructure.
Who You Are
A data platform owner who takes pride in reliability, model quality, and data that stakeholders can trust.
Someone who can sit in a business review and understand what the numbers need to say, then go build the pipeline that produces them correctly.
Comfortable working across engineering, product, and business - translating between technical constraints and analytical requirements.
Someone who treats documentation and data contracts as first-class engineering deliverables, not afterthoughts.
Curious about the fintech and rural finance domain - motivated by the idea that well-engineered data directly influences credit decisions for underserved households.
Preferred Qualification
Bachelor's or Master's in Computer Science, Engineering, Statistics, or a related quantitative field.
Benefits
Benefits
SarvaGram is on a mission to revolutionize financial services for millions in rural India. We're building the nation's first data-driven platform that combines cutting-edge technology with a human touch to unlock financial possibilities for underserved households.
This is your chance to be at the forefront of innovation. Join us and:
Shape the future of FinTech: We're not just building a product, we're creating a new category. Be a part of defining the future of financial inclusion for rural India.
Embrace a high-growth, high-impact environment: This is a non-linear growth opportunity. Build a platform used by millions and witness the network effect drive massive scale.
Tackle real-world challenges: Apply your skills to solve critical problems and directly empower rural communities.
Craft solutions that touch lives: Develop innovative products used by diverse household members, each with unique needs.