Lead Data Engineer
Job Summary:
The Data Engineer designs, builds, and maintains scalable data pipelines and infrastructure to support AI-driven healthcare SaaS applications. This role ensures data integrity, security, and compliance while enabling advanced analytics and machine learning capabilities. The Data Engineer collaborates with cross-functional teams to deliver reliable data solutions that improve clinical and operational outcomes.
Key Responsibilities:
- Data Pipeline Development & ETL/ELT
- Expertise in building robust ETL/ELT pipelines using tools like Apache Airflow, Talend, or Informatica
- Ability to transform raw healthcare data into structured formats for analytics and AI models
- Cloud Platform & Big Data Technologies
- Proficiency in AWS, Azure, or GCP for cloud-native development
- Experience with big data frameworks such as Spark, Hadoop, Hive
- Healthcare Data Standards & Compliance
- Familiarity with HIPAA, HL7, and FHIR standards
- Ability to ensure secure and compliant data handling across systems
- AI & Machine Learning Integration
- Support for AI/ML model deployment and data preparation
- Understanding of model monitoring, feature engineering, and real-time data streaming
- Technical Leadership
- Lead team to deliver planned commitments on time
- Learn, understand and leverage cutting edge technologies to gain competitive advantage
- Creatively brainstorm and innovate on prototypes as well as future products and features to drive the business forward
- Proactively research industry trends and best practices to apply them as necessary
- Deliver high-quality and on-schedule work according to Agile software development methodology
- Participate in Agile activities including daily stand ups, estimations, and backlog grooming and reviews
-
Provide internal development support which includes delivering fast and high-quality fixes to urgent production issues
- Database Management & Optimization
- Strong SQL skills for complex queries, joins, and performance tuning
- Experience with relational and NoSQL databases (e.g., PostgreSQL, MongoDB)
Required Qualifications:
Education & Experience
Bachelor's or Master's degree in Computer Science, Data Science, or related field
10-14 years of experience in data engineering, preferably in healthcare or SaaS
Experience with cloud platforms, data pipeline tools, and healthcare data standards
Exposure to AI/ML workflows and real-time analytics is a plus
Other Preferred Knowledge, Skills, Abilities or Certifications:
- Cloud Certifications: AWS Data Analytics, Azure Data Engineer, Google Cloud Data Engineer
- Streaming & Real-Time Data: Apache Kafka, Spark Streaming, Flink
- DataOps & Automation: Airflow, dbt, CI/CD for data workflows
- Security & Compliance: HIPAA, GDPR, CCPA, data encryption
- Advanced Databases: PostgreSQL, MongoDB, Cassandra, DynamoDB
- AI/ML Support: Feature engineering, model monitoring, ML pipeline integration