Principal AI Automation Engineer
Job Description
Principal Engineer – AI Quality & Evaluation Architecture
Role Overview
Vocera (Stryker) is seeking a Principal Engineer to own end-to-end AI quality across the lifecycle — data, models, prompts, evaluation, deployment, and monitoring. This role will define and scale reliable, measurable, production-grade AI systems across speech, NLP, and GenAI in healthcare.
Key Responsibilities
AI Quality Ownership
Own AI quality across the full lifecycle
Define SLAs, KPIs, release gates, and production readiness decisions
Build evaluation frameworks for ASR (WER, latency), NLP (intent/entity), and LLMs/RAG (hallucination, safety, groundedness)
Develop benchmarking, regression pipelines, and golden datasets
Drive adversarial testing, edge case handling, and failure analysis
Architect scalable evaluation platforms (offline, regression, A/B, shadow testing)
Integrate with CI/CD and MLOps pipelines
Implement monitoring, observability, and drift detection
Define standards for data curation, annotation, and versioning
Ensure reproducibility and feedback loops from production
Maintain healthcare data compliance
Establish AI MLOps standards for evaluation, retraining, and deployment
Enable continuous evaluation and performance monitoring at scale
Act as AI quality authority across the organization
Mentor teams and align with product and business goals
12+ years in software/AI engineering; 5+ years in LLMs, NLP, RAG, or speech
Experience building scalable AI evaluation frameworks
LLM evaluation (hallucination, safety, groundedness)
Golden datasets, regression testing, adversarial testing
Prompt validation, Python, data analysis, automation
CI/CD, MLOps, distributed systems
RAG evaluation & retrieval benchmarking
Speech/ASR evaluation
Azure ML / OpenAI / AI Search
Responsible AI & compliance
Travel Percentage: 10%
Evaluation & Reliability
AI Testing Platform
Data Governance
MLOps & Continuous Quality
Leadership
Qualifications
Expertise in:
Nice to Have