Skip to main content
Posted 01 July, 2026

Scribie- Senior Applied ML Engineer - Audio and Foundation Models

Scribie
Bengaluru, KA, IN Full Time
Reference: aec7575ae9d5f3ac

Job Description

Scribie is an AI-powered, Human Verified audio and video transcription service, trusted globally since 2008. We specialize in delivering accurate and reliable transcription solutions by blending advanced AI technology with human expertise. Headquartered in the US, we operate with a hybrid model in our Bangalore office, combining the flexibility of remote work with the collaboration of in-person engagement.

This approach offers our team both autonomy and growth opportunities in a dynamic and supportive environment.\n\nWe’re building production-grade audio foundation models for high-stakes legal and enterprise transcription — real customer data, messy audio, real consequences.\n\nThis is not a paper-only research role.\n\nYou’ll own the full ML lifecycle:\n\nFine-tuning large audio / multimodal models using SFT, LoRA, and RL-based preference optimization (DPO / PPO / ORPO)\nBeating strong baselines like Whisper-large, GPT-4o, Gemini, Claude on domain-specific data\nDesigning WER, diarization, and alignment-driven evaluation stacks\nTaking models from research notebooks → production inference services\nRunning daily experiments that directly impact quality, cost, and customer satisfaction\n\nYou’ll be our first ML hire, with real ownership over the audio ML roadmap — not a side project, not a support role.\n\nBangalore (primarily onsite)\n\nCompensation - ₹9L – ₹11L\n\nThis role is a great fit if you:\n\nHave shipped fine-tuned ASR / LLM / multimodal models into production\nAre comfortable running large training jobs and debugging failures\nCare about real-world impact, not just benchmarks\n\nNot a fit if you’re looking for an academic or paper-only research role.\n\nApply here Or DM me with your LinkedIn/GitHub and a short note on the coolest audio or LLM system you’ve shipped.\n\nIf turning messy real-world audio into models that make humans 5–10× more efficient excites you — let’s talk.

Sign up for Job Alerts