Scribie- Senior Applied ML Engineer - Audio and Foundation Models
Job Description
This approach offers our team both autonomy and growth opportunities in a dynamic and supportive environment.\n\nWe’re building production-grade audio foundation models for high-stakes legal and enterprise transcription — real customer data, messy audio, real consequences.\n\nThis is not a paper-only research role.\n\nYou’ll own the full ML lifecycle:\n\nFine-tuning large audio / multimodal models using SFT, LoRA, and RL-based preference optimization (DPO / PPO / ORPO)\nBeating strong baselines like Whisper-large, GPT-4o, Gemini, Claude on domain-specific data\nDesigning WER, diarization, and alignment-driven evaluation stacks\nTaking models from research notebooks → production inference services\nRunning daily experiments that directly impact quality, cost, and customer satisfaction\n\nYou’ll be our first ML hire, with real ownership over the audio ML roadmap — not a side project, not a support role.\n\nBangalore (primarily onsite)\n\nCompensation - ₹9L – ₹11L\n\nThis role is a great fit if you:\n\nHave shipped fine-tuned ASR / LLM / multimodal models into production\nAre comfortable running large training jobs and debugging failures\nCare about real-world impact, not just benchmarks\n\nNot a fit if you’re looking for an academic or paper-only research role.\n\nApply here Or DM me with your LinkedIn/GitHub and a short note on the coolest audio or LLM system you’ve shipped.\n\nIf turning messy real-world audio into models that make humans 5–10× more efficient excites you — let’s talk.