{"version":"0.1","company":{"name":"YubHub","url":"https://yubhub.co","jobsUrl":"https://yubhub.co/jobs/skill/inference-workflows"},"x-facet":{"type":"skill","slug":"inference-workflows","display":"inference workflows","count":1},"x-feed-size-limit":100,"x-feed-sort":"enriched_at desc","x-feed-notice":"This feed contains at most 100 jobs (the most recently enriched). For the full corpus, use the paginated /stats/by-facet endpoint or /search.","x-generator":"yubhub-xml-generator","x-rights":"Free to redistribute with attribution: \"Data by YubHub (https://yubhub.co)\"","x-schema":"Each entry in `jobs` follows https://schema.org/JobPosting. YubHub-native raw fields carry `x-` prefix.","jobs":[{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_c2c97849-e31"},"title":"Senior Machine Learning Engineer, Voice Experience","description":"<p>We are looking for a Senior Machine Learning Engineer, Voice Experience to help build the next generation of AI-powered voice systems for the contact center. In this role, you will work at the intersection of speech, language, and real-time production systems, improving how AI listens, understands, reasons, empathizes, and responds in live customer conversations.</p>\n<p>You will develop and improve machine learning systems that power voice experiences end to end, including automatic speech recognition, turn detection, downstream language understanding, retrieval-augmented and agentic workflows, quality measurement, text to speech, and production optimization.</p>\n<p>Responsibilities:</p>\n<ul>\n<li>Design, train, evaluate, and deploy machine learning systems that power real-time voice experiences, including ASR, speech understanding, turn detection, text to speech, speech to speech, classification, entity extraction, summarization, and structured insight generation.</li>\n<li>Improve the quality of voice AI systems through error analysis, data curation, metric design, benchmarking, and iterative model improvement, with a strong focus on real-world performance.</li>\n<li>Build evaluation frameworks for complex voice and agentic systems, measuring metrics such as accuracy, robustness, latency, faithfulness, naturalness, professionalism, task completion, and cost.</li>\n<li>Diagnose and mitigate failure modes across the voice stack, including transcription errors, hallucinations, retrieval failures, tool misuse, prompt brittleness, context drift, and multi-step reasoning breakdowns.</li>\n<li>Design and optimize low-latency ML workflows for live conversations, balancing model quality with system responsiveness, scalability, and reliability.</li>\n<li>Partner with platform and backend engineers to productionize real-time inference, streaming pipelines, quality monitoring, and continuous model iteration.</li>\n<li>Collaborate cross-functionally with product, design, frontend, and backend teams to integrate voice intelligence seamlessly into Cresta’s platform.</li>\n<li>Establish best practices for offline evaluation, online experimentation, model validation, observability, and ongoing quality monitoring in production.</li>\n<li>Mentor engineers, contribute to technical strategy, and help shape the roadmap for Cresta’s voice AI systems.</li>\n</ul>\n<p>Qualifications:</p>\n<ul>\n<li>Bachelor’s degree in Computer Science, Mathematics, Machine Learning, AI, or a related field; Master’s or Ph.D. preferred.</li>\n<li>5+ years of experience building, evaluating, and deploying machine learning systems in production.</li>\n<li>Strong background in one or more of the following: speech recognition, speech processing, NLP, generative AI, or conversational AI.</li>\n<li>Deep experience with model evaluation, benchmarking, error analysis, and quality improvement for production ML systems.</li>\n<li>Strong expertise with modern ML frameworks and tooling such as PyTorch, TensorFlow, and Hugging Face.</li>\n<li>Solid understanding of transformer-based models, embeddings, retrieval systems, and large-scale training or inference workflows.</li>\n<li>Experience designing and deploying real-time ML systems with strong requirements around latency, scalability, and reliability.</li>\n<li>Experience building data pipelines and tooling for experimentation, measurement, and large-scale quality analysis.</li>\n<li>Ability to work across research and engineering boundaries and translate promising ideas into production-grade systems.</li>\n<li>Strong communication and technical leadership skills, with the ability to influence cross-functional decisions and raise the engineering bar.</li>\n</ul>\n<p>Nice to have:</p>\n<ul>\n<li>Hands-on experience with ASR quality metrics such as WER and task-level evaluation methodologies.</li>\n<li>Experience with RAG systems, agentic workflows, multi-step reasoning systems, or LLM-as-a-judge evaluation methods.</li>\n<li>Familiarity with streaming inference, real-time voice pipelines, or media systems.</li>\n<li>Experience working closely with infrastructure or platform teams on production ML deployment, observability, and reliability.</li>\n<li>Experience in contact center AI, conversational intelligence, or enterprise voice products.</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_c2c97849-e31","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Cresta","sameAs":"https://www.cresta.ai/","logo":"https://logos.yubhub.co/cresta.ai.png"},"x-apply-url":"https://job-boards.greenhouse.io/cresta/jobs/5199747008?utm_source=yubhub.co&utm_medium=jobs_feed&utm_campaign=apply","x-work-arrangement":"remote","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$205,000–$270,000","x-skills-required":["speech recognition","speech processing","NLP","generative AI","conversational AI","PyTorch","TensorFlow","Hugging Face","transformer-based models","embeddings","retrieval systems","large-scale training","inference workflows","real-time ML systems","latency","scalability","reliability","data pipelines","tooling","experimentation","measurement","quality analysis"],"x-skills-preferred":[],"datePosted":"2026-04-24T15:18:12.809Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"United States (Remote)"}},"jobLocationType":"TELECOMMUTE","employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"speech recognition, speech processing, NLP, generative AI, conversational AI, PyTorch, TensorFlow, Hugging Face, transformer-based models, embeddings, retrieval systems, large-scale training, inference workflows, real-time ML systems, latency, scalability, reliability, data pipelines, tooling, experimentation, measurement, quality analysis","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":205000,"maxValue":270000,"unitText":"YEAR"}}}]}