{"version":"0.1","company":{"name":"YubHub","url":"https://yubhub.co","jobsUrl":"https://yubhub.co/jobs/skill/model-compression"},"x-facet":{"type":"skill","slug":"model-compression","display":"Model Compression","count":2},"x-feed-size-limit":100,"x-feed-sort":"enriched_at desc","x-feed-notice":"This feed contains at most 100 jobs (the most recently enriched). For the full corpus, use the paginated /stats/by-facet endpoint or /search.","x-generator":"yubhub-xml-generator","x-rights":"Free to redistribute with attribution: \"Data by YubHub (https://yubhub.co)\"","x-schema":"Each entry in `jobs` follows https://schema.org/JobPosting. YubHub-native raw fields carry `x-` prefix.","jobs":[{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_7f56054b-d77"},"title":"Principal Software Engineer","description":"<p><strong>Summary</strong></p>\n<p>Microsoft AI are looking for a talented Principal Software Engineer at their Mountain View office. This role sits at the heart of strategic decision-making, driving innovations in AI infrastructure. You&#39;ll work directly with key partners to understand, design, and implement complex inferencing capabilities for state-of-the-art deep learning models.</p>\n<p><strong>About the Role</strong></p>\n<p>As a Principal Software Engineer, you will be responsible for engaging directly with key partners to understand, design, and implement complex inferencing capabilities for state-of-the-art deep learning models. You will work with cutting-edge hardware and software stacks to deliver best-in-class inference performance while optimizing for cost, leveraging open-source projects to advance deep learning applications. You will collaborate with external and internal teams to identify new areas for improvement and contribute to innovations that enhance model performance and deployment.</p>\n<p><strong>Accountabilities</strong></p>\n<ul>\n<li>Engage directly with key partners to understand, design, and implement complex inferencing capabilities for state-of-the-art deep learning models.</li>\n<li>Work with cutting-edge hardware and software stacks to deliver best-in-class inference performance while optimizing for cost.</li>\n</ul>\n<p><strong>The Candidate we&#39;re looking for</strong></p>\n<p><strong>Experience:</strong></p>\n<ul>\n<li>6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python.</li>\n</ul>\n<p><strong>Technical skills:</strong></p>\n<ul>\n<li>Experience with model compression (quantization, distillation, SVD, low-rank methods).</li>\n<li>Experience in building high-throughput inference serving stacks (continuous batching, KV-cache optimizations, routing).</li>\n</ul>\n<p><strong>Personal attributes:</strong></p>\n<ul>\n<li>Solid experience in GPU inference optimization (CUDA, TensorRT, Triton, or custom GPU kernels).</li>\n<li>Proficiency in profiling tools (Nsight, TensorBoard, PyTorch profiler) and ability to identify CPU/GPU bottlenecks.</li>\n</ul>\n<p><strong>Benefits</strong></p>\n<ul>\n<li>Competitive salary range of USD $139,900 – $274,800 per year.</li>\n<li>Comprehensive benefits package, including health insurance, retirement plan, and paid time off.</li>\n<li>Opportunities for professional growth and development.</li>\n<li>Collaborative and dynamic work environment.</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_7f56054b-d77","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Microsoft AI","sameAs":"https://microsoft.ai","logo":"https://logos.yubhub.co/microsoft.ai.png"},"x-apply-url":"https://microsoft.ai/job/principal-software-engineer-24/","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"USD $139,900 – $274,800 per year","x-skills-required":["C","C++","C#","Java","JavaScript","Python","model compression","GPU inference optimization"],"x-skills-preferred":["TensorRT","Triton","CUDA","Nsight","TensorBoard","PyTorch profiler"],"datePosted":"2026-03-06T07:30:21.077Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Mountain View"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"C, C++, C#, Java, JavaScript, Python, model compression, GPU inference optimization, TensorRT, Triton, CUDA, Nsight, TensorBoard, PyTorch profiler","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":139900,"maxValue":274800,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_a15b11dd-765"},"title":"Principal Software Engineer","description":"<p><strong>Summary</strong></p>\n<p>Microsoft AI are looking for a talented Principal Software Engineer at their Redmond office. This role sits at the heart of strategic decision-making, turning market data into actionable insights for a company that&#39;s revolutionising AI technology. You&#39;ll work directly with leadership to shape the company&#39;s direction in the AI market.</p>\n<p><strong>About the Role</strong></p>\n<p>As a Principal Software Engineer, you will be responsible for designing and implementing complex software systems that drive innovation in AI infrastructure. You will work with cutting-edge hardware and software stacks to deliver best-in-class inference performance while optimizing for cost, leveraging open-source projects to advance deep learning applications. You will collaborate with external and internal teams to identify new areas for improvement and contribute to innovations that enhance model performance and deployment.</p>\n<p><strong>Accountabilities</strong></p>\n<ul>\n<li>Engage directly with key partners to understand, design, and implement complex inferencing capabilities for state-of-the-art deep learning models, driving innovations in AI infrastructure.</li>\n<li>Work with cutting-edge hardware and software stacks to deliver best-in-class inference performance while optimizing for cost, leveraging open-source projects to advance deep learning applications.</li>\n</ul>\n<p><strong>The Candidate we&#39;re looking for</strong></p>\n<p><strong>Experience:</strong></p>\n<ul>\n<li>Bachelor’s Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience.</li>\n</ul>\n<p><strong>Technical skills:</strong></p>\n<ul>\n<li>Experience with model compression (quantization, distillation, SVD, low-rank methods).</li>\n<li>Experience in building high-throughput inference serving stacks (continuous batching, KV-cache optimizations, routing).</li>\n</ul>\n<p><strong>Personal attributes:</strong></p>\n<ul>\n<li>Solid experience in GPU inference optimization (CUDA, TensorRT, Triton, or custom GPU kernels).</li>\n<li>Proficiency in profiling tools (Nsight, TensorBoard, PyTorch profiler) and ability to identify CPU/GPU bottlenecks.</li>\n</ul>\n<p><strong>Benefits</strong></p>\n<ul>\n<li>Competitive salary</li>\n<li>Comprehensive benefits package</li>\n<li>Opportunities for professional growth and development</li>\n<li>Collaborative and dynamic work environment</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_a15b11dd-765","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Microsoft AI","sameAs":"https://microsoft.ai","logo":"https://logos.yubhub.co/microsoft.ai.png"},"x-apply-url":"https://microsoft.ai/job/principal-software-engineer-23/","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"USD $139,900 – $274,800 per year","x-skills-required":["C","C++","C#","Java","JavaScript","Python","model compression","GPU inference optimization","profiling tools"],"x-skills-preferred":["TensorRT","Triton","CUDA","TensorBoard","PyTorch profiler"],"datePosted":"2026-03-06T07:29:39.108Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Redmond"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"C, C++, C#, Java, JavaScript, Python, model compression, GPU inference optimization, profiling tools, TensorRT, Triton, CUDA, TensorBoard, PyTorch profiler","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":139900,"maxValue":274800,"unitText":"YEAR"}}}]}