{"version":"0.1","company":{"name":"YubHub","url":"https://yubhub.co","jobsUrl":"https://yubhub.co/jobs/skill/nccl"},"x-facet":{"type":"skill","slug":"nccl","display":"Nccl","count":28},"x-feed-size-limit":100,"x-feed-sort":"enriched_at desc","x-feed-notice":"This feed contains at most 100 jobs (the most recently enriched). For the full corpus, use the paginated /stats/by-facet endpoint or /search.","x-generator":"yubhub-xml-generator","x-rights":"Free to redistribute with attribution: \"Data by YubHub (https://yubhub.co)\"","x-schema":"Each entry in `jobs` follows https://schema.org/JobPosting. YubHub-native raw fields carry `x-` prefix.","jobs":[{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_588dfb0e-611"},"title":"Solutions Architect - Kubernetes","description":"<p>As a Solutions Architect at CoreWeave, you will play a vital role in helping customers succeed with our cloud infrastructure offerings, focusing on Kubernetes solutions within high-performance compute (HPC) environments.</p>\n<p>Your responsibilities will include serving as the primary technical point of contact for customers, establishing strong technical relationships and ensuring their success with CoreWeave&#39;s cloud infrastructure offerings.</p>\n<p>You will collaborate closely with customers to understand their unique business needs and create, prototype, and deploy tailored solutions that align with their requirements.</p>\n<p>You will lead proof of concept initiatives to showcase the value and viability of CoreWeave&#39;s solutions within specific environments.</p>\n<p>You will drive technical leadership and direction during customer meetings, presentations, and workshops, addressing any technical queries or concerns that arise.</p>\n<p>You will act as a virtual member of CoreWeave&#39;s Kubernetes product and engineering teams, identifying opportunities for product enhancement and collaborating with engineers to implement your suggestions.</p>\n<p>You will offer valuable insights on product features, functionality, and performance, contributing regularly to discussions about product strategy and architecture.</p>\n<p>You will conduct periodic technical reviews and assessments of customer workloads, pinpointing opportunities for workload optimization and suggesting suitable solutions.</p>\n<p>You will stay informed of the latest developments and trends in Kubernetes, cloud computing and infrastructure, sharing your thought leadership with customers and internal stakeholders.</p>\n<p>You will lead the prototyping and initiation of research and development efforts for emerging products and solutions, delivering prototypes and key insights for internal consumption.</p>\n<p>You will represent CoreWeave at conferences and industry events, with occasional travel as required.</p>\n<p>To be successful in this role, you will need to have a B.S. in Computer Science or a related technical discipline, or equivalent experience.</p>\n<p>You will also need to have 7+ years of proven experience as a Solutions Architect, engineer, researcher, or technical account manager in cloud infrastructure, focusing on building distributed systems or HPC/cloud services, with an expertise focused on scalable Kubernetes solutions.</p>\n<p>You will need to be fluent in cloud computing concepts, architecture, and technologies with hands-on experience in designing and implementing cloud solutions.</p>\n<p>You will need to have a proven track record with building customer relationships, communicating clearly and the ability to break down complex technical concepts to both technical and non-technical audiences.</p>\n<p>You will need to be familiar with NVIDIA GPUs typically used in AI/ML applications and associated technologies such as Infiniband and NVIDIA Collective Communications Library (NCCL).</p>\n<p>You will need to have experience with running large-scale Artificial Intelligence/Machine Learning (AI/ML) training and inference workloads on technologies such as Slurm and Kubernetes.</p>\n<p>Preferred qualifications include code contributions to open-source inference frameworks, experience with scripting and automation related to Kubernetes clusters and workloads, experience with building solutions across multi-cloud environments, and client or customer-facing publications/talks on latency, optimization, or advanced model-server architectures.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_588dfb0e-611","directApply":true,"hiringOrganization":{"@type":"Organization","name":"CoreWeave","sameAs":"https://www.coreweave.com","logo":"https://logos.yubhub.co/coreweave.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/coreweave/jobs/4557835006","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$165,000 to $220,000","x-skills-required":["Kubernetes","Cloud Computing","High-Performance Compute (HPC)","Distributed Systems","Cloud Infrastructure","Scalable Solutions","NVIDIA GPUs","Infiniband","NVIDIA Collective Communications Library (NCCL)","Slurm","Kubernetes Clusters"],"x-skills-preferred":["Code Contributions to Open-Source Inference Frameworks","Scripting and Automation Related to Kubernetes Clusters and Workloads","Building Solutions Across Multi-Cloud Environments","Client or Customer-Facing Publications/Talks on Latency, Optimization, or Advanced Model-Server Architectures"],"datePosted":"2026-04-18T15:57:29.779Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Livingston, NJ / New York, NY / Sunnyvale, CA / Bellevue, WA"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Kubernetes, Cloud Computing, High-Performance Compute (HPC), Distributed Systems, Cloud Infrastructure, Scalable Solutions, NVIDIA GPUs, Infiniband, NVIDIA Collective Communications Library (NCCL), Slurm, Kubernetes Clusters, Code Contributions to Open-Source Inference Frameworks, Scripting and Automation Related to Kubernetes Clusters and Workloads, Building Solutions Across Multi-Cloud Environments, Client or Customer-Facing Publications/Talks on Latency, Optimization, or Advanced Model-Server Architectures","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":165000,"maxValue":220000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_d799d883-0dd"},"title":"Solutions Architect- Networking","description":"<p>As a Solutions Architect at CoreWeave, you will play a vital role in leading innovation at every turn. You will have the opportunity to demonstrate thought leadership and engage hands-on throughout our customers&#39; entire lifecycle. From establishing their Kubernetes environment to developing proofs of concept, onboarding, and optimizing workloads, you will lead innovation at every turn.</p>\n<p>In this role, you will:</p>\n<p>Serve as the primary technical point of contact for customers, establishing strong technical relationships and ensuring their success with CoreWeave&#39;s cloud infrastructure offerings, focusing on networking technologies within high-performance compute (HPC) environments Collaborate closely with customers to understand their unique business needs and create, prototype, and deploy tailored solutions that align with their requirements. Lead proof of concept initiatives to showcase the value and viability of CoreWeave&#39;s solutions within specific environments. Drive technical leadership and direction during customer meetings, presentations, and workshops, addressing any technical queries or concerns that arise. Act as a virtual member of CoreWeave&#39;s Networking product and engineering teams, identifying opportunities for product enhancement and collaborating with engineers to implement your suggestions. Offer valuable insights on product features, functionality, and performance, contributing regularly to discussions about product strategy and architecture. Conduct periodic technical reviews and assessments of customer workloads, pinpointing opportunities for workload optimization and suggesting suitable solutions. Stay informed of the latest developments and trends in Kubernetes, cloud computing and infrastructure, sharing your thought leadership with customers and internal stakeholders. Lead the prototyping and initiation of research and development efforts for emerging products and solutions, delivering prototypes and key insights for internal consumption. Represent CoreWeave at conferences and industry events, with occasional travel as required.</p>\n<p>Who You Are:</p>\n<p>B.S. in Computer Science or a related technical discipline, or equivalent experience 7+ years of proven experience as a Solutions Architect, engineer, researcher, or technical account manager in cloud infrastructure focusing on building distributed systems or HPC/cloud services, with an expertise focused on infrastructure networking. Fluency in cloud computing concepts, architecture, and technologies with hands-on experience in designing and implementing cloud solutions Proven track record with building customer relationships, communicating clearly and the ability to break down complex technical concepts to both technical and non-technical audiences Expertise with a broad range of networking technologies and topics, with a familiarity to understand the needs and use cases is it relates to securing and enabling high performance networking environments. Experience with managing infrastructure networking, Kubernnetes CSI management, and private networking concepts Familiar with NVIDIA GPUs typically used in AI/ML applications and associated technologies such as Infiniband and NVIDIA Collective Communications Library (NCCL)</p>\n<p>Preferred:</p>\n<p>Code contributions to open-source inference frameworks Experience with scripting and automation related to network technologies Experience with building solutions across multi-cloud environments Client or customer-facing publications/talks on latency, optimization, or advanced model-server architectures</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_d799d883-0dd","directApply":true,"hiringOrganization":{"@type":"Organization","name":"CoreWeave","sameAs":"https://www.coreweave.com","logo":"https://logos.yubhub.co/coreweave.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/coreweave/jobs/4568528006","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$165,000 to $220,000","x-skills-required":["cloud computing","Kubernetes","infrastructure networking","high-performance computing","networking technologies","NVIDIA GPUs","Infiniband","NVIDIA Collective Communications Library (NCCL)"],"x-skills-preferred":["open-source inference frameworks","scripting and automation","multi-cloud environments","latency, optimization, or advanced model-server architectures"],"datePosted":"2026-04-18T15:56:27.053Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Livingston, NJ / New York, NY / Sunnyvale, CA"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"cloud computing, Kubernetes, infrastructure networking, high-performance computing, networking technologies, NVIDIA GPUs, Infiniband, NVIDIA Collective Communications Library (NCCL), open-source inference frameworks, scripting and automation, multi-cloud environments, latency, optimization, or advanced model-server architectures","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":165000,"maxValue":220000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_9166d234-4c5"},"title":"Solutions Architect - HPC/AI/ML","description":"<p>As a Solutions Architect at CoreWeave, you will play a vital and dynamic role in helping customers establish their Kubernetes environment, develop proofs of concept, onboard, and optimise workloads. You will serve as the primary technical point of contact for customers, establishing strong technical relationships and ensuring their success with CoreWeave&#39;s cloud infrastructure offerings, focusing on AI/ML workloads within high-performance compute (HPC) environments.</p>\n<p>Collaborate closely with customers to understand their unique business needs and create, prototype, and deploy tailored solutions that align with their requirements. Lead proof of concept initiatives to showcase the value and viability of CoreWeave&#39;s solutions within specific environments.</p>\n<p>Drive technical leadership and direction during customer meetings, presentations, and workshops, addressing any technical queries or concerns that arise. Act as a virtual member of CoreWeave&#39;s Kubernetes product and engineering teams, identifying opportunities for product enhancement and collaborating with engineers to implement your suggestions.</p>\n<p>Offer valuable insights on product features, functionality, and performance, contributing regularly to discussions about product strategy and architecture. Conduct periodic technical reviews and assessments of customer workloads, pinpointing opportunities for workload optimisation and suggesting suitable solutions.</p>\n<p>Stay informed of the latest developments and trends in Kubernetes, cloud computing and infrastructure, sharing your thought leadership with customers and internal stakeholders. Lead the prototyping and initiation of research and development efforts for emerging products and solutions, delivering prototypes and key insights for internal consumption.</p>\n<p>Represent CoreWeave at conferences and industry events, with occasional travel as required.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_9166d234-4c5","directApply":true,"hiringOrganization":{"@type":"Organization","name":"CoreWeave","sameAs":"https://www.coreweave.com","logo":"https://logos.yubhub.co/coreweave.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/coreweave/jobs/4649044006","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$165,000 to $225,000 SGD","x-skills-required":["cloud computing concepts","architecture","technologies","NVIDIA GPUs","Infiniband","NVIDIA Collective Communications Library (NCCL)","Slurm","Kubernetes"],"x-skills-preferred":["code contributions to open-source inference frameworks","scripting and automation related to AI/ML workloads","building solutions across multi-cloud environments","client or customer-facing publications/talks on latency, optimisation, or advanced model-server architectures"],"datePosted":"2026-04-18T15:51:30.371Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Singapore"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"cloud computing concepts, architecture, technologies, NVIDIA GPUs, Infiniband, NVIDIA Collective Communications Library (NCCL), Slurm, Kubernetes, code contributions to open-source inference frameworks, scripting and automation related to AI/ML workloads, building solutions across multi-cloud environments, client or customer-facing publications/talks on latency, optimisation, or advanced model-server architectures","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":165000,"maxValue":225000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_ec7cc743-ef4"},"title":"Senior Software Engineer II, Inference","description":"<p>We&#39;re seeking a senior software engineer to join our team and lead the design and development of our Kubernetes-native inference platform. As a senior engineer, you will be responsible for leading design reviews, driving architecture, and ensuring the reliability and scalability of our platform.</p>\n<p>Key responsibilities include:</p>\n<ul>\n<li>Leading design reviews and driving architecture within the team</li>\n<li>Defining and owning SLIs/SLOs and ensuring post-incident actions land and reliability improves release-over-release</li>\n<li>Implementing advanced optimizations such as micro-batch schedulers, speculative decoding, and KV-cache reuse</li>\n<li>Strengthening incident posture through capacity planning, autoscaling policy, and rollback/traffic-shift strategies</li>\n<li>Mentoring IC1/IC2 engineers and reviewing cross-team designs to elevate coding/testing standards</li>\n</ul>\n<p>We&#39;re looking for someone with strong coding skills in Python or Go, deep familiarity with networked systems and performance, and hands-on experience with Kubernetes at production scale. If you have experience with inference internals, batching, caching, mixed precision, and streaming token delivery, that&#39;s a plus.</p>\n<p>In addition to a competitive salary, we offer a range of benefits including medical, dental, and vision insurance, company-paid life insurance, and flexible PTO. We&#39;re committed to creating a work environment that&#39;s inclusive, diverse, and supportive of our employees&#39; well-being.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_ec7cc743-ef4","directApply":true,"hiringOrganization":{"@type":"Organization","name":"CoreWeave","sameAs":"https://www.coreweave.com","logo":"https://logos.yubhub.co/coreweave.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/coreweave/jobs/4604832006","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$165,000 to $242,000","x-skills-required":["Python","Go","Kubernetes","Networked systems","Performance","Inference internals","Batching","Caching","Mixed precision","Streaming token delivery"],"x-skills-preferred":["CUDA kernels","NCCL/SHARP","RDMA/NUMA","GPU interconnect topologies","Contributions to inference frameworks","Experience with multi-team initiatives"],"datePosted":"2026-04-18T15:50:27.738Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Sunnyvale, CA / Bellevue, WA"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Python, Go, Kubernetes, Networked systems, Performance, Inference internals, Batching, Caching, Mixed precision, Streaming token delivery, CUDA kernels, NCCL/SHARP, RDMA/NUMA, GPU interconnect topologies, Contributions to inference frameworks, Experience with multi-team initiatives","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":165000,"maxValue":242000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_3c6419c4-a9b"},"title":"Software Engineer, Compute Efficiency","description":"<p>As a Software Engineer for Compute Efficiency on the Capacity team, you will play a central role in making our systems more performant, cost-effective, and sustainable,without compromising reliability or latency.</p>\n<p>You will work across the full infrastructure stack, from cloud platforms and networking to application-level performance, and will bridge the gap between high-level research needs and low-level hardware constraints to build the most efficient AI infrastructure in the world. You will help with building the telemetry, cost attribution, and optimization frameworks that ensure every dollar of our infrastructure investment delivers maximum value.</p>\n<p>Responsibilities:</p>\n<ul>\n<li>Build and evolve telemetry and monitoring systems to provide deep visibility into infrastructure performance, utilization, and costs across our cloud and datacenter fleets.</li>\n</ul>\n<ul>\n<li>Design and implement cost attribution frameworks for our multi-tenant infrastructure, enabling teams to understand and optimize their resource consumption.</li>\n</ul>\n<ul>\n<li>Identify and resolve performance bottlenecks and capacity hotspots through deep analysis of distributed systems at scale.</li>\n</ul>\n<ul>\n<li>Partner closely with cloud service providers and internal stakeholders to optimize cluster configurations, workload placement, and resource utilization across AI training and inference workloads,including large-scale clusters spanning thousands to hundreds of thousands of machines.</li>\n</ul>\n<ul>\n<li>Develop and champion engineering practices around efficiency, driving a culture of performance awareness and cost-conscious design across Anthropic.</li>\n</ul>\n<ul>\n<li>Collaborate with research and product teams to deeply understand their infrastructure needs, and design solutions that balance performance with cost efficiency.</li>\n</ul>\n<ul>\n<li>Drive architectural improvements and code-level optimizations across multiple services and platforms to deliver measurable utilization and performance gains.</li>\n</ul>\n<p>You may be a good fit if you:</p>\n<ul>\n<li>Have 6+ years of relevant industry experience, 1+ year leading large scale, complex projects or teams as a software engineer or tech lead</li>\n</ul>\n<ul>\n<li>Deep expertise in distributed systems at scale, with a strong focus on infrastructure reliability, scalability, and continuous improvement.</li>\n</ul>\n<ul>\n<li>Strong proficiency in at least one programming language (e.g., Python, Rust, Go, Java)</li>\n</ul>\n<ul>\n<li>Hands-on experience with cloud infrastructure, including Kubernetes, Infrastructure as Code, and major cloud providers such as AWS or GCP.</li>\n</ul>\n<ul>\n<li>Experience optimizing end-to-end performance of distributed systems, including workload right-sizing and resource utilization tuning.</li>\n</ul>\n<ul>\n<li>You possess a deep curiosity for how things work under the hood and have a proven ability to work independently to solve opaque performance issues</li>\n</ul>\n<ul>\n<li>Experience designing or working with performance and utilization monitoring tools in large-scale, distributed environments.</li>\n</ul>\n<ul>\n<li>Strong problem-solving skills with the ability to work independently and navigate ambiguity.</li>\n</ul>\n<ul>\n<li>Excellent communication and collaboration skills,you will work closely with internal and external stakeholders to build consensus and drive projects forward.</li>\n</ul>\n<p>Strong candidates may have:</p>\n<ul>\n<li>Experience with machine learning infrastructure workloads as well as associated networking technologies like NCCL.</li>\n</ul>\n<ul>\n<li>Low level systems experience, for example linux kernel tuning and eBPF</li>\n</ul>\n<ul>\n<li>Quickly understanding systems design tradeoffs, keeping track of rapidly evolving software systems</li>\n</ul>\n<ul>\n<li>Published work in performance optimization and scaling distributed systems</li>\n</ul>\n<p>The annual compensation range for this role is $320,000-$405,000 USD.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_3c6419c4-a9b","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Anthropic","sameAs":"https://www.anthropic.com","logo":"https://logos.yubhub.co/anthropic.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/anthropic/jobs/5108982008","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$320,000-$405,000 USD","x-skills-required":["distributed systems","cloud infrastructure","Kubernetes","Infrastructure as Code","AWS","GCP","Python","Rust","Go","Java"],"x-skills-preferred":["machine learning infrastructure workloads","NCCL","linux kernel tuning","eBPF","performance optimization","scaling distributed systems"],"datePosted":"2026-04-18T15:49:18.293Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco, CA | New York City, NY"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"distributed systems, cloud infrastructure, Kubernetes, Infrastructure as Code, AWS, GCP, Python, Rust, Go, Java, machine learning infrastructure workloads, NCCL, linux kernel tuning, eBPF, performance optimization, scaling distributed systems","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":320000,"maxValue":405000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_59e88547-efc"},"title":"Senior Software Engineer, Systems","description":"<p>About Anthropic</p>\n<p>Anthropic&#39;s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole.</p>\n<p>About the Role</p>\n<p>Anthropic&#39;s Infrastructure organization is foundational to our mission of developing AI systems that are reliable, interpretable, and steerable. The systems we build determine how quickly we can train new models, how reliably we can run safety experiments, and how effectively we can scale Claude to millions of users , demonstrating that safe, reliable infrastructure and frontier capabilities can go hand in hand. The Systems engineering team owns compute uptime and resilience at massive scale, building the clusters, automation, and observability that make frontier AI research possible and safely deployable to customers.</p>\n<p>Responsibilities</p>\n<ul>\n<li>Lead infrastructure projects from design through delivery, owning scope, execution, and outcomes</li>\n<li>Build and maintain systems that support AI clusters at massive scale (thousands to hundreds of thousands of machines)</li>\n<li>Partner with cloud providers and internal teams to solve compute, networking, and reliability challenges</li>\n<li>Tackle difficult technical problems in your domain and proactively fill gaps in tooling, documentation, and processes</li>\n<li>Contribute to operational practices including incident response, postmortems, and on-call rotations</li>\n</ul>\n<p>Benefits</p>\n<ul>\n<li>Competitive compensation and benefits</li>\n<li>Optional equity donation matching</li>\n<li>Generous vacation and parental leave</li>\n<li>Flexible working hours</li>\n<li>Lovely office space in which to collaborate with colleagues</li>\n</ul>\n<p>Requirements</p>\n<ul>\n<li>6+ years of software engineering experience</li>\n<li>Have led technical projects end-to-end over multiple months, including scoping, breaking down work, and driving delivery</li>\n<li>Have deep knowledge of distributed systems, reliability, and cloud platforms (Kubernetes, IaC, AWS/GCP)</li>\n<li>Are strong in at least one systems language (Python, Rust, Go, Java)</li>\n<li>Solve hard problems independently and know when to pull others in</li>\n<li>Help teammates grow through knowledge sharing and thoughtful technical guidance</li>\n<li>Communicate clearly in design docs, presentations, and cross-functional discussions</li>\n</ul>\n<p>Preferred Qualifications</p>\n<ul>\n<li>Security and privacy best practice expertise</li>\n<li>Experience with machine learning infrastructure like GPUs, TPUs, or Trainium, as well as supporting networking infrastructure like NCCL</li>\n<li>Low level systems experience, for example linux kernel tuning and eBPF</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_59e88547-efc","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Anthropic","sameAs":"https://www.anthropic.com/","logo":"https://logos.yubhub.co/anthropic.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/anthropic/jobs/4915842008","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"£240,000-£325,000 GBP","x-skills-required":["Distributed systems","Reliability","Cloud platforms","Kubernetes","IaC","AWS/GCP","Systems language","Python","Rust","Go","Java"],"x-skills-preferred":["Security and privacy best practice","Machine learning infrastructure","GPUs","TPUs","Trainium","Networking infrastructure","NCCL","Low level systems experience","Linux kernel tuning","eBPF"],"datePosted":"2026-04-18T15:48:47.617Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"London, UK"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Distributed systems, Reliability, Cloud platforms, Kubernetes, IaC, AWS/GCP, Systems language, Python, Rust, Go, Java, Security and privacy best practice, Machine learning infrastructure, GPUs, TPUs, Trainium, Networking infrastructure, NCCL, Low level systems experience, Linux kernel tuning, eBPF","baseSalary":{"@type":"MonetaryAmount","currency":"GBP","value":{"@type":"QuantitativeValue","minValue":240000,"maxValue":325000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_9701c504-1a6"},"title":"Senior Software Engineer I, Inference","description":"<p>We&#39;re looking for a Senior Software Engineer I to join our team. As a senior engineer, you&#39;ll lead designs, raise engineering standards, and deliver measurable improvements to latency, throughput, and reliability across multiple services. You&#39;ll partner with product, orchestration, and hardware teams to evolve our Kubernetes-native inference platform and meet strict P99 SLAs at scale.</p>\n<p>Key responsibilities include:</p>\n<ul>\n<li>Lead design reviews and drive architecture within the team; decompose multi-service work into clear milestones.</li>\n<li>Define and own SLIs/SLOs; ensure post-incident actions land and reliability improves release-over-release.</li>\n<li>Implement advanced optimizations (e.g., micro-batch schedulers, speculative decoding, KV-cache reuse) and quantify impact.</li>\n<li>Strengthen incident posture: capacity planning, autoscaling policy, graceful degradation, rollback/traffic-shift strategies.</li>\n<li>Mentor IC1/IC2 engineers; review cross-team designs and elevate coding/testing standards.</li>\n</ul>\n<p>Requirements include:</p>\n<ul>\n<li>3-5 years of industry experience building distributed systems or cloud services.</li>\n<li>Strong coding in Python or Go (C++ a plus) and deep familiarity with networked systems and performance.</li>\n<li>Hands-on experience with Kubernetes at production scale, CI/CD, and observability stacks (Prometheus, Grafana, OpenTelemetry).</li>\n<li>Practical knowledge of inference internals: batching, caching, mixed precision (BF16/FP8), streaming token delivery.</li>\n<li>Proven track record improving tail latency (P95/P99) and service reliability through metrics-driven work.</li>\n</ul>\n<p>Preferred qualifications include contributions to inference frameworks, experience with CUDA kernels, NCCL/SHARP, RDMA/NUMA, or GPU interconnect topologies, and leading multi-team initiatives or partnering with customers on mission-critical launches.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_9701c504-1a6","directApply":true,"hiringOrganization":{"@type":"Organization","name":"CoreWeave","sameAs":"https://www.coreweave.com","logo":"https://logos.yubhub.co/coreweave.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/coreweave/jobs/4647603006","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$139,000 to $204,000","x-skills-required":["Python","Go","Kubernetes","CI/CD","Observability stacks","Inference internals","Batching","Caching","Mixed precision","Streaming token delivery"],"x-skills-preferred":["Contributions to inference frameworks","CUDA kernels","NCCL/SHARP","RDMA/NUMA","GPU interconnect topologies"],"datePosted":"2026-04-18T15:48:09.297Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Sunnyvale, CA / Bellevue, WA"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Python, Go, Kubernetes, CI/CD, Observability stacks, Inference internals, Batching, Caching, Mixed precision, Streaming token delivery, Contributions to inference frameworks, CUDA kernels, NCCL/SHARP, RDMA/NUMA, GPU interconnect topologies","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":139000,"maxValue":204000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_93c935ee-a9f"},"title":"Network Engineer - AI/HPC","description":"<p><strong>About the Role</strong></p>\n<p>xAI is a technology company that aims to create AI systems to understand the universe and aid humanity in its pursuit of knowledge. We are seeking a Network Engineer - AI/HPC to join our team.</p>\n<p><strong>Responsibilities</strong></p>\n<ul>\n<li>Develop and maintain large-scale networks with expertise in RoCEv2, optimizing performance and availability.</li>\n<li>Design and implement metric dashboards to monitor network performance.</li>\n<li>Collaborate with the team to design and implement the next iteration of our backend and front-end networks.</li>\n<li>Participate in a team on-call rotation and help with scaling and maintenance efforts.</li>\n</ul>\n<p><strong>Requirements</strong></p>\n<ul>\n<li>Minimum 10 years designing and operating large-scale networks with 5 years in the ethernet AI/HPC space.</li>\n<li>Deep understanding of congestion control on ethernet with Infiniband an added bonus.</li>\n<li>Expertise in creating a portfolio of metrics for performance and operations to optimize the fleet for training and inference traffic.</li>\n<li>Experience with Python to automate away repetitive tasks and facilitate daily job working with and analyzing large sets of data.</li>\n</ul>\n<p><strong>Compensation and Benefits</strong></p>\n<p>$180,000 - $440,000 base salary. Comprehensive medical, vision, and dental coverage, access to a 401(k) retirement plan, short &amp; long-term disability insurance, life insurance, and various other discounts and perks.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_93c935ee-a9f","directApply":true,"hiringOrganization":{"@type":"Organization","name":"xAI","sameAs":"https://www.xai.com/","logo":"https://logos.yubhub.co/xai.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/xai/jobs/5074185007","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$180,000 - $440,000","x-skills-required":["RoCEv2","Ethernet","Infiniband","NCCL","Python","Large-scale network design and operation"],"x-skills-preferred":[],"datePosted":"2026-04-18T15:45:48.262Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Palo Alto, CA"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"RoCEv2, Ethernet, Infiniband, NCCL, Python, Large-scale network design and operation","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":180000,"maxValue":440000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_71554e46-b64"},"title":"Senior Engineering Manager, AI Runtime","description":"<p>At Databricks, we are committed to enabling data teams to solve the world&#39;s toughest problems. As a Senior Engineering Manager, you will lead the team owning both the product experience and the foundational infrastructure of our AI Runtime (AIR) product.</p>\n<p>You will be responsible for shaping customer-facing capabilities while designing for scalability, extensibility, and performance of GPU training and adjacent areas. This will involve collaborating closely across the platform, product, infrastructure, and research organisations.</p>\n<p>Key responsibilities include:</p>\n<ul>\n<li>Leading, mentoring, and growing a high-performing engineering team responsible for the Custom Training product and its foundational infrastructure</li>\n<li>Defining and owning the product and technical roadmap for AIR, balancing customer experience, functionality, and foundational investments</li>\n<li>Collaborating closely with product, research, platform, infrastructure teams, and customers to drive end-to-end delivery</li>\n<li>Driving architectural decisions and product design for managed GPU training at scale</li>\n<li>Advocating for customer needs through direct engagement, ensuring engineering decisions translate to clear product impact</li>\n</ul>\n<p>We are looking for someone with 8+ years of software engineering experience, with 3+ years in engineering management. You should have a track record of building and operating managed GPU training infrastructure at scale, as well as deep familiarity with distributed training frameworks and parallelism strategies.</p>\n<p>In addition, you should have experience with training resilience patterns, such as checkpointing, elastic training, and automated failure recovery for long-running jobs. You should also have a strong understanding of GPU performance fundamentals, including NCCL, interconnect topologies, and memory optimisation.</p>\n<p>Experience building platform products with clear SLAs is also essential, as is strong cross-functional leadership across platform, product, and research teams. Excellent collaboration and communication skills are also required.</p>\n<p>The pay range for this role is $228,600-$314,250 USD per year, depending on location. The total compensation package may also include eligibility for annual performance bonus, equity, and benefits.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_71554e46-b64","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Databricks","sameAs":"https://databricks.com","logo":"https://logos.yubhub.co/databricks.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/databricks/jobs/8490282002","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$228,600-$314,250 USD per year","x-skills-required":["software engineering","engineering management","distributed training frameworks","parallelism strategies","GPU training infrastructure","checkpointing","elastic training","automated failure recovery","GPU performance fundamentals","NCCL","interconnect topologies","memory optimisation"],"x-skills-preferred":[],"datePosted":"2026-04-18T15:45:28.312Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Mountain View, California; San Francisco, California"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"software engineering, engineering management, distributed training frameworks, parallelism strategies, GPU training infrastructure, checkpointing, elastic training, automated failure recovery, GPU performance fundamentals, NCCL, interconnect topologies, memory optimisation","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":228600,"maxValue":314250,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_99450ad6-e3b"},"title":"Network Engineer - AI/HPC","description":"<p><strong>About the Role</strong></p>\n<p>We are seeking a skilled Network Engineer to join our team at xAI. As a Network Engineer, you will play a critical role in designing and operating large-scale networks for our AI and HPC systems.</p>\n<p><strong>Responsibilities</strong></p>\n<ul>\n<li>Design and operate large-scale networks with a deep understanding of congestion control on ethernet and Infiniband</li>\n<li>Develop and optimize network configurations to ensure high performance and availability</li>\n<li>Collaborate with the team to design the next iteration of our backend and front-end networks</li>\n<li>Travel to Memphis to build capacity and participate in a team on-call rotation</li>\n</ul>\n<p><strong>Requirements</strong></p>\n<ul>\n<li>Minimum of 10 years designing and operating large-scale networks with 5 years in the ethernet AI/HPC space</li>\n<li>Deep understanding of congestion control on ethernet with Infiniband an added bonus</li>\n<li>Expertise in creating a portfolio of metrics for performance and operations to optimize the fleet for training and inference traffic</li>\n<li>Experience with Python to automate away repetitive tasks and facilitate daily job working with and analyzing large sets of data</li>\n</ul>\n<p><strong>Benefits</strong></p>\n<ul>\n<li>Opportunity to work with a highly motivated team focused on engineering excellence</li>\n<li>Collaborative and dynamic work environment</li>\n<li>Professional development opportunities</li>\n</ul>\n<p><strong>What We Offer</strong></p>\n<ul>\n<li>Competitive salary and benefits package</li>\n<li>Opportunity to work on cutting-edge AI and HPC projects</li>\n<li>Collaborative and dynamic work environment</li>\n</ul>\n<p><strong>How to Apply</strong></p>\n<p>If you are a motivated and experienced Network Engineer looking for a new challenge, please submit your application, including your resume and cover letter, to [insert contact information].</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_99450ad6-e3b","directApply":true,"hiringOrganization":{"@type":"Organization","name":"xAI","sameAs":"https://www.xai.com/","logo":"https://logos.yubhub.co/xai.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/xai/jobs/4946691007","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["RoCEv2","NCCL","Python","Ethernet","Infiniband","AI training and inference workloads"],"x-skills-preferred":[],"datePosted":"2026-04-18T15:45:15.340Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Memphis, TN"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"RoCEv2, NCCL, Python, Ethernet, Infiniband, AI training and inference workloads"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_28107212-128"},"title":"Performance Engineer, GPU","description":"<p>As a GPU Performance Engineer at Anthropic, you will be responsible for architecting and implementing the foundational systems that power Claude and push the frontiers of what&#39;s possible with large language models. You will maximize GPU utilization and performance at unprecedented scale, develop cutting-edge optimizations that directly enable new model capabilities, and dramatically improve inference efficiency.</p>\n<p>Working at the intersection of hardware and software, you will implement state-of-the-art techniques from custom kernel development to distributed system architectures. Your work will span the entire stack,from low-level tensor core optimizations to orchestrating thousands of GPUs in perfect synchronization.</p>\n<p>Strong candidates will have a track record of delivering transformative GPU performance improvements in production ML systems and will be excited to shape the future of AI infrastructure alongside world-class researchers and engineers.</p>\n<p>Responsibilities:</p>\n<ul>\n<li>Architect and implement foundational systems that power Claude</li>\n<li>Maximize GPU utilization and performance at unprecedented scale</li>\n<li>Develop cutting-edge optimizations that directly enable new model capabilities</li>\n<li>Dramatically improve inference efficiency</li>\n<li>Implement state-of-the-art techniques from custom kernel development to distributed system architectures</li>\n<li>Work at the intersection of hardware and software</li>\n<li>Span the entire stack,from low-level tensor core optimizations to orchestrating thousands of GPUs in perfect synchronization</li>\n</ul>\n<p>Requirements:</p>\n<ul>\n<li>Deep experience with GPU programming and optimization at scale</li>\n<li>Impact-driven, passionate about delivering measurable performance breakthroughs</li>\n<li>Ability to navigate complex systems from hardware interfaces to high-level ML frameworks</li>\n<li>Enjoy collaborative problem-solving and pair programming</li>\n<li>Want to work on state-of-the-art language models with real-world impact</li>\n<li>Care about the societal impacts of your work</li>\n<li>Thrive in ambiguous environments where you define the path forward</li>\n</ul>\n<p>Nice to have:</p>\n<ul>\n<li>Experience with GPU Kernel Development: CUDA, Triton, CUTLASS, Flash Attention, tensor core optimization</li>\n<li>ML Compilers &amp; Frameworks: PyTorch/JAX internals, torch.compile, XLA, custom operators</li>\n<li>Performance Engineering: Kernel fusion, memory bandwidth optimization, profiling with Nsight</li>\n<li>Distributed Systems: NCCL, NVLink, collective communication, model parallelism</li>\n<li>Low-Precision: INT8/FP8 quantization, mixed-precision techniques</li>\n<li>Production Systems: Large-scale training infrastructure, fault tolerance, cluster orchestration</li>\n</ul>\n<p>Representative projects:</p>\n<ul>\n<li>Co-design attention mechanisms and algorithms for next-generation hardware architectures</li>\n<li>Develop custom kernels for emerging quantization formats and mixed-precision techniques</li>\n<li>Design distributed communication strategies for multi-node GPU clusters</li>\n<li>Optimize end-to-end training and inference pipelines for frontier language models</li>\n<li>Build performance modeling frameworks to predict and optimize GPU utilization</li>\n<li>Implement kernel fusion strategies to minimize memory bandwidth bottlenecks</li>\n<li>Create resilient systems for planet-scale distributed training infrastructure</li>\n<li>Profile and eliminate performance bottlenecks in production serving infrastructure</li>\n<li>Partner with hardware vendors to influence future accelerator capabilities and software stacks</li>\n</ul>\n<p>Note: The salary range for this position is $280,000-$850,000 USD per year.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_28107212-128","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Anthropic","sameAs":"https://www.anthropic.com/","logo":"https://logos.yubhub.co/anthropic.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/anthropic/jobs/4926227008","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$280,000-$850,000 USD per year","x-skills-required":["GPU programming","optimization at scale","CUDA","Triton","CUTLASS","Flash Attention","tensor core optimization","PyTorch/JAX internals","torch.compile","XLA","custom operators","kernel fusion","memory bandwidth optimization","profiling with Nsight","NCCL","NVLink","collective communication","model parallelism","INT8/FP8 quantization","mixed-precision techniques","large-scale training infrastructure","fault tolerance","cluster orchestration"],"x-skills-preferred":[],"datePosted":"2026-04-18T15:40:11.758Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco, CA | New York City, NY | Seattle, WA"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"GPU programming, optimization at scale, CUDA, Triton, CUTLASS, Flash Attention, tensor core optimization, PyTorch/JAX internals, torch.compile, XLA, custom operators, kernel fusion, memory bandwidth optimization, profiling with Nsight, NCCL, NVLink, collective communication, model parallelism, INT8/FP8 quantization, mixed-precision techniques, large-scale training infrastructure, fault tolerance, cluster orchestration","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":280000,"maxValue":850000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_0a872d93-7f6"},"title":"Engineering Manager, Cloud Inference AWS","description":"<p>We are seeking an experienced Engineering Manager to lead the Cloud Inference team for AWS. You will lead your team to scale and optimize Claude to serve the massive audiences of developers and enterprise companies using AWS.</p>\n<p>As an Engineering Manager, you will own the end-to-end product of Claude on AWS, including API, load balancing, inference, capacity and operations. Your team will ensure our LLMs meet rigorous performance, safety and security standards and enhance our core infrastructure for packaging, testing, and deploying inference technology across the globe.</p>\n<p>Responsibilities:</p>\n<ul>\n<li>Set technical strategy and oversee development of Claude on AWS across all layers of the technical stack.</li>\n<li>Collaborate across teams and companies to deeply understand product, infrastructure, operations and capacity needs, identifying potential solutions to support frontier LLM serving</li>\n<li>Work closely with cross-functional stakeholders across companies to align on goals and drive outcomes</li>\n<li>Create clarity for the team and stakeholders in an ambiguous and evolving environment</li>\n<li>Take an inclusive approach to hiring and coaching top technical talent, and support a high performing team</li>\n<li>Design and run processes (e.g. postmortem review, incident response, on-call rotations) that help the team operate effectively and never fail the same way twice</li>\n</ul>\n<p>Requirements:</p>\n<ul>\n<li>10+ years of experience in high-scale, high-reliability software development, particularly infrastructure or capacity management</li>\n<li>5+ years of engineering management experience</li>\n<li>Experience recruiting, scaling, and retaining engineering talent in a high growth environment</li>\n<li>Have experience scaling products, resources and operations to accommodate rapid growth</li>\n<li>Are deeply interested in the potential transformative effects of advanced AI systems and are committed to ensuring their safe development</li>\n<li>Excel at building strong relationships and strategy with stakeholders across engineering, product, finance, and sales</li>\n<li>Have experience working with external partners to align goals and deliver impact</li>\n<li>Enjoy working in a fast-paced, early environment; comfortable with adapting priorities as driven by the rapidly evolving AI space</li>\n<li>Have excellent written and verbal communication skills</li>\n<li>Demonstrated success building a culture of belonging and engineering excellence</li>\n<li>Are motivated by developing AI responsibly and safely</li>\n<li>Are willing and able to travel frequently between Seattle and the SF Bay Area</li>\n</ul>\n<p>Strong candidates may also have experience with:</p>\n<ul>\n<li>Experience with machine learning infrastructure like GPUs, TPUs, or Trainium, as well as supporting networking infrastructure like NCCL</li>\n<li>Experience as a Product Manager</li>\n<li>Experience with deployment and capacity management automation</li>\n<li>Security and privacy best practice expertise</li>\n</ul>\n<p>Annual compensation range for this role is $405,000-$485,000 USD.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_0a872d93-7f6","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Anthropic","sameAs":"https://www.anthropic.com/","logo":"https://logos.yubhub.co/anthropic.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/anthropic/jobs/5141377008","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$405,000-$485,000 USD","x-skills-required":["Cloud Inference","AWS","Machine Learning","Infrastructure Management","Capacity Planning","Security and Privacy","Leadership","Communication","Collaboration"],"x-skills-preferred":["GPU","TPU","Trainium","NCCL","Product Management","Deployment Automation","Security Best Practices"],"datePosted":"2026-04-18T15:37:12.539Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco, CA | Seattle, WA"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Cloud Inference, AWS, Machine Learning, Infrastructure Management, Capacity Planning, Security and Privacy, Leadership, Communication, Collaboration, GPU, TPU, Trainium, NCCL, Product Management, Deployment Automation, Security Best Practices","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":405000,"maxValue":485000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_81c1220e-c71"},"title":"Applied Scientist / Research Engineer","description":"<p>About the Job</p>\n<p>Mistral AI is seeking Applied Scientists and Research Engineers to drive innovative research and collaborate with clients on complex research projects. You will develop SOTA models across different modalities such as text, image, and speech. By developing novel methods and research ideas you will apply these models across a diverse set of use cases and domains.</p>\n<p>Responsibilities</p>\n<p>• Run pre-training, post-training and deploy state of the art models on clusters with thousands of GPUs. You don’t panic when you see OOM errors or when NCCL feels like not wanting to talk.\n• Generate and curate data for pre-training and post-training, working on evaluations and making sure the model’s performance beats expectations.\n• Develop the necessary tools and frameworks to facilitate data generation, model training, evaluation and deployment.\n• Collaborate with cross-functional teams to tackle complex use cases using agents and RAG pipelines.\n• Manage research projects and communications with client research teams.</p>\n<p>About You</p>\n<p>• You are fluent in English, and have excellent communication skills. You are at ease explaining complex technical concepts to both technical and non-technical audiences.\n• You’re an expert with PyTorch or JAX.\n• You’re not afraid of contributing to a big codebase and can find yourself around independently with little guidance.\n• You write clean, readable, high-performance, fault-tolerant Python code.\n• You don’t need roadmaps: you just do. You don’t need a manager: you just ship.\n• Low-ego, collaborative and eager to learn.\n• You have a track record of success through personal projects, professional projects or in academia.</p>\n<p>Benefits</p>\n<p>• Competitive cash salary and equity\n• Health Insurance\n• Sport: $90 for gym membership allowance\n• Food: $200 monthly allowance for meals (solution might evolve as we grow bigger)\n• Transportation: $120/month for public transport or Parking charges reimbursed</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_81c1220e-c71","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Mistral AI","sameAs":"https://mistral.ai","logo":"https://logos.yubhub.co/mistral.ai.png"},"x-apply-url":"https://jobs.lever.co/mistral/c41d9d9e-f0ea-4621-a4a9-3f10dfa9ae84","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["PyTorch","JAX","Python","GPU","OOM","NCCL","Agents","RAG pipelines"],"x-skills-preferred":["PhD","Master","Mathematics","Physics","Machine Learning"],"datePosted":"2026-04-17T12:47:44.697Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Singapore"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"PyTorch, JAX, Python, GPU, OOM, NCCL, Agents, RAG pipelines, PhD, Master, Mathematics, Physics, Machine Learning"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_c569b4c8-3f2"},"title":"Applied Scientist / Research Engineer - EMEA","description":"<p>About Mistral AI</p>\n<p>Mistral AI is a pioneering company shaping the future of AI. We believe in the power of AI to simplify tasks, save time, and enhance learning and creativity.</p>\n<p>About The Job</p>\n<p>We are seeking Applied Scientists and Research Engineers to drive innovative research and collaborate with clients on complex research projects. You will develop SOTA models across different modalities such as text, image, and speech. By developing novel methods and research ideas you will apply these models across a diverse set of use cases and domains.</p>\n<p>Responsibilities</p>\n<p>• Run pre-training, post-training and deploy state of the art models on clusters with thousands of GPUs. You don’t panic when you see OOM errors or when NCCL feels like not wanting to talk.\n• Generate and curate data for pre-training and post-training, working on evaluations and making sure the model’s performance beats expectations.\n• Develop the necessary tools and frameworks to facilitate data generation, model training, evaluation and deployment.\n• Collaborate with cross-functional teams to tackle complex use cases using agents and RAG pipelines.\n• Manage research projects and communications with client research teams.</p>\n<p>About You</p>\n<p>• You are fluent in English, and have excellent communication skills. You are at ease explaining complex technical concepts to both technical and non-technical audiences.\n• You’re an expert with PyTorch or JAX.\n• You’re not afraid of contributing to a big codebase and can find yourself around independently with little guidance.\n• You write clean, readable, high-performance, fault-tolerant Python code.\n• You don’t need roadmaps: you just do. You don’t need a manager: you just ship.\n• Low-ego, collaborative and eager to learn.\n• You have a track record of success through personal projects, professional projects or in academia.</p>\n<p>Benefits</p>\n<p>We have local offices in Paris, London, Marseille, Singapore and Palo Alto. France:\n• Competitive cash salary and equity\n• Food: Daily lunch vouchers\n• Sport: Monthly contribution to a Gympass subscription\n• Transportation: Monthly contribution to a mobility pass\n• Health: Full health insurance for you and your family\n• Parental: Generous parental leave policy\n• Visa sponsorship</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_c569b4c8-3f2","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Mistral AI","sameAs":"https://mistral.ai/careers","logo":"https://logos.yubhub.co/mistral.ai.png"},"x-apply-url":"https://jobs.lever.co/mistral/b7ae8fc4-5779-4ad2-8f5b-632b4d9498cf","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["PyTorch","JAX","Python","GPU","OOM errors","NCCL"],"x-skills-preferred":[],"datePosted":"2026-04-17T12:47:06.801Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Paris"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"PyTorch, JAX, Python, GPU, OOM errors, NCCL"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_8d359571-77e"},"title":"Lead Software Engineer, Runtime","description":"<p>As the Technical Lead for the Inference team, you will drive the architecture and optimization of our inference backbone, ensuring high performance, scalability, and efficiency in a dynamic environment.</p>\n<p>The role involves architecting and optimizing the inference for high-volume, low-latency, and high-availability environments, leading the acquisition and automation of benchmarks, collaborating with cross-functional teams, and innovating solutions to enhance our AI-powered applications.</p>\n<p>Key responsibilities include:</p>\n<ul>\n<li>Architecting and optimizing the inference for high-volume, low-latency, and high-availability environments</li>\n<li>Leading the acquisition and automation of benchmarks at both micro and macro scales</li>\n<li>Introducing new techniques and tools to improve performance, latency, throughput, and efficiency in our model inference stack</li>\n<li>Building tools to identify bottlenecks and sources of instability, and designing solutions to address them</li>\n<li>Collaborating with machine learning researchers, engineers, and product managers to bring cutting-edge technologies into production</li>\n<li>Optimizing code and infrastructure to maximize hardware utilization and efficiency</li>\n<li>Mentoring and guiding team members, fostering a culture of collaboration, innovation, and continuous learning</li>\n</ul>\n<p>Requirements include:</p>\n<ul>\n<li>Extensive experience in C++ and Python, with a strong focus on backend development and performance optimization</li>\n<li>Deep understanding of modern ML architectures and experience with performance optimization for inference</li>\n<li>Proven track record with large-scale distributed systems, particularly performance-critical ones</li>\n<li>Familiarity with PyTorch, TensorRT, CUDA, NCCL</li>\n<li>Strong grasp of infrastructure, continuous integration, and continuous development principles</li>\n<li>Ability to lead and mentor team members, driving projects from concept to implementation</li>\n<li>Results-oriented mindset with a bias towards flexibility and impact</li>\n<li>Passion for staying ahead of emerging technologies and applying them to AI-driven solutions</li>\n<li>Humble attitude, eagerness to help colleagues, and a desire to see the team succeed</li>\n</ul>\n<p>Our Culture</p>\n<p>We&#39;re driven to build a strong company culture and are looking for individuals with solid alignment with the following:</p>\n<ul>\n<li>Reason with rigor</li>\n<li>Are you audacious enough?</li>\n<li>Make our customers succeed</li>\n<li>Ship early and accelerate</li>\n<li>Leave your ego aside</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_8d359571-77e","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Mistral AI","sameAs":"https://mistral.ai"},"x-apply-url":"https://jobs.lever.co/mistral/0593f273-44f5-4c20-a84c-0406d5da6a0b","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["C++","Python","PyTorch","TensorRT","CUDA","NCCL"],"x-skills-preferred":[],"datePosted":"2026-03-10T11:27:09.420Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Paris"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"C++, Python, PyTorch, TensorRT, CUDA, NCCL"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_fe73c8b8-4de"},"title":"Applied Scientist / Research Engineer (Internship)","description":"<p>About Mistral AI</p>\n<p>Mistral AI is a pioneering company shaping the future of AI. We believe in the power of AI to simplify tasks, save time, and enhance learning and creativity.</p>\n<p>Role Summary</p>\n<p>Mistral AI is seeking Applied Scientists Interns and Research Engineers Interns to drive innovative research and collaborate with clients on complex research projects. You will develop SOTA models across different modalities such as text, image, and speech. By developing novel methods and research ideas you will apply these models across a diverse set of use cases and domains.</p>\n<p>Responsibilities</p>\n<ul>\n<li>Run pre-training, post-training and deploy state of the art models on clusters with thousands of GPUs.</li>\n<li>Generate and curate data for pre-training and post-training, working on evaluations and making sure the model&#39;s performance beats expectations.</li>\n<li>Develop the necessary tools and frameworks to facilitate data generation, model training, evaluation and deployment.</li>\n<li>Collaborate with cross-functional teams to tackle complex use cases using agents and RAG pipelines.</li>\n<li>Manage research projects and communications with client research teams.</li>\n</ul>\n<p>About You</p>\n<ul>\n<li>You are fluent in English, and have excellent communication skills. You are at ease explaining complex technical concepts to both technical and non-technical audiences.</li>\n<li>You&#39;re an expert with PyTorch or JAX.</li>\n<li>You&#39;re not afraid of contributing to a big codebase and can find yourself around independently with little guidance.</li>\n<li>You write clean, readable, high-performance, fault-tolerant Python code.</li>\n<li>You don&#39;t need roadmaps: you just do. You don&#39;t need a manager: you just ship.</li>\n<li>Low-ego, collaborative and eager to learn.</li>\n<li>You have a track record of success through personal projects, professional projects or in academia.</li>\n</ul>\n<p>It Would Be Great If</p>\n<ul>\n<li>You are pursuing a PhD / master in a relevant field (e.g., Mathematics, Physics, Machine Learning), but if you’re an exceptional candidate from a different background, you should apply.</li>\n<li>We’d love to have you for at least 3 months, ideally 6 months.</li>\n<li>You can bring a variety of research experiences, such as working with agents, multi-modality, robotics, diffusion models, or time-series analysis.</li>\n<li>Have contributed to a large codebase used by many (open source or in the industry).</li>\n<li>Have a track record of publications in top academic journals or conferences.</li>\n<li>Love improving existing code by fixing typing issues, adding tests and improving CI pipelines.</li>\n</ul>\n<p>Benefits</p>\n<ul>\n<li>Competitive salary</li>\n<li>Food: Daily lunch vouchers</li>\n<li>Sport: Monthly contribution to a Gympass subscription</li>\n<li>Transportation: Monthly contribution to a mobility pass</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_fe73c8b8-4de","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Mistral AI","sameAs":"https://mistral.ai"},"x-apply-url":"https://jobs.lever.co/mistral/426ef8c0-eb26-4004-a690-f33c62b445a7","x-work-arrangement":"onsite","x-experience-level":"entry","x-job-type":"internship","x-salary-range":null,"x-skills-required":["PyTorch","JAX","Python","GPU","OOM errors","NCCL"],"x-skills-preferred":["agents","multi-modality","robotics","diffusion models","time-series analysis"],"datePosted":"2026-03-10T11:25:01.722Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Paris"}},"employmentType":"INTERN","occupationalCategory":"Engineering","industry":"Technology","skills":"PyTorch, JAX, Python, GPU, OOM errors, NCCL, agents, multi-modality, robotics, diffusion models, time-series analysis"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_9bc334f7-f1c"},"title":"Applied Scientist / Research Engineer","description":"<p>About the Job</p>\n<p>Mistral AI is seeking Applied Scientists and Research Engineers to drive innovative research and collaborate with clients on complex research projects.</p>\n<p>You will develop SOTA models across different modalities such as text, image, and speech. By developing novel methods and research ideas you will apply these models across a diverse set of use cases and domains. Working cross-functionally with both external and internal science, engineering, and product teams you will deliver high-impact AI solutions that turn the needle.</p>\n<p>Responsibilities</p>\n<p>• Run pre-training, post-training and deploy state of the art models on clusters with thousands of GPUs. You don’t panic when you see OOM errors or when NCCL feels like not wanting to talk.\n• Generate and curate data for pre-training and post-training, working on evaluations and making sure the model’s performance beats expectations.\n• Develop the necessary tools and frameworks to facilitate data generation, model training, evaluation and deployment.\n• Collaborate with cross-functional teams to tackle complex use cases using agents and RAG pipelines.\n• Manage research projects and communications with client research teams.</p>\n<p>About You</p>\n<p>• You are fluent in English, and have excellent communication skills. You are at ease explaining complex technical concepts to both technical and non-technical audiences.\n• You’re an expert with PyTorch or JAX.\n• You’re not afraid of contributing to a big codebase and can find yourself around independently with little guidance.\n• You write clean, readable, high-performance, fault-tolerant Python code.\n• You don’t need roadmaps: you just do. You don’t need a manager: you just ship.\n• Low-ego, collaborative and eager to learn.\n• You have a track record of success through personal projects, professional projects or in academia.</p>\n<p>It would be great if you</p>\n<p>• Hold a PhD / master in a relevant field (e.g., Mathematics, Physics, Machine Learning), but if you’re an exceptional candidate from a different background, you should apply.\n• Can bring a variety of research experience (agents, multi-modality, robotics, diffusion, time-series).\n• Have contributed to a large codebase used by many (open source or in the industry).\n• Have a track record of publications in top academic journals or conferences.\n• Love improving existing code by fixing typing issues, adding tests and improving CI pipelines.</p>\n<p>Benefits</p>\n<p>• Competitive cash salary and equity\n• Health Insurance\n• Sport: $90 for gym membership allowance\n• Food: $200 monthly allowance for meals (solution might evolve as we grow bigger)\n• Transportation: $120/month for public transport or Parking charges reimbursed</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_9bc334f7-f1c","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Mistral AI","sameAs":"https://mistral.ai"},"x-apply-url":"https://jobs.lever.co/mistral/c41d9d9e-f0ea-4621-a4a9-3f10dfa9ae84","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["PyTorch","JAX","Python","GPU","OOM","NCCL"],"x-skills-preferred":[],"datePosted":"2026-03-10T11:24:50.623Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Singapore"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"PyTorch, JAX, Python, GPU, OOM, NCCL"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_a51375e8-30e"},"title":"Member of Technical Staff, Software Co-Design AI HPC Systems","description":"<p>Our team&#39;s mission is to architect, co-design, and productionize next-generation AI systems at datacenter scale. We operate at the intersection of models, systems software, networking, storage, and AI hardware, optimizing end-to-end performance, efficiency, reliability, and cost. Our work spans today&#39;s frontier AI workloads and directly shapes the next generation of accelerators, system architectures, and large-scale AI platforms. We pursue this mission through deep hardware–software co-design, combining rigorous systems thinking with hands-on engineering. The team invests heavily in understanding real production workloads large-scale training, inference, and emerging multimodal models and translating those insights into concrete improvements across the stack: from kernels, runtimes, and distributed systems, all the way down to silicon-level trade-offs and datacenter-scale architectures. This role sits at the boundary between exploration and production. You will work closely with internal infrastructure, hardware, compiler, and product teams, as well as external partners across the hardware and systems ecosystem. Our operating model emphasizes rapid ideation and prototyping, followed by disciplined execution to drive high-leverage ideas into production systems that operate at massive scale. In addition to delivering real-world impact on large-scale AI platforms, the team actively contributes to the broader research and engineering community. Our work aligns closely with leading communities in ML systems, distributed systems, computer architecture, and high-performance computing, and we regularly publish, prototype, and open-source impactful technologies where appropriate.</p>\n<p>About the Team</p>\n<p>We build foundational AI infrastructure that enables large-scale training and inference across diverse workloads and rapidly evolving hardware generations. Our work directly shapes how AI systems are designed, deployed, and scaled today and into the future. Engineers on this team operate with end-to-end ownership, deep technical rigor, and a strong bias toward real-world impact.</p>\n<p>Microsoft Superintelligence Team</p>\n<p>Microsoft Superintelligence team’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.</p>\n<p>This role is part of Microsoft AI’s Superintelligence Team. The MAIST is a startup-like team inside Microsoft AI, created to push the boundaries of AI toward Humanist Superintelligence—ultra-capable systems that remain controllable, safety-aligned, and anchored to human values. Our mission is to create AI that amplifies human potential while ensuring humanity remains firmly in control. We aim to deliver breakthroughs that benefit society—advancing science, education, and global well-being. We’re also fortunate to partner with incredible product teams giving our models the chance to reach billions of users and create immense positive impact. If you’re a brilliant, highly-ambitious and low ego individual, you’ll fit right in—come and join us as we work on our next generation of models!</p>\n<p>Responsibilities</p>\n<p>Lead the co-design of AI systems across hardware and software boundaries, spanning accelerators, interconnects, memory systems, storage, runtimes, and distributed training/inference frameworks. Drive architectural decisions by analyzing real workloads, identifying bottlenecks across compute, communication, and data movement, and translating findings into actionable system and hardware requirements. Co-design and optimize parallelism strategies, execution models, and distributed algorithms to improve scalability, utilization, reliability, and cost efficiency of large-scale AI systems. Develop and evaluate what-if performance models to project system behavior under future workloads, model architectures, and hardware generations, providing early guidance to hardware and platform roadmaps. Partner with compiler, kernel, and runtime teams to unlock the full performance of current and next-generation accelerators, including custom kernels, scheduling strategies, and memory optimizations. Influence and guide AI hardware design at system and silicon levels, including accelerator microarchitecture, interconnect topology, memory hierarchy, and system integration trade-offs. Lead cross-functional efforts to prototype, validate, and productionize high-impact co-design ideas, working across infrastructure, hardware, and product teams. Mentor senior engineers and researchers, set technical direction, and raise the overall bar for systems rigor, performance engineering, and co-design thinking across the organization.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_a51375e8-30e","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Microsoft AI","sameAs":"https://microsoft.ai","logo":"https://logos.yubhub.co/microsoft.ai.png"},"x-apply-url":"https://microsoft.ai/job/member-of-technical-staff-software-co-design-ai-hpc-systems-mai-superintelligence-team-3/","x-work-arrangement":"hybrid","x-experience-level":"staff","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["AI accelerator or GPU architectures","Distributed systems and large-scale AI training/inference","High-performance computing (HPC) and collective communications","ML systems, runtimes, or compilers","Performance modeling, benchmarking, and systems analysis","Hardware–software co-design for AI workloads","Proficiency in systems-level programming (e.g., C/C++, CUDA, Python) and performance-critical software development"],"x-skills-preferred":["Experience designing or operating large-scale AI clusters for training or inference","Deep familiarity with LLMs, multimodal models, or recommendation systems, and their systems-level implications","Experience with accelerator interconnects and communication stacks (e.g., NCCL, MPI, RDMA, high-speed Ethernet or InfiniBand)","Background in performance modeling and capacity planning for future hardware generations","Prior experience contributing to or leading hardware roadmaps, silicon bring-up, or platform architecture reviews","Publications, patents, or open-source contributions in systems, architecture, or ML systems"],"datePosted":"2026-03-08T22:18:41.443Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"London"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"AI accelerator or GPU architectures, Distributed systems and large-scale AI training/inference, High-performance computing (HPC) and collective communications, ML systems, runtimes, or compilers, Performance modeling, benchmarking, and systems analysis, Hardware–software co-design for AI workloads, Proficiency in systems-level programming (e.g., C/C++, CUDA, Python) and performance-critical software development, Experience designing or operating large-scale AI clusters for training or inference, Deep familiarity with LLMs, multimodal models, or recommendation systems, and their systems-level implications, Experience with accelerator interconnects and communication stacks (e.g., NCCL, MPI, RDMA, high-speed Ethernet or InfiniBand), Background in performance modeling and capacity planning for future hardware generations, Prior experience contributing to or leading hardware roadmaps, silicon bring-up, or platform architecture reviews, Publications, patents, or open-source contributions in systems, architecture, or ML systems"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_cd1a0d16-311"},"title":"Member of Technical Staff, Software Co-Design AI HPC Systems","description":"<p>Our team&#39;s mission is to architect, co-design, and productionize next-generation AI systems at datacenter scale. We operate at the intersection of models, systems software, networking, storage, and AI hardware, optimizing end-to-end performance, efficiency, reliability, and cost.</p>\n<p>We pursue this mission through deep hardware–software co-design, combining rigorous systems thinking with hands-on engineering. The team invests heavily in understanding real production workloads large-scale training, inference, and emerging multimodal models and translating those insights into concrete improvements across the stack: from kernels, runtimes, and distributed systems, all the way down to silicon-level trade-offs and datacenter-scale architectures.</p>\n<p>This role sits at the boundary between exploration and production. You will work closely with internal infrastructure, hardware, compiler, and product teams, as well as external partners across the hardware and systems ecosystem. Our operating model emphasizes rapid ideation and prototyping, followed by disciplined execution to drive high-leverage ideas into production systems that operate at massive scale.</p>\n<p>In addition to delivering real-world impact on large-scale AI platforms, the team actively contributes to the broader research and engineering community. Our work aligns closely with leading communities in ML systems, distributed systems, computer architecture, and high-performance computing, and we regularly publish, prototype, and open-source impactful technologies where appropriate.</p>\n<p>Microsoft Superintelligence Team\nMicrosoft Superintelligence team’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.</p>\n<p>This role is part of Microsoft AI’s Superintelligence Team. The MAIST is a startup-like team inside Microsoft AI, created to push the boundaries of AI toward Humanist Superintelligence—ultra-capable systems that remain controllable, safety-aligned, and anchored to human values. Our mission is to create AI that amplifies human potential while ensuring humanity remains firmly in control. We aim to deliver breakthroughs that benefit society—advancing science, education, and global well-being. We’re also fortunate to partner with incredible product teams giving our models the chance to reach billions of users and create immense positive impact.</p>\n<p>Responsibilities\nLead the co-design of AI systems across hardware and software boundaries, spanning accelerators, interconnects, memory systems, storage, runtimes, and distributed training/inference frameworks.</p>\n<p>Drive architectural decisions by analyzing real workloads, identifying bottlenecks across compute, communication, and data movement, and translating findings into actionable system and hardware requirements.</p>\n<p>Co-design and optimize parallelism strategies, execution models, and distributed algorithms to improve scalability, utilization, reliability, and cost efficiency of large-scale AI systems.</p>\n<p>Develop and evaluate what-if performance models to project system behavior under future workloads, model architectures, and hardware generations, providing early guidance to hardware and platform roadmaps.</p>\n<p>Partner with compiler, kernel, and runtime teams to unlock the full performance of current and next-generation accelerators, including custom kernels, scheduling strategies, and memory optimizations.</p>\n<p>Influence and guide AI hardware design at system and silicon levels, including accelerator microarchitecture, interconnect topology, memory hierarchy, and system integration trade-offs.</p>\n<p>Lead cross-functional efforts to prototype, validate, and productionize high-impact co-design ideas, working across infrastructure, hardware, and product teams.</p>\n<p>Mentor senior engineers and researchers, set technical direction, and raise the overall bar for systems rigor, performance engineering, and co-design thinking across the organization.</p>\n<p>Qualifications\nBachelor’s Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience.</p>\n<p>Additional or Preferred Qualifications\nMaster’s Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR Bachelor’s Degree in Computer Science or related technical field AND 12+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience.</p>\n<p>Strong background in one or more of the following areas: AI accelerator or GPU architectures Distributed systems and large-scale AI training/inference High-performance computing (HPC) and collective communications ML systems, runtimes, or compilers Performance modeling, benchmarking, and systems analysis Hardware–software co-design for AI workloads Proficiency in systems-level programming (e.g., C/C++, CUDA, Python) and performance-critical software development.</p>\n<p>Proven ability to work across organizational boundaries and influence technical decisions involving multiple stakeholders. Experience designing or operating large-scale AI clusters for training or inference. Deep familiarity with LLMs, multimodal models, or recommendation systems, and their systems-level implications. Experience with accelerator interconnects and communication stacks (e.g., NCCL, MPI, RDMA, high-speed Ethernet or InfiniBand). Background in performance modeling and capacity planning for future hardware generations. Prior experience contributing to or leading hardware roadmaps, silicon bring-up, or platform architecture reviews. Publications, patents, or open-source contributions in systems, architecture, or ML systems are a plus.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_cd1a0d16-311","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Microsoft AI","sameAs":"https://microsoft.ai","logo":"https://logos.yubhub.co/microsoft.ai.png"},"x-apply-url":"https://microsoft.ai/job/member-of-technical-staff-software-co-design-ai-hpc-systems-mai-superintelligence-team-2/","x-work-arrangement":"hybrid","x-experience-level":"staff","x-job-type":"full-time","x-salary-range":"$139,900 – $274,800 per year","x-skills-required":["C","C++","C#","Java","JavaScript","Python","AI accelerator or GPU architectures","Distributed systems and large-scale AI training/inference","High-performance computing (HPC) and collective communications","ML systems, runtimes, or compilers","Performance modeling, benchmarking, and systems analysis","Hardware–software co-design for AI workloads","Proficiency in systems-level programming (e.g., C/C++, CUDA, Python) and performance-critical software development"],"x-skills-preferred":["LLMs, multimodal models, or recommendation systems, and their systems-level implications","Accelerator interconnects and communication stacks (e.g., NCCL, MPI, RDMA, high-speed Ethernet or InfiniBand)","Performance modeling and capacity planning for future hardware generations","Contributing to or leading hardware roadmaps, silicon bring-up, or platform architecture reviews","Publications, patents, or open-source contributions in systems, architecture, or ML systems"],"datePosted":"2026-03-08T22:13:30.666Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Redmond"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"C, C++, C#, Java, JavaScript, Python, AI accelerator or GPU architectures, Distributed systems and large-scale AI training/inference, High-performance computing (HPC) and collective communications, ML systems, runtimes, or compilers, Performance modeling, benchmarking, and systems analysis, Hardware–software co-design for AI workloads, Proficiency in systems-level programming (e.g., C/C++, CUDA, Python) and performance-critical software development, LLMs, multimodal models, or recommendation systems, and their systems-level implications, Accelerator interconnects and communication stacks (e.g., NCCL, MPI, RDMA, high-speed Ethernet or InfiniBand), Performance modeling and capacity planning for future hardware generations, Contributing to or leading hardware roadmaps, silicon bring-up, or platform architecture reviews, Publications, patents, or open-source contributions in systems, architecture, or ML systems","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":139900,"maxValue":274800,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_139cd1f4-231"},"title":"Software Engineer, Compute Efficiency","description":"<p><strong>About Anthropic</strong></p>\n<p>Anthropic&#39;s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole.</p>\n<p>At Anthropic, we are building some of the most complex and large-scale AI infrastructure in the world. As that infrastructure scales rapidly, so does the imperative to optimise how we use it. As a Software Engineer for Compute Efficiency on the Capacity team, you will play a central role in making our systems more performant, cost-effective, and sustainable—without compromising reliability or latency.</p>\n<p>You will work across the full infrastructure stack, from cloud platforms and networking to application-level performance, and will bridge the gap between high-level research needs and low-level hardware constraints to build the most efficient AI infrastructure in the world. You will help with building the telemetry, cost attribution, and optimisation frameworks that ensure every dollar of our infrastructure investment delivers maximum value. This is a high-impact, cross-functional role at the intersection of systems engineering, financial optimisation, and AI infrastructure.</p>\n<p><strong>Responsibilities</strong></p>\n<ul>\n<li>Build and evolve telemetry and monitoring systems to provide deep visibility into infrastructure performance, utilisation, and costs across our cloud and datacentre fleets.</li>\n</ul>\n<ul>\n<li>Design and implement cost attribution frameworks for our multi-tenant infrastructure, enabling teams to understand and optimise their resource consumption.</li>\n</ul>\n<ul>\n<li>Identify and resolve performance bottlenecks and capacity hotspots through deep analysis of distributed systems at scale.</li>\n</ul>\n<ul>\n<li>Partner closely with cloud service providers and internal stakeholders to optimise cluster configurations, workload placement, and resource utilisation across AI training and inference workloads—including large-scale clusters spanning thousands to hundreds of thousands of machines.</li>\n</ul>\n<ul>\n<li>Develop and champion engineering practices around efficiency, driving a culture of performance awareness and cost-conscious design across Anthropic.</li>\n</ul>\n<ul>\n<li>Collaborate with research and product teams to deeply understand their infrastructure needs, and design solutions that balance performance with cost efficiency.</li>\n</ul>\n<ul>\n<li>Drive architectural improvements and code-level optimisations across multiple services and platforms to deliver measurable utilisation and performance gains.</li>\n</ul>\n<p><strong>You may be a good fit if you:</strong></p>\n<ul>\n<li>Have 6+ years of relevant industry experience, 1+ year leading large scale, complex projects or teams as a software engineer or tech lead</li>\n</ul>\n<ul>\n<li>Deep expertise in distributed systems at scale, with a strong focus on infrastructure reliability, scalability, and continuous improvement.</li>\n</ul>\n<ul>\n<li>Strong proficiency in at least one programming language (e.g., Python, Rust, Go, Java)</li>\n</ul>\n<ul>\n<li>Hands-on experience with cloud infrastructure, including Kubernetes, Infrastructure as Code, and major cloud providers such as AWS or GCP.</li>\n</ul>\n<ul>\n<li>Experience optimising end-to-end performance of distributed systems, including workload right-sizing and resource utilisation tuning.</li>\n</ul>\n<ul>\n<li>You possess a deep curiosity for how things work under the hood and have a proven ability to work independently to solve opaque performance issues</li>\n</ul>\n<ul>\n<li>Experience designing or working with performance and utilisation monitoring tools in large-scale, distributed environments.</li>\n</ul>\n<ul>\n<li>Strong problem-solving skills with the ability to work independently and navigate ambiguity.</li>\n</ul>\n<ul>\n<li>Excellent communication and collaboration skills—you will work closely with internal and external stakeholders to build consensus and drive projects forward.</li>\n</ul>\n<p><strong>Strong candidates may have:</strong></p>\n<ul>\n<li>Experience with machine learning infrastructure workloads as well as associated networking technologies like NCCL.</li>\n</ul>\n<ul>\n<li>Low level systems experience, for example linux kernel tuning and eBPF</li>\n</ul>\n<ul>\n<li>Quickly understanding systems design tradeoffs, keeping track of rapidly evolving software systems</li>\n</ul>\n<ul>\n<li>Published work in performance optimisation and scaling distributed systems</li>\n</ul>\n<p><strong>Logistics</strong></p>\n<p><strong>Education requirements:</strong> We require at least a Bachelor&#39;s degree in a related field or equivalent experience. <strong>Location-based hybrid policy:</strong> Currently, we expect all staff to be in one of our offices at least 25% of the time. However, some roles may require more time in our offices.</p>\n<p><strong>Visa sponsorship:</strong> We do sponsor visas! However, we aren&#39;t able to successfully sponsor visas for every role and every candidate. But if we make you an offer, we will make every reasonable effort to get you a visa, and we retain an immigration lawyer to help with this.</p>\n<p><strong>We encourage you to apply even if you do not believe you meet every single qualification.</strong> Not all strong candidates will meet every single qualification as listed. Research shows that people who identify as being from underrepresented groups are more prone to experiencing imposter syndrome and doubting the strength of their candidacy, so we urge you not to exclude yourself prematurely and to submit an application if you&#39;re interested in this work. We think AI systems like the ones we&#39;re building have enormous social and ethical implications. We think this makes representation even more important, and we strive to include a range of diverse perspectives on our team.</p>\n<p><strong>Your safety matters to us.</strong> To protect yourself from potential scams, remember that Anthropic</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_139cd1f4-231","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Anthropic","sameAs":"https://job-boards.greenhouse.io","logo":"https://logos.yubhub.co/anthropic.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/anthropic/jobs/5108982008","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$320,000 - $405,000USD","x-skills-required":["distributed systems","cloud infrastructure","Kubernetes","Infrastructure as Code","AWS","GCP","Python","Rust","Go","Java","performance optimisation","scalability","continuous improvement"],"x-skills-preferred":["machine learning infrastructure workloads","NCCL","linux kernel tuning","eBPF","systems design tradeoffs","published work in performance optimisation"],"datePosted":"2026-03-08T13:56:57.417Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco, CA | New York City, NY"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"distributed systems, cloud infrastructure, Kubernetes, Infrastructure as Code, AWS, GCP, Python, Rust, Go, Java, performance optimisation, scalability, continuous improvement, machine learning infrastructure workloads, NCCL, linux kernel tuning, eBPF, systems design tradeoffs, published work in performance optimisation","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":320000,"maxValue":405000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_0a7113f5-76c"},"title":"Engineering Manager, Cloud Inference AWS","description":"<p><strong>About the role</strong></p>\n<p>We are seeking an experienced Engineering Manager to lead the Cloud Inference team for AWS. You will lead your team to scale and optimize Claude to serve the massive audiences of developers and enterprise companies using AWS. You will own the end-to-end product of Claude on AWS, including API, load balancing, inference, capacity and operations. Your team will ensure our LLMs meet rigorous performance, safety and security standards and enhance our core infrastructure for packaging, testing, and deploying inference technology across the globe. Your work will increase the scale at which Anthropic operates and accelerate our ability to reliably launch new frontier models and innovative features to customers across all platforms.</p>\n<p><strong>Responsibilities:</strong></p>\n<ul>\n<li>Set technical strategy and oversee development of Claude on AWS across all layers of the technical stack.</li>\n<li>Collaborate across teams and companies to deeply understand product, infrastructure, operations and capacity needs, identifying potential solutions to support frontier LLM serving</li>\n<li>Work closely with cross-functional stakeholders across companies to align on goals and drive outcomes</li>\n<li>Create clarity for the team and stakeholders in an ambiguous and evolving environment</li>\n<li>Take an inclusive approach to hiring and coaching top technical talent, and support a high performing team</li>\n<li>Design and run processes (e.g. postmortem review, incident response, on-call rotations) that help the team operate effectively and never fail the same way twice</li>\n</ul>\n<p><strong>You may be a good fit if you:</strong></p>\n<ul>\n<li>Have 10+ years of experience in high-scale, high-reliability software development, particularly infrastructure or capacity management</li>\n<li>Have 5+ years of engineering management experience</li>\n<li>Experience recruiting, scaling, and retaining engineering talent in a high growth environment</li>\n<li>Have experience scaling products, resources and operations to accommodate rapid growth</li>\n<li>Are deeply interested in the potential transformative effects of advanced AI systems and are committed to ensuring their safe development</li>\n<li>Excel at building strong relationships and strategy with stakeholders across engineering, product, finance, and sales</li>\n<li>Have experience working with external partners to align goals and deliver impact</li>\n<li>Enjoy working in a fast-paced, early environment; comfortable with adapting priorities as driven by the rapidly evolving AI space</li>\n<li>Have excellent written and verbal communication skills</li>\n<li>Demonstrated success building a culture of belonging and engineering excellence</li>\n<li>Are motivated by developing AI responsibly and safely</li>\n<li>Are willing and able to travel frequently between Seattle and the SF Bay Area</li>\n</ul>\n<p><strong>Strong candidates may also have experience with:</strong></p>\n<ul>\n<li>Experience with machine learning infrastructure like GPUs, TPUs, or Trainium, as well as supporting networking infrastructure like NCCL</li>\n<li>Experience as a Product Manager</li>\n<li>Experience with deployment and capacity management automation</li>\n<li>Security and privacy best practice expertise</li>\n</ul>\n<p><strong>Logistics</strong></p>\n<p><strong>Education requirements:</strong> We require at least a Bachelor&#39;s degree in a related field or equivalent experience. <strong>Location-based hybrid policy:</strong> Currently, we expect all staff to be in one of our offices at least 25% of the time. However, some roles may require more time in our offices.</p>\n<p><strong>Visa sponsorship:</strong> We do sponsor visas! However, we aren&#39;t able to successfully sponsor visas for every role and every candidate. But if we make you an offer, we will make every reasonable effort to get you a visa, and we retain an immigration lawyer to help with this.</p>\n<p><strong>We encourage you to apply even if you do not believe you meet every single qualification.</strong> Not all strong candidates will meet every single qualification as listed. Research shows that people who identify as being from underrepresented groups are more prone to experiencing imposter syndrome and doubting the strength of their candidacy, so we urge you not to exclude yourself prematurely and to submit an application if you&#39;re interested in this work.</p>\n<p><strong>Your safety matters to us.</strong> To protect yourself from potential scams, remember that Anthropic recruiters only contact you from @anthropic.com email addresses. In some cases, we may partner with vetted recruiting agencies who will identify themselves as working on behalf of Anthropic. Be cautious of emails from other domains. Legitimate Anthropic recruiters will never ask for money, fees, or banking information before your first day. If you&#39;re ever unsure about a communication, don&#39;t click any links—visit anthropic.com/careers directly for confirmed position openings.</p>\n<p><strong>How we&#39;re different</strong></p>\n<p>We believe that the highest-impact AI research will be big science. At Anthropic we work as a single cohesive team on just a few large-scale research efforts. And we value impact — advancing our long-term goals of steerable, trustworthy AI — rather than work on smaller and more specific puzzles. We view AI research as a collaborative effort, and we work closely with other researchers, engineers, and experts to advance our understanding of AI and its applications.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_0a7113f5-76c","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Anthropic","sameAs":"https://job-boards.greenhouse.io","logo":"https://logos.yubhub.co/anthropic.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/anthropic/jobs/5141377008","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$405,000 - $485,000 USD","x-skills-required":["high-scale, high-reliability software development","infrastructure or capacity management","engineering management","recruiting, scaling, and retaining engineering talent","scaling products, resources and operations","machine learning infrastructure","deployment and capacity management automation","security and privacy best practice expertise"],"x-skills-preferred":["experience with GPUs, TPUs, or Trainium","experience as a Product Manager","experience with networking infrastructure like NCCL"],"datePosted":"2026-03-08T13:56:51.226Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco, CA | Seattle, WA"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"high-scale, high-reliability software development, infrastructure or capacity management, engineering management, recruiting, scaling, and retaining engineering talent, scaling products, resources and operations, machine learning infrastructure, deployment and capacity management automation, security and privacy best practice expertise, experience with GPUs, TPUs, or Trainium, experience as a Product Manager, experience with networking infrastructure like NCCL","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":405000,"maxValue":485000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_3b20b513-ea1"},"title":"Staff+ Software Engineer, Systems","description":"<p><strong>About Anthropic</strong></p>\n<p>Anthropic&#39;s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems.</p>\n<p><strong>About the Role</strong></p>\n<p>Anthropic&#39;s Infrastructure organisation is foundational to our mission of developing AI systems that are reliable, interpretable, and steerable. The systems we build determine how quickly we can train new models, how reliably we can run safety experiments, and how effectively we can scale Claude to millions of users — demonstrating that safe, reliable infrastructure and frontier capabilities can go hand in hand.</p>\n<p>The Systems engineering team owns compute uptime and resilience at massive scale, building the clusters, automation, and observability that make frontier AI research possible and safely deployable to customers.</p>\n<p>_Team Matching: Team matching is determined after the interview process based on interview performance, interests, and business priorities. Please note we may also consider you for different Infrastructure teams._</p>\n<p><strong>Responsibilities</strong></p>\n<ul>\n<li>Own the technical strategy and roadmap for your area, translating team-level goals into concrete execution plans</li>\n<li>Drive cross-team initiatives to build and scale AI clusters (thousands to hundreds of thousands of machines)</li>\n<li>Define infrastructure architecture, ensuring the hardest problems get solved — whether by you directly or by working through others</li>\n<li>Partner with cloud providers and internal stakeholders to shape long-term compute, data, and infrastructure strategy</li>\n<li>Establish and evolve operational excellence practices (incident response, postmortem culture, on-call)</li>\n</ul>\n<p><strong>You may be a good fit if you:</strong></p>\n<ul>\n<li>Have 10+ years of software engineering experience</li>\n<li>Have led complex, multi-quarter technical initiatives that span multiple teams or systems</li>\n<li>Can set technical direction for a team, not just execute within it</li>\n<li>Have deep expertise in distributed systems, reliability, and cloud platforms (Kubernetes, IaC, AWS/GCP)</li>\n<li>Are strong in at least one systems language (Python, Rust, Go, Java)</li>\n<li>Naturally uplevel the engineers around you and can redirect efforts when things are heading off track</li>\n<li>Build alignment across senior stakeholders and communicate effectively at all levels</li>\n</ul>\n<p><strong>Strong candidates may have:</strong></p>\n<ul>\n<li>Security and privacy best practice expertise</li>\n<li>Experience with machine learning infrastructure like GPUs, TPUs, or Trainium, as well as supporting networking infrastructure like NCCL</li>\n<li>Low level systems experience, for example linux kernel tuning and eBPF</li>\n<li>Technical expertise: Quickly understanding systems design tradeoffs, keeping track of rapidly evolving software systems</li>\n</ul>\n<p>_Deadline to apply: None. Applications will be reviewed on a rolling basis._</p>\n<p><strong>Logistics</strong></p>\n<p><strong>Education requirements:</strong> We require at least a Bachelor&#39;s degree in a related field or equivalent experience. <strong>Location-based hybrid policy:</strong> Currently, we expect all staff to be in one of our offices at least 25% of the time. However, some roles may require more time in our offices.</p>\n<p><strong>Visa sponsorship:</strong> We do sponsor visas! However, we aren&#39;t able to successfully sponsor visas for every role and every candidate. But if we make you an offer, we will make every reasonable effort to get you a visa, and we retain an immigration lawyer to help with this.</p>\n<p><strong>We encourage you to apply even if you do not believe you meet every single qualification. Not all strong candidates will meet every single qualification as listed. Research shows that people who identify as being from underrepresented groups are more prone to experiencing imposter syndrome and doubting the strength of their candidacy, so we urge you not to exclude yourself prematurely and to submit an application if you&#39;re interested in this work.</strong></p>\n<p><strong>Your safety matters to us. To protect yourself from potential scams, remember that Anthropic recruiters only contact you from @anthropic.com email addresses. In some cases, we may partner with vetted recruiting agencies who will identify themselves as working on behalf of Anthropic. Be cautious of emails from other domains. Legitimate Anthropic recruiters will never ask for money, fees, or banking information before your first day. If you&#39;re ever unsure about a communication, don&#39;t click any links—visit anthropic.com/careers directly for confirmed position openings.</strong></p>\n<p><strong>How we&#39;re different</strong></p>\n<p>We believe that the highest-impact AI research will be big science. At Anthropic we work as a single cohesive team on just a few large-scale research efforts. And we value impact — advancing our long-term goals of steerable, trustworthy AI — rather than work on smaller and more specific puzzles. We view AI research as an empirical science, which has as much in common with physics and biology as with traditional efforts in computer science. We&#39;re an extremely collaborative group, and we host frequent research discussions to ensure that we are pursuing the highest-impact work at any given time. As such, we greatly value communication skills.</p>\n<p>The easiest way to understand our research directions is to read our recent research. This re</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_3b20b513-ea1","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Anthropic","sameAs":"https://job-boards.greenhouse.io","logo":"https://logos.yubhub.co/anthropic.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/anthropic/jobs/5108817008","x-work-arrangement":"hybrid","x-experience-level":"staff","x-job-type":"full-time","x-salary-range":"$405,000 - $485,000 USD","x-skills-required":["distributed systems","reliability","cloud platforms","Kubernetes","IaC","AWS/GCP","Python","Rust","Go","Java"],"x-skills-preferred":["security and privacy best practice expertise","machine learning infrastructure","GPUs","TPUs","Trainium","NCCL","low level systems experience","linux kernel tuning","eBPF"],"datePosted":"2026-03-08T13:49:17.054Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco, CA | New York City, NY | Seattle, WA"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"distributed systems, reliability, cloud platforms, Kubernetes, IaC, AWS/GCP, Python, Rust, Go, Java, security and privacy best practice expertise, machine learning infrastructure, GPUs, TPUs, Trainium, NCCL, low level systems experience, linux kernel tuning, eBPF","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":405000,"maxValue":485000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_886a66bf-10d"},"title":"Senior Software Engineer, Systems","description":"<p><strong>About Anthropic</strong></p>\n<p>Anthropic&#39;s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems.</p>\n<p><strong>About the Role</strong></p>\n<p>Anthropic&#39;s Infrastructure organisation is foundational to our mission of developing AI systems that are reliable, interpretable, and steerable. The systems we build determine how quickly we can train new models, how reliably we can run safety experiments, and how effectively we can scale Claude to millions of users — demonstrating that safe, reliable infrastructure and frontier capabilities can go hand in hand.</p>\n<p>The Systems engineering team owns compute uptime and resilience at massive scale, building the clusters, automation, and observability that make frontier AI research possible and safely deployable to customers.</p>\n<p>_Team Matching: Team matching is determined after the interview process based on interview performance, interests, and business priorities. Please note we may also consider you for different Infrastructure teams._</p>\n<p><strong>Responsibilities</strong></p>\n<ul>\n<li>Lead infrastructure projects from design through delivery, owning scope, execution, and outcomes</li>\n<li>Build and maintain systems that support AI clusters at massive scale (thousands to hundreds of thousands of machines)</li>\n<li>Partner with cloud providers and internal teams to solve compute, networking, and reliability challenges</li>\n<li>Tackle difficult technical problems in your domain and proactively fill gaps in tooling, documentation, and processes</li>\n<li>Contribute to operational practices including incident response, postmortems, and on-call rotations</li>\n</ul>\n<p><strong>You may be a good fit if you:</strong></p>\n<ul>\n<li>Have 6+ years of software engineering experience</li>\n<li>Have led technical projects end-to-end over multiple months, including scoping, breaking down work, and driving delivery</li>\n<li>Have deep knowledge of distributed systems, reliability, and cloud platforms (Kubernetes, IaC, AWS/GCP)</li>\n<li>Are strong in at least one systems language (Python, Rust, Go, Java)</li>\n<li>Solve hard problems independently and know when to pull others in</li>\n<li>Help teammates grow through knowledge sharing and thoughtful technical guidance</li>\n<li>Communicate clearly in design docs, presentations, and cross-functional discussions</li>\n</ul>\n<p><strong>Strong candidates may have:</strong></p>\n<ul>\n<li>Security and privacy best practice expertise</li>\n<li>Experience with machine learning infrastructure like GPUs, TPUs, or Trainium, as well as supporting networking infrastructure like NCCL</li>\n<li>Low level systems experience, for example linux kernel tuning and eBPF</li>\n<li>Technical expertise: Quickly understanding systems design tradeoffs, keeping track of rapidly evolving software systems</li>\n</ul>\n<p>_Deadline to apply: None. Applications will be reviewed on a rolling basis._</p>\n<p><strong>Logistics</strong></p>\n<p><strong>Education requirements:</strong> We require at least a Bachelor&#39;s degree in a related field or equivalent experience. <strong>Location-based hybrid policy:</strong> Currently, we expect all staff to be in one of our offices at least 25% of the time. However, some roles may require more time in our offices.</p>\n<p><strong>Visa sponsorship:</strong> We do sponsor visas! However, we aren&#39;t able to successfully sponsor visas for every role and every candidate. But if we make you an offer, we will make every reasonable effort to get you a visa, and we retain an immigration lawyer to help with this.</p>\n<p><strong>We encourage you to apply even if you do not believe you meet every single qualification. Not all strong candidates will meet every single qualification as listed. Research shows that people who identify as being from underrepresented groups are more prone to experiencing imposter syndrome and doubting the strength of their candidacy, so we urge you not to exclude yourself prematurely and to submit an application if you&#39;re interested in this work.</strong></p>\n<p><strong>Your safety matters to us. To protect yourself from potential scams, remember that Anthropic recruiters only contact you from @anthropic.com email addresses. In some cases, we may partner with vetted recruiting agencies who will identify themselves as working on behalf of Anthropic. Be cautious of emails from other domains. Legitimate Anthropic recruiters will never ask for money, fees, or banking information before your first day. If you&#39;re ever unsure about a communication, don&#39;t click any links—visit anthropic.com/careers directly for confirmed position openings.</strong></p>\n<p><strong>How we&#39;re different</strong></p>\n<p>We believe that the highest-impact AI research will be big science. At Anthropic we work as a single cohesive team on just a few large-scale research efforts. And we value impact — advancing our long-term goals of steerable, trustworthy AI — rather than work on smaller and more specific puzzles. We view AI research as an empirical science, which has as much in common with physics and biology as with traditional efforts in computer science. We&#39;re an extremely collaborative group, and we host frequent research discussions to ensure that we are pursuing the highest-impact work at any given time. As such, we greatly value communication skills.</p>\n<p>The easiest way to understand our research directions is to read our recent research. This research continues many of the directions our team worked on prior to Anthropic</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_886a66bf-10d","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Anthropic","sameAs":"https://job-boards.greenhouse.io","logo":"https://logos.yubhub.co/anthropic.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/anthropic/jobs/4915842008","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"£240,000 - £325,000GBP","x-skills-required":["distributed systems","reliability","cloud platforms","Kubernetes","IaC","AWS/GCP","Python","Rust","Go","Java"],"x-skills-preferred":["security and privacy best practice expertise","machine learning infrastructure","GPUs","TPUs","Trainium","NCCL","low level systems experience","linux kernel tuning","eBPF"],"datePosted":"2026-03-08T13:46:27.991Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"London, UK"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"distributed systems, reliability, cloud platforms, Kubernetes, IaC, AWS/GCP, Python, Rust, Go, Java, security and privacy best practice expertise, machine learning infrastructure, GPUs, TPUs, Trainium, NCCL, low level systems experience, linux kernel tuning, eBPF","baseSalary":{"@type":"MonetaryAmount","currency":"GBP","value":{"@type":"QuantitativeValue","minValue":240000,"maxValue":325000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_11a60d5a-f54"},"title":"Performance Engineer, GPU","description":"<p><strong>About the role:</strong></p>\n<p>Pioneering the next generation of AI requires breakthrough innovations in GPU performance and systems engineering. As a GPU Performance Engineer, you&#39;ll architect and implement the foundational systems that power Claude and push the frontiers of what&#39;s possible with large language models. You&#39;ll be responsible for maximizing GPU utilization and performance at unprecedented scale, developing cutting-edge optimizations that directly enable new model capabilities and dramatically improve inference efficiency.</p>\n<p>Working at the intersection of hardware and software, you&#39;ll implement state-of-the-art techniques from custom kernel development to distributed system architectures. Your work will span the entire stack—from low-level tensor core optimizations to orchestrating thousands of GPUs in perfect synchronization.</p>\n<p>Strong candidates will have a track record of delivering transformative GPU performance improvements in production ML systems and will be excited to shape the future of AI infrastructure alongside world-class researchers and engineers.</p>\n<p><strong>You might be a good fit if you:</strong></p>\n<ul>\n<li>Have deep experience with GPU programming and optimization at scale</li>\n<li>Are impact-driven, passionate about delivering measurable performance breakthroughs</li>\n<li>Can navigate complex systems from hardware interfaces to high-level ML frameworks</li>\n<li>Enjoy collaborative problem-solving and pair programming</li>\n<li>Want to work on state-of-the-art language models with real-world impact</li>\n<li>Care about the societal impacts of your work</li>\n<li>Thrive in ambiguous environments where you define the path forward</li>\n</ul>\n<p><strong>Strong candidates may also have experience with:</strong></p>\n<ul>\n<li>GPU Kernel Development: CUDA, Triton, CUTLASS, Flash Attention, tensor core optimization</li>\n<li>ML Compilers &amp; Frameworks: PyTorch/JAX internals, torch.compile, XLA, custom operators</li>\n<li>Performance Engineering: Kernel fusion, memory bandwidth optimization, profiling with Nsight</li>\n<li>Distributed Systems: NCCL, NVLink, collective communication, model parallelism</li>\n<li>Low-Precision: INT8/FP8 quantization, mixed-precision techniques</li>\n<li>Production Systems: Large-scale training infrastructure, fault tolerance, cluster orchestration</li>\n</ul>\n<p><strong>Representative projects:</strong></p>\n<ul>\n<li>Co-design attention mechanisms and algorithms for next-generation hardware architectures</li>\n<li>Develop custom kernels for emerging quantization formats and mixed-precision techniques</li>\n<li>Design distributed communication strategies for multi-node GPU clusters</li>\n<li>Optimize end-to-end training and inference pipelines for frontier language models</li>\n<li>Build performance modeling frameworks to predict and optimize GPU utilization</li>\n<li>Implement kernel fusion strategies to minimize memory bandwidth bottlenecks</li>\n<li>Create resilient systems for planet-scale distributed training infrastructure</li>\n<li>Profile and eliminate performance bottlenecks in production serving infrastructure</li>\n<li>Partner with hardware vendors to influence future accelerator capabilities and software stacks</li>\n</ul>\n<p><strong>Deadline to apply:</strong> None. Applications will be reviewed on a rolling basis.</p>\n<p>The expected salary range for this position is:</p>\n<p>Annual Salary: $280,000 - $850,000USD</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_11a60d5a-f54","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Anthropic","sameAs":"https://job-boards.greenhouse.io","logo":"https://logos.yubhub.co/anthropic.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/anthropic/jobs/4926227008","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$280,000 - $850,000USD","x-skills-required":["GPU programming","optimization at scale","custom kernel development","distributed system architectures","low-level tensor core optimizations","orchestrating thousands of GPUs","GPU kernel development","CUDA","Triton","CUTLASS","Flash Attention","tensor core optimization","ML compilers & frameworks","PyTorch/JAX internals","torch.compile","XLA","custom operators","performance engineering","kernel fusion","memory bandwidth optimization","profiling with Nsight","distributed systems","NCCL","NVLink","collective communication","model parallelism","low-precision","INT8/FP8 quantization","mixed-precision techniques","production systems","large-scale training infrastructure","fault tolerance","cluster orchestration"],"x-skills-preferred":["GPU programming","optimization at scale","custom kernel development","distributed system architectures","low-level tensor core optimizations","orchestrating thousands of GPUs","GPU kernel development","CUDA","Triton","CUTLASS","Flash Attention","tensor core optimization","ML compilers & frameworks","PyTorch/JAX internals","torch.compile","XLA","custom operators","performance engineering","kernel fusion","memory bandwidth optimization","profiling with Nsight","distributed systems","NCCL","NVLink","collective communication","model parallelism","low-precision","INT8/FP8 quantization","mixed-precision techniques","production systems","large-scale training infrastructure","fault tolerance","cluster orchestration"],"datePosted":"2026-03-08T13:45:05.412Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco, CA | New York City, NY | Seattle, WA"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"GPU programming, optimization at scale, custom kernel development, distributed system architectures, low-level tensor core optimizations, orchestrating thousands of GPUs, GPU kernel development, CUDA, Triton, CUTLASS, Flash Attention, tensor core optimization, ML compilers & frameworks, PyTorch/JAX internals, torch.compile, XLA, custom operators, performance engineering, kernel fusion, memory bandwidth optimization, profiling with Nsight, distributed systems, NCCL, NVLink, collective communication, model parallelism, low-precision, INT8/FP8 quantization, mixed-precision techniques, production systems, large-scale training infrastructure, fault tolerance, cluster orchestration, GPU programming, optimization at scale, custom kernel development, distributed system architectures, low-level tensor core optimizations, orchestrating thousands of GPUs, GPU kernel development, CUDA, Triton, CUTLASS, Flash Attention, tensor core optimization, ML compilers & frameworks, PyTorch/JAX internals, torch.compile, XLA, custom operators, performance engineering, kernel fusion, memory bandwidth optimization, profiling with Nsight, distributed systems, NCCL, NVLink, collective communication, model parallelism, low-precision, INT8/FP8 quantization, mixed-precision techniques, production systems, large-scale training infrastructure, fault tolerance, cluster orchestration","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":280000,"maxValue":850000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_f2722128-3e2"},"title":"Inference Runtime, Engineering Manager","description":"<p><strong>Inference Runtime, Engineering Manager</strong></p>\n<p><strong>Location</strong></p>\n<p>San Francisco</p>\n<p><strong>Employment Type</strong></p>\n<p>Full time</p>\n<p><strong>Department</strong></p>\n<p>Scaling</p>\n<p><strong>Compensation</strong></p>\n<ul>\n<li>$455K – $555K</li>\n</ul>\n<p>The base pay offered may vary depending on multiple individualized factors, including market location, job-related knowledge, skills, and experience. If the role is non-exempt, overtime pay will be provided consistent with applicable laws. In addition to the salary range listed above, total compensation also includes generous equity, performance-related bonus(es) for eligible employees, and the following benefits.</p>\n<ul>\n<li>Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts</li>\n</ul>\n<ul>\n<li>Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)</li>\n</ul>\n<ul>\n<li>401(k) retirement plan with employer match</li>\n</ul>\n<ul>\n<li>Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)</li>\n</ul>\n<ul>\n<li>Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees</li>\n</ul>\n<ul>\n<li>13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)</li>\n</ul>\n<ul>\n<li>Mental health and wellness support</li>\n</ul>\n<ul>\n<li>Employer-paid basic life and disability coverage</li>\n</ul>\n<ul>\n<li>Annual learning and development stipend to fuel your professional growth</li>\n</ul>\n<ul>\n<li>Daily meals in our offices, and meal delivery credits as eligible</li>\n</ul>\n<ul>\n<li>Relocation support for eligible employees</li>\n</ul>\n<ul>\n<li>Additional taxable fringe benefits, such as charitable donation matching and wellness stipends, may also be provided.</li>\n</ul>\n<p>More details about our benefits are available to candidates during the hiring process.</p>\n<p>This role is at-will and OpenAI reserves the right to modify base pay and other compensation components at any time based on individual performance, team or company results, or market conditions.</p>\n<p><strong>About the Team</strong></p>\n<p>Our Inference team brings OpenAI’s most capable research and technology to the world through our products. We empower consumers, enterprise and developers alike to use and access our start-of-the-art AI models, allowing them to do things that they’ve never been able to before. We focus on performant and efficient model inference, as well as accelerating research progression via model inference.</p>\n<p><strong>About the Role</strong></p>\n<p>We are looking for an engineering leader who wants to build and lead the worlds leading AI systems and modeling engineers who take the world&#39;s largest and most capable AI models and optimize them for use in a high-volume, low-latency, and high-availability production and research environment.</p>\n<p>In this role, you will:</p>\n<ul>\n<li>Lead a team of engineers who are experts in working with distributed systems, with a deep understanding of model architecture, system co-design with research and production team,</li>\n</ul>\n<ul>\n<li>Work alongside partners in machine learning researchers, engineers, and product managers to bring our latest technologies into production.</li>\n</ul>\n<ul>\n<li>Work in an outcome-oriented environment where everyone contributes across layers of the stack, from infra plumbing to performance tuning.</li>\n</ul>\n<ul>\n<li>Introduce new techniques, tools, and architecture that improve the performance, latency, throughput, and efficiency of our model inference stack.</li>\n</ul>\n<ul>\n<li>Build tools to give us visibility into our bottlenecks and sources of instability and then design and implement solutions to address the highest priority issues.</li>\n</ul>\n<ul>\n<li>Optimize our code and fleet of GPU’s to utilize every FLOP and every GB of GPU RAM of our hardware.</li>\n</ul>\n<p><strong>You might thrive in this role if you:</strong></p>\n<ul>\n<li>Have an understanding of modern ML architectures and an intuition for how to optimize their performance, particularly for inference.</li>\n</ul>\n<ul>\n<li>Own problems end-to-end, and are willing to pick up whatever knowledge you&#39;re missing to get the job done.</li>\n</ul>\n<ul>\n<li>Have at least 15 years of professional software engineering experience.</li>\n</ul>\n<ul>\n<li>Have or can quickly gain familiarity with PyTorch, NVidia GPUs and the software stacks that optimize them (e.g. NCCL, CUDA), as well as HPC technologies such as InfiniBand, MPI, NVLink, etc.</li>\n</ul>\n<ul>\n<li>Have experience architecting, building, observing, and debugging production distributed systems. Bonus point if worked on performance-critical distributed systems.</li>\n</ul>\n<ul>\n<li>Have needed to rebuild or substantially refactor production systems several times over due to rapidly increasing scale.</li>\n</ul>\n<ul>\n<li>Are self-directed and enjoy figuring out the most important problem to work on.</li>\n</ul>\n<ul>\n<li>Have a humble attitude, an eagerness to help your colleagues, and a desire to do whatever it takes to make the team succeed.</li>\n</ul>\n<p><strong>About OpenAI</strong></p>\n<p>OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_f2722128-3e2","directApply":true,"hiringOrganization":{"@type":"Organization","name":"OpenAI","sameAs":"https://jobs.ashbyhq.com","logo":"https://logos.yubhub.co/openai.com.png"},"x-apply-url":"https://jobs.ashbyhq.com/openai/4f998abb-4510-4bd3-9922-161599625171","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$455K – $555K","x-skills-required":["PyTorch","NVidia GPUs","NCCL","CUDA","InfiniBand","MPI","NVLink","HPC technologies","Distributed systems","Model architecture","System co-design","Machine learning","Research","Production","Software engineering","GPU optimization"],"x-skills-preferred":["HPC technologies","Distributed systems","Model architecture","System co-design","Machine learning","Research","Production","Software engineering","GPU optimization"],"datePosted":"2026-03-06T18:39:15.426Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"PyTorch, NVidia GPUs, NCCL, CUDA, InfiniBand, MPI, NVLink, HPC technologies, Distributed systems, Model architecture, System co-design, Machine learning, Research, Production, Software engineering, GPU optimization, HPC technologies, Distributed systems, Model architecture, System co-design, Machine learning, Research, Production, Software engineering, GPU optimization","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":455000,"maxValue":555000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_0e457a06-cee"},"title":"Training Performance Engineer","description":"<p><strong>Location</strong></p>\n<p>San Francisco</p>\n<p><strong>Employment Type</strong></p>\n<p>Full time</p>\n<p><strong>Location Type</strong></p>\n<p>Hybrid</p>\n<p><strong>Department</strong></p>\n<p>Scaling</p>\n<p><strong>Compensation</strong></p>\n<ul>\n<li>$250K – $445K • Offers Equity</li>\n</ul>\n<p>The base pay offered may vary depending on multiple individualized factors, including market location, job-related knowledge, skills, and experience. If the role is non-exempt, overtime pay will be provided consistent with applicable laws. In addition to the salary range listed above, total compensation also includes generous equity, performance-related bonus(es) for eligible employees, and the following benefits.</p>\n<ul>\n<li>Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts</li>\n</ul>\n<ul>\n<li>Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)</li>\n</ul>\n<ul>\n<li>401(k) retirement plan with employer match</li>\n</ul>\n<ul>\n<li>Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)</li>\n</ul>\n<ul>\n<li>Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees</li>\n</ul>\n<ul>\n<li>13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)</li>\n</ul>\n<ul>\n<li>Mental health and wellness support</li>\n</ul>\n<ul>\n<li>Employer-paid basic life and disability coverage</li>\n</ul>\n<ul>\n<li>Annual learning and development stipend to fuel your professional growth</li>\n</ul>\n<ul>\n<li>Daily meals in our offices, and meal delivery credits as eligible</li>\n</ul>\n<ul>\n<li>Relocation support for eligible employees</li>\n</ul>\n<ul>\n<li>Additional taxable fringe benefits, such as charitable donation matching and wellness stipends, may also be provided.</li>\n</ul>\n<p>More details about our benefits are available to candidates during the hiring process.</p>\n<p>This role is at-will and OpenAI reserves the right to modify base pay and other compensation components at any time based on individual performance, team or company results, or market conditions.</p>\n<p><strong>About the Team</strong> Training Runtime designs the core distributed machine-learning training runtime that powers everything from early research experiments to frontier-scale model runs. With a dual mandate to accelerate researchers and enable frontier scale, we’re building a unified, modular runtime that meets researchers where they are and moves with them up the scaling curve.</p>\n<p>Our work focuses on three pillars: high-performance, asynchronous, zero-copy tensor and optimizer-state-aware data movement; performant, high-uptime, fault-tolerant training frameworks (training loop, state management, resilient checkpointing, deterministic orchestration, and observability); and distributed process management for long-lived, job-specific and user-provided processes.</p>\n<p>We integrate proven large-scale capabilities into a composable, developer-facing runtime so teams can iterate quickly and run reliably at any scale, partnering closely with model-stack, research, and platform teams. Success for us is measured by raising both training throughput (how fast models train) and researcher throughput (how fast ideas become experiments and products).</p>\n<p><strong>About the Role</strong> As a Training Performance Engineer, you’ll drive efficiency improvements across our distributed training stack. You’ll analyze large-scale training runs, identify utilization gaps, and design optimizations that push the boundaries of throughput and uptime. This role blends deep systems understanding with practical performance engineering — analyzing GPU kernel performance, collective communication throughput, investigating I/O bottlenecks, and sharding our models so we can train them at massive scale.</p>\n<p>You’ll help ensure that our clusters are running at peak performance, enabling OpenAI to train larger, more capable models with the same compute budget.</p>\n<p>This role is based in San Francisco, CA. We use a hybrid work model of three days in the office per week and offer relocation assistance to new employees.</p>\n<p><strong>In this role, you will:</strong></p>\n<ul>\n<li>Profile end-to-end training runs to identify performance bottlenecks across compute, communication, and storage.</li>\n<li>Optimize GPU utilization and throughput for large-scale distributed model training.</li>\n<li>Collaborate with runtime and systems engineers to improve kernel efficiency, scheduling, and collective communication performance.</li>\n<li>Implement model graph transforms to improve end to end throughput.</li>\n<li>Build tooling to monitor and visualize MFU, throughput, and uptime across clusters.</li>\n<li>Partner with researchers to ensure new model architectures scale efficiently during pre-training.</li>\n<li>Contribute to infrastructure decisions that improve reliability and efficiency of large training jobs.</li>\n</ul>\n<p><strong>You might thrive in this role if you:</strong></p>\n<ul>\n<li>Love optimizing performance and digging into systems to understand how every layer interacts.</li>\n<li>Have strong programming skills in Python and C++ (Rust or CUDA a plus).</li>\n<li>Have experience running distributed training jobs on multi-GPU systems or HPC clusters.</li>\n<li>Enjoy debugging complex distributed systems and measuring efficiency rigorously.</li>\n<li>Have exposure to frameworks like PyTorch, JAX, or TensorFlow and an understanding of how large-scale training loops are built.</li>\n<li>Are comfortable collaborating across teams and translating raw profiling data into practical engineering improvements.</li>\n</ul>\n<p><strong>Nice to have:</strong></p>\n<ul>\n<li>Familiarity with NCCL, MPI, or UCX communication libraries.</li>\n<li>Experience with large-scale data loading and checkpointing systems.</li>\n<li>Prior work on training runtime, distributed scheduling, or ML compiler optimization.</li>\n</ul>\n<p><strong>About OpenAI</strong> OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_0e457a06-cee","directApply":true,"hiringOrganization":{"@type":"Organization","name":"OpenAI","sameAs":"https://jobs.ashbyhq.com","logo":"https://logos.yubhub.co/openai.com.png"},"x-apply-url":"https://jobs.ashbyhq.com/openai/6eb386ac-9056-4795-aa79-a27e105faf5c","x-work-arrangement":"hybrid","x-experience-level":"mid","x-job-type":"full-time","x-salary-range":"$250K – $445K","x-skills-required":["Python","C++","Rust","CUDA","PyTorch","JAX","TensorFlow","NCCL","MPI","UCX"],"x-skills-preferred":["Large-scale data loading and checkpointing systems","Training runtime, distributed scheduling, or ML compiler optimization"],"datePosted":"2026-03-06T18:32:30.509Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Python, C++, Rust, CUDA, PyTorch, JAX, TensorFlow, NCCL, MPI, UCX, Large-scale data loading and checkpointing systems, Training runtime, distributed scheduling, or ML compiler optimization","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":250000,"maxValue":445000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_d5390946-539"},"title":"Software Engineer, Model Inference","description":"<p><strong>Software Engineer, Model Inference</strong></p>\n<p><strong>Location</strong></p>\n<p>San Francisco</p>\n<p><strong>Employment Type</strong></p>\n<p>Full time</p>\n<p><strong>Department</strong></p>\n<p>Scaling</p>\n<p><strong>Compensation</strong></p>\n<ul>\n<li>$295K – $555K • Offers Equity</li>\n</ul>\n<p>The base pay offered may vary depending on multiple individualized factors, including market location, job-related knowledge, skills, and experience. If the role is non-exempt, overtime pay will be provided consistent with applicable laws. In addition to the salary range listed above, total compensation also includes generous equity, performance-related bonus(es) for eligible employees, and the following benefits.</p>\n<p><strong>Benefits</strong></p>\n<ul>\n<li>Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts</li>\n</ul>\n<ul>\n<li>Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)</li>\n</ul>\n<ul>\n<li>401(k) retirement plan with employer match</li>\n</ul>\n<ul>\n<li>Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)</li>\n</ul>\n<ul>\n<li>Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees</li>\n</ul>\n<ul>\n<li>13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)</li>\n</ul>\n<ul>\n<li>Mental health and wellness support</li>\n</ul>\n<ul>\n<li>Employer-paid basic life and disability coverage</li>\n</ul>\n<ul>\n<li>Annual learning and development stipend to fuel your professional growth</li>\n</ul>\n<ul>\n<li>Daily meals in our offices, and meal delivery credits as eligible</li>\n</ul>\n<ul>\n<li>Relocation support for eligible employees</li>\n</ul>\n<ul>\n<li>Additional taxable fringe benefits, such as charitable donation matching and wellness stipends, may also be provided.</li>\n</ul>\n<p><strong>About the Team</strong></p>\n<p>Our Inference team brings OpenAI’s most capable research and technology to the world through our products. We empower consumers, enterprise and developers alike to use and access our start-of-the-art AI models, allowing them to do things that they’ve never been able to before. We focus on performant and efficient model inference, as well as accelerating research progression via model inference.</p>\n<p><strong>About the Role</strong></p>\n<p>We are looking for an engineer who wants to take the world&#39;s largest and most capable AI models and optimize them for use in a high-volume, low-latency, and high-availability production and research environment.</p>\n<p><strong>In this role, you will:</strong></p>\n<ul>\n<li>Work alongside machine learning researchers, engineers, and product managers to bring our latest technologies into production.</li>\n</ul>\n<ul>\n<li>Work alongside researchers to enable advanced research through awesome engineering.</li>\n</ul>\n<ul>\n<li>Introduce new techniques, tools, and architecture that improve the performance, latency, throughput, and efficiency of our model inference stack.</li>\n</ul>\n<ul>\n<li>Build tools to give us visibility into our bottlenecks and sources of instability and then design and implement solutions to address the highest priority issues.</li>\n</ul>\n<ul>\n<li>Optimize our code and fleet of Azure VMs to utilize every FLOP and every GB of GPU RAM of our hardware.</li>\n</ul>\n<p><strong>You might thrive in this role if you:</strong></p>\n<ul>\n<li>Have an understanding of modern ML architectures and an intuition for how to optimize their performance, particularly for inference.</li>\n</ul>\n<ul>\n<li>Own problems end-to-end, and are willing to pick up whatever knowledge you&#39;re missing to get the job done.</li>\n</ul>\n<ul>\n<li>Have at least 5 years of professional software engineering experience.</li>\n</ul>\n<ul>\n<li>Have or can quickly gain familiarity with PyTorch, NVidia GPUs and the software stacks that optimize them (e.g. NCCL, CUDA), as well as HPC technologies such as InfiniBand, MPI, NVLink, etc.</li>\n</ul>\n<ul>\n<li>Have experience architecting, building, observing, and debugging production distributed systems. Bonus point if worked on performance-critical distributed systems.</li>\n</ul>\n<ul>\n<li>Have needed to rebuild or substantially refactor production systems several times over due to rapidly increasing scale.</li>\n</ul>\n<ul>\n<li>Are self-directed and enjoy figuring out the most important problem to work on.</li>\n</ul>\n<ul>\n<li>Have a humble attitude, an eagerness to help your colleagues, and a desire to do whatever it takes to make the team succeed.</li>\n</ul>\n<p><strong>About OpenAI</strong></p>\n<p>OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_d5390946-539","directApply":true,"hiringOrganization":{"@type":"Organization","name":"OpenAI","sameAs":"https://jobs.ashbyhq.com","logo":"https://logos.yubhub.co/openai.com.png"},"x-apply-url":"https://jobs.ashbyhq.com/openai/83b6755d-7785-4186-9050-5ef3ad127941","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$295K – $555K • Offers Equity","x-skills-required":["PyTorch","NVidia GPUs","NCCL","CUDA","HPC technologies","InfiniBand","MPI","NVLink","Azure VMs","GPU RAM","FLOP"],"x-skills-preferred":["modern ML architectures","intuition for optimizing performance","distributed systems","performance-critical distributed systems"],"datePosted":"2026-03-06T18:31:29.482Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"PyTorch, NVidia GPUs, NCCL, CUDA, HPC technologies, InfiniBand, MPI, NVLink, Azure VMs, GPU RAM, FLOP, modern ML architectures, intuition for optimizing performance, distributed systems, performance-critical distributed systems","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":295000,"maxValue":555000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_989f992b-6b2"},"title":"Software Engineer, Inference – AMD GPU Enablement","description":"<p><strong>Software Engineer, Inference – AMD GPU Enablement</strong></p>\n<p><strong>Location</strong></p>\n<p>San Francisco</p>\n<p><strong>Employment Type</strong></p>\n<p>Full time</p>\n<p><strong>Department</strong></p>\n<p>Scaling</p>\n<p><strong>Compensation</strong></p>\n<ul>\n<li>$295K – $555K • Offers Equity</li>\n</ul>\n<p>The base pay offered may vary depending on multiple individualized factors, including market location, job-related knowledge, skills, and experience. If the role is non-exempt, overtime pay will be provided consistent with applicable laws. In addition to the salary range listed above, total compensation also includes generous equity, performance-related bonus(es) for eligible employees, and the following benefits.</p>\n<ul>\n<li>Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts</li>\n</ul>\n<ul>\n<li>Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)</li>\n</ul>\n<ul>\n<li>401(k) retirement plan with employer match</li>\n</ul>\n<ul>\n<li>Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)</li>\n</ul>\n<ul>\n<li>Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees</li>\n</ul>\n<ul>\n<li>13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)</li>\n</ul>\n<ul>\n<li>Mental health and wellness support</li>\n</ul>\n<ul>\n<li>Employer-paid basic life and disability coverage</li>\n</ul>\n<ul>\n<li>Annual learning and development stipend to fuel your professional growth</li>\n</ul>\n<ul>\n<li>Daily meals in our offices, and meal delivery credits as eligible</li>\n</ul>\n<ul>\n<li>Relocation support for eligible employees</li>\n</ul>\n<ul>\n<li>Additional taxable fringe benefits, such as charitable donation matching and wellness stipends, may also be provided.</li>\n</ul>\n<p>More details about our benefits are available to candidates during the hiring process.</p>\n<p>This role is at-will and OpenAI reserves the right to modify base pay and other compensation components at any time based on individual performance, team or company results, or market conditions.</p>\n<p><strong>About the Team</strong></p>\n<p>Our Inference team brings OpenAI’s most capable research and technology to the world through our products. We empower consumers, enterprises and developers alike to use and access our state-of-the-art AI models, allowing them to do things that they’ve never been able to before. We focus on performant and efficient model inference, as well as accelerating research progression via model inference.</p>\n<p><strong>About the Role</strong></p>\n<p>We’re hiring engineers to scale and optimize OpenAI’s inference infrastructure across emerging GPU platforms. You’ll work across the stack - from low-level kernel performance to high-level distributed execution - and collaborate closely with research, infra, and performance teams to ensure our largest models run smoothly on new hardware.</p>\n<p>This is a high-impact opportunity to shape OpenAI’s multi-platform inference capabilities from the ground up with a particular focus on advancing inference performance on AMD accelerators.</p>\n<p><strong>In this role, you will:</strong></p>\n<ul>\n<li>Own bring-up, correctness and performance of the OpenAI inference stack on AMD hardware.</li>\n</ul>\n<ul>\n<li>Integrate internal model-serving infrastructure (e.g., vLLM, Triton) into a variety of GPU-backed systems.</li>\n</ul>\n<ul>\n<li>Debug and optimize distributed inference workloads across memory, network, and compute layers.</li>\n</ul>\n<ul>\n<li>Validate correctness, performance, and scalability of model execution on large GPU clusters.</li>\n</ul>\n<ul>\n<li>Collaborate with partner teams to design and optimize high-performance GPU kernels for accelerators using HIP, Triton, or other performance-focused frameworks.</li>\n</ul>\n<ul>\n<li>Collaborate with partner teams to build, integrate and tune collective communication libraries (e.g., RCCL) used to parallelize model execution across many GPUs.</li>\n</ul>\n<p><strong>You can thrive in this role if you:</strong></p>\n<ul>\n<li>Have experience writing or porting GPU kernels using HIP, CUDA, or Triton, and care deeply about low-level performance.</li>\n</ul>\n<ul>\n<li>Are familiar with communication libraries like NCCL/RCCL and understand their role in high-throughput model serving.</li>\n</ul>\n<ul>\n<li>Have worked on distributed inference systems and are comfortable scaling models across fleets of accelerators.</li>\n</ul>\n<ul>\n<li>Enjoy solving end-to-end performance challenges across hardware, system libraries, and orchestration layers.</li>\n</ul>\n<ul>\n<li>Are excited to be part of a small, fast-moving team building new infrastructure from first principles.</li>\n</ul>\n<p><strong>Nice to Have:</strong></p>\n<ul>\n<li>Contributions to open-source libraries like RCCL, Triton, or vLLM.</li>\n</ul>\n<ul>\n<li>Experience with GPU performance tools (Nsight, rocprof, perf) and memory/comms profiling.</li>\n</ul>\n<ul>\n<li>Prior experience deploying inference on other non-NVIDIA GPU environments.</li>\n</ul>\n<ul>\n<li>Knowledge of model/tensor parallelism, mixed precision, and serving 10B+ parameter models.</li>\n</ul>\n<p><strong>About OpenAI</strong></p>\n<p>OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_989f992b-6b2","directApply":true,"hiringOrganization":{"@type":"Organization","name":"OpenAI","sameAs":"https://jobs.ashbyhq.com","logo":"https://logos.yubhub.co/openai.com.png"},"x-apply-url":"https://jobs.ashbyhq.com/openai/9b79406c-89a8-49bd-8a38-e72db80996e9","x-work-arrangement":"onsite","x-experience-level":"mid","x-job-type":"full-time","x-salary-range":"$295K – $555K • Offers Equity","x-skills-required":["GPU kernels","HIP","CUDA","Triton","NCCL/RCCL","distributed inference systems","GPU performance tools","memory/comms profiling"],"x-skills-preferred":["open-source libraries","GPU performance tools","memory/comms profiling"],"datePosted":"2026-03-06T18:28:36.084Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"GPU kernels, HIP, CUDA, Triton, NCCL/RCCL, distributed inference systems, GPU performance tools, memory/comms profiling, open-source libraries, GPU performance tools, memory/comms profiling","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":295000,"maxValue":555000,"unitText":"YEAR"}}}]}