{"version":"0.1","company":{"name":"YubHub","url":"https://yubhub.co","jobsUrl":"https://yubhub.co/jobs/skill/tpus"},"x-facet":{"type":"skill","slug":"tpus","display":"Tpus","count":7},"x-feed-size-limit":100,"x-feed-sort":"enriched_at desc","x-feed-notice":"This feed contains at most 100 jobs (the most recently enriched). For the full corpus, use the paginated /stats/by-facet endpoint or /search.","x-generator":"yubhub-xml-generator","x-rights":"Free to redistribute with attribution: \"Data by YubHub (https://yubhub.co)\"","x-schema":"Each entry in `jobs` follows https://schema.org/JobPosting. YubHub-native raw fields carry `x-` prefix.","jobs":[{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_59e88547-efc"},"title":"Senior Software Engineer, Systems","description":"<p>About Anthropic</p>\n<p>Anthropic&#39;s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole.</p>\n<p>About the Role</p>\n<p>Anthropic&#39;s Infrastructure organization is foundational to our mission of developing AI systems that are reliable, interpretable, and steerable. The systems we build determine how quickly we can train new models, how reliably we can run safety experiments, and how effectively we can scale Claude to millions of users , demonstrating that safe, reliable infrastructure and frontier capabilities can go hand in hand. The Systems engineering team owns compute uptime and resilience at massive scale, building the clusters, automation, and observability that make frontier AI research possible and safely deployable to customers.</p>\n<p>Responsibilities</p>\n<ul>\n<li>Lead infrastructure projects from design through delivery, owning scope, execution, and outcomes</li>\n<li>Build and maintain systems that support AI clusters at massive scale (thousands to hundreds of thousands of machines)</li>\n<li>Partner with cloud providers and internal teams to solve compute, networking, and reliability challenges</li>\n<li>Tackle difficult technical problems in your domain and proactively fill gaps in tooling, documentation, and processes</li>\n<li>Contribute to operational practices including incident response, postmortems, and on-call rotations</li>\n</ul>\n<p>Benefits</p>\n<ul>\n<li>Competitive compensation and benefits</li>\n<li>Optional equity donation matching</li>\n<li>Generous vacation and parental leave</li>\n<li>Flexible working hours</li>\n<li>Lovely office space in which to collaborate with colleagues</li>\n</ul>\n<p>Requirements</p>\n<ul>\n<li>6+ years of software engineering experience</li>\n<li>Have led technical projects end-to-end over multiple months, including scoping, breaking down work, and driving delivery</li>\n<li>Have deep knowledge of distributed systems, reliability, and cloud platforms (Kubernetes, IaC, AWS/GCP)</li>\n<li>Are strong in at least one systems language (Python, Rust, Go, Java)</li>\n<li>Solve hard problems independently and know when to pull others in</li>\n<li>Help teammates grow through knowledge sharing and thoughtful technical guidance</li>\n<li>Communicate clearly in design docs, presentations, and cross-functional discussions</li>\n</ul>\n<p>Preferred Qualifications</p>\n<ul>\n<li>Security and privacy best practice expertise</li>\n<li>Experience with machine learning infrastructure like GPUs, TPUs, or Trainium, as well as supporting networking infrastructure like NCCL</li>\n<li>Low level systems experience, for example linux kernel tuning and eBPF</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_59e88547-efc","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Anthropic","sameAs":"https://www.anthropic.com/","logo":"https://logos.yubhub.co/anthropic.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/anthropic/jobs/4915842008","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"£240,000-£325,000 GBP","x-skills-required":["Distributed systems","Reliability","Cloud platforms","Kubernetes","IaC","AWS/GCP","Systems language","Python","Rust","Go","Java"],"x-skills-preferred":["Security and privacy best practice","Machine learning infrastructure","GPUs","TPUs","Trainium","Networking infrastructure","NCCL","Low level systems experience","Linux kernel tuning","eBPF"],"datePosted":"2026-04-18T15:48:47.617Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"London, UK"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Distributed systems, Reliability, Cloud platforms, Kubernetes, IaC, AWS/GCP, Systems language, Python, Rust, Go, Java, Security and privacy best practice, Machine learning infrastructure, GPUs, TPUs, Trainium, Networking infrastructure, NCCL, Low level systems experience, Linux kernel tuning, eBPF","baseSalary":{"@type":"MonetaryAmount","currency":"GBP","value":{"@type":"QuantitativeValue","minValue":240000,"maxValue":325000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_173381a1-8d0"},"title":"Software Engineer, Sandboxing (Systems)","description":"<p><strong>About Anthropic</strong></p>\n<p>Anthropic&#39;s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems.</p>\n<p><strong>Responsibilities:</strong></p>\n<p>We are seeking a Linux OS and System Programming Subject Matter Expert to join our Infrastructure team. In this role, you&#39;ll work on accelerating and optimising our virtualisation and VM workloads that power our AI infrastructure. Your expertise in low-level system programming, kernel optimisation, and virtualisation technologies will be crucial in ensuring Anthropic can scale our compute infrastructure efficiently and reliably for training and serving frontier AI models.</p>\n<ul>\n<li>Optimise our virtualisation stack, improving performance, reliability, and efficiency of our VM environments</li>\n<li>Design and implement kernel modules, drivers, and system-level components to enhance our compute infrastructure</li>\n<li>Investigate and resolve performance bottlenecks in virtualised environments</li>\n<li>Collaborate with cloud engineering teams to optimise interactions between our workloads and underlying hardware</li>\n<li>Develop tooling for monitoring and improving virtualisation performance</li>\n<li>Work with our ML engineers to understand their computational needs and optimise our systems accordingly</li>\n<li>Contribute to the design and implementation of our next-generation compute infrastructure</li>\n<li>Share knowledge with team members on low-level systems programming and Linux kernel internals</li>\n<li>Partner with cloud providers to influence hardware and platform features for AI workloads</li>\n</ul>\n<p><strong>You may be a good fit if you:</strong></p>\n<ul>\n<li>Have experience with Linux kernel development, system programming, or related low-level software engineering</li>\n<li>Understand virtualisation technologies (KVM, Xen, QEMU, etc.) and their performance characteristics</li>\n<li>Have experience optimising system performance for compute-intensive workloads</li>\n<li>Are familiar with modern CPU architectures and memory systems</li>\n<li>Have strong C/C++ programming skills and ideally experience with systems languages like Rust</li>\n<li>Understand Linux resource management, scheduling, and memory management</li>\n<li>Have experience profiling and debugging system-level performance issues</li>\n<li>Are comfortable diving into unfamiliar codebases and technical domains</li>\n<li>Are results-oriented, with a bias towards practical solutions and measurable impact</li>\n<li>Care about the societal impacts of AI and are passionate about building safe, reliable systems</li>\n</ul>\n<p><strong>Strong candidates may also have experience with:</strong></p>\n<ul>\n<li>GPU virtualisation and acceleration technologies</li>\n<li>Cloud infrastructure at scale (AWS, GCP)</li>\n<li>Container technologies and their underlying implementation (Docker, containerd, runc, OCI)</li>\n<li>eBPF programming and kernel tracing tools</li>\n<li>OS-level security hardening and isolation techniques</li>\n<li>Developing custom scheduling algorithms for specialised workloads</li>\n<li>Performance optimisation for ML/AI specific workloads</li>\n<li>Network stack optimisation and high-performance networking</li>\n<li>Experience with TPUs, custom ASICs, or other ML accelerators</li>\n</ul>\n<p><strong>Representative projects:</strong></p>\n<ul>\n<li>Optimising kernel parameters and VM configurations to reduce inference latency for large language models</li>\n<li>Implementing custom memory management schemes for large-scale distributed training</li>\n<li>Developing specialised I/O schedulers to prioritise ML workloads</li>\n<li>Creating lightweight virtualisation solutions tailored for AI inference</li>\n<li>Building monitoring and instrumentation tools to identify system-level bottlenecks</li>\n<li>Enhancing communication between VMs for distributed training workloads</li>\n</ul>\n<p><strong>Deadline to apply:</strong></p>\n<p>None. Applications will be reviewed on a rolling basis.</p>\n<p><strong>Logistics</strong></p>\n<p><strong>Education requirements:</strong></p>\n<p>We require at least a Bachelor&#39;s degree in a related field or equivalent experience.</p>\n<p><strong>Location-based hybrid policy:</strong></p>\n<p>Currently, we expect all staff to be in one of our offices at least 25% of the time. However, some roles may require more time in our offices.</p>\n<p><strong>Visa sponsorship:</strong></p>\n<p>We do sponsor visas! However, we aren&#39;t able to successfully sponsor visas for every role and every candidate. But if we make you an offer, we will make every reasonable effort to get you a visa, and we retain an immigration lawyer to help with this.</p>\n<p><strong>We encourage you to apply even if you do not believe you meet every single qualification.</strong></p>\n<p>Not all strong candidates will meet every single qualification as listed. Research shows that people who identify as being from underrepresented groups are more prone to experiencing imposter syndrome and doubting the strength of their candidacy, so we urge you not to exclude yourself prematurely and to submit an application if you&#39;re interested in this work.</p>\n<p><strong>Your safety matters to us.</strong></p>\n<p>To protect yourself from potential scams, remember that Anthropic recruiters only contact you from @anthropic.com email addresses. In some cases, we may partner with vetted recruiting agencies who will identify themselves as working on behalf of Anthropic. Be cautious of emails from other domains. Legitimate Anthropic recruiters will never ask for money, fees, or banking information before your first day. If you&#39;re ever unsure about the authenticity of an email or a request, please reach out to us directly.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_173381a1-8d0","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Anthropic","sameAs":"https://job-boards.greenhouse.io","logo":"https://logos.yubhub.co/anthropic.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/anthropic/jobs/5025591008","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$300,000 - $405,000 USD","x-skills-required":["Linux kernel development","System programming","Low-level software engineering","Virtualisation technologies","Kernel optimisation","C/C++ programming","Rust programming","Linux resource management","Scheduling","Memory management"],"x-skills-preferred":["GPU virtualisation","Cloud infrastructure","Container technologies","eBPF programming","OS-level security hardening","Custom scheduling algorithms","Performance optimisation","Network stack optimisation","TPUs","Custom ASICs"],"datePosted":"2026-03-08T14:03:08.579Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco, CA | New York City, NY"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Linux kernel development, System programming, Low-level software engineering, Virtualisation technologies, Kernel optimisation, C/C++ programming, Rust programming, Linux resource management, Scheduling, Memory management, GPU virtualisation, Cloud infrastructure, Container technologies, eBPF programming, OS-level security hardening, Custom scheduling algorithms, Performance optimisation, Network stack optimisation, TPUs, Custom ASICs","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":300000,"maxValue":405000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_0a7113f5-76c"},"title":"Engineering Manager, Cloud Inference AWS","description":"<p><strong>About the role</strong></p>\n<p>We are seeking an experienced Engineering Manager to lead the Cloud Inference team for AWS. You will lead your team to scale and optimize Claude to serve the massive audiences of developers and enterprise companies using AWS. You will own the end-to-end product of Claude on AWS, including API, load balancing, inference, capacity and operations. Your team will ensure our LLMs meet rigorous performance, safety and security standards and enhance our core infrastructure for packaging, testing, and deploying inference technology across the globe. Your work will increase the scale at which Anthropic operates and accelerate our ability to reliably launch new frontier models and innovative features to customers across all platforms.</p>\n<p><strong>Responsibilities:</strong></p>\n<ul>\n<li>Set technical strategy and oversee development of Claude on AWS across all layers of the technical stack.</li>\n<li>Collaborate across teams and companies to deeply understand product, infrastructure, operations and capacity needs, identifying potential solutions to support frontier LLM serving</li>\n<li>Work closely with cross-functional stakeholders across companies to align on goals and drive outcomes</li>\n<li>Create clarity for the team and stakeholders in an ambiguous and evolving environment</li>\n<li>Take an inclusive approach to hiring and coaching top technical talent, and support a high performing team</li>\n<li>Design and run processes (e.g. postmortem review, incident response, on-call rotations) that help the team operate effectively and never fail the same way twice</li>\n</ul>\n<p><strong>You may be a good fit if you:</strong></p>\n<ul>\n<li>Have 10+ years of experience in high-scale, high-reliability software development, particularly infrastructure or capacity management</li>\n<li>Have 5+ years of engineering management experience</li>\n<li>Experience recruiting, scaling, and retaining engineering talent in a high growth environment</li>\n<li>Have experience scaling products, resources and operations to accommodate rapid growth</li>\n<li>Are deeply interested in the potential transformative effects of advanced AI systems and are committed to ensuring their safe development</li>\n<li>Excel at building strong relationships and strategy with stakeholders across engineering, product, finance, and sales</li>\n<li>Have experience working with external partners to align goals and deliver impact</li>\n<li>Enjoy working in a fast-paced, early environment; comfortable with adapting priorities as driven by the rapidly evolving AI space</li>\n<li>Have excellent written and verbal communication skills</li>\n<li>Demonstrated success building a culture of belonging and engineering excellence</li>\n<li>Are motivated by developing AI responsibly and safely</li>\n<li>Are willing and able to travel frequently between Seattle and the SF Bay Area</li>\n</ul>\n<p><strong>Strong candidates may also have experience with:</strong></p>\n<ul>\n<li>Experience with machine learning infrastructure like GPUs, TPUs, or Trainium, as well as supporting networking infrastructure like NCCL</li>\n<li>Experience as a Product Manager</li>\n<li>Experience with deployment and capacity management automation</li>\n<li>Security and privacy best practice expertise</li>\n</ul>\n<p><strong>Logistics</strong></p>\n<p><strong>Education requirements:</strong> We require at least a Bachelor&#39;s degree in a related field or equivalent experience. <strong>Location-based hybrid policy:</strong> Currently, we expect all staff to be in one of our offices at least 25% of the time. However, some roles may require more time in our offices.</p>\n<p><strong>Visa sponsorship:</strong> We do sponsor visas! However, we aren&#39;t able to successfully sponsor visas for every role and every candidate. But if we make you an offer, we will make every reasonable effort to get you a visa, and we retain an immigration lawyer to help with this.</p>\n<p><strong>We encourage you to apply even if you do not believe you meet every single qualification.</strong> Not all strong candidates will meet every single qualification as listed. Research shows that people who identify as being from underrepresented groups are more prone to experiencing imposter syndrome and doubting the strength of their candidacy, so we urge you not to exclude yourself prematurely and to submit an application if you&#39;re interested in this work.</p>\n<p><strong>Your safety matters to us.</strong> To protect yourself from potential scams, remember that Anthropic recruiters only contact you from @anthropic.com email addresses. In some cases, we may partner with vetted recruiting agencies who will identify themselves as working on behalf of Anthropic. Be cautious of emails from other domains. Legitimate Anthropic recruiters will never ask for money, fees, or banking information before your first day. If you&#39;re ever unsure about a communication, don&#39;t click any links—visit anthropic.com/careers directly for confirmed position openings.</p>\n<p><strong>How we&#39;re different</strong></p>\n<p>We believe that the highest-impact AI research will be big science. At Anthropic we work as a single cohesive team on just a few large-scale research efforts. And we value impact — advancing our long-term goals of steerable, trustworthy AI — rather than work on smaller and more specific puzzles. We view AI research as a collaborative effort, and we work closely with other researchers, engineers, and experts to advance our understanding of AI and its applications.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_0a7113f5-76c","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Anthropic","sameAs":"https://job-boards.greenhouse.io","logo":"https://logos.yubhub.co/anthropic.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/anthropic/jobs/5141377008","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$405,000 - $485,000 USD","x-skills-required":["high-scale, high-reliability software development","infrastructure or capacity management","engineering management","recruiting, scaling, and retaining engineering talent","scaling products, resources and operations","machine learning infrastructure","deployment and capacity management automation","security and privacy best practice expertise"],"x-skills-preferred":["experience with GPUs, TPUs, or Trainium","experience as a Product Manager","experience with networking infrastructure like NCCL"],"datePosted":"2026-03-08T13:56:51.226Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco, CA | Seattle, WA"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"high-scale, high-reliability software development, infrastructure or capacity management, engineering management, recruiting, scaling, and retaining engineering talent, scaling products, resources and operations, machine learning infrastructure, deployment and capacity management automation, security and privacy best practice expertise, experience with GPUs, TPUs, or Trainium, experience as a Product Manager, experience with networking infrastructure like NCCL","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":405000,"maxValue":485000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_25934fbc-c50"},"title":"Staff / Senior Software Engineer, Cloud Inference","description":"<p><strong>About the Role</strong></p>\n<p>The Cloud Inference team scales and optimizes Claude to serve the massive audiences of developers and enterprise companies across AWS, GCP, Azure, and future cloud service providers (CSPs). We own the end-to-end product of Claude on each cloud platform—from API integration and intelligent request routing to inference execution, capacity management, and day-to-day operations.</p>\n<p>Our engineers are extremely high leverage: we simultaneously drive multiple major revenue streams while optimizing one of Anthropic&#39;s most precious resources—compute. As we expand to more cloud platforms, the complexity of managing inference efficiently across providers with different hardware, networking stacks, and operational models grows significantly. We need engineers who can navigate these platform differences, build robust abstractions that work across providers, and make smart infrastructure decisions that keep us cost-effective at massive scale.</p>\n<p>Your work will increase the scale at which our services operate, accelerate our ability to reliably launch new frontier models and innovative features to customers across all platforms, and ensure our LLMs meet rigorous safety, performance, and security standards.</p>\n<p><strong>What You&#39;ll Do</strong></p>\n<ul>\n<li>Design and build infrastructure that serves Claude across multiple CSPs, accounting for differences in compute hardware, networking, APIs, and operational models</li>\n<li>Collaborate with CSP partner engineering teams to resolve operational issues, influence provider roadmaps, and stand up end-to-end serving on new cloud platforms</li>\n<li>Design and evolve CI/CD automation systems, including validation and deployment pipelines, that reliably ship new model versions to millions of users across cloud platforms without regressions</li>\n<li>Design interfaces and tooling abstractions across CSPs that enable cost-effective inference management, scale across providers, and reduce per-platform complexity</li>\n<li>Contribute to capacity planning and autoscaling strategies that dynamically match supply with demand across CSP validation and production workloads</li>\n<li>Optimize inference cost and performance across providers—designing workload placement and routing systems that direct requests to the most cost-effective accelerator and region</li>\n<li>Contribute to inference features that must work consistently across all platforms</li>\n<li>Analyze observability data across providers to identify performance bottlenecks, cost anomalies, and regressions, and drive remediation based on real-world production workloads</li>\n</ul>\n<p><strong>You May Be a Good Fit If You:</strong></p>\n<ul>\n<li>Have significant software engineering experience, with a strong background in high-performance, large-scale distributed systems serving millions of users</li>\n<li>Have experience building or operating services on at least one major cloud platform (AWS, GCP, or Azure), with exposure to Kubernetes, Infrastructure as Code or container orchestration</li>\n<li>Have strong interest in inference</li>\n<li>Thrive in cross-functional collaboration with both internal teams and external partners</li>\n<li>Are a fast learner who can quickly ramp up on new technologies, hardware platforms, and provider ecosystems</li>\n<li>Are highly autonomous and self-driven, taking ownership of problems end-to-end with a bias toward flexibility and high-impact work</li>\n<li>Pick up slack, even when it goes outside your job description</li>\n</ul>\n<p><strong>Strong Candidates May Also Have Experience With</strong></p>\n<ul>\n<li>Direct experience working with CSP partner teams to scale infrastructure or products across multiple platforms, navigating differences in networking, security, privacy, billing, and managed service offerings</li>\n<li>A background in building platform-agnostic tooling or abstraction layers that work across cloud providers</li>\n<li>Hands-on experience with capacity management, cost optimization, or resource planning at scale across heterogeneous environments</li>\n<li>Strong familiarity with LLM inference optimization, batching, caching, and serving strategies</li>\n<li>Experience with Machine learning infrastructure including GPUs, TPUs, Trainium, or other AI accelerators</li>\n<li>Background designing and building CI/CD systems that automate deployment and validation across cloud environments</li>\n<li>Solid understanding of multi-region deployments, geographic routing, and global traffic management</li>\n<li>Proficiency in Python or Rust</li>\n</ul>\n<p><strong>Logistics</strong></p>\n<p><strong>Education requirements:</strong> We require at least a Bachelor&#39;s degree in a related field or equivalent experience. <strong>Location-based hybrid policy:</strong> Currently, we expect all staff to be in one of our offices at least 25% of the time. However, some roles may require more time in our offices.</p>\n<p><strong>Visa sponsorship:</strong> We do sponsor visas! However, we aren&#39;t able to successfully sponsor visas for every role and every candidate. But if we make you an offer, we will make every reasonable effort to get you a visa, and we retain an immigration lawyer to help with this.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_25934fbc-c50","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Anthropic","sameAs":"https://www.anthropic.com","logo":"https://logos.yubhub.co/anthropic.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/anthropic/jobs/5107466008","x-work-arrangement":"hybrid","x-experience-level":"staff","x-job-type":"full-time","x-salary-range":"$300,000 - $485,000 USD","x-skills-required":["Software engineering","Cloud infrastructure","Kubernetes","Infrastructure as Code","Container orchestration","LLM inference optimization","Batching","Caching","Serving strategies","Machine learning infrastructure","GPUs","TPUs","Trainium","AI accelerators","CI/CD systems","Deployment and validation","Cloud environments","Multi-region deployments","Geographic routing","Global traffic management"],"x-skills-preferred":["Python","Rust","Cloud platforms","Networking","Security","Privacy","Billing","Managed service offerings","Platform-agnostic tooling","Abstraction layers","Capacity management","Cost optimization","Resource planning"],"datePosted":"2026-03-08T13:49:59.956Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco, CA | Seattle, WA"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Software engineering, Cloud infrastructure, Kubernetes, Infrastructure as Code, Container orchestration, LLM inference optimization, Batching, Caching, Serving strategies, Machine learning infrastructure, GPUs, TPUs, Trainium, AI accelerators, CI/CD systems, Deployment and validation, Cloud environments, Multi-region deployments, Geographic routing, Global traffic management, Python, Rust, Cloud platforms, Networking, Security, Privacy, Billing, Managed service offerings, Platform-agnostic tooling, Abstraction layers, Capacity management, Cost optimization, Resource planning","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":300000,"maxValue":485000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_3b20b513-ea1"},"title":"Staff+ Software Engineer, Systems","description":"<p><strong>About Anthropic</strong></p>\n<p>Anthropic&#39;s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems.</p>\n<p><strong>About the Role</strong></p>\n<p>Anthropic&#39;s Infrastructure organisation is foundational to our mission of developing AI systems that are reliable, interpretable, and steerable. The systems we build determine how quickly we can train new models, how reliably we can run safety experiments, and how effectively we can scale Claude to millions of users — demonstrating that safe, reliable infrastructure and frontier capabilities can go hand in hand.</p>\n<p>The Systems engineering team owns compute uptime and resilience at massive scale, building the clusters, automation, and observability that make frontier AI research possible and safely deployable to customers.</p>\n<p>_Team Matching: Team matching is determined after the interview process based on interview performance, interests, and business priorities. Please note we may also consider you for different Infrastructure teams._</p>\n<p><strong>Responsibilities</strong></p>\n<ul>\n<li>Own the technical strategy and roadmap for your area, translating team-level goals into concrete execution plans</li>\n<li>Drive cross-team initiatives to build and scale AI clusters (thousands to hundreds of thousands of machines)</li>\n<li>Define infrastructure architecture, ensuring the hardest problems get solved — whether by you directly or by working through others</li>\n<li>Partner with cloud providers and internal stakeholders to shape long-term compute, data, and infrastructure strategy</li>\n<li>Establish and evolve operational excellence practices (incident response, postmortem culture, on-call)</li>\n</ul>\n<p><strong>You may be a good fit if you:</strong></p>\n<ul>\n<li>Have 10+ years of software engineering experience</li>\n<li>Have led complex, multi-quarter technical initiatives that span multiple teams or systems</li>\n<li>Can set technical direction for a team, not just execute within it</li>\n<li>Have deep expertise in distributed systems, reliability, and cloud platforms (Kubernetes, IaC, AWS/GCP)</li>\n<li>Are strong in at least one systems language (Python, Rust, Go, Java)</li>\n<li>Naturally uplevel the engineers around you and can redirect efforts when things are heading off track</li>\n<li>Build alignment across senior stakeholders and communicate effectively at all levels</li>\n</ul>\n<p><strong>Strong candidates may have:</strong></p>\n<ul>\n<li>Security and privacy best practice expertise</li>\n<li>Experience with machine learning infrastructure like GPUs, TPUs, or Trainium, as well as supporting networking infrastructure like NCCL</li>\n<li>Low level systems experience, for example linux kernel tuning and eBPF</li>\n<li>Technical expertise: Quickly understanding systems design tradeoffs, keeping track of rapidly evolving software systems</li>\n</ul>\n<p>_Deadline to apply: None. Applications will be reviewed on a rolling basis._</p>\n<p><strong>Logistics</strong></p>\n<p><strong>Education requirements:</strong> We require at least a Bachelor&#39;s degree in a related field or equivalent experience. <strong>Location-based hybrid policy:</strong> Currently, we expect all staff to be in one of our offices at least 25% of the time. However, some roles may require more time in our offices.</p>\n<p><strong>Visa sponsorship:</strong> We do sponsor visas! However, we aren&#39;t able to successfully sponsor visas for every role and every candidate. But if we make you an offer, we will make every reasonable effort to get you a visa, and we retain an immigration lawyer to help with this.</p>\n<p><strong>We encourage you to apply even if you do not believe you meet every single qualification. Not all strong candidates will meet every single qualification as listed. Research shows that people who identify as being from underrepresented groups are more prone to experiencing imposter syndrome and doubting the strength of their candidacy, so we urge you not to exclude yourself prematurely and to submit an application if you&#39;re interested in this work.</strong></p>\n<p><strong>Your safety matters to us. To protect yourself from potential scams, remember that Anthropic recruiters only contact you from @anthropic.com email addresses. In some cases, we may partner with vetted recruiting agencies who will identify themselves as working on behalf of Anthropic. Be cautious of emails from other domains. Legitimate Anthropic recruiters will never ask for money, fees, or banking information before your first day. If you&#39;re ever unsure about a communication, don&#39;t click any links—visit anthropic.com/careers directly for confirmed position openings.</strong></p>\n<p><strong>How we&#39;re different</strong></p>\n<p>We believe that the highest-impact AI research will be big science. At Anthropic we work as a single cohesive team on just a few large-scale research efforts. And we value impact — advancing our long-term goals of steerable, trustworthy AI — rather than work on smaller and more specific puzzles. We view AI research as an empirical science, which has as much in common with physics and biology as with traditional efforts in computer science. We&#39;re an extremely collaborative group, and we host frequent research discussions to ensure that we are pursuing the highest-impact work at any given time. As such, we greatly value communication skills.</p>\n<p>The easiest way to understand our research directions is to read our recent research. This re</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_3b20b513-ea1","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Anthropic","sameAs":"https://job-boards.greenhouse.io","logo":"https://logos.yubhub.co/anthropic.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/anthropic/jobs/5108817008","x-work-arrangement":"hybrid","x-experience-level":"staff","x-job-type":"full-time","x-salary-range":"$405,000 - $485,000 USD","x-skills-required":["distributed systems","reliability","cloud platforms","Kubernetes","IaC","AWS/GCP","Python","Rust","Go","Java"],"x-skills-preferred":["security and privacy best practice expertise","machine learning infrastructure","GPUs","TPUs","Trainium","NCCL","low level systems experience","linux kernel tuning","eBPF"],"datePosted":"2026-03-08T13:49:17.054Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco, CA | New York City, NY | Seattle, WA"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"distributed systems, reliability, cloud platforms, Kubernetes, IaC, AWS/GCP, Python, Rust, Go, Java, security and privacy best practice expertise, machine learning infrastructure, GPUs, TPUs, Trainium, NCCL, low level systems experience, linux kernel tuning, eBPF","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":405000,"maxValue":485000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_886a66bf-10d"},"title":"Senior Software Engineer, Systems","description":"<p><strong>About Anthropic</strong></p>\n<p>Anthropic&#39;s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems.</p>\n<p><strong>About the Role</strong></p>\n<p>Anthropic&#39;s Infrastructure organisation is foundational to our mission of developing AI systems that are reliable, interpretable, and steerable. The systems we build determine how quickly we can train new models, how reliably we can run safety experiments, and how effectively we can scale Claude to millions of users — demonstrating that safe, reliable infrastructure and frontier capabilities can go hand in hand.</p>\n<p>The Systems engineering team owns compute uptime and resilience at massive scale, building the clusters, automation, and observability that make frontier AI research possible and safely deployable to customers.</p>\n<p>_Team Matching: Team matching is determined after the interview process based on interview performance, interests, and business priorities. Please note we may also consider you for different Infrastructure teams._</p>\n<p><strong>Responsibilities</strong></p>\n<ul>\n<li>Lead infrastructure projects from design through delivery, owning scope, execution, and outcomes</li>\n<li>Build and maintain systems that support AI clusters at massive scale (thousands to hundreds of thousands of machines)</li>\n<li>Partner with cloud providers and internal teams to solve compute, networking, and reliability challenges</li>\n<li>Tackle difficult technical problems in your domain and proactively fill gaps in tooling, documentation, and processes</li>\n<li>Contribute to operational practices including incident response, postmortems, and on-call rotations</li>\n</ul>\n<p><strong>You may be a good fit if you:</strong></p>\n<ul>\n<li>Have 6+ years of software engineering experience</li>\n<li>Have led technical projects end-to-end over multiple months, including scoping, breaking down work, and driving delivery</li>\n<li>Have deep knowledge of distributed systems, reliability, and cloud platforms (Kubernetes, IaC, AWS/GCP)</li>\n<li>Are strong in at least one systems language (Python, Rust, Go, Java)</li>\n<li>Solve hard problems independently and know when to pull others in</li>\n<li>Help teammates grow through knowledge sharing and thoughtful technical guidance</li>\n<li>Communicate clearly in design docs, presentations, and cross-functional discussions</li>\n</ul>\n<p><strong>Strong candidates may have:</strong></p>\n<ul>\n<li>Security and privacy best practice expertise</li>\n<li>Experience with machine learning infrastructure like GPUs, TPUs, or Trainium, as well as supporting networking infrastructure like NCCL</li>\n<li>Low level systems experience, for example linux kernel tuning and eBPF</li>\n<li>Technical expertise: Quickly understanding systems design tradeoffs, keeping track of rapidly evolving software systems</li>\n</ul>\n<p>_Deadline to apply: None. Applications will be reviewed on a rolling basis._</p>\n<p><strong>Logistics</strong></p>\n<p><strong>Education requirements:</strong> We require at least a Bachelor&#39;s degree in a related field or equivalent experience. <strong>Location-based hybrid policy:</strong> Currently, we expect all staff to be in one of our offices at least 25% of the time. However, some roles may require more time in our offices.</p>\n<p><strong>Visa sponsorship:</strong> We do sponsor visas! However, we aren&#39;t able to successfully sponsor visas for every role and every candidate. But if we make you an offer, we will make every reasonable effort to get you a visa, and we retain an immigration lawyer to help with this.</p>\n<p><strong>We encourage you to apply even if you do not believe you meet every single qualification. Not all strong candidates will meet every single qualification as listed. Research shows that people who identify as being from underrepresented groups are more prone to experiencing imposter syndrome and doubting the strength of their candidacy, so we urge you not to exclude yourself prematurely and to submit an application if you&#39;re interested in this work.</strong></p>\n<p><strong>Your safety matters to us. To protect yourself from potential scams, remember that Anthropic recruiters only contact you from @anthropic.com email addresses. In some cases, we may partner with vetted recruiting agencies who will identify themselves as working on behalf of Anthropic. Be cautious of emails from other domains. Legitimate Anthropic recruiters will never ask for money, fees, or banking information before your first day. If you&#39;re ever unsure about a communication, don&#39;t click any links—visit anthropic.com/careers directly for confirmed position openings.</strong></p>\n<p><strong>How we&#39;re different</strong></p>\n<p>We believe that the highest-impact AI research will be big science. At Anthropic we work as a single cohesive team on just a few large-scale research efforts. And we value impact — advancing our long-term goals of steerable, trustworthy AI — rather than work on smaller and more specific puzzles. We view AI research as an empirical science, which has as much in common with physics and biology as with traditional efforts in computer science. We&#39;re an extremely collaborative group, and we host frequent research discussions to ensure that we are pursuing the highest-impact work at any given time. As such, we greatly value communication skills.</p>\n<p>The easiest way to understand our research directions is to read our recent research. This research continues many of the directions our team worked on prior to Anthropic</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_886a66bf-10d","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Anthropic","sameAs":"https://job-boards.greenhouse.io","logo":"https://logos.yubhub.co/anthropic.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/anthropic/jobs/4915842008","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"£240,000 - £325,000GBP","x-skills-required":["distributed systems","reliability","cloud platforms","Kubernetes","IaC","AWS/GCP","Python","Rust","Go","Java"],"x-skills-preferred":["security and privacy best practice expertise","machine learning infrastructure","GPUs","TPUs","Trainium","NCCL","low level systems experience","linux kernel tuning","eBPF"],"datePosted":"2026-03-08T13:46:27.991Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"London, UK"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"distributed systems, reliability, cloud platforms, Kubernetes, IaC, AWS/GCP, Python, Rust, Go, Java, security and privacy best practice expertise, machine learning infrastructure, GPUs, TPUs, Trainium, NCCL, low level systems experience, linux kernel tuning, eBPF","baseSalary":{"@type":"MonetaryAmount","currency":"GBP","value":{"@type":"QuantitativeValue","minValue":240000,"maxValue":325000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_4e51470c-8f1"},"title":"Software Engineer, Accelerators","description":"<p><strong>Software Engineer, Accelerators</strong></p>\n<p><strong>Location</strong></p>\n<p>San Francisco</p>\n<p><strong>Employment Type</strong></p>\n<p>Full time</p>\n<p><strong>Department</strong></p>\n<p>Scaling</p>\n<p><strong>Compensation</strong></p>\n<ul>\n<li>$295K – $380K • Offers Equity</li>\n</ul>\n<p>The base pay offered may vary depending on multiple individualized factors, including market location, job-related knowledge, skills, and experience. If the role is non-exempt, overtime pay will be provided consistent with applicable laws. In addition to the salary range listed above, total compensation also includes generous equity, performance-related bonus(es) for eligible employees, and the following benefits.</p>\n<p><strong>Benefits</strong></p>\n<ul>\n<li>Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts</li>\n</ul>\n<ul>\n<li>Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)</li>\n</ul>\n<ul>\n<li>401(k) retirement plan with employer match</li>\n</ul>\n<ul>\n<li>Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)</li>\n</ul>\n<ul>\n<li>Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees</li>\n</ul>\n<ul>\n<li>13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)</li>\n</ul>\n<ul>\n<li>Mental health and wellness support</li>\n</ul>\n<ul>\n<li>Employer-paid basic life and disability coverage</li>\n</ul>\n<ul>\n<li>Annual learning and development stipend to fuel your professional growth</li>\n</ul>\n<ul>\n<li>Daily meals in our offices, and meal delivery credits as eligible</li>\n</ul>\n<ul>\n<li>Relocation support for eligible employees</li>\n</ul>\n<ul>\n<li>Additional taxable fringe benefits, such as charitable donation matching and wellness stipends, may also be provided.</li>\n</ul>\n<p><strong>About the Team</strong></p>\n<p>The Kernels team at OpenAI builds the low-level software that accelerates our most ambitious AI research.</p>\n<p>We work at the boundary of hardware and software, developing high-performance kernels, distributed system optimizations, and runtime improvements to make large-scale training and inference more efficient.</p>\n<p>Our work enables OpenAI to push the limits by ensuring models - from LLMs to recommender systems - to run reliably on advanced supercomputing platforms. That includes adapting our software stack to new types of accelerators, tuning system performance end-to-end, and removing bottlenecks across every layer of the stack.</p>\n<p><strong>About the Role</strong></p>\n<p>On the Accelerators team, you will help OpenAI evaluate and bring up new compute platforms that can support large-scale AI training and inference.</p>\n<p>Your work will range from prototyping system software on new accelerators to enabling performance optimizations across our AI workloads.</p>\n<p>You’ll work across the stack, collaborating with both hardware and software aspects - working on kernels, sharding strategies, scaling across distributed systems, and performance modeling.</p>\n<p>You&#39;ll help adapt OpenAI&#39;s software stack to non-traditional hardware and drive efficiency improvements in core AI workloads. This is not a compiler-focused role, rather bridging ML algorithms with system performance - especially at scale.</p>\n<p><strong>In this role, you will:</strong></p>\n<ul>\n<li>Prototype and enable OpenAI&#39;s AI software stack on new, exploratory accelerator platforms.</li>\n</ul>\n<ul>\n<li>Optimize large-scale model performance (LLMs, recommender systems, distributed AI workloads) for diverse hardware environments.</li>\n</ul>\n<ul>\n<li>Develop kernels, sharding mechanisms, and system scaling strategies tailored to emerging accelerators.</li>\n</ul>\n<ul>\n<li>Collaborate on optimizations at the model code level (e.g. PyTorch) and below to enhance performance on non-traditional hardware.</li>\n</ul>\n<p>Perform system-level performance modeling, debug bottlenecks, and drive end-to-end optimization.</p>\n<ul>\n<li>Work with hardware teams and vendors to evaluate alternatives to existing platforms and adapt the software stack to their architectures.</li>\n</ul>\n<ul>\n<li>Contribute to runtime improvements, compute/communication overlapping, and scaling efforts for frontier AI workloads.</li>\n</ul>\n<p><strong>You might thrive in this role if you have:</strong></p>\n<ul>\n<li>3+ years of experience working on AI infrastructure, including kernels, systems, or hardware-software co-design</li>\n</ul>\n<ul>\n<li>Hands-on experience with accelerator platforms for AI at data center scale (e.g., TPUs, custom silicon, exploratory architectures).</li>\n</ul>\n<ul>\n<li>Strong understanding of kernels, sharding, runtime systems, or distributed scaling techniques.</li>\n</ul>\n<ul>\n<li>Familiarity with optimizing LLMs, CNNs, or recommender models for hardware efficiency.</li>\n</ul>\n<ul>\n<li>Experience with performance modeling, system debugging, and software stack adaptation for novel architectures.</li>\n</ul>\n<ul>\n<li>Exposure to mobile accelerators is welcome, but experience enabling data center-scale AI hardware is preferred.</li>\n</ul>\n<ul>\n<li>Ability to operate across multiple levels of the stack, rapidly prototype solutions, and navigate ambiguity in early hardware bring-up phases</li>\n</ul>\n<ul>\n<li>Interest in shaping the future of AI compute through exploration of alternatives to mainstream accelerators.</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_4e51470c-8f1","directApply":true,"hiringOrganization":{"@type":"Organization","name":"OpenAI","sameAs":"https://jobs.ashbyhq.com","logo":"https://logos.yubhub.co/openai.com.png"},"x-apply-url":"https://jobs.ashbyhq.com/openai/f386b209-1259-4b79-bf5a-aa97fc7ce77b","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$295K – $380K • Offers Equity","x-skills-required":["AI infrastructure","kernels","systems","hardware-software co-design","accelerator platforms","TPUs","custom silicon","exploratory architectures","kernels","sharding","runtime systems","distributed scaling techniques","LLMs","CNNs","recommender models","hardware efficiency","performance modeling","system debugging","software stack adaptation","novel architectures"],"x-skills-preferred":["mobile accelerators","data center-scale AI hardware"],"datePosted":"2026-03-06T18:27:12.141Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"AI infrastructure, kernels, systems, hardware-software co-design, accelerator platforms, TPUs, custom silicon, exploratory architectures, kernels, sharding, runtime systems, distributed scaling techniques, LLMs, CNNs, recommender models, hardware efficiency, performance modeling, system debugging, software stack adaptation, novel architectures, mobile accelerators, data center-scale AI hardware","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":295000,"maxValue":380000,"unitText":"YEAR"}}}]}