{"version":"0.1","company":{"name":"YubHub","url":"https://yubhub.co","jobsUrl":"https://yubhub.co/jobs/skill/deployment-infrastructure"},"x-facet":{"type":"skill","slug":"deployment-infrastructure","display":"Deployment Infrastructure","count":1},"x-feed-size-limit":100,"x-feed-sort":"enriched_at desc","x-feed-notice":"This feed contains at most 100 jobs (the most recently enriched). For the full corpus, use the paginated /stats/by-facet endpoint or /search.","x-generator":"yubhub-xml-generator","x-rights":"Free to redistribute with attribution: \"Data by YubHub (https://yubhub.co)\"","x-schema":"Each entry in `jobs` follows https://schema.org/JobPosting. YubHub-native raw fields carry `x-` prefix.","jobs":[{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_74be15a1-bce"},"title":"Software Engineer, Inference Deployment","description":"<p>Our mandate is to make inference deployment boring and unattended. We serve Claude to millions of users across GPUs, TPUs, and Trainium , and every model update must reach production safely, quickly, and without disrupting service. As a Software Engineer on the Launch Engineering team, you&#39;ll design and build the deployment infrastructure that moves inference code from merge to production.</p>\n<p>This is a resource-constrained optimization problem at its core: validation and deployment consume the same accelerator chips that serve customer traffic , your deploys compete with live user requests for the same hardware. Every model brings different fleet sizes, startup times, and correctness requirements, so the system must adapt continuously. You&#39;ll build systems that navigate these constraints , orchestrating validation, scheduling deployments intelligently, and driving down cycle time from merge to production.</p>\n<p>Responsibilities:</p>\n<ul>\n<li>Own deployment orchestration that continuously moves validated inference builds into production across GPU, TPU, and Trainium fleets, unattended under normal conditions</li>\n</ul>\n<ul>\n<li>Improve capacity-aware deployment scheduling to maximize deployment throughput against constrained accelerator budgets and variable fleet sizes</li>\n</ul>\n<ul>\n<li>Extend deployment observability , dashboards and tooling that answer &quot;what code is running in production,&quot; &quot;where is my commit,&quot; and &quot;what validation passed for this deploy&quot;</li>\n</ul>\n<ul>\n<li>Drive down cycle time from code merge to production with pipeline architectures that minimize serial dependencies and maximize parallelism</li>\n</ul>\n<ul>\n<li>Optimize fleet rollout strategies for large-scale deployments across thousands of GPU, TPU, and Trainium chips, minimizing disruption to serving capacity</li>\n</ul>\n<ul>\n<li>Evolve self-service model onboarding so that new models can be added to the continuous deployment pipeline without Launch Engineering involvement</li>\n</ul>\n<ul>\n<li>Partner across the Inference organization with teams owning validation, autoscaling, and model routing to integrate deployment automation with their systems</li>\n</ul>\n<p>You May Be a Good Fit If You Have:</p>\n<ul>\n<li>5+ years of experience building deployment, release, or delivery infrastructure at scale</li>\n</ul>\n<ul>\n<li>Strong software engineering skills with experience designing systems that manage complex state machines and multi-stage pipelines</li>\n</ul>\n<ul>\n<li>Experience with deployment systems where resource constraints shape the design , whether that&#39;s fleet capacity, network bandwidth, hardware availability, or coordinated rollout windows</li>\n</ul>\n<ul>\n<li>A track record of building automation that measurably improves deployment velocity and reliability</li>\n</ul>\n<ul>\n<li>Proficiency with Kubernetes-based deployments, rolling update mechanics, and container orchestration</li>\n</ul>\n<ul>\n<li>Comfort working across the stack , from backend services and databases to CLI tools and web UIs</li>\n</ul>\n<ul>\n<li>Strong communication skills and the ability to work closely with oncall engineers, model teams, and infrastructure partners</li>\n</ul>\n<p>Strong Candidates May Also Have:</p>\n<ul>\n<li>Experience with ML inference or training infrastructure deployment, particularly across multiple accelerator types (GPU, TPU, Trainium)</li>\n</ul>\n<ul>\n<li>Background in capacity planning or resource-constrained scheduling (e.g., bin-packing, fleet management, job scheduling with hardware affinity)</li>\n</ul>\n<ul>\n<li>Experience with progressive delivery in systems with long validation cycles: canary/soak testing, blue-green deployments, traffic shifting, automated rollback</li>\n</ul>\n<ul>\n<li>Experience at companies with large-scale release engineering challenges (mobile release trains, monorepo deployments, multi-datacenter rollouts)</li>\n</ul>\n<ul>\n<li>Experience with Python and/or Rust in production systems</li>\n</ul>\n<p>The annual compensation range for this role is $320,000-$485,000 USD.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_74be15a1-bce","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Anthropic","sameAs":"https://www.anthropic.com/","logo":"https://logos.yubhub.co/anthropic.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/anthropic/jobs/5111745008","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$320,000-$485,000 USD","x-skills-required":["deployment infrastructure","software engineering","complex state machines","multi-stage pipelines","Kubernetes-based deployments","container orchestration","backend services","databases","CLI tools","web UIs"],"x-skills-preferred":["ML inference","training infrastructure deployment","capacity planning","resource-constrained scheduling"," deployments","progressive delivery","Python","Rust"],"datePosted":"2026-04-18T15:53:04.252Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco, CA | New York City, NY | Seattle, WA"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"deployment infrastructure, software engineering, complex state machines, multi-stage pipelines, Kubernetes-based deployments, container orchestration, backend services, databases, CLI tools, web UIs, ML inference, training infrastructure deployment, capacity planning, resource-constrained scheduling,  deployments, progressive delivery, Python, Rust","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":320000,"maxValue":485000,"unitText":"YEAR"}}}]}