{"version":"0.1","company":{"name":"YubHub","url":"https://yubhub.co","jobsUrl":"https://yubhub.co/jobs/skill/job-scheduling"},"x-facet":{"type":"skill","slug":"job-scheduling","display":"Job Scheduling","count":6},"x-feed-size-limit":100,"x-feed-sort":"enriched_at desc","x-feed-notice":"This feed contains at most 100 jobs (the most recently enriched). For the full corpus, use the paginated /stats/by-facet endpoint or /search.","x-generator":"yubhub-xml-generator","x-rights":"Free to redistribute with attribution: \"Data by YubHub (https://yubhub.co)\"","x-schema":"Each entry in `jobs` follows https://schema.org/JobPosting. YubHub-native raw fields carry `x-` prefix.","jobs":[{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_8f6ef3b1-c9b"},"title":"Technical Program Manager, Compute","description":"<p>As a Technical Program Manager on the Compute team, you will help drive the planning, coordination, and execution of programs that keep Anthropic&#39;s compute infrastructure running efficiently at scale.</p>\n<p>Our compute fleet is the foundation on which every model training run, evaluation, and inference workload depends. You&#39;ll join a small, high-impact TPM team and take ownership of critical workstreams across the compute lifecycle, from how supply is procured and brought online, to how capacity is allocated and utilized across teams.</p>\n<p>You&#39;ll partner with Infrastructure, Systems, Research, Finance, and Capacity Engineering to shape the processes, tooling, and coordination mechanisms that allow Anthropic to move fast while managing an increasingly complex compute environment.</p>\n<p>Responsibilities:</p>\n<ul>\n<li>Own and drive critical programs across the compute lifecycle, coordinating execution across multiple engineering, research, and operations teams</li>\n<li>Build and maintain operational visibility into the compute fleet, ensuring the organization has a clear picture of supply, demand, utilization, and health</li>\n<li>Lead cross-functional coordination for compute transitions: bringing new capacity online, migrating workloads, and managing decommissions across cloud providers and hardware platforms</li>\n<li>Partner with engineering and research leadership to navigate competing priorities and drive alignment on how compute resources are planned, allocated, and used</li>\n<li>Identify and close operational gaps across the compute pipeline, whether through new tooling, improved processes, or better cross-team communication</li>\n<li>Own trade-off discussions between utilization, cost, latency, and reliability, synthesizing inputs from technical and business stakeholders and communicating decisions to leadership</li>\n<li>Develop and improve the processes and frameworks the team uses to plan, track, and execute compute programs at increasing scale and complexity</li>\n</ul>\n<p>You may be a good fit if you:</p>\n<ul>\n<li>Have 7+ years of technical program management experience in infrastructure, platform engineering, or compute-intensive environments</li>\n<li>Have led complex, cross-functional programs involving multiple engineering teams with competing priorities and ambiguous requirements</li>\n<li>Have experience working with research or ML teams and translating their needs into operational plans and technical requirements</li>\n<li>Are comfortable diving deep into technical details (cloud infrastructure, cluster management, job scheduling, resource orchestration) while maintaining program-level visibility</li>\n<li>Thrive in ambiguous, fast-moving environments where you need to define scope and build processes from the ground up</li>\n<li>Have strong communication skills and can engage credibly with engineers, researchers, finance, and executive leadership</li>\n<li>Have a track record of building trust with engineering teams and driving changes through influence rather than authority</li>\n</ul>\n<p>Strong candidates may also have:</p>\n<ul>\n<li>Experience managing compute capacity across multiple cloud providers (AWS, GCP, Azure) or hybrid cloud/on-premises environments</li>\n<li>Familiarity with job scheduling, resource orchestration, or workload management systems (Kubernetes, Slurm, Borg, YARN, or custom schedulers)</li>\n<li>Experience with GPU or accelerator infrastructure, including the unique challenges of large-scale ML training and inference workloads</li>\n<li>Built or improved observability for infrastructure systems: dashboards, alerting, efficiency metrics, or cost attribution</li>\n<li>Capacity planning experience including demand forecasting, cost modeling, or hardware lifecycle management</li>\n<li>Scaled through hypergrowth in AI/ML, HPC, or large-scale cloud environments</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_8f6ef3b1-c9b","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Anthropic","sameAs":"https://www.anthropic.com/","logo":"https://logos.yubhub.co/anthropic.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/anthropic/jobs/5138044008","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$290,000-$365,000 USD","x-skills-required":["Technical Program Management","Cloud Infrastructure","Cluster Management","Job Scheduling","Resource Orchestration","Compute Capacity Management","GPU or Accelerator Infrastructure","Observability for Infrastructure Systems","Capacity Planning"],"x-skills-preferred":["Kubernetes","Slurm","Borg","YARN","Custom Schedulers","Demand Forecasting","Cost Modeling","Hardware Lifecycle Management"],"datePosted":"2026-04-18T15:53:42.458Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco, CA | New York City, NY | Seattle, WA"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Technical Program Management, Cloud Infrastructure, Cluster Management, Job Scheduling, Resource Orchestration, Compute Capacity Management, GPU or Accelerator Infrastructure, Observability for Infrastructure Systems, Capacity Planning, Kubernetes, Slurm, Borg, YARN, Custom Schedulers, Demand Forecasting, Cost Modeling, Hardware Lifecycle Management","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":290000,"maxValue":365000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_e8e9acc0-a63"},"title":"Technical Program Manager, Compute","description":"<p>As a Technical Program Manager on the Compute team, you will help drive the planning, coordination, and execution of programs that keep Anthropic&#39;s compute infrastructure running efficiently at scale.</p>\n<p>Our compute fleet is the foundation on which every model training run, evaluation, and inference workload depends. You&#39;ll join a small, high-impact TPM team and take ownership of critical workstreams across the compute lifecycle, from how supply is procured and brought online, to how capacity is allocated and utilized across teams.</p>\n<p>You&#39;ll partner with Infrastructure, Systems, Research, Finance, and Capacity Engineering to shape the processes, tooling, and coordination mechanisms that allow Anthropic to move fast while managing an increasingly complex compute environment.</p>\n<p>Responsibilities:</p>\n<ul>\n<li>Own and drive critical programs across the compute lifecycle, coordinating execution across multiple engineering, research, and operations teams</li>\n<li>Build and maintain operational visibility into the compute fleet, ensuring the organization has a clear picture of supply, demand, utilization, and health</li>\n<li>Lead cross-functional coordination for compute transitions: bringing new capacity online, migrating workloads, and managing decommissions across cloud providers and hardware platforms</li>\n<li>Partner with engineering and research leadership to navigate competing priorities and drive alignment on how compute resources are planned, allocated, and used</li>\n<li>Identify and close operational gaps across the compute pipeline, whether through new tooling, improved processes, or better cross-team communication</li>\n<li>Own trade-off discussions between utilization, cost, latency, and reliability, synthesizing inputs from technical and business stakeholders and communicating decisions to leadership</li>\n<li>Develop and improve the processes and frameworks the team uses to plan, track, and execute compute programs at increasing scale and complexity</li>\n</ul>\n<p>You may be a good fit if you:</p>\n<ul>\n<li>Have 7+ years of technical program management experience in infrastructure, platform engineering, or compute-intensive environments</li>\n<li>Have led complex, cross-functional programs involving multiple engineering teams with competing priorities and ambiguous requirements</li>\n<li>Have experience working with research or ML teams and translating their needs into operational plans and technical requirements</li>\n<li>Are comfortable diving deep into technical details (cloud infrastructure, cluster management, job scheduling, resource orchestration) while maintaining program-level visibility</li>\n<li>Thrive in ambiguous, fast-moving environments where you need to define scope and build processes from the ground up</li>\n<li>Have strong communication skills and can engage credibly with engineers, researchers, finance, and executive leadership</li>\n<li>Have a track record of building trust with engineering teams and driving changes through influence rather than authority</li>\n</ul>\n<p>Strong candidates may also have:</p>\n<ul>\n<li>Experience managing compute capacity across multiple cloud providers (AWS, GCP, Azure) or hybrid cloud/on-premises environments</li>\n<li>Familiarity with job scheduling, resource orchestration, or workload management systems (Kubernetes, Slurm, Borg, YARN, or custom schedulers)</li>\n<li>Experience with GPU or accelerator infrastructure, including the unique challenges of large-scale ML training and inference workloads</li>\n<li>Built or improved observability for infrastructure systems: dashboards, alerting, efficiency metrics, or cost attribution</li>\n<li>Capacity planning experience including demand forecasting, cost modeling, or hardware lifecycle management</li>\n<li>Scaled through hypergrowth in AI/ML, HPC, or large-scale cloud environments</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_e8e9acc0-a63","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Anthropic","sameAs":"https://www.anthropic.com/","logo":"https://logos.yubhub.co/anthropic.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/anthropic/jobs/5138044008","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$290,000-$365,000 USD","x-skills-required":["Technical Program Management","Compute Infrastructure","Cloud Providers","Job Scheduling","Resource Orchestration","Workload Management","GPU or Accelerator Infrastructure","Observability","Capacity Planning"],"x-skills-preferred":["Kubernetes","Slurm","Borg","YARN","Custom Schedulers","Demand Forecasting","Cost Modeling","Hardware Lifecycle Management"],"datePosted":"2026-04-18T15:52:47.770Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco, CA | New York City, NY | Seattle, WA"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Technical Program Management, Compute Infrastructure, Cloud Providers, Job Scheduling, Resource Orchestration, Workload Management, GPU or Accelerator Infrastructure, Observability, Capacity Planning, Kubernetes, Slurm, Borg, YARN, Custom Schedulers, Demand Forecasting, Cost Modeling, Hardware Lifecycle Management","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":290000,"maxValue":365000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_4c45d017-749"},"title":"FBS Mainframe System Administration- Application Subject Matter Expert II","description":"<p><strong>Job Summary</strong></p>\n<p>We are seeking a highly skilled FBS Mainframe System Administration- Application Subject Matter Expert II to join our team. As a key member of our Insurance Mainframe Job Subject Matter Expert (SME) team, you will be responsible for ensuring the stability, performance, and operational integrity of the mainframe environment supporting insurance applications.</p>\n<p><strong>Core Responsibilities</strong></p>\n<p><strong>Incident Management</strong></p>\n<ul>\n<li>Analyze and resolve job failures, spool issues, and performance bottlenecks.</li>\n<li>Provide root cause analysis (RCA) and implement preventive measures.</li>\n</ul>\n<p><strong>Environment Support</strong></p>\n<ul>\n<li>Maintain mainframe environments for development, testing, and production.</li>\n<li>Coordinate with infrastructure teams for system health and resource optimization.</li>\n</ul>\n<p><strong>Performance Tuning</strong></p>\n<ul>\n<li>Monitor CPU, spool, and memory utilization.</li>\n<li>Optimize job configurations to reduce resource consumption.</li>\n</ul>\n<p><strong>Compliance &amp; Audit</strong></p>\n<ul>\n<li>Ensure jobs comply with regulatory and security standards.</li>\n<li>Maintain documentation for audits and governance.</li>\n</ul>\n<p><strong>Collaboration</strong></p>\n<ul>\n<li>Work closely with application teams, operations, and business units.</li>\n<li>Provide technical guidance and best practices for job design and execution.</li>\n</ul>\n<p><strong>Key Responsibilities</strong></p>\n<ul>\n<li><strong>Environment Support &amp; Stability</strong></li>\n</ul>\n<p>+ Manage and maintain mainframe environments for development, testing, and production. \t+ Monitor system health, resource utilization, and job performance.</p>\n<ul>\n<li><strong>Batch Job Expertise</strong></li>\n</ul>\n<p>+ Oversee scheduling, execution, and troubleshooting of insurance-related batch jobs. \t+ Analyze job failures, spool issues, and CPU spikes; implement preventive measures.</p>\n<ul>\n<li><strong>Incident &amp; Problem Management</strong></li>\n</ul>\n<p>+ Provide root cause analysis (RCA) for outages and performance issues. \t+ Collaborate with operations and application teams to resolve incidents promptly.</p>\n<ul>\n<li><strong>Performance Optimization</strong></li>\n</ul>\n<p>+ Tune jobs and system parameters to improve efficiency and reduce resource consumption. \t+ Implement best practices for job design and output management.</p>\n<ul>\n<li><strong>Compliance &amp; Documentation</strong></li>\n</ul>\n<p>+ Ensure adherence to regulatory, security, and audit requirements. \t+ Maintain detailed documentation for processes and incident resolutions.</p>\n<p><strong>Requirements</strong></p>\n<ul>\n<li>Mainframe Technologies</li>\n<li>Patches</li>\n<li>System Administration</li>\n<li>zOS</li>\n<li>Mainframe technologies (JCL, COBOL, DB2, CICS) - Advanced</li>\n<li>Batch job scheduling tools (e.g., Control-M, CA7). - Advanced</li>\n<li>Knowledge of spool management, CPU optimization, and performance tuning. - Advanced</li>\n<li>Excellent problem-solving and communication skills - Advanced</li>\n<li>Insurance domain experience is a plus - Advanced</li>\n</ul>\n<p><strong>Benefits</strong></p>\n<ul>\n<li>Competitive compensation and benefits package:</li>\n</ul>\n<p>+ Competitive salary and performance-based bonuses \t+ Comprehensive benefits package \t+ Career development and training opportunities \t+ Flexible work arrangements (remote and/or office-based) \t+ Dynamic and inclusive work culture within a globally renowned group \t+ Private Health Insurance \t+ Pension Plan \t+ Paid Time Off \t+ Training &amp; Development</p>\n<p>Note: Benefits differ based on employee level.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_4c45d017-749","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Capgemini","sameAs":"https://www.capgemini.com/us-en/about-us/who-we-are/","logo":"https://logos.yubhub.co/capgemini.com.png"},"x-apply-url":"https://jobs.workable.com/view/1k3E5rgxsguPKBxRevC7y7/hybrid-fbs-mainframe-system-administration--application-subject-matter-expert-ii-in-pune-at-capgemini","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["Mainframe Technologies","Patches","System Administration","zOS","Mainframe technologies (JCL, COBOL, DB2, CICS)","Batch job scheduling tools (e.g., Control-M, CA7)","Knowledge of spool management, CPU optimization, and performance tuning","Excellent problem-solving and communication skills","Insurance domain experience"],"x-skills-preferred":[],"datePosted":"2026-03-09T17:02:07.155Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Pune, Maharashtra, India"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Mainframe Technologies, Patches, System Administration, zOS, Mainframe technologies (JCL, COBOL, DB2, CICS), Batch job scheduling tools (e.g., Control-M, CA7), Knowledge of spool management, CPU optimization, and performance tuning, Excellent problem-solving and communication skills, Insurance domain experience"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_047e9644-6cd"},"title":"FBS Mainframe System Administration- Application Subject Matter Expert II","description":"<p><strong>Job Description</strong></p>\n<p>Capgemini is seeking a highly skilled FBS Mainframe System Administration- Application Subject Matter Expert II to join our team. As a key member of our Insurance Mainframe Job Subject Matter Expert (SME) team, you will be responsible for ensuring the stability, performance, and operational integrity of the mainframe environment supporting insurance applications.</p>\n<p><strong>Core Responsibilities</strong></p>\n<ul>\n<li>Analyze and resolve job failures, spool issues, and performance bottlenecks.</li>\n<li>Provide root cause analysis (RCA) and implement preventive measures.</li>\n<li>Maintain mainframe environments for development, testing, and production.</li>\n<li>Coordinate with infrastructure teams for system health and resource optimization.</li>\n<li>Monitor CPU, spool, and memory utilization.</li>\n<li>Optimize job configurations to reduce resource consumption.</li>\n<li>Ensure jobs comply with regulatory and security standards.</li>\n<li>Maintain documentation for audits and governance.</li>\n<li>Work closely with application teams, operations, and business units.</li>\n<li>Provide technical guidance and best practices for job design and execution.</li>\n</ul>\n<p><strong>Key Responsibilities</strong></p>\n<ul>\n<li>Manage and maintain mainframe environments for development, testing, and production.</li>\n<li>Monitor system health, resource utilization, and job performance.</li>\n<li>Oversee scheduling, execution, and troubleshooting of insurance-related batch jobs.</li>\n<li>Analyze job failures, spool issues, and CPU spikes; implement preventive measures.</li>\n<li>Provide root cause analysis (RCA) for outages and performance issues.</li>\n<li>Collaborate with operations and application teams to resolve incidents promptly.</li>\n<li>Tune jobs and system parameters to improve efficiency and reduce resource consumption.</li>\n<li>Implement best practices for job design and output management.</li>\n<li>Ensure adherence to regulatory, security, and audit requirements.</li>\n<li>Maintain detailed documentation for processes and incident resolutions.</li>\n</ul>\n<p><strong>Requirements</strong></p>\n<ul>\n<li>Mainframe technologies (JCL, COBOL, DB2, CICS) - Advanced</li>\n<li>Batch job scheduling tools (e.g., Control-M, CA7) - Advanced</li>\n<li>Knowledge of spool management, CPU optimization, and performance tuning - Advanced</li>\n<li>Excellent problem-solving and communication skills - Advanced</li>\n<li>Insurance domain experience is a plus - Advanced</li>\n</ul>\n<p><strong>Benefits</strong></p>\n<ul>\n<li>Competitive compensation and benefits package:</li>\n</ul>\n<p>+ Competitive salary and performance-based bonuses \t+ Comprehensive benefits package \t+ Career development and training opportunities \t+ Flexible work arrangements (remote and/or office-based) \t+ Dynamic and inclusive work culture within a globally renowned group \t+ Private Health Insurance \t+ Pension Plan \t+ Paid Time Off \t+ Training &amp; Development</p>\n<p>Note: Benefits differ based on employee level.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_047e9644-6cd","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Capgemini","sameAs":"https://jobs.workable.com","logo":"https://logos.yubhub.co/view.com.png"},"x-apply-url":"https://jobs.workable.com/view/gDs4UDcPsPLvWwDx7Z6H6Y/hybrid-fbs-mainframe-system-administration--application-subject-matter-expert-ii-in-hyderabad-at-capgemini","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["Mainframe technologies (JCL, COBOL, DB2, CICS)","Batch job scheduling tools (e.g., Control-M, CA7)","Spool management, CPU optimization, and performance tuning","Problem-solving and communication skills","Insurance domain experience"],"x-skills-preferred":[],"datePosted":"2026-03-09T16:53:44.424Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Hyderabad, Telangana, India"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Mainframe technologies (JCL, COBOL, DB2, CICS), Batch job scheduling tools (e.g., Control-M, CA7), Spool management, CPU optimization, and performance tuning, Problem-solving and communication skills, Insurance domain experience"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_7b2b97d5-0a1"},"title":"Software Engineer, Inference Deployment","description":"<p><strong>About Anthropic</strong></p>\n<p>Anthropic&#39;s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems.</p>\n<p><strong>About the Role</strong></p>\n<p>Our mandate is to make inference deployment boring and unattended.</p>\n<p>Anthropic serves Claude to millions of users across GPUs, TPUs, and Trainium — and every model update must reach production safely, quickly, and without disrupting service. We&#39;re building the systems that make inference deployment continuous and unattended.</p>\n<p>As a Software Engineer on the Launch Engineering team, you&#39;ll design and build the deployment infrastructure that moves inference code from merge to production. This is a resource-constrained optimization problem at its core: validation and deployment consume the same accelerator chips that serve customer traffic — your deploys compete with live user requests for the same hardware. Every model brings different fleet sizes, startup times, and correctness requirements, so the system must adapt continuously. You&#39;ll build systems that navigate these constraints — orchestrating validation, scheduling deployments intelligently, and driving down cycle time from merge to production.</p>\n<p>If you&#39;ve built deployment systems at scale and gravitate toward the hardest problems at the intersection of automation and resource management, this team will give you an outsized scope to work on them.</p>\n<p><strong>Responsibilities</strong></p>\n<ul>\n<li><strong>Own deployment orchestration</strong> that continuously moves validated inference builds into production across GPU, TPU, and Trainium fleets, unattended under normal conditions</li>\n<li><strong>Improve capacity-aware deployment scheduling</strong> to maximize deployment throughput against constrained accelerator budgets and variable fleet sizes</li>\n<li><strong>Extend deployment observability</strong> — dashboards and tooling that answer &quot;what code is running in production,&quot; &quot;where is my commit,&quot; and &quot;what validation passed for this deploy&quot;</li>\n<li><strong>Drive down cycle time</strong> from code merge to production with pipeline architectures that minimize serial dependencies and maximize parallelism</li>\n<li><strong>Optimize fleet rollout strategies</strong> for large-scale deployments across thousands of GPU, TPU, and Trainium chips, minimizing disruption to serving capacity</li>\n<li><strong>Evolve self-service model onboarding</strong> so that new models can be added to the continuous deployment pipeline without Launch Engineering involvement</li>\n<li><strong>Partner across the Inference organization</strong> with teams owning validation, autoscaling, and model routing to integrate deployment automation with their systems</li>\n</ul>\n<p><strong>You May Be a Good Fit If You Have</strong></p>\n<ul>\n<li>5+ years of experience building deployment, release, or delivery infrastructure at scale</li>\n<li>Strong software engineering skills with experience designing systems that manage complex state machines and multi-stage pipelines</li>\n<li>Experience with deployment systems where resource constraints shape the design — whether that&#39;s fleet capacity, network bandwidth, hardware availability, or coordinated rollout windows</li>\n<li>A track record of building automation that measurably improves deployment velocity and reliability</li>\n<li>Proficiency with Kubernetes-based deployments, rolling update mechanics, and container orchestration</li>\n<li>Comfort working across the stack — from backend services and databases to CLI tools and web UIs</li>\n<li>Strong communication skills and the ability to work closely with oncall engineers, model teams, and infrastructure partners</li>\n</ul>\n<p><strong>Strong Candidates May Also Have</strong></p>\n<ul>\n<li>Experience with ML inference or training infrastructure deployment, particularly across multiple accelerator types (GPU, TPU, Trainium)</li>\n<li>Background in capacity planning or resource-constrained scheduling (e.g., bin-packing, fleet management, job scheduling with hardware affinity)</li>\n<li>Experience with progressive delivery in systems with long validation cycles: canary/soak testing, blue-green deployments, traffic shifting, automated rollback</li>\n<li>Experience at companies with large-scale release engineering challenges (mobile release trains, monorepo deployments, multi-datacenter rollouts)</li>\n<li>Experience with Python and/or Rust in production systems</li>\n</ul>\n<p><strong>Logistics</strong></p>\n<p><strong>Education requirements:</strong> We require at least a Bachelor&#39;s degree in a related field or equivalent experience. <strong>Location-based hybrid policy:</strong> Currently, we expect all staff to be in one of our offices at least 25% of the time. However, some roles may require more time in our offices.</p>\n<p><strong>Visa sponsorship:</strong> We do sponsor visas! However, we aren&#39;t able to successfully sponsor visas for every role and every candidate. But if we make you an offer, we will make every reasonable effort to get you a visa, and we retain an immigration lawyer to help with this.</p>\n<p><strong>We encourage you to apply even if you do not believe you meet every single qualification.</strong> Not all strong candidates will meet every single qualification as listed. Research shows that people who identify as being from underrepresented groups are more prone to experiencing imposter syndrome and doubting the strength of their candidacy, so we urge you not to exclude yourself prematurely and to submit an application if you&#39;re interested in this work.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_7b2b97d5-0a1","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Anthropic","sameAs":"https://job-boards.greenhouse.io","logo":"https://logos.yubhub.co/anthropic.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/anthropic/jobs/5111745008","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$320,000 - $485,000USD","x-skills-required":["deployment","release","delivery","infrastructure","Kubernetes","container","orchestration","pipelines","state machines","multi-stage","pipelines","parallelism","optimization","resource management","automation","velocity","reliability","communication","collaboration","oncall","model teams","infrastructure partners"],"x-skills-preferred":["ML inference","training infrastructure","capacity planning","resource-constrained scheduling","bin-packing","fleet management","job scheduling","hardware affinity","progressive delivery","canary/soak testing","blue-green deployments","traffic shifting","automated rollback","mobile release trains","monorepo deployments","multi-datacenter rollouts","Python","Rust"],"datePosted":"2026-03-08T13:54:19.012Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco, CA | New York City, NY | Seattle, WA"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"deployment, release, delivery, infrastructure, Kubernetes, container, orchestration, pipelines, state machines, multi-stage, pipelines, parallelism, optimization, resource management, automation, velocity, reliability, communication, collaboration, oncall, model teams, infrastructure partners, ML inference, training infrastructure, capacity planning, resource-constrained scheduling, bin-packing, fleet management, job scheduling, hardware affinity, progressive delivery, canary/soak testing, blue-green deployments, traffic shifting, automated rollback, mobile release trains, monorepo deployments, multi-datacenter rollouts, Python, Rust","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":320000,"maxValue":485000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_148ddf8d-fe9"},"title":"IT Director - Infrastructure Engineering","description":"<p>We are seeking an experienced IT Director to lead our Infrastructure Engineering team. As a seasoned IT leader, you will be responsible for developing and executing strategic IT infrastructure plans, policies, and procedures that align with Synopsys&#39; global and regional objectives.</p>\n<p><strong>What you&#39;ll do</strong></p>\n<ul>\n<li>Developing and executing strategic IT infrastructure plans, policies, and procedures that align with Synopsys&#39; global and regional objectives.</li>\n<li>Overseeing the design, implementation, and maintenance of HPC Engineering infrastructure, including Compute, Citrix, Storage, Networks, and Data Centers to ensure seamless operations.</li>\n</ul>\n<p><strong>What you need</strong></p>\n<ul>\n<li>Extensive experience (20+ years) in managing large-scale, 24x7 IT infrastructure delivery programs for global organizations.</li>\n<li>Deep technical expertise in HPC Engineering infrastructure—Compute, Citrix, Storage, Networks, and Data Centers.</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_148ddf8d-fe9","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Synopsys","sameAs":"https://careers.synopsys.com","logo":"https://logos.yubhub.co/careers.synopsys.com.png"},"x-apply-url":"https://careers.synopsys.com/job/bengaluru/it-director-infrastructure-engineering/44408/92296852016","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"employee","x-salary-range":null,"x-skills-required":["IT infrastructure management","HPC Engineering infrastructure","IT operations","governance","compliance","service management"],"x-skills-preferred":["job scheduling and queuing systems","LSF","SLURM"],"datePosted":"2026-03-06T07:20:45.459Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Bengaluru"}},"occupationalCategory":"Information Technology","industry":"Technology","skills":"IT infrastructure management, HPC Engineering infrastructure, IT operations, governance, compliance, service management, job scheduling and queuing systems, LSF, SLURM"}]}