<?xml version="1.0" encoding="UTF-8"?>
<source>
  <jobs>
    <job>
      <externalid>4d924e95-bdd</externalid>
      <Title>Research Engineer, RL Infrastructure and Reliability (Knowledge Work)</Title>
      <Description><![CDATA[<p><strong>About the role</strong></p>
<p>The Knowledge Work team builds the training environments and evaluations that make Claude effective at real-world professional workflows , searching, analysing, and creating across the tools and documents knowledge workers use every day.</p>
<p>As that work scales, the systems behind it need to be as rigorous as the research itself. We are looking for a Research Engineer to own the reliability, observability, and infrastructure foundation that the team&#39;s research depends on.</p>
<p>You will be responsible for ensuring our training and evaluation runs remain stable, well-instrumented, and high-quality as they grow in scale and complexity. A core part of this role is shifting reliability work from reactive to proactive: hardening systems, stress-testing at realistic scale, and building the observability and tooling that surface problems early , so researchers can stay focused on research rather than incident response.</p>
<p>You will be the team&#39;s stable, context-rich owner for environment health and evaluation integrity, and the primary point of contact for partner teams when issues arise.</p>
<p>While you&#39;ll work closely with researchers building new training environments, the priority for this role is the reliability those environments depend on. It&#39;s best suited to an engineer who finds real ownership and impact in making critical systems dependable, and in being the person behind trustworthy evaluation results the entire organisation relies on.</p>
<p><strong>Key Responsibilities:</strong></p>
<ul>
<li>Serve as the dedicated reliability owner for the Knowledge Work training environments, providing continuity of context and reducing the operational overhead of rotating ownership</li>
<li>Own a clean, canonical set of evaluation tools and processes for Knowledge Work capabilities, including the process used for model releases</li>
<li>Build and automate observability, dashboards, and operational tooling for our training environments and evaluation systems, with an emphasis on high signal-to-noise: a small set of trusted metrics and alerts rather than sprawling instrumentation</li>
<li>Proactively harden environments and evaluation systems through load testing, fault injection, and stress testing at realistic scale, so failures surface early rather than during critical training work</li>
<li>Act as the primary point of contact for partner training and infrastructure teams when issues in our environments arise, and drive incidents to resolution</li>
<li>Reduce the operational burden on researchers so they can stay focused on research</li>
</ul>
<p><strong>Minimum Qualifications:</strong></p>
<ul>
<li>Highly experienced Python engineer who ships reliable, well-instrumented code that teammates trust in production</li>
<li>Demonstrated experience operating ML or distributed systems at scale, including significant on-call and incident-response experience</li>
<li>Strong SRE or production-engineering mindset , reaching for SLOs, load tests, and failure injection before reaching for more dashboards</li>
<li>Foundational ML knowledge sufficient to understand what a training environment or evaluation is actually measuring, and recognise when an evaluation has become stale or gameable</li>
<li>Able to read research code and reason evaluation integrity</li>
</ul>
<p><strong>Preferred Qualifications:</strong></p>
<ul>
<li>5+ years of experience operating ML or distributed systems at scale</li>
<li>Experience building or operating RL environments, agent harnesses, or LLM evaluation frameworks</li>
<li>Familiarity with reward modelling, evaluation design, or detecting and mitigating reward hacking</li>
<li>Experience with observability stacks (metrics, tracing, structured logging) and operational dashboard tooling</li>
<li>Background in chaos engineering, fault injection, or large-scale load testing</li>
<li>Experience with data quality pipelines, drift detection, or evaluation-set curation and versioning</li>
<li>Familiarity with large-scale training or inference infrastructure (schedulers, multi-agent orchestration, sandboxed execution)</li>
<li>Prior experience as a dedicated reliability or operations owner embedded within a research team</li>
</ul>
<p><strong>Logistics</strong></p>
<p>Minimum education: Bachelor’s degree or an equivalent combination of education, training, and/or experience Required field of study: A field relevant to the role as demonstrated through coursework, training, or professional experience Minimum years of experience: Years of experience required will correlate with the internal job level requirements for the position Location-based hybrid policy: Currently, we expect all staff to be in one of our offices at least 25% of the time. However, some roles may require more time in our offices. Visa sponsorship: We do sponsor visas! However, we aren’t able to successfully sponsor visas for every role and every candidate. But if we make you an offer, we will make every reasonable effort to get you a visa, and we retain an immigration lawyer to help with this.</p>
<p><strong>How we’re different</strong></p>
<p>We believe that the highest-impact AI research will be big science. At Anthropic we work as a single cohesive team on just a few large-scale research efforts. And we value impact , advancing our long-term goals of steerable, trustworthy AI , rather than work on smaller and more specific puzzles.</p>
<p><strong>Come work with us!</strong></p>
<p>Anthropic is a public benefit corporation headquartered in San Francisco. We offer competitive compensation and benefits, including a comprehensive health insurance package, 401(k) matching, and generous paid time off.</p>
<p style="margin-top:24px;font-size:13px;color:#666;">XML job scraping automation by <a href="https://yubhub.co">YubHub</a></p>]]></Description>
      <Jobtype>full-time</Jobtype>
      <Experiencelevel>senior</Experiencelevel>
      <Workarrangement>hybrid</Workarrangement>
      <Salaryrange>$350,000-$850,000 USD</Salaryrange>
      <Skills>Python, ML, Distributed Systems, SRE, Production-Engineering, Observability, Dashboards, Operational Tooling, Load Testing, Fault Injection, Stress Testing, Reward Modelling, Evaluation Design, Data Quality Pipelines, Drift Detection, Evaluation-Set Curation, Versioning, Large-Scale Training, Inference Infrastructure, Schedulers, Multi-Agent Orchestration, Sandboxed Execution, RL Environments, Agent Harnesses, LLM Evaluation Frameworks, Chaos Engineering, Structured Logging, Dashboard Tooling</Skills>
      <Category>Engineering</Category>
      <Industry>Technology</Industry>
      <Employername>Anthropic</Employername>
      <Employerlogo>https://logos.yubhub.co/anthropic.com.png</Employerlogo>
      <Employerdescription>Anthropic is a public benefit corporation headquartered in San Francisco that creates reliable, interpretable, and steerable AI systems.</Employerdescription>
      <Employerwebsite>https://www.anthropic.com/</Employerwebsite>
      <Compensationcurrency></Compensationcurrency>
      <Compensationmin></Compensationmin>
      <Compensationmax></Compensationmax>
      <Applyto>https://job-boards.greenhouse.io/anthropic/jobs/5197337008</Applyto>
      <Location>San Francisco, CA</Location>
      <Country></Country>
      <Postedate>2026-04-24</Postedate>
    </job>
    <job>
      <externalid>736969e6-3f9</externalid>
      <Title>CPU Storage Tech Lead</Title>
      <Description><![CDATA[<p><strong>Compensation</strong></p>
<p>The base pay offered may vary depending on multiple individualized factors, including market location, job-related knowledge, skills, and experience. If the role is non-exempt, overtime pay will be provided consistent with applicable laws. In addition to the salary range listed above, total compensation also includes generous equity, performance-related bonus(es) for eligible employees, and the following benefits.</p>
<ul>
<li>Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts</li>
</ul>
<ul>
<li>Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)</li>
</ul>
<ul>
<li>401(k) retirement plan with employer match</li>
</ul>
<ul>
<li>Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)</li>
</ul>
<ul>
<li>Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees</li>
</ul>
<ul>
<li>13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)</li>
</ul>
<ul>
<li>Mental health and wellness support</li>
</ul>
<ul>
<li>Employer-paid basic life and disability coverage</li>
</ul>
<ul>
<li>Annual learning and development stipend to fuel your professional growth</li>
</ul>
<ul>
<li>Daily meals in our offices, and meal delivery credits as eligible</li>
</ul>
<ul>
<li>Relocation support for eligible employees</li>
</ul>
<ul>
<li>Additional taxable fringe benefits, such as charitable donation matching and wellness stipends, may also be provided.</li>
</ul>
<p><strong>About the Team</strong></p>
<p>The Stargate team is responsible for building the physical infrastructure that powers large-scale AI systems. We design and deliver next-generation data centers optimized for dense compute clusters, advanced networking, and rapidly evolving hardware platforms.</p>
<p><strong>About the Role</strong></p>
<p>We are seeking a CPU &amp; Storage Technical Lead to define and drive the server compute and storage architecture strategy for Stargate infrastructure.</p>
<p>In this role, you will own technical direction across CPU platforms, memory configurations, local and disaggregated storage systems, and their integration into large-scale AI clusters. You will evaluate vendor roadmaps, lead platform tradeoff decisions, and ensure compute and storage systems are optimized for training, inference, and supporting services.</p>
<p><strong>Key Responsibilities</strong></p>
<ul>
<li>Own CPU and storage technical strategy for Stargate compute infrastructure across current and future generations.</li>
</ul>
<ul>
<li>Evaluate CPU platforms across performance, efficiency, memory bandwidth, PCIe topology, cost, and roadmap alignment.</li>
</ul>
<ul>
<li>Define storage architectures for AI environments, including boot media, local NVMe, shared storage, caching tiers, metadata services, and high-performance data pipelines.</li>
</ul>
<ul>
<li>Drive server platform decisions involving CPU, memory, NIC, GPU, and storage subsystem integration.</li>
</ul>
<ul>
<li>Partner with performance modeling teams to quantify tradeoffs across compute, memory, I/O, and storage bottlenecks.</li>
</ul>
<ul>
<li>Work with silicon and hardware vendors on roadmap influence, feature requests, qualification plans, and technical escalations.</li>
</ul>
<ul>
<li>Lead bring-up and validation efforts for new CPU and storage platforms in lab and production environments.</li>
</ul>
<ul>
<li>Partner with networking and cluster architecture teams to optimize end-to-end node design and data movement.</li>
</ul>
<ul>
<li>Support supply chain and sourcing teams with technical vendor assessments and second-source strategies.</li>
</ul>
<ul>
<li>Drive reliability, serviceability, and fleet lifecycle planning for compute and storage platforms.</li>
</ul>
<ul>
<li>Translate future AI workload requirements into infrastructure platform specifications.</li>
</ul>
<ul>
<li>Provide technical leadership across cross-functional stakeholders and executive reviews.</li>
</ul>
<p><strong>Qualifications</strong></p>
<ul>
<li>Bachelor’s degree in Computer Engineering, Electrical Engineering, Computer Science, or related technical field; advanced degree preferred.</li>
</ul>
<ul>
<li>10+ years of experience in server hardware, systems architecture, data center infrastructure, or hyperscale compute platforms.</li>
</ul>
<ul>
<li>Deep expertise in modern CPU architectures (x86, ARM, accelerator host systems) and server platform design.</li>
</ul>
<ul>
<li>Strong understanding of memory systems, PCIe/CXL fabrics, NUMA behavior, and platform-level performance constraints.</li>
</ul>
<ul>
<li>Experience with storage systems including NVMe, SSD qualification, RAID, distributed storage, object/file systems, or high-performance data pipelines.</li>
</ul>
<ul>
<li>Experience evaluating hardware tradeoffs across performance, cost, power, thermals, and supply availability.</li>
</ul>
<ul>
<li>Familiarity with GPU clusters and AI training/inference infrastructure strongly preferred.</li>
</ul>
<ul>
<li>Experience working directly with OEMs, ODMs, silicon vendors, or storage vendors.</li>
</ul>
<ul>
<li>Strong systems thinking with ability to connect component decisions to fleet-level outcomes.</li>
</ul>
<ul>
<li>Excellent communication skills with the ability to influence engineering and executive stakeholders.</li>
</ul>
<ul>
<li>Proven ability to operate in fast-moving, ambiguous environments with high ownership.</li>
</ul>
<p><strong>Preferred Skills</strong></p>
<ul>
<li>Experience designing infrastructure for large-scale AI or HPC environments.</li>
</ul>
<ul>
<li>Familiarity with CPU vendor roadmaps across AMD, Intel, and ARM ecosystems.</li>
</ul>
<ul>
<li>Experience with distributed storage architectures supporting GPU clusters.</li>
</ul>
<ul>
<li>Knowledge of fleet operations, hardware lifecycle management, and production deployments at scale.</li>
</ul>
<ul>
<li>Prior experience in hyperscale cloud, AI infrastructure, or advanced compute environments.</li>
</ul>
<p><strong>About OpenAI</strong></p>
<p>OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity.</p>
<p>We are an equal opportunity employer, and we do not discriminate on the basis of race, religion, color, national origin, sex, sexual orientation, age, veteran status, disability, genetic information, or other applicable legally protected characteristic.</p>
<p>For additional information, please see [OpenAI’s Affirmative Action and Equal Employment Opportunity Policy Statement](https://cdn.openai.com/policies/eeo-policy-statement.pdf).</p>
<p>Background checks for applicants will be administered in accordance with applicable law, and qualified applicants with arrest or conviction records will be considered for employment consistent with those laws, including the San Francisco Fair Chance Ordinance, the Los Angeles County</p>
<p style="margin-top:24px;font-size:13px;color:#666;">XML job scraping automation by <a href="https://yubhub.co">YubHub</a></p>]]></Description>
      <Jobtype>Full time</Jobtype>
      <Experiencelevel>senior</Experiencelevel>
      <Workarrangement>hybrid</Workarrangement>
      <Salaryrange>$342K – $555K</Salaryrange>
      <Skills>server hardware, systems architecture, data center infrastructure, hyperscale compute platforms, modern CPU architectures, server platform design, memory systems, PCIe/CXL fabrics, NUMA behavior, platform-level performance constraints, storage systems, NVMe, SSD qualification, RAID, distributed storage, object/file systems, high-performance data pipelines, hardware tradeoffs, performance, cost, power, thermals, supply availability, GPU clusters, AI training/inference infrastructure, OEMs, ODMs, silicon vendors, storage vendors, strong systems thinking, component decisions, fleet-level outcomes, excellent communication skills, influence engineering and executive stakeholders, fast-moving, ambiguous environments, high ownership, infrastructure for large-scale AI or HPC environments, CPU vendor roadmaps across AMD, Intel, and ARM ecosystems, distributed storage architectures supporting GPU clusters, fleet operations, hardware lifecycle management, production deployments at scale, hyperscale cloud, AI infrastructure, advanced compute environments</Skills>
      <Category>Engineering</Category>
      <Industry>Technology</Industry>
      <Employername>OpenAI</Employername>
      <Employerlogo>https://logos.yubhub.co/openai.com.png</Employerlogo>
      <Employerdescription>OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity.</Employerdescription>
      <Employerwebsite>https://openai.com</Employerwebsite>
      <Compensationcurrency></Compensationcurrency>
      <Compensationmin></Compensationmin>
      <Compensationmax></Compensationmax>
      <Applyto>https://jobs.ashbyhq.com/openai/18a60850-cf8b-4374-a214-ef78b9712deb</Applyto>
      <Location>San Francisco; Seattle</Location>
      <Country></Country>
      <Postedate>2026-04-24</Postedate>
    </job>
    <job>
      <externalid>c2419ec4-6fb</externalid>
      <Title>Research Engineer, RL Infrastructure and Reliability (Knowledge Work)</Title>
      <Description><![CDATA[<p><strong>About the role</strong></p>
<p>The Knowledge Work team builds the training environments and evaluations that make Claude effective at real-world professional workflows , searching, analysing, and creating across the tools and documents knowledge workers use every day.</p>
<p>As that work scales, the systems behind it need to be as rigorous as the research itself. We are looking for a Research Engineer to own the reliability, observability, and infrastructure foundation that the team&#39;s research depends on.</p>
<p>You will be responsible for ensuring our training and evaluation runs remain stable, well-instrumented, and high-quality as they grow in scale and complexity.</p>
<p>A core part of this role is shifting reliability work from reactive to proactive: hardening systems, stress-testing at realistic scale, and building the observability and tooling that surface problems early , so researchers can stay focused on research rather than incident response.</p>
<p>You will be the team&#39;s stable, context-rich owner for environment health and evaluation integrity, and the primary point of contact for partner teams when issues arise.</p>
<p><strong>Key Responsibilities:</strong></p>
<ul>
<li>Serve as the dedicated reliability owner for the Knowledge Work training environments, providing continuity of context and reducing the operational overhead of rotating ownership</li>
<li>Own a clean, canonical set of evaluation tools and processes for Knowledge Work capabilities, including the process used for model releases</li>
<li>Build and automate observability, dashboards, and operational tooling for our training environments and evaluation systems, with an emphasis on high signal-to-noise: a small set of trusted metrics and alerts rather than sprawling instrumentation</li>
<li>Proactively harden environments and evaluation systems through load testing, fault injection, and stress testing at realistic scale, so failures surface early rather than during critical training work</li>
<li>Act as the primary point of contact for partner training and infrastructure teams when issues in our environments arise, and drive incidents to resolution</li>
<li>Reduce the operational burden on researchers so they can stay focused on research</li>
</ul>
<p><strong>Minimum Qualifications:</strong></p>
<ul>
<li>Highly experienced Python engineer who ships reliable, well-instrumented code that teammates trust in production</li>
<li>Demonstrated experience operating ML or distributed systems at scale, including significant on-call and incident-response experience</li>
<li>Strong SRE or production-engineering mindset , reaching for SLOs, load tests, and failure injection before reaching for more dashboards</li>
<li>Foundational ML knowledge sufficient to understand what a training environment or evaluation is actually measuring, and recognise when an evaluation has become stale or gameable</li>
<li>Able to read research code and reason evaluation integrity</li>
</ul>
<p><strong>Preferred Qualifications:</strong></p>
<ul>
<li>5+ years of experience operating ML or distributed systems at scale</li>
<li>Experience building or operating RL environments, agent harnesses, or LLM evaluation frameworks</li>
<li>Familiarity with reward modelling, evaluation design, or detecting and mitigating reward hacking</li>
<li>Experience with observability stacks (metrics, tracing, structured logging) and operational dashboard tooling</li>
<li>Background in chaos engineering, fault injection, or large-scale load testing</li>
<li>Experience with data quality pipelines, drift detection, or evaluation-set curation and versioning</li>
<li>Familiarity with large-scale training or inference infrastructure (schedulers, multi-agent orchestration, sandboxed execution)</li>
<li>Prior experience as a dedicated reliability or operations owner embedded within a research team</li>
</ul>
<p><strong>Logistics</strong></p>
<p>Minimum education: Bachelor’s degree or an equivalent combination of education, training, and/or experience Required field of study: A field relevant to the role as demonstrated through coursework, training, or professional experience Minimum years of experience: Years of experience required will correlate with the internal job level requirements for the position Location-based hybrid policy: Currently, we expect all staff to be in one of our offices at least 25% of the time. However, some roles may require more time in our offices. Visa sponsorship: We do sponsor visas! However, we aren’t able to successfully sponsor visas for every role and every candidate. But if we make you an offer, we will make every reasonable effort to get you a visa, and we retain an immigration lawyer to help with this.</p>
<p><strong>How we’re different</strong></p>
<p>We believe that the highest-impact AI research will be big science. At Anthropic we work as a single cohesive team on just a few large-scale research efforts. And we value impact , advancing our long-term goals of steerable, trustworthy AI , rather than work on smaller and more specific puzzles.</p>
<p><strong>Come work with us!</strong></p>
<p>Anthropic is a public benefit corporation headquartered in San Francisco. We offer competitive compensation and benefits.</p>
<p style="margin-top:24px;font-size:13px;color:#666;">XML job scraping automation by <a href="https://yubhub.co">YubHub</a></p>]]></Description>
      <Jobtype>full-time</Jobtype>
      <Experiencelevel>senior</Experiencelevel>
      <Workarrangement>hybrid</Workarrangement>
      <Salaryrange>$350,000-$850,000 USD</Salaryrange>
      <Skills>Python, ML, Distributed Systems, SRE, Production-Engineering, Observability, Dashboards, Operational Tooling, Load Testing, Fault Injection, Stress Testing, Reliability, Infrastructure Foundation, Evaluation Integrity, RL Environments, Agent Harnesses, LLM Evaluation Frameworks, Reward Modelling, Evaluation Design, Chaos Engineering, Data Quality Pipelines, Drift Detection, Evaluation-Set Curation, Versioning, Large-Scale Training, Inference Infrastructure, Schedulers, Multi-Agent Orchestration, Sandboxed Execution</Skills>
      <Category>Engineering</Category>
      <Industry>Technology</Industry>
      <Employername>Anthropic</Employername>
      <Employerlogo>https://logos.yubhub.co/anthropic.com.png</Employerlogo>
      <Employerdescription>Anthropic is a public benefit corporation that creates reliable, interpretable, and steerable AI systems.</Employerdescription>
      <Employerwebsite>https://www.anthropic.com</Employerwebsite>
      <Compensationcurrency></Compensationcurrency>
      <Compensationmin></Compensationmin>
      <Compensationmax></Compensationmax>
      <Applyto>https://job-boards.greenhouse.io/anthropic/jobs/5197337008</Applyto>
      <Location>San Francisco, CA</Location>
      <Country></Country>
      <Postedate>2026-04-24</Postedate>
    </job>
    <job>
      <externalid>a21c0ab8-098</externalid>
      <Title>Researcher, Training</Title>
      <Description><![CDATA[<p><strong>Location</strong></p>
<p>San Francisco</p>
<p><strong>Employment Type</strong></p>
<p>Full time</p>
<p><strong>Department</strong></p>
<p>Research</p>
<p><strong>Compensation</strong></p>
<ul>
<li>$360K – $440K • Offers Equity</li>
</ul>
<p>The base pay offered may vary depending on multiple individualized factors, including market location, job-related knowledge, skills, and experience. If the role is non-exempt, overtime pay will be provided consistent with applicable laws. In addition to the salary range listed above, total compensation also includes generous equity, performance-related bonus(es) for eligible employees, and the following benefits.</p>
<ul>
<li>Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts</li>
</ul>
<ul>
<li>Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)</li>
</ul>
<ul>
<li>401(k) retirement plan with employer match</li>
</ul>
<ul>
<li>Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)</li>
</ul>
<ul>
<li>Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees</li>
</ul>
<ul>
<li>13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)</li>
</ul>
<ul>
<li>Mental health and wellness support</li>
</ul>
<ul>
<li>Employer-paid basic life and disability coverage</li>
</ul>
<ul>
<li>Annual learning and development stipend to fuel your professional growth</li>
</ul>
<ul>
<li>Daily meals in our offices, and meal delivery credits as eligible</li>
</ul>
<ul>
<li>Relocation support for eligible employees</li>
</ul>
<ul>
<li>Additional taxable fringe benefits, such as charitable donation matching and wellness stipends, may also be provided.</li>
</ul>
<p>More details about our benefits are available to candidates during the hiring process.</p>
<p>This role is at-will and OpenAI reserves the right to modify base pay and other compensation components at any time based on individual performance, team or company results, or market conditions.</p>
<p><strong>About the Team</strong></p>
<p>OpenAI&#39;s Training team is responsible for producing the large language models that power our research, our products, and ultimately bring us closer to AGI. Achieving this goal requires combining deep research into improving our current architecture, datasets and optimization techniques, alongside long-term bets aimed at improving the efficiency and capability of future generations of models. We are responsible for integrating these techniques and producing model artifacts used by the rest of the company, and ensuring that these models are world-class in every respect. Recent examples of artifacts with major contributions from our team include GPT4-Turbo, GPT-4o and o1-mini.</p>
<p><strong>About the Role</strong></p>
<p>As a member of the architecture team, you will push the frontier of architecture development for OpenAI&#39;s flagship models, enhancing intelligence, efficiency, and adding new capabilities.</p>
<p>Ideal candidates have a deep understanding of LLM architectures, a sophisticated understanding of model inference, and a hands-on empirical approach. A good fit for this role will be equally happy coming up with a creative breakthrough, investing in strengthening a baseline, designing an eval, debugging a thorny regression, or tracking down a bottleneck.</p>
<p>This role is based in San Francisco. We use a hybrid work model of 3 days in the office per week and offer relocation assistance to new employees.</p>
<p><strong>In this role, you will:</strong></p>
<ul>
<li>Design, prototype and scale up new architectures to improve model intelligence</li>
<li>Execute and analyze experiments autonomously and collaboratively</li>
<li>Study, debug, and optimize both model performance and computational performance</li>
<li>Contribute to training and inference infrastructure</li>
</ul>
<p><strong>You might thrive in this role if you:</strong></p>
<ul>
<li>Have experience landing contributions to major LLM training runs</li>
<li>Can thoroughly evaluate and improve deep learning architectures in a self-directed fashion</li>
<li>Are motivated by safely deploying LLMs in the real world</li>
<li>Are well-versed in the state of the art transformer modifications for efficiency</li>
</ul>
<p><strong>About OpenAI</strong></p>
<p>OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity.</p>
<p style="margin-top:24px;font-size:13px;color:#666;">XML job scraping automation by <a href="https://yubhub.co">YubHub</a></p>]]></Description>
      <Jobtype>full-time</Jobtype>
      <Experiencelevel>senior</Experiencelevel>
      <Workarrangement>hybrid</Workarrangement>
      <Salaryrange>$360K – $440K • Offers Equity</Salaryrange>
      <Skills>Deep learning, Transformers, Model inference, Architecture development, Experiment design, Optimization techniques, LLM architectures, Model performance, Computational performance, Training and inference infrastructure</Skills>
      <Category>Engineering</Category>
      <Industry>Technology</Industry>
      <Employername>OpenAI</Employername>
      <Employerlogo>https://logos.yubhub.co/openai.com.png</Employerlogo>
      <Employerdescription>OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. The company was founded in 2015 and has since grown to become a leading player in the field of artificial intelligence.</Employerdescription>
      <Employerwebsite>https://jobs.ashbyhq.com</Employerwebsite>
      <Compensationcurrency></Compensationcurrency>
      <Compensationmin></Compensationmin>
      <Compensationmax></Compensationmax>
      <Applyto>https://jobs.ashbyhq.com/openai/97d3670c-e75a-4bb2-a235-171765f5f10e</Applyto>
      <Location>San Francisco</Location>
      <Country></Country>
      <Postedate>2026-03-06</Postedate>
    </job>
    <job>
      <externalid>d3a39f4c-d95</externalid>
      <Title>Software Engineer, Inference - Multi Modal</Title>
      <Description><![CDATA[<p><strong>Software Engineer, Inference - Multi Modal</strong></p>
<p><strong>Location</strong></p>
<p>San Francisco</p>
<p><strong>Employment Type</strong></p>
<p>Full time</p>
<p><strong>Department</strong></p>
<p>Scaling</p>
<p><strong>Compensation</strong></p>
<ul>
<li>$295K – $555K • Offers Equity</li>
</ul>
<p>The base pay offered may vary depending on multiple individualized factors, including market location, job-related knowledge, skills, and experience. If the role is non-exempt, overtime pay will be provided consistent with applicable laws. In addition to the salary range listed above, total compensation also includes generous equity, performance-related bonus(es) for eligible employees, and the following benefits.</p>
<ul>
<li>Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts</li>
</ul>
<ul>
<li>Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)</li>
</ul>
<ul>
<li>401(k) retirement plan with employer match</li>
</ul>
<ul>
<li>Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)</li>
</ul>
<ul>
<li>Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees</li>
</ul>
<ul>
<li>13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)</li>
</ul>
<ul>
<li>Mental health and wellness support</li>
</ul>
<ul>
<li>Employer-paid basic life and disability coverage</li>
</ul>
<ul>
<li>Annual learning and development stipend to fuel your professional growth</li>
</ul>
<ul>
<li>Daily meals in our offices, and meal delivery credits as eligible</li>
</ul>
<ul>
<li>Relocation support for eligible employees</li>
</ul>
<ul>
<li>Additional taxable fringe benefits, such as charitable donation matching and wellness stipends, may also be provided.</li>
</ul>
<p>More details about our benefits are available to candidates during the hiring process.</p>
<p>This role is at-will and OpenAI reserves the right to modify base pay and other compensation components at any time based on individual performance, team or company results, or market conditions.</p>
<p><strong>About the Team</strong></p>
<p>OpenAI’s Inference team powers the deployment of our most advanced models - including our GPT models, 4o Image Generation, and Whisper - across a variety of platforms. Our work ensures these models are available, performant, and scalable in production, and we partner closely with Research to bring the next generation of models into the world. We&#39;re a small, fast-moving team of engineers focused on delivering a world-class developer experience while pushing the boundaries of what AI can do.</p>
<p>We’re expanding into multimodal inference, building the infrastructure needed to serve models that handle image, audio, and other non-text modalities. These workloads are inherently more heterogeneous and experimental, involving diverse model sizes and interactions, more complex input/output formats, and tighter coordination with product and research.</p>
<p><strong>About the Role</strong></p>
<p>We’re looking for a software engineer to help us serve OpenAI’s multimodal models at scale. You’ll be part of a small team responsible for building reliable, high-performance infrastructure for serving real-time audio, image, and other MM workloads in production.</p>
<p>This work is inherently cross-functional: you’ll collaborate directly with researchers training these models and with product teams defining new modalities of interaction. You&#39;ll build and optimize the systems that let users generate speech, understand images, and interact with models in ways far beyond text.</p>
<p><strong>In this role, you will:</strong></p>
<ul>
<li>Design and implement inference infrastructure for large-scale multimodal models.</li>
</ul>
<ul>
<li>Optimize systems for high-throughput, low-latency delivery of image and audio inputs and outputs.</li>
</ul>
<ul>
<li>Enable experimental research workflows to transition into reliable production services.</li>
</ul>
<ul>
<li>Collaborate closely with researchers, infra teams, and product engineers to deploy state-of-the-art capabilities.</li>
</ul>
<ul>
<li>Contribute to system-level improvements including GPU utilization, tensor parallelism, and hardware abstraction layers.</li>
</ul>
<p><strong>You might thrive in this role if you:</strong></p>
<ul>
<li>Have experience building and scaling inference systems for LLMs or multimodal models.</li>
</ul>
<ul>
<li>Have worked with GPU-based ML workloads and understand the performance dynamics of large models, especially with complex data like images or audio.</li>
</ul>
<ul>
<li>Enjoy experimental, fast-evolving work and collaborating closely with research.</li>
</ul>
<ul>
<li>Are comfortable dealing with systems that span networking, distributed compute, and high-throughput data handling.</li>
</ul>
<ul>
<li>Have familiarity with inference tooling like vLLM, TensorRT-LLM, or custom model parallel systems.</li>
</ul>
<ul>
<li>Own problems end-to-end and are excited to operate in ambiguous, fast-moving spaces.</li>
</ul>
<p><strong>Nice to Have:</strong></p>
<ul>
<li>Experience working with image generation or audio synthesis models in production.</li>
</ul>
<ul>
<li>Exposure to distributed ML training or system-efficient model design.</li>
</ul>
<p><strong>About OpenAI</strong></p>
<p>OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity.</p>
<p style="margin-top:24px;font-size:13px;color:#666;">XML job scraping automation by <a href="https://yubhub.co">YubHub</a></p>]]></Description>
      <Jobtype>full-time</Jobtype>
      <Experiencelevel>mid</Experiencelevel>
      <Workarrangement>onsite</Workarrangement>
      <Salaryrange>$295K – $555K • Offers Equity</Salaryrange>
      <Skills>Software Engineer, Inference Infrastructure, GPU-based ML Workloads, Tensor Parallelism, Hardware Abstraction Layers, vLLM, TensorRT-LLM, Custom Model Parallel Systems, Image Generation, Audio Synthesis, Distributed ML Training, System-Efficient Model Design</Skills>
      <Category>Engineering</Category>
      <Industry>Technology</Industry>
      <Employername>OpenAI</Employername>
      <Employerlogo>https://logos.yubhub.co/openai.com.png</Employerlogo>
      <Employerdescription>OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products.</Employerdescription>
      <Employerwebsite>https://jobs.ashbyhq.com</Employerwebsite>
      <Compensationcurrency></Compensationcurrency>
      <Compensationmin></Compensationmin>
      <Compensationmax></Compensationmax>
      <Applyto>https://jobs.ashbyhq.com/openai/4d14449e-5e7f-45d4-b103-8776a6c87086</Applyto>
      <Location>San Francisco</Location>
      <Country></Country>
      <Postedate>2026-03-06</Postedate>
    </job>
    <job>
      <externalid>3b26742c-769</externalid>
      <Title>Software Engineer, Financial Engineering</Title>
      <Description><![CDATA[<p><strong>Software Engineer, Financial Engineering</strong></p>
<p><strong>Location</strong></p>
<p>San Francisco</p>
<p><strong>Employment Type</strong></p>
<p>Full time</p>
<p><strong>Location Type</strong></p>
<p>Hybrid</p>
<p><strong>Department</strong></p>
<p>Applied AI</p>
<p><strong>Compensation</strong></p>
<ul>
<li>$230K – $385K • Offers Equity</li>
</ul>
<p>The base pay offered may vary depending on multiple individualized factors, including market location, job-related knowledge, skills, and experience. If the role is non-exempt, overtime pay will be provided consistent with applicable laws. In addition to the salary range listed above, total compensation also includes generous equity, performance-related bonus(es) for eligible employees, and the following benefits.</p>
<p><strong>Benefits</strong></p>
<ul>
<li>Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts</li>
</ul>
<ul>
<li>Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)</li>
</ul>
<ul>
<li>401(k) retirement plan with employer match</li>
</ul>
<ul>
<li>Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)</li>
</ul>
<ul>
<li>Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees</li>
</ul>
<ul>
<li>13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)</li>
</ul>
<ul>
<li>Mental health and wellness support</li>
</ul>
<ul>
<li>Employer-paid basic life and disability coverage</li>
</ul>
<ul>
<li>Annual learning and development stipend to fuel your professional growth</li>
</ul>
<ul>
<li>Daily meals in our offices, and meal delivery credits as eligible</li>
</ul>
<ul>
<li>Relocation support for eligible employees</li>
</ul>
<ul>
<li>Additional taxable fringe benefits, such as charitable donation matching and wellness stipends, may also be provided.</li>
</ul>
<p><strong>About the team</strong></p>
<p>The Applied team at OpenAI safely brings cutting-edge technology to the world. We have released groundbreaking products such as ChatGPT, Plugins, DALL·E, and APIs for GPT-4, GPT-3, embeddings, and fine-tuning. Our team also manages large-scale inference infrastructure. With much more on the horizon, our impact continues to grow.</p>
<p>Our customers create fast-growing businesses using our APIs, enabling product features previously unimaginable. ChatGPT exemplifies the current scope of possibilities. We prioritize the responsible use of our powerful tools, valuing safe deployment over unchecked expansion.</p>
<p>Within Applied Engineering, the Financial Engineering team ensures that our products are monetized effectively to accommodate customers&#39; varying needs and scales. Collaborating closely with the GTM and Finance teams, we strive to tailor our billing stack to our evolving internal requirements. We seek an experienced engineer to architect and refine our billing systems, enhancing their functionality to meet the demands of our increasingly complex and expansive product offerings.</p>
<p><strong>In this role, you will:</strong></p>
<ul>
<li>Architect and build the next generation of billing and monetization systems at OpenAI.</li>
</ul>
<ul>
<li>Develop across the stack to create comprehensive billing integrations for our range of ChatGPT and API users.</li>
</ul>
<ul>
<li>Design a versatile billing platform suitable for both subscription and usage-based offerings, ensuring scalability and enterprise readiness/flexibility.</li>
</ul>
<ul>
<li>Construct and integrate tools that empower internal teams to seamlessly incorporate billing data into their workflows.</li>
</ul>
<ul>
<li>Collaborate closely with a wide array of stakeholders, including the Product, Data, Finance, and Go-To-Market teams, as well as fellow engineers.</li>
</ul>
<p><strong>You might thrive in this role if you:</strong></p>
<ul>
<li>Possess a minimum of 5 years of professional software engineering experience, with added experience in payments, billing, or monetization seen as a bonus.</li>
</ul>
<ul>
<li>Enjoy engaging with various partners, particularly those outside of engineering.</li>
</ul>
<ul>
<li>Have a keen and innate desire to learn and acquire new skills, coupled with a strong ability to impart that knowledge clearly and succinctly to others.</li>
</ul>
<ul>
<li>Bring significant experience in developing (and redeveloping) production systems to launch new product capabilities and to handle scaling challenges.</li>
</ul>
<ul>
<li>Are deeply invested in creating an exceptional user experience, taking pride in crafting products that address customer needs.</li>
</ul>
<ul>
<li>Ability to move fast in an environment where things are sometimes loosely defined and may have competing priorities or deadlines.</li>
</ul>
<p style="margin-top:24px;font-size:13px;color:#666;">XML job scraping automation by <a href="https://yubhub.co">YubHub</a></p>]]></Description>
      <Jobtype>full-time</Jobtype>
      <Experiencelevel>senior</Experiencelevel>
      <Workarrangement>hybrid</Workarrangement>
      <Salaryrange>$230K – $385K • Offers Equity</Salaryrange>
      <Skills>software engineering, payments, billing, monetization, APIs, GPT-4, GPT-3, embeddings, fine-tuning, large-scale inference infrastructure, product development, team collaboration, communication, problem-solving, leadership, mentoring, technical writing, public speaking, project management</Skills>
      <Category>Engineering</Category>
      <Industry>Technology</Industry>
      <Employername>OpenAI</Employername>
      <Employerlogo>https://logos.yubhub.co/openai.com.png</Employerlogo>
      <Employerdescription>OpenAI is a technology company that develops and releases advanced artificial intelligence models. It has a large team of engineers and researchers working on various projects, including ChatGPT and DALL·E.</Employerdescription>
      <Employerwebsite>https://jobs.ashbyhq.com</Employerwebsite>
      <Compensationcurrency></Compensationcurrency>
      <Compensationmin></Compensationmin>
      <Compensationmax></Compensationmax>
      <Applyto>https://jobs.ashbyhq.com/openai/4ef5bf23-cf0e-4b97-a639-11f963c99b88</Applyto>
      <Location>San Francisco</Location>
      <Country></Country>
      <Postedate>2026-03-06</Postedate>
    </job>
  </jobs>
</source>