Engineering Manager, Agent Prompts & Evals

d08d38d2-b72 Engineering Manager, Agent Prompts & Evals About the Role

Anthropic is looking for an Engineering Manager to lead the Agent Prompts & Evals team. This team owns the infrastructure that lets Anthropic ship model and prompt changes with confidence , the eval frameworks, system prompt pipelines, and regression-detection systems that every model launch depends on.

When a new Claude model is ready to ship, this team is the one answering “is it actually better in our products?” When a product team wants to change how Claude behaves, this team owns the tooling that tells them whether they broke something. It’s a platform team whose platform is model behavior itself.

The team sits deliberately at the seam between product engineering and research. You’ll partner closely with other evals groups across the company on shared infrastructure and methodology, with product teams who are shipping features on top of Claude, and with the TPMs and research PMs driving model launches. The pace is set by the model release cadence, and the team operates as both a platform owner and a hands-on partner during launch periods.

Responsibilities

Lead and grow a team of prompt engineers and platform software engineers
Own the product-side eval platform: the frameworks, dashboards, bulk runners, and CI integrations that product teams use to measure Claude’s behavior and catch regressions before they ship
Own system prompt infrastructure: versioning, deployment, rollback, and review tooling for the prompts that run in production across claude.ai, the API, and agentic surfaces
Be a steady hand through model launches , these are the team’s highest-stakes operational moments and the EM is the backstop when things get chaotic
Build durable collaboration with other evals groups across the company; this means real work on ownership boundaries, shared roadmaps, and avoiding tragedy-of-the-commons on shared eval infrastructure
Recruit, close, and retain engineers who want to work at the intersection of product engineering and model behavior
Shape where the team invests next: there are credible paths into frontier eval development, model launch automation, and deeper prompt engineering support, and part of the job is sequencing them
Push the team toward measuring things that are hard to measure , behavioral drift, prompt quality, harness parity , not just things that are easy

You May Be a Good Fit If You Have

8+ years in software engineering with 3+ years managing engineering teams, including experience leading a platform, infra, or developer-tooling team where your customers were other engineers
A track record of building “pits of success” , tooling and process that made it easy for other teams to do the right thing without needing to understand all the details
Comfort managing a team with a mixed charter: platform ownership, service-to-other-teams, and a launch-driven operational rhythm, all at once
Enough technical depth to engage on system design, review pipeline architecture, and be credible in debates with strong ICs , you don’t need to be writing code by hand every day, but you should be able to read it, review it, and be comfortable leveraging Claude to understand, design, and occasionally build.
A product mindset and willingness to wear multiple hats when the work calls for it
Demonstrated ability to build and maintain peer relationships with partner orgs that have different cultures and incentives , negotiating ownership, aligning roadmaps, and holding ground when it matters without being territorial about it
Experience recruiting and closing senior ICs in a competitive market

Strong Candidates May Also Have

Prior exposure to LLM evals, ML experimentation platforms, or model quality work , even tangentially
Experience with A/B testing infrastructure, feature flagging, or gradual rollout systems
Background in devtools, CI/CD platforms, or testing infrastructure at scale
A history of managing teams that sit between two larger orgs and making that position an asset rather than a liability
Interest in AI safety and alignment , not required, but it makes the “why” of the work land harder

Logistics

Minimum education: Bachelor’s degree or an equivalent combination of education, training, and/or experience
Required field of study: A field relevant to the role as demonstrated through coursework, training, or professional experience
Minimum years of experience: Years of experience required will correlate with the internal job level requirements for the position
Location-based hybrid policy: Currently, we expect all staff to be in one of our offices at least 25% of the time. However, some roles may require more time in our offices.
Visa sponsorship: We do sponsor visas! However, we aren’t able to successfully sponsor visas for every role and every candidate. But if we make you an offer, we will make every reasonable effort to get you a visa, and we retain an immigration lawyer to help with this.

How we’re different

We believe that the highest-impact AI research will be big science. At Anthropic we work as a single cohesive team on just a few large-scale research efforts. And we value impact , advancing our long-term goals of steerable, trustworthy AI , rather than work on smaller and more specific puzzles. We view AI research as an empirical science, which has as much in common with physics and biology as with traditional efforts in computer science. We’re an extremely collaborative group, and we host frequent research discussions

XML job scraping automation by YubHub

]]> full-time senior hybrid $320,000-$405,000 USD software engineering, team management, platform ownership, service-to-other-teams, launch-driven operational rhythm, system design, pipeline architecture, product mindset, recruiting and closing senior ICs, LLM evals, ML experimentation platforms, model quality work, A/B testing infrastructure, feature flagging, gradual rollout systems, devtools, CI/CD platforms, testing infrastructure at scale Engineering Technology Anthropic https://logos.yubhub.co/anthropic.com.png Anthropic is a company that creates reliable, interpretable, and steerable AI systems. The company has a team of researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems. https://www.anthropic.com/ https://job-boards.greenhouse.io/anthropic/jobs/5159608008 San Francisco, CA | New York City, NY 2026-04-18 76d3f53b-3c6 Staff Software Engineer, Quality and Release Platform About Us

We're looking for a Staff Software Engineer to join our Quality and Release Platform (QARP) team and lead the technical direction of the platforms that power how dbt Labs builds, tests, and ships software.

Our mission spans two critical areas: release engineering , making it easy for engineers to ship changes quickly, safely, and reliably , and code quality , building a platform that raises the bar for code quality across all of dbt Labs engineering.

In this role, you'll work with tools like Helm, ArgoCD, Terraform, Python, GitHub Actions, and Kargo to architect and scale our deployment systems, while also helping design and build the tooling, frameworks, and automation that enable engineering teams to consistently produce high-quality code.

This is a high-impact, staff-level role where you'll set architectural direction, mentor engineers, and drive initiatives that improve developer velocity, code quality, and reliability across the entire engineering organization.

Responsibilities

Define and drive the technical strategy and architecture for our CI/CD platform, release management systems, and code quality platform.

Design and build tooling, frameworks, and automation that help engineering teams maintain and improve code quality across the organization.

Lead high-impact initiatives that improve automation, observability, and self-service capabilities for engineers across the organization.

Mentor and level up other engineers on the team, fostering a culture of technical excellence and continuous improvement.

Collaborate across teams and with engineering leadership to identify systemic challenges in our delivery and quality processes and architect solutions to address them.

Evolve our release architecture to support dbt Cloud's multi-cloud, cell-based infrastructure at scale.

Establish best practices and standards for build pipelines, release workflows, code quality, and infrastructure-as-code that are adopted across engineering.

Serve as a thought leader in engineering's internal AI strategy , evaluating AI-assisted development tools, defining adoption practices and guardrails, and enabling developers to use AI effectively across the org.

Requirements

8+ years of software engineering experience, with significant time in platform, infrastructure, release engineering, or developer tooling.

A track record of leading technical strategy and architecture for complex, production-scale CI/CD, code quality, or platform systems.

Deep experience with one or more of the following: Helm, ArgoCD, Terraform, GitHub Actions, or Kubernetes.

Strong background in Python, Go, or Rust for automation, platform tooling, or systems development.

Passion for code quality and experience building or improving tools, linters, static analysis, testing frameworks, or CI checks that help teams write better code.

Demonstrated ability to drive cross-team initiatives and influence engineering-wide practices and standards.

Excellent communication skills , able to translate complex technical concepts for diverse audiences and lead through influence.

Demonstrated interest or hands-on experience with AI-assisted development tools and practices, with a perspective on how AI can improve engineering productivity and code quality.

Experience working asynchronously as part of a fully remote, distributed team.

Preferred Qualifications

Experience with Kargo or similar progressive delivery systems.

Hands-on experience with multi-cloud architectures (AWS, GCP, Azure).

Experience building code quality platforms, static analysis tooling, or testing infrastructure at scale.

Experience defining and rolling out engineering-wide code quality standards or best practices.

A track record of improving developer productivity or release safety across a large engineering organization.

Experience mentoring engineers and shaping team culture in a staff or principal-level role.

Track record of evaluating, championing, and rolling out AI developer tools (e.g., Copilot, Cursor, Claude Code) within an engineering organization.

Experience defining guidelines, guardrails, or best practices for AI-assisted development.

Compensation & Benefits

Salary: We offer competitive compensation packages commensurate with experience, including salary, equity, and where applicable, performance-based pay.

In select locations (including Boston, Chicago, Denver, Los Angeles, Philadelphia, New York Metro, San Francisco, DC Metro, Seattle, Austin), an alternate range may apply, as specified below.

The typical starting salary range for this role is: $207,000 - $251,000 USD

The typical starting salary range for this role in the select locations listed is: $230,000 - $279,000 US

Equity Stake Benefits

dbt Labs offers: unlimited vacation, 401k w/3% guaranteed contribution, excellent healthcare, paid parental leave, wellness stipend, home office stipend, and more!

Our Hiring Process

Interview with a Talent Acquisition Partner (30 Mins)

Technical Interview with Hiring Manager (60 Mins)

Team Interviews - Technical (3 rounds, 60 Mins each)

Values Interview (30 Mins)

#LI_RC1

XML job scraping automation by YubHub

]]> full-time staff remote Helm, ArgoCD, Terraform, Python, GitHub Actions, Kargo, Kubernetes, multi-cloud architectures, code quality platforms, static analysis tooling, testing infrastructure Engineering Technology dbt Labs https://logos.yubhub.co/getdbt.com.png dbt Labs is a leading analytics engineering platform, used by over 90,000 teams every week, with annual recurring revenue exceeding $100 million. https://www.getdbt.com/ https://job-boards.greenhouse.io/dbtlabsinc/jobs/4666468005 US - Remote 2026-04-18 0806749e-694 Engineering Manager, Agent Prompts & Evals About the Role

Responsibilities

Lead and grow a team of prompt engineers and platform software engineers
Own the product-side eval platform: the frameworks, dashboards, bulk runners, and CI integrations that product teams use to measure Claude’s behavior and catch regressions before they ship
Own system prompt infrastructure: versioning, deployment, rollback, and review tooling for the prompts that run in production across claude.ai, the API, and agentic surfaces
Be a steady hand through model launches , these are the team’s highest-stakes operational moments and the EM is the backstop when things get chaotic
Build durable collaboration with other evals groups across the company; this means real work on ownership boundaries, shared roadmaps, and avoiding tragedy-of-the-commons on shared eval infrastructure
Recruit, close, and retain engineers who want to work at the intersection of product engineering and model behavior
Shape where the team invests next: there are credible paths into frontier eval development, model launch automation, and deeper prompt engineering support, and part of the job is sequencing them
Push the team toward measuring things that are hard to measure , behavioral drift, prompt quality, harness parity , not just things that are easy

Requirements

8+ years in software engineering with 3+ years managing engineering teams, including experience leading a platform, infra, or developer-tooling team where your customers were other engineers
A track record of building “pits of success” , tooling and process that made it easy for other teams to do the right thing without needing to understand all the details
Comfort managing a team with a mixed charter: platform ownership, service-to-other-teams, and a launch-driven operational rhythm, all at once
Enough technical depth to engage on system design, review pipeline architecture, and be credible in debates with strong ICs , you don’t need to be writing code by hand every day, but you should be able to read it, review it, and be comfortable leveraging Claude to understand, design, and occasionally build.
A product mindset and willingness to wear multiple hats when the work calls for it
Demonstrated ability to build and maintain peer relationships with partner orgs that have different cultures and incentives , negotiating ownership, aligning roadmaps, and holding ground when it matters without being territorial about it
Experience recruiting and closing senior ICs in a competitive market

Nice to Have

Prior exposure to LLM evals, ML experimentation platforms, or model quality work , even tangentially
Experience with A/B testing infrastructure, feature flagging, or gradual rollout systems
Background in devtools, CI/CD platforms, or testing infrastructure at scale
A history of managing teams that sit between two larger orgs and making that position an asset rather than a liability
Interest in AI safety and alignment , not required, but it makes the “why” of the work land harder

Logistics

Minimum education: Bachelor’s degree or an equivalent combination of education, training, and/or experience
Required field of study: A field relevant to the role as demonstrated through coursework, training, or professional experience
Minimum years of experience: Years of experience required will correlate with the internal job level requirements for the position
Location-based hybrid policy: Currently, we expect all staff to be in one of our offices at least 25% of the time. However, some roles may require more time in our offices.
Visa sponsorship: We do sponsor visas! However, we aren’t able to successfully sponsor visas for every role and every candidate. But if we make you an offer, we will make every reasonable effort to get you a visa, and we retain an immigration lawyer to help with this.

XML job scraping automation by YubHub

]]> full-time senior hybrid $320,000-$405,000 USD Software engineering, Team management, Platform ownership, Service-to-other-teams, Launch-driven operational rhythm, System design, Pipeline architecture, Product mindset, Peer relationships, Recruiting and closing senior ICs, LLM evals, ML experimentation platforms, Model quality work, A/B testing infrastructure, Feature flagging, Gradual rollout systems, Devtools, CI/CD platforms, Testing infrastructure, AI safety and alignment Engineering Technology Anthropic https://logos.yubhub.co/anthropic.com.png Anthropic is a company that creates reliable, interpretable, and steerable AI systems. It has a team of researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems. https://www.anthropic.com/ https://job-boards.greenhouse.io/anthropic/jobs/5159608008 San Francisco, CA | New York City, NY 2026-04-18 cc051e9f-7ab AI Product Owner - Growth The Role ================ Product Owner, Growth (AI-First) The Role Belongs growth constraint is supply. Every homeowner who activates on the platform adds a home to the network, creates a resident opportunity, and moves Belong closer to the profitability inflection that defines the next chapter of the company.

The homeowner funnel, from first impression through signed agreement and activated listing, is the highest-leverage product surface in the business. Most growth product roles are about optimizing what already exists: faster page loads, shorter forms, better copy. This role is about building something structurally different.

Belong's homeowner acquisition funnel is being rebuilt as an AI-native system: conversational intake powered by LLMs, personalized onboarding that adapts dynamically to each homeowner's financial profile, predictive scoring that routes the right lead to the right moment in the Advisor workflow, and agentic follow-up that replaces manual sequences with intelligent, context-aware outreach.

The target is a funnel that learns, where every interaction generates signal that makes the next interaction more likely to convert. As Product Owner, Growth, you are the person building that system. You own the homeowner acquisition and activation funnel end to end, from first contact to listed home.

Responsibilities ================

AI-native intake and qualification layer

The first interaction a homeowner has with Belong, whether via belonghome.com, a paid channel, or a referral, is where trust is either established or lost. You will build conversational intake flows powered by LLMs that qualify, capture, and begin converting leads in real time.

These are not chatbots with decision trees. They are context-aware systems that understand the difference between a cashflow-positive homeowner who wants yield optimization and a cashflow-negative homeowner who needs a path to profitability, and adapt the conversation, the framing, and the call-to-action accordingly.

Personalized onboarding and trust architecture

A homeowner considering Belong is anxious. They are considering handing over their most valuable asset to a platform they found online. Conversion at this stage is not a UX problem. It is a trust architecture problem.

You will design onboarding sequences that adapt dynamically based on homeowner attributes: property type, cashflow profile, prior rental history, risk signals, and behavioral signals from in-session activity.

You will use LLMs to generate personalized content, market analyses, improvement ROI estimates, comparable listings, that makes the value proposition concrete and specific to their home, not generic.

Predictive lead scoring and Advisor routing

Belong's Advisors are the trust-critical human touchpoint in the homeowner funnel. Their time is finite and high-value. You will build the predictive infrastructure that scores every lead on conversion likelihood, property quality, and fit with Belong's ICP, and routes leads to Advisors with the context they need to have the right conversation immediately.

You will work with data science to train and evaluate these models, with RevOps to deploy them into the Salesforce workflow, and with Sales leadership to validate signal quality against actual close rates.

Agentic follow-up and nurture sequences

Most leads do not convert on the first contact. Today, nurture is a sequence of templated emails. The target state is an AI agent that monitors lead behavior, page views, document opens, return visits, session signals, and generates contextually appropriate, personalized outreach at the right moment, with the right frame, without a human initiating every touchpoint.

You will define the agent's decision logic, build the context retrieval pipeline, instrument the output quality, and iterate on conversion impact week over week.

Funnel instrumentation and the learning loop

An AI-native funnel without rigorous instrumentation is a black box. You will build the measurement architecture that makes every conversion decision traceable: which intake flow variant produced the lead, which scoring model routed it, which agent-generated touchpoint influenced the next action, which Advisor framing closed it.

You will design the feedback loops that push conversion signal back into model evaluation, prompt improvement, and scoring recalibration. The funnel gets smarter every week or it is not an AI-native funnel.

The activation gap: agreement to listed home

Signing the agreement is not growth. A listed home is growth. The conversion from signed agreement to activated listing is a product problem with high leverage: homeowners who do not complete inspection scheduling, who abandon the improvement process, or who sit in the pipeline without a live listing represent real lost revenue.

You will own the product layer that closes this gap, including AI-assisted improvement planning, proactive homeowner communication anchored to their cashflow profile, and predictive identification of homeowners at risk of churning before listing.

The AI Stack You Will Work With ===========================

LLM-powered conversational intake with real-time lead qualification and cashflow profile detection
Personalized content generation using property-level market data, comparable listings, and improvement ROI modeling
Predictive lead scoring models trained on conversion, property quality, and ICP signals
Agentic follow-up workflows with behavioral trigger logic and context-aware generation
Retrieval-augmented generation for Advisor preparation: the right context, surfaced at the right moment before the call
A/B testing infrastructure applied to AI-generated content variants, not just static copy

What Success Looks Like ====================== 90 days: The funnel is fully instrumented from first click to activated listing with conversion rates and drop-off points visible at each stage. An AI-assisted intake flow is in production and being tested against the baseline.

6 months: Lead-to-listing conversion is measurably above baseline. AI is integrated at a minimum of 3 funnel touchpoints with documented conversion impact per touchpoint. Advisor routing is scored, and the correlation between score and close rate is being tracked.

Year 1: The majority of homeowner outreach between first contact and agreement signing is AI-generated, with human Advisors focusing exclusively on trust-critical call moments. CAC on the supply side is trending down. Time-to-activation is compressing quarter over quarter.

Example KPIs You Will Be Held To ==================================

Lead-to-listing conversion rate (the primary number)
Cost per activated listing
Time from first contact to listing live
AI-assisted funnel touchpoint conversion impact, measured per touchpoint
Advisor routing accuracy: scored lead close rate vs. unscored baseline
Experiment velocity: instrumented tests shipped per month
Homeowner CSAT at onboarding and inspection phases (the constraint: conversion gains cannot come at experience cost)

Who You Are ============ AI systems builder, not AI enthusiast. You have shipped LLM-powered product features in production. You understand prompt engineering, retrieval quality, latency tradeoffs, output evaluation, and model feedback loops. You think about AI systems the way a statistician thinks about models: with explicit assumptions, known failure modes

XML job scraping automation by YubHub

]]> full-time senior onsite LLM-powered conversational intake, Personalized content generation, Predictive lead scoring models, Agentic follow-up workflows, Retrieval-augmented generation, A/B testing infrastructure Engineering Technology Belong https://logos.yubhub.co/belonghome.com.png Belong is a platform that connects homeowners with potential renters. It is a rapidly growing company. https://www.belonghome.com/ https://jobs.lever.co/belong/0360a259-aa2d-492a-9c20-33497533573e Argentina 2026-04-17 3b6d1400-188 Android Engineer, Terminal Developer Productivity We're looking for an experienced Android Engineer to join our Terminal Developer Productivity team. As a key member of the team, you will design, build, and maintain tools, libraries, and infrastructure that improve the productivity of Terminal engineers across mobile, backend, and embedded systems.

Responsibilities:

Collaborate closely with mobile engineers to understand their workflows and pain points and translate them into practical short-term and long-term solutions.
Contribute to and improve our build, CI/CD, and test automation systems for Terminal SDKs, Android apps, and firmware.
Work with stakeholders across Terminal to prioritize work, balance competing needs, and ensure your solutions integrate cleanly into existing workflows.
Own projects end-to-end, from problem discovery and design through implementation, rollout, and ongoing operation.
Participate in code reviews, design discussions, and documentation to maintain a high bar for code quality, reliability, and developer experience.
Mentor other engineers in areas such as build, test, and release best practices, helping to spread strong developer productivity practices across the team.

Requirements:

BS or MS in Computer Science or a related field, or equivalent practical experience.
6+ years of software engineering experience, including meaningful experience with backend systems and at least one of: Android/mobile or embedded/firmware development.
Experience designing, implementing, and maintaining production systems or developer tooling.
Understanding of how to build scalable, reliable, and observable services, pipelines, or tooling.
Experience owning projects from design through implementation, rollout, and ongoing support.
Ability to thrive in a collaborative environment involving multiple stakeholders and subject matter experts.
Strong communication skills and the ability to explain technical concepts clearly to different audiences.

Preferred Qualifications:

Proficiency in one or more of: Kotlin, Java, or Go.
Experience building tools or platforms to improve developer productivity, with clear empathy for internal developer users.
Experience with CI/CD tooling and pipelines (e.g. Jenkins, GitLab CI, GitHub Actions) and modern build systems.
Experience designing and maintaining automated testing infrastructure (e.g. integration/end-to-end tests, test orchestration, or reducing test flakiness).
Experience with Android build and test tooling (e.g. Gradle, emulators, device farms) or firmware build pipelines.
Experience in payments, point-of-sale, or hardware-integrated systems is a plus.

XML job scraping automation by YubHub

]]> full-time senior remote Android, Java, Kotlin, Go, CI/CD, Test Automation, Scalable Systems, Reliable Systems, Observable Systems, CI/CD Tooling, Pipelines, Modern Build Systems, Automated Testing Infrastructure, Android Build and Test Tooling, Firmware Build Pipelines Engineering Technology Stripe https://logos.yubhub.co/stripe.com.png Stripe is a financial infrastructure platform for businesses, used by millions of companies worldwide. https://stripe.com/ https://job-boards.greenhouse.io/stripe/jobs/7550154 San Francisco, Seattle, Remote in US 2026-03-31 a34c84d3-af7 Software Engineer, Research Developer Productivity Job Posting

Software Engineer, Research Developer Productivity

Location

San Francisco

Employment Type

Full time

Department

Scaling

Compensation

$230K – $325K • Offers Equity

The base pay offered may vary depending on multiple individualized factors, including market location, job-related knowledge, skills, and experience. If the role is non-exempt, overtime pay will be provided consistent with applicable laws. In addition to the salary range listed above, total compensation also includes generous equity, performance-related bonus(es) for eligible employees, and the following benefits.

Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts

Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)

401(k) retirement plan with employer match

Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)

Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees

13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)

Mental health and wellness support

Employer-paid basic life and disability coverage

Annual learning and development stipend to fuel your professional growth

Daily meals in our offices, and meal delivery credits as eligible

Relocation support for eligible employees

Additional taxable fringe benefits, such as charitable donation matching and wellness stipends, may also be provided.

More details about our benefits are available to candidates during the hiring process.

This role is at-will and OpenAI reserves the right to modify base pay and other compensation components at any time based on individual performance, team or company results, or market conditions.

About the Team

The Fleet team builds core components to enable productive research from small to state of the art scale across OpenAI, with the goal of accelerating progress towards AGI. We frequently collaborate with other teams to speed up the development of new state-of-the-art capabilities.

About the Role

As we scale up with more researchers and engineers joining OpenAI, we seek a pragmatic and passionate engineer with a strong focus on the development experience for both engineers and scientists.

In this role, you will be responsible for building and maintaining systems that allow our research + engineering organisation to iteratively develop, test, and deploy new features reliably, with high velocity, and with a frictionless and fast development cycle.

You will help oversee and drive to the vision of how we should build, test and deploy software. You will drive the design of our continuous integration pipelines, testing infrastructure, training and support around our build system. Our current environment relies heavily on Python, Rust, and C++, which you will take ownership of and strive to transform into a state of the art development experience for research.

Ultimately, your role will be to provide the necessary tools and metrics to support our fast-paced culture and ensure a stable, scalable platform for growth, while also fostering a seamless and low friction experience for OpenAI’s research.

This role is based in San Francisco, CA. For a San Francisco role, we use a hybrid work model of 3 days in the office per week and offer relocation assistance to new employees.

You might thrive in this role if you:

Have supported large monorepo development and deployment before

Are a proficient Python programmer working in large monorepos

Are proficient with Docker and Kubernetes

Experienced in CI/CD

About OpenAI

OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity.

XML job scraping automation by YubHub

]]> full-time mid hybrid $230K – $325K • Offers Equity Python, Rust, C++, Docker, Kubernetes, CI/CD, large monorepo development, continuous integration pipelines, testing infrastructure Engineering Technology OpenAI https://logos.yubhub.co/openai.com.png OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. https://jobs.ashbyhq.com https://jobs.ashbyhq.com/openai/e6d5ca02-f30b-4ac5-a69d-c947efb430f9 San Francisco 2026-03-06