Tech Lead Manager- MLRE, ML Systems

539e2a23-ddf Tech Lead Manager- MLRE, ML Systems You will lead the development of our internal distributed framework for large language model training. The platform powers MLEs, researchers, data scientists, and operators for fast and automatic training and evaluation of LLMs. It also serves as the underlying training framework for the data quality evaluation pipeline.

You will work closely with Scale’s ML teams and researchers to build the foundation platform which supports all our ML research and development works. You will be building and optimising the platform to enable our next generation LLM training, inference and data curation.

Key responsibilities include:

Building, profiling and optimising our training and inference framework.
Collaborating with ML and research teams to accelerate their research and development, and enable them to develop the next generation of models and data curation.
Researching and integrating state-of-the-art technologies to optimise our ML system.

The ideal candidate will have:

Passionate about system optimisation.
Experience with multi-node LLM training and inference.
Experience with developing large-scale distributed ML systems.
Experience with post-training methods like RLHF/RLVR and related algorithms like PPO/GRPO etc.
Strong software engineering skills, proficient in frameworks and tools such as CUDA, PyTorch, transformers, flash attention, etc.

Nice to haves include demonstrated expertise in post-training methods and/or next generation use cases for large language models including instruction tuning, RLHF, tool use, reasoning, agents, and multimodal, etc.

XML job scraping automation by YubHub

]]> full-time senior hybrid $264,800-$331,000 USD system optimisation, multi-node LLM training and inference, large-scale distributed ML systems, post-training methods, software engineering skills, CUDA, PyTorch, transformers, flash attention, next generation use cases for large language models, instruction tuning, RLHF, tool use, reasoning, agents, multimodal Engineering Technology Scale https://logos.yubhub.co/scale.com.png Scale provides training and evaluation data and end-to-end solutions for the ML lifecycle. https://scale.com/ https://job-boards.greenhouse.io/scaleai/jobs/4618046005 San Francisco, CA; New York, NY 2026-04-18 840bab06-7be ML Research Engineer, ML Systems Job Description:

Scale's ML platform (RLXF) team builds our internal distributed framework for large language model training and inference. The platform has been powering MLEs, researchers, data scientists and operators for fast and automatic training and evaluation of LLM's, as well as evaluation of data quality.

At Scale, we're uniquely positioned at the heart of the field of AI as an indispensable provider of training and evaluation data and end-to-end solutions for the ML lifecycle. You will work closely across Scale's ML teams and researchers to build the foundation platform that supports all our ML research and development. You will be building and optimizing the platform to enable our next generation of LLM training, inference and data curation.

Responsibilities:

Build, profile and optimize our training and inference framework
Collaborate with ML teams to accelerate their research and development and enable them to develop the next generation of models and data curation
Research and integrate state-of-the-art technologies to optimize our ML system

Ideal Candidate:

Strong excitement about system optimization
Experience with multi-node LLM training and inference
Experience with developing large-scale distributed ML systems
Strong software engineering skills, proficient in frameworks and tools such as CUDA, Pytorch, transformers, flash attention, etc.
Strong written and verbal communication skills and the ability to operate in a cross functional team environment

Nice to Have:

Demonstrated expertise in post-training methods &/or next generation use cases for large language models including instruction tuning, RLHF, tool use, reasoning, agents, and multimodal, etc.

Compensation Packages:

Compensation packages at Scale for eligible roles include base salary, equity, and benefits. The range displayed on each job posting reflects the minimum and maximum target for new hire salaries for the position, determined by work location and additional factors, including job-related skills, experience, interview performance, and relevant education or training. Scale employees in eligible roles are also granted equity based compensation, subject to Board of Director approval. Your recruiter can share more about the specific salary range for your preferred location during the hiring process, and confirm whether the hired role will be eligible for equity grant. You'll also receive benefits including, but not limited to: Comprehensive health, dental and vision coverage, retirement benefits, a learning and development stipend, and generous PTO. Additionally, this role may be eligible for additional benefits such as a commuter stipend.

Please note that our policy requires a 90-day waiting period before reconsidering candidates for the same role. This allows us to ensure a fair and thorough evaluation of all applicants.

XML job scraping automation by YubHub

]]> full-time senior hybrid $189,600-$237,000 USD System Optimization, Multi-node LLM Training and Inference, Large-Scale Distributed ML Systems, CUDA, Pytorch, Transformers, Flash Attention, Post-Training Methods, Next Generation Use Cases for Large Language Models, Instruction Tuning, RLHF, Tool Use, Reasoning, Agents, Multimodal Engineering Technology Scale https://logos.yubhub.co/scale.com.png Scale develops reliable AI systems for the world's most important decisions, providing high-quality data and full-stack technologies for leading models. https://scale.com/ https://job-boards.greenhouse.io/scaleai/jobs/4534631005 San Francisco, CA; Seattle, WA; New York, NY 2026-04-18 4310ad06-74d Senior Product Manager, Copilot AI We are looking for a Senior Product Manager to drive Copilot model capabilities such as tool-use to ensure that the language models that power Microsoft Copilot deliver high quality responses to our users whilst being grounded, reliable, and cost-efficient.

As a Senior Product Manager, you will work at the nexus of product and research, driving execution in partnership with engineers, language engineers, data scientists and researchers. You will develop and execute on LLM platform strategy for Copilot that extend language model's capabilities. You will prototype approaches by steering language models to drive response quality across a wide range of scenarios. You will identify and prioritise platform, orchestration and language model issues that impact quality, factuality and safety and working with engineers and researchers to find a path to resolution.

You will define and build measurable evaluations with relevant datasets to demonstrate quality improvements. You will define, deploy and manage experiments in production that impact language model's tool use, driving measurable improvements in relevance for and engagement with Copilot users. You will partner with product teams to scale tool building and work with inference, agents and orchestration teams to resolve dependencies. You will be accountable to own the status of key projects, proactively identifying risks and proposing solutions to ensure timely delivery.

Responsibilities include:

Developing and executing on LLM platform strategy for Copilot that extend language model's capabilities
Prototyping approaches by steering language models to drive response quality across a wide range of scenarios
Identifying and prioritising platform, orchestration and language model issues that impact quality, factuality and safety and working with engineers and researchers to find a path to resolution
Defining and building measurable evaluations with relevant datasets to demonstrate quality improvements
Defining, deploying and managing experiments in production that impact language model's tool use, driving measurable improvements in relevance for and engagement with Copilot users
Partnering with product teams to scale tool building and working with inference, agents and orchestration teams to resolve dependencies
Being accountable to own the status of key projects, proactively identifying risks and proposing solutions to ensure timely delivery

XML job scraping automation by YubHub

]]> full-time senior hybrid $119,800 - $234,700 per year LLM APIs, embeddings, vector databases, tool use, prompt design, context window management, model evaluation, product management, hands-on experience with LLM APIs, experience with language model development Engineering Technology Microsoft AI https://logos.yubhub.co/microsoft.ai.png Microsoft AI is a technology company that creates products and services. It is a part of the larger Microsoft organisation. https://microsoft.ai https://microsoft.ai/job/senior-product-manager-copilot-ai-2/ Redmond 2026-03-08 7f0a1aea-7d3 Head of Agent Ops Compensation

$170K – $215K • 0.01% – 0.2%

Head of Agent Ops

You'll own the internal AI infrastructure that makes our team unreasonably fast. That means building, evaluating vendors, and continuously evolving the AI systems our team runs on — with the goal of maximizing every person's clock speed and scaling our ability to deploy agents across the business.

Salary Range:

$170,000–$215,000/year (Range shown is for U.S.-based employees in San Francisco, CA. Compensation outside the U.S. is adjusted fairly based on your country’s cost of living. You can explore how we calculate this here: [https://www.firecrawl.dev/careers/compensation](https://www.firecrawl.dev/careers/compensation).)

Equity Range:

Up to 0.20%

Location:

San Francisco, CA (Hybrid); Truly exceptional remote considered.

Job Type:

Full-Time (SF)

Experience:

5+ years or equivalent shipped systems

Visa:

US Citizenship/Visa required for SF

About Firecrawl

Firecrawl is the easiest way to extract data from the web. Developers use us to reliably convert URLs into LLM-ready markdown or structured data with a single API call.

What We're Looking For

A technical practitioner, not a theorist.

You understand how models actually work — not just API calls, but attention, context windows, inference tradeoffs, tool use patterns. You read papers. But your first instinct is always to build, not to theorize. You've shipped real systems that automate real work.

Someone with battle-tested opinions on AI-assisted development.

You've pushed vibe coding techniques far enough to know where they break. You have strong, experience-driven opinions about what works and what doesn't — grounded in first principles, not hype. You know when to let the model drive and when to take the wheel.

An automation obsessive.

You've already automated your own life to a degree that others find somewhat unhinged. You see manual processes the way most people see bugs — something that shouldn't exist and won't for long.

Someone who understands the bitter lesson.

You know where the highest-leverage opportunities are because you understand which problems get solved by scale and which don't. You allocate your energy accordingly.

$10k+/month token budget

(will increase with demonstrated impact).

What We're NOT Looking For

AI skeptics.

If you think superintelligence is hundreds of years away, this isn't your role.

Comfort optimizers.

We're looking for people who seek discomfort and aim to make a difference — not optimize for comfy vibes.

Status chasers.

If you're here because AI is in the news, you're not a fit. If you believe we're at a once-in-a-species inflection point and you want to shape the wave, not just ride it — this is the place for you.

A Note On Pace

We operate at an absurd level of urgency because the window for what we're building won't stay open forever. If that excites you, keep reading. If it doesn't, no hard feelings — but this role probably isn't for you.

Benefits & Perks

Available to all employees

Salary that makes sense — $170,000–215,000/year (SF, U.S.-based), based on impact, not tenure

Own a piece — Up to 0.20% equity in what you're helping build

Generous PTO — 15 days mandatory, anything after 24 days, just ask (holidays excluded); take the time you need to recharge

Parental leave — 12 weeks fully paid, for moms and dads

Wellness stipend — $100/month for the gym, therapy, massages, or whatever keeps you human

Learning & Development \- Expense up to $1000/year toward anything that helps you grow professionally

Team offsites — A change of scenery, minus the trust falls

Sabbatical— 3 paid months off after 4 years, do something fun and new

Available to US-based full-time employees

Full coverage, no red tape — Medical, dental, and vision (100% for employees, 50% for spouse/kids) — no weird loopholes, just care that works

Life & Disability insurance — Employer-paid short-term disability, long-term disability, and life insurance — coverage for life's curveballs

Supplemental options — Optional accident, critical illness, hospital indemnity, and voluntary life insurance for extra peace of mind

Doctegrity telehealth — Talk to a doctor from your couch

401(k) plan — Retirement might be a ways off, but future-you will thank you

Pre-tax benefits — Access to FSAs and commuter benefits (US-only) to help your wallet out a bit

Pet insurance — Because fur babies are family too

Available to SF-based employees

SF HQ perks — Snacks, drinks, team lunches, intense ping pong, and peak startup energy

E-Bike transportation — A loaner electric bike to get you around the city, on us

Interview Process

Application Review – Send us your stuff + a quick note on why this excites you (plus links to things you’ve built).

Intro Chat (~20–25 min) – Quick alignment call focused on what you’ve shipped, how you think about agents vs workflows, and what you’d automate first at Firecrawl.

Async Systems Design (60–90 min) – A short, structured design exercise: propose a v1 agent/workflow to eliminate a real internal bottleneck. We’re looking for taste, guardrails, eval/monitoring, and a rollout plan — not a big build.

Founder Chat (~30 min) – Culture, pace, ownership, and how you like to work. Time for your questions too.

Paid Work Trial (1–2 weeks) – Test drive the real thing: ship something real, get feedback, and iterate.

XML job scraping automation by YubHub

]]> Full time senior Hybrid $170K – $215K • 0.01% – 0.2% AI infrastructure, vendor evaluation, continuous evolution, agent systems, workflow automation, design, evaluation, monitoring, rollout plan, attention, context windows, inference tradeoffs, tool use patterns, vibe coding techniques, automation, scale, energy allocation Engineering Technology Firecrawl https://logos.yubhub.co/firecrawl.com.png Firecrawl is a small, fast-moving, technical team building essential infrastructure super-intelligence will use to gather data on the web. They've hit 8 figures in ARR and 80k+ GitHub stars by building the fastest way for developers to get LLM-ready data. https://jobs.ashbyhq.com https://jobs.ashbyhq.com/firecrawl/d8122162-754a-47fb-9ce7-51fdf6320a3f San Francisco, CA (Hybrid) 2026-03-08 8ee55a18-4c1 Researcher, Automated Red Teaming Location

San Francisco

Employment Type

Full time

Department

Safety Systems

Compensation

Estimated Base Salary $295K – $445K

The base pay offered may vary depending on multiple individualized factors, including market location, job-related knowledge, skills, and experience. If the role is non-exempt, overtime pay will be provided consistent with applicable laws. In addition to the salary range listed above, total compensation also includes generous equity, performance-related bonus(es) for eligible employees, and the following benefits.

Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts

Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)

401(k) retirement plan with employer match

Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)

Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees

13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)

Mental health and wellness support

Employer-paid basic life and disability coverage

Annual learning and development stipend to fuel your professional growth

Daily meals in our offices, and meal delivery credits as eligible

Relocation support for eligible employees

Additional taxable fringe benefits, such as charitable donation matching and wellness stipends, may also be provided.

More details about our benefits are available to candidates during the hiring process.

This role is at-will and OpenAI reserves the right to modify base pay and other compensation components at any time based on individual performance, team or company results, or market conditions.

About the team

The Safety Systems org ensures that OpenAI’s most capable models can be responsibly developed and deployed. We build evaluations, safeguards, and safety frameworks that help our models behave as intended in real-world settings.

The Preparedness team is an important part of the Safety Systems org at OpenAI, and is guided by OpenAI’s Preparedness Framework.

Frontier AI models have the potential to benefit all of humanity, but also pose increasingly severe risks. To ensure that AI promotes positive change, the Preparedness team helps us prepare for the development of increasingly capable frontier AI models. This team is tasked with identifying, tracking, and preparing for catastrophic risks related to frontier AI models.

The mission of the Preparedness team is to:

Closely monitor and predict the evolving capabilities of frontier AI systems, with an eye towards risks whose impact could be catastrophic
Ensure we have concrete procedures, infrastructure and partnerships to mitigate these risks and to safely handle the development of powerful AI systems

Preparedness tightly connects capability assessment, evaluations, and internal red teaming, and mitigations for frontier models, as well as overall coordination on AGI preparedness. This is fast paced, exciting work that has far reaching importance for the company and for society.

About the role

This role leads the Automated Red Teaming (ART) effort: building scalable, research-driven systems that continuously discover failure modes in our models and mitigations — and translate those findings into actionable, production-facing improvements. The goal is to maximize counterfactual reduction in expected harm by finding the highest-leverage, least-covered weaknesses early and reliably.

In this role you will

You will own the research and technical direction for automated red teaming across catastrophic risk areas, with an initial emphasis on:

Automated classifier jailbreak discovery (cyber and bio)
Automated bio threat-development elicitation (worst-feasible planning uplift)
CoT monitoring evasion probing (and adjacent loss-of-control evaluations)

You will partner tightly with:

Vertical risk teams (Cyber, Bio, Loss of Control) to define threat models, prioritize targets, and land mitigations
The Classifiers team to turn discovered attacks into training data, evals, and measurable robustness gains
Product / eng / safety stakeholders to ensure ART outputs are operationally useful (not just interesting)

You might thrive in this role if you:

Feel a strong pull toward AI safety, and you’re motivated by reducing real-world catastrophic risk (not just publishing cool results)
Love breaking systems (responsibly) — you get energy from finding weird, high-severity failure modes and turning them into concrete fixes
Have strong applied research instincts, especially around evaluations: you’re good at designing experiments that are reproducible, interpretable, and hard to fool
Bring hands-on experience with LLMs and agents, including multi-turn behaviors, tool use, and the ways models adapt to constraints
Are comfortable building scalable automation, not just prototypes — you can turn red-teaming ideas into pipelines that run continuously and produce high-signal outputs
Have solid software engineering fundamentals (data structures, algorithms, testing discipline) and you can work effectively in a production-adjacent environment
Think in threat models and incentives, and you naturally ask “what would an attacker do next?” or “how would this fail under pressure?”
Can translate messy findings into action, communicating clearly with researchers, engineers, product, and policy — and driving alignment on what to fix first
Care about efficiency and prioritization, and you’re happy to say “no” to low-level

XML job scraping automation by YubHub

]]> full-time senior onsite $295K – $445K Applied research, Automated red teaming, Catastrophic risk assessment, Classifier jailbreak discovery, Cybersecurity, Data structures, Evaluations, LLMs and agents, Loss-of-control evaluations, Multi-turn behaviors, Red teaming, Scalable automation, Software engineering, Threat models, Tool use, Bio threat-development elicitation, CoT monitoring evasion probing, Loss-of-control evaluations, Multi-turn behaviors, Red teaming, Scalable automation, Software engineering, Threat models, Tool use Engineering Technology OpenAI https://logos.yubhub.co/openai.com.png OpenAI is a technology company that specializes in developing and training artificial intelligence models. It was founded in 2015 and is headquartered in San Francisco, California. https://jobs.ashbyhq.com https://jobs.ashbyhq.com/openai/bf7d2623-7846-410c-87f8-c628915ec16c San Francisco 2026-03-06 a938a934-817 Software Engineer, Applied Evals Software Engineer, Applied Evals

Location

San Francisco

Employment Type

Full time

Location Type

Hybrid

Department

Applied AI

Compensation

$230K – $325K • Offers Equity

Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts

Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)

401(k) retirement plan with employer match

Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)

Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees

13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)

Mental health and wellness support

Employer-paid basic life and disability coverage

Annual learning and development stipend to fuel your professional growth

Daily meals in our offices, and meal delivery credits as eligible

Relocation support for eligible employees

Additional taxable fringe benefits, such as charitable donation matching and wellness stipends, may also be provided.

More details about our benefits are available to candidates during the hiring process.

This role is at-will and OpenAI reserves the right to modify base pay and other compensation components at any time based on individual performance, team or company results, or market conditions.

About the team

Applied Evals defines what good looks like for safe, advanced AI systems. We turn complex, high-value workflows into clear, reproducible signals that guide model training and product quality. Our work bridges frontier customers and models, ensuring improvements show up where users experience them. We combine hands-on, unscalable efforts with systems that others can extend, creating a compounding loop of model improvement.

About the Role

We’re hiring product-minded engineers to design and build evals and harnesses that capture real-world quality for advanced AI systems. You’ll own the loop from prototyping with users to building reliable pipelines and integrating signals into training stacks. This role sits at the center of model improvement. The systems you design will directly shape how models behave, accelerate their reliability, and raise the standard for what customers expect.

You’ll collaborate closely with research and product teams and work across the stack, from backend pipelines to user-facing interfaces. The work includes evaluating multi-turn and tool-using systems, designing agent harnesses, and applying reinforcement learning and related methods in production settings. Engineers who succeed in this role bring both a builder’s mindset and the judgment to create reusable systems that others can build on. Many thrive here by operating like founders or founding engineers, taking initiative, moving quickly, and creating structure where none exists.

This role is based in our San Francisco HQ. We use a hybrid work model of 3 days in the office per week and offer relocation assistance.

In this role, you will:

Define the core evaluation signals that drive model improvement at OpenAI, turning vague product gaps into crisp, defensible measures of quality

Design agents, harnesses, and eval pipelines that are reliable, reproducible, and extendable

Prototype solutions with real workflows and convert them into scalable feedback loops

Connect evaluation signals directly to research and training systems so product improvements show up in what users experience

Shape model interaction paradigms by partnering with engineering, research, and product teams on how models are deployed and measured

Build reusable systems and tools that enable contributions from across the company and steadily raise the quality bar

You’ll thrive in this role if you:

Bring 4+ years of experience in software engineering with strong fundamentals and a track record of shipping production systems end-to-end

Have experience building AI agents or applications, including designing evals and improving performance through prompting or scaffolding

Are familiar with evaluation methods for LLMs and have worked with patterns like multi-agent workflows, tool use, or long context.

Are familiar with deep learning concepts or have prior exposure to training models.

Communicate clearly across technical and non-technical audiences across levels

Are motivated by high-impact collaboration with research and product teams and thrive in ambiguity

About OpenAI

OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity.

XML job scraping automation by YubHub

]]> full-time senior hybrid $230K – $325K Software engineering, AI agents or applications, Evaluation methods for LLMs, Deep learning concepts, Training models, Reinforcement learning, Multi-agent workflows, Tool use, Long context Engineering Technology OpenAI https://logos.yubhub.co/openai.com.png OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. https://jobs.ashbyhq.com https://jobs.ashbyhq.com/openai/99121e6d-a542-4881-968f-4cd89d9f583c San Francisco 2026-03-06 c224e1d4-cc6 Backend Software Engineer (Evals) Location

San Francisco; Seattle

Employment Type

Full time

Department

Applied AI

Compensation

$230K – $385K • Offers Equity

Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts

Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)

401(k) retirement plan with employer match

Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)

Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees

13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)

Mental health and wellness support

Employer-paid basic life and disability coverage

Annual learning and development stipend to fuel your professional growth

Daily meals in our offices, and meal delivery credits as eligible

Relocation support for eligible employees

Additional taxable fringe benefits, such as charitable donation matching and wellness stipends, may also be provided.

More details about our benefits are available to candidates during the hiring process.

This role is at-will and OpenAI reserves the right to modify base pay and other compensation components at any time based on individual performance, team or company results, or market conditions.

About the Team

The Support Automation team at OpenAI scales the organization by applying cutting-edge AI models to real-world challenges, automating and enhancing work across the organization. From customer operations to engineering, we develop an ecosystem of automation products that empower our colleagues and drive impact. We're passionate about crafting products that serve those around us, blending rapid prototyping with a focus on long-term quality and reliability. By creating reusable solutions, we create patterns that can be applied across diverse domains within OpenAI.

TLDR: this team leverages OpenAI technology to improve OpenAI, and you’ll have the opportunity to leverage the full extent of our tech (both public and pre-released) to accomplish this mission.

About the Role

We’re looking for a Backend Software Engineer with experience working in ML/LLM-heavy domains to help to design and build an evals infrastructure that measures the quality of OpenAI’s support automation. This is a deeply technical and highly cross-functional role where you’ll build robust systems and backend services that serve as the foundation for how knowledge is created, accessed, and applied across OpenAI. The role will especially focus on working closely with Data Science and Research partners to design and build evals at scale.

In this role, you will:

Design eval pipelines that are reliable, reproducible, and extendable

Build the infrastructure for continuous eval monitoring frameworks (regression/drift monitoring, building robust golden datasets) along with feedback loops that ultimately strengthen support automation

Design, build, and maintain backend services and APIs to support intelligent automation and knowledge systems

Integrate and structure data across internal platforms, transforming it into formats optimized for use by downstream systems and AI workflows.

Collaborate closely with data, research, and engineering teams to integrate OpenAI models into high-leverage workflows

Own the full development lifecycle of new backend systems and internal platform capabilities

Build with scale and maintainability in mind, while rapidly iterating on new ideas

You might be a great fit if you have:

4+ years of backend engineering experience at product-driven companies (excluding internships)

Proficiency in backend technologies. Our tech stack includes Python, FastAPI, and Postgres

Experience designing and scaling distributed systems, APIs, or data processing pipelines

Have experience building AI agents or applications, including designing evals and improving performance through prompting or scaffolding

Are familiar with evaluation methods for LLMs and have worked with patterns like multi-agent workflows, tool use, or long context.

Experience creating production evals and/or measuring performance of ML/LLM models at scale

A pragmatic mindset. You’re comfortable shipping iteratively while building toward a long-term vision

About OpenAI

XML job scraping automation by YubHub

]]> full-time senior hybrid $230K – $385K backend engineering, Python, FastAPI, Postgres, distributed systems, APIs, data processing pipelines, AI agents, evaluation methods for LLMs, ML/LLM-heavy domains, designing evals, improving performance through prompting or scaffolding, multi-agent workflows, tool use, long context Engineering Technology OpenAI https://logos.yubhub.co/openai.com.png OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. The company pushes the boundaries of the capabilities of AI systems and seeks to safely deploy them to the world through their products. https://jobs.ashbyhq.com https://jobs.ashbyhq.com/openai/3d064454-c0c3-4225-bc2c-6d8c0f8735b2 San Francisco; Seattle 2026-03-06 146f30de-73a Principal Applied Scientist Summary

Microsoft AI are looking for a talented Principal Applied Scientist at their Beijing office. This role sits at the heart of strategic decision-making, turning market data into actionable insights for a company that's revolutionising the field of AI. You'll work directly with leadership to shape the company's direction in the AI market.

About the Role

We are seeking an Applied Scientist / AI Architect with strong hands-on experience in building and optimizing large language models (LLMs), agentic AI systems, and end-to-end model training workflows. This role is ideal for scientists with a solid applied background who can translate state-of-the-art research into real-world impact. A research-oriented mindset with publications in top AI/ML venues is highly preferred but not strictly required.

Accountabilities

Design and implement advanced LLM-based architectures and agentic systems for real-world product scenarios.
Translate research breakthroughs into production-ready algorithms, contributing to core capabilities such as reasoning, planning, long-term memory, code-gen based design.
Monitor and improve model performance post-deployment through data-driven iteration and error analysis.
Collaborate across teams to deliver robust, scalable models aligned with product objectives and user value.
Contribute to the organization’s scientific direction by identifying research opportunities that drive long-term differentiation.

The Candidate we're looking for

Experience:

M.S. or Ph.D. in Computer Science, Machine Learning, or a related field, or equivalent practical experience.
5+ years of experience in applied machine learning, with a focus on LLMs, agent systems, or reinforcement learning.

Technical skills:

Strong hands-on experience with prompt engineering, context engineering, retrieval-augmented generation (RAG), tool use, planning agents, and long-context modeling, etc.
Familiarity with model training pipelines using PyTorch, TensorFlow, JAX, or similar frameworks, evaluation strategies, and model deployment best practices.

Personal attributes:

Strong coding and debugging skills, and comfort working in cross-functional, agile environments.

Benefits

Competitive salary and benefits package.
Opportunities for professional growth and development.
Collaborative and dynamic work environment.
Access to cutting-edge technology and resources.

XML job scraping automation by YubHub

]]> full-time senior onsite Competitive salary and benefits package Applied machine learning, LLMs, Agent systems, Reinforcement learning, PyTorch, TensorFlow, JAX, Prompt engineering, Context engineering, RAG, Tool use, Planning agents, Long-context modeling Engineering Technology Microsoft AI https://logos.yubhub.co/microsoft.ai.png Microsoft AI is a leading company in the field of artificial intelligence, with a mission to innovate and push the boundaries of what is possible with AI. They have a strong focus on research and development, and are constantly looking for new ways to apply AI to real-world problems. https://microsoft.ai https://microsoft.ai/job/principal-applied-scientist-12/ Beijing, China 2026-03-06 741e1ef8-936 Senior Applied Scientist Summary

Microsoft are looking for a talented Senior Applied Scientist at their Suzhou office. This role sits at the heart of strategic decision-making, turning market data into actionable insights for a company that's revolutionising the AI and engineering system. You'll work directly with leadership to shape the company's direction in the global Office users market.

About the Role

We are seeking a highly skilled and motivated Applied Scientist with strong hands-on experience in building and optimizing agentic AI systems. Our mission is to benefit Office users with rich content and tool support to raise productivity, and we are building the AI and engineering system to make it happen. This position offers an exciting opportunity to design and develop highly complex and comprehensive systems combining engineering, AI and human participation. You will work closely with product managers, designers, and users to turn ideas and AI into reality and make a great impact to all global Office users.

Accountabilities

Design and implement advanced LLM-based architectures and agentic systems for real-world product scenarios.
Translate research breakthroughs into production-ready algorithms, contributing to core capabilities such as reasoning, planning, long-term memory, code-gen based design.
Monitor and improve model performance post-deployment through data-driven iteration and error analysis.
Collaborate across teams to deliver robust, scalable models aligned with product objectives and user value.
Contribute to the organization’s scientific direction by identifying research opportunities that drive long-term differentiation.

The Candidate we're looking for

Experience:

M.S. or Ph.D. in Computer Science, Machine Learning, or a related field, or equivalent practical experience.
5+ years of experience in applied machine learning, with a focus on LLMs, agent systems, or reinforcement learning.

Technical skills:

Strong hands-on experience with prompt engineering, context engineering, retrieval-augmented generation (RAG), tool use, planning agents, and long-context modeling, etc.
Familiarity with model training pipelines using PyTorch, TensorFlow, JAX, or similar frameworks, evaluation strategies, and model deployment best practices.

Personal attributes:

Strong coding and debugging skills, and comfort working in cross-functional, agile environments.
Experience on office document generation and related applications is a plus.

Benefits

Competitive salary and benefits package.
Opportunities for professional growth and development.
Collaborative and dynamic work environment.
Access to cutting-edge technology and resources.
Flexible work arrangements and work-life balance.

XML job scraping automation by YubHub

]]> full-time senior onsite Competitive salary and benefits package Machine Learning, Artificial Intelligence, Python, PyTorch, TensorFlow, JAX, Prompt Engineering, Context Engineering, Retrieval-Augmented Generation (RAG), Tool Use, Planning Agents, Long-Context Modeling Engineering Technology Microsoft https://logos.yubhub.co/microsoft.ai.png Microsoft is a multinational technology company that develops, manufactures, licenses, and supports a wide range of software products, services, and devices. They are a leader in the technology industry and have a strong presence in the global market. Microsoft is known for its innovative products and services, such as Windows, Office, and Azure. https://microsoft.ai https://microsoft.ai/job/senior-applied-scientist-34/ Suzhou 2026-03-06 18b450e2-d82 Senior Applied Scientist Summary

Microsoft are looking for a highly skilled and motivated Applied Scientist with strong hands-on experience in building and optimizing agentic AI systems. Our mission is to benefit Office users with rich content and tool support to raise productivity, and we are building the AI and engineering system to make it happen.

About the Role

Accountabilities

Design and implement advanced LLM-based architectures and agentic systems for real-world product scenarios.
Translate research breakthroughs into production-ready algorithms, contributing to core capabilities such as reasoning, planning, long-term memory, code-gen based design.
Monitor and improve model performance post-deployment through data-driven iteration and error analysis.
Collaborate across teams to deliver robust, scalable models aligned with product objectives and user value.
Contribute to the organization’s scientific direction by identifying research opportunities that drive long-term differentiation.

The Candidate we're looking for

Experience:

M.S. or Ph.D. in Computer Science, Machine Learning, or a related field, or equivalent practical experience.
5+ years of experience in applied machine learning, with a focus on LLMs, agent systems, or reinforcement learning.

Technical skills:

Strong hands-on experience with prompt engineering, context engineering, retrieval-augmented generation (RAG), tool use, planning agents, and long-context modeling, etc.
Familiarity with model training pipelines using PyTorch, TensorFlow, JAX, or similar frameworks, evaluation strategies, and model deployment best practices.

Personal attributes:

Strong coding and debugging skills, and comfort working in cross-functional, agile environments.
Experience on office document generation and related applications is a plus.

Benefits

Competitive salary and benefits package.
Opportunities for professional growth and development.
Collaborative and dynamic work environment.
Access to cutting-edge technology and resources.
Flexible work arrangements and work-life balance.

XML job scraping automation by YubHub

]]> full-time senior onsite Competitive salary and benefits package Machine Learning, Artificial Intelligence, Python, PyTorch, TensorFlow, JAX, Prompt Engineering, Context Engineering, Retrieval-Augmented Generation, Tool Use, Planning Agents, Long-Context Modeling Engineering Technology Microsoft https://logos.yubhub.co/microsoft.ai.png Microsoft is a multinational technology company that develops, manufactures, licenses, and supports a wide range of software products, services, and devices. They are a leader in the technology industry and have a strong presence in the global market. https://microsoft.ai https://microsoft.ai/job/senior-applied-scientist-13/ Beijing 2026-03-06 15ef0497-1bc Senior Product Manager, Copilot AI Summary

Microsoft AI are looking for a talented Senior Product Manager, Copilot AI at their Mountain View office. This role sits at the heart of strategic decision-making, turning market data into actionable insights for a company that's revolutionising AI technology. You'll work directly with leadership to shape the company's direction in the AI landscape.

About the Role

As a Senior Product Manager, you will drive Copilot model capabilities such as tool-use to ensure that the language models that power Microsoft Copilot deliver high quality responses to our users whilst being grounded, reliable, and cost-efficient. You will work at the nexus of product and research, driving execution in partnership with engineers, language engineers, data scientists and researchers.

Accountabilities

Develop and execute on LLM platform strategy for Copilot that extend language model’s capabilities.
Prototype approaches by steering language models to drive response quality across a wide range of scenarios.

The Candidate we're looking for

Experience:

5+ years experience in product management.
3+ years of experience leading ambiguous product areas, defining requirements, developing roadmaps, and working with multi-disciplinary teams to execute them.

Technical skills:

Hands-on experience with LLM APIs (e.g. OpenAI, Anthropic, Azure OpenAI), embeddings, vector databases, and tool use.
Hands-on experience with prompt design, context window management, and model evaluation.

Personal attributes:

Abundance of positive energy, empathy, and kindness.
Highly effective.

Benefits

Competitive salary.
Benefits and other compensation.

XML job scraping automation by YubHub

]]> full-time senior onsite USD $119,800 – $234,700 per year product management, LLM APIs, embeddings, vector databases, tool use, prompt design, context window management, model evaluation, positive energy, empathy, kindness, highly effective Engineering Technology Microsoft AI https://logos.yubhub.co/microsoft.ai.png Microsoft AI is pushing the boundaries of technology, creating unique, beautiful and powerful products that will change lives. As a small, friendly, fast-moving team, we support each other to do the best work of our lives, always looking to break new ground, fast. https://microsoft.ai https://microsoft.ai/job/senior-product-manager-copilot-ai/ Mountain View 2026-03-06