Senior Research Scientist, Reward Models

b1be4c11-417 Senior Research Scientist, Reward Models As a Senior Research Scientist on our Reward Models team, you'll lead research efforts to improve how we specify and learn human preferences at scale. Your work will directly shape how our models understand and optimize for what humans actually want , enabling Claude to be more useful, more reliable, and better aligned with human values.

This role focuses on pushing the frontier of reward modeling for large language models. You'll develop novel architectures and training methodologies for RLHF, research new approaches to LLM-based evaluation and grading (including rubric-based methods), and investigate techniques to identify and mitigate reward hacking. You'll collaborate closely with teams across Anthropic, including Finetuning, Alignment Science, and our broader research organization, to ensure your work translates into concrete improvements in both model capabilities and safety.

We're looking for someone who can drive ambitious research agendas while also shipping practical improvements to production systems. You'll have the opportunity to work on some of the most important open problems in AI alignment, with access to frontier models and significant computational resources. Your work will directly advance the science of how we train AI systems to be both highly capable and safe.

Responsibilities:

Lead research on novel reward model architectures and training approaches for RLHF
Develop and evaluate LLM-based grading and evaluation methods, including rubric-driven approaches that improve consistency and interpretability
Research techniques to detect, characterize, and mitigate reward hacking and specification gaming
Design experiments to understand reward model generalization, robustness, and failure modes
Collaborate with the Finetuning team to translate research insights into improvements for production training pipelines
Contribute to research publications, blog posts, and internal documentation
Mentor other researchers and help build institutional knowledge around reward modeling

You may be a good fit if you

Have a track record of research contributions in reward modeling, RLHF, or closely related areas of machine learning
Have experience training and evaluating reward models for large language models
Are comfortable designing and running large-scale experiments with significant computational resources
Can work effectively across research and engineering, iterating quickly while maintaining scientific rigor
Enjoy collaborative research and can communicate complex ideas clearly to diverse audiences
Care deeply about building AI systems that are both highly capable and safe

Strong candidates may also

Have published research on reward modeling, preference learning, or RLHF
Have experience with LLM-as-judge approaches, including calibration and reliability challenges
Have worked on reward hacking, specification gaming, or related robustness problems
Have experience with constitutional AI, debate, or other scalable oversight approaches
Have contributed to production ML systems at scale
Have familiarity with interpretability techniques as applied to understanding reward model behavior

The annual compensation range for this role is $350,000-$500,000 USD.

XML job scraping automation by YubHub

]]> full-time senior hybrid $350,000-$500,000 USD reward modeling, RLHF, LLM-based evaluation and grading, rubric-driven approaches, reward hacking, specification gaming, large-scale experiments, computational resources, research and engineering, collaborative research, complex ideas communication, AI systems development, published research, LLM-as-judge approaches, calibration and reliability challenges, constitutional AI, debate, scalable oversight approaches, production ML systems, interpretability techniques Engineering Technology Anthropic https://logos.yubhub.co/anthropic.com.png Anthropic creates reliable, interpretable, and steerable AI systems. It is a public benefit corporation headquartered in San Francisco. https://www.anthropic.com/ https://job-boards.greenhouse.io/anthropic/jobs/5024835008 Remote-Friendly (Travel Required) | San Francisco, CA 2026-04-18 8549c317-12f Senior Research Scientist, Reward Models As a Senior Research Scientist on our Reward Models team, you'll lead research efforts to improve how we specify and learn human preferences at scale.

Your work will directly shape how our models understand and optimize for what humans actually want , enabling Claude to be more useful, more reliable, and better aligned with human values.

You'll collaborate closely with teams across Anthropic, including Finetuning, Alignment Science, and our broader research organization, to ensure your work translates into concrete improvements in both model capabilities and safety.

Your work will directly advance the science of how we train AI systems to be both highly capable and safe.

Responsibilities:

Lead research on novel reward model architectures and training approaches for RLHF

Develop and evaluate LLM-based grading and evaluation methods, including rubric-driven approaches that improve consistency and interpretability

Research techniques to detect, characterize, and mitigate reward hacking and specification gaming

Design experiments to understand reward model generalization, robustness, and failure modes

Collaborate with the Finetuning team to translate research insights into improvements for production training pipelines

Contribute to research publications, blog posts, and internal documentation

Mentor other researchers and help build institutional knowledge around reward modeling

You may be a good fit if you:

Have a track record of research contributions in reward modeling, RLHF, or closely related areas of machine learning

Have experience training and evaluating reward models for large language models

Are comfortable designing and running large-scale experiments with significant computational resources

Can work effectively across research and engineering, iterating quickly while maintaining scientific rigor

Enjoy collaborative research and can communicate complex ideas clearly to diverse audiences

Care deeply about building AI systems that are both highly capable and safe

Strong candidates may also:

Have published research on reward modeling, preference learning, or RLHF

Have experience with LLM-as-judge approaches, including calibration and reliability challenges

Have worked on reward hacking, specification gaming, or related robustness problems

Have experience with constitutional AI, debate, or other scalable oversight approaches

Have contributed to production ML systems at scale

Have familiarity with interpretability techniques as applied to understanding reward model behavior

The annual compensation range for this role is $350,000-$500,000 USD.

Logistics:

Minimum education: Bachelor’s degree or an equivalent combination of education, training, and/or experience

Required field of study: A field relevant to the role as demonstrated through coursework, training, or professional experience

Minimum years of experience: Years of experience required will correlate with the internal job level requirements for the position

Location-based hybrid policy: Currently, we expect all staff to be in one of our offices at least 25% of the time. However, some roles may require more time in our offices.

Visa sponsorship: We do sponsor visas! However, we aren't able to successfully sponsor visas for every role and every candidate. But if we make you an offer, we will make every reasonable effort to get you a visa, and we retain an immigration lawyer to help with this.

We encourage you to apply even if you do not believe you meet every single qualification. Not all strong candidates will meet every single qualification as listed. Research shows that people who identify as being from underrepresented groups are more prone to experiencing imposter syndrome and doubting the strength of their candidacy, so we urge you not to exclude yourself prematurely and to submit an application if you're interested in this work.

Your safety matters to us. To protect yourself from potential scams, remember that Anthropic recruiters only contact you from @anthropic.com email addresses. In some cases, we may partner with vetted recruiting agencies who will identify themselves as working on behalf of Anthropic. Be cautious of emails from other domains. Legitimate Anthropic recruiters will never ask for money, fees, or banking information before your first day. If you're ever unsure about a communication, don't click any links,visit anthropic.com/careers directly for confirmed position openings.

How we're different:

We believe that the highest-impact AI research will be big science. At Anthropic we work as a single cohesive team on just a few large-scale research efforts. And we value impact , advancing our long-term goals of steerable, trustworthy AI , rather than work on smaller and more specific puzzles. We view AI research as an empirical science, which has as much in common with physics and biology as with traditional efforts in computer science. We're an extremely collaborative group, and we host frequent research discussions to ensure that we are pursuing the highest-impact work at any given time. As such, we greatly value communication skills.

The easiest way to understand our research directions is to read our recent research. This research continues many of the directions our team worked on prior to Anthropic, including: GPT-3, Circuit-Based Interpretability, Multimodal Neurons, Scaling Laws, AI & Compute, Concrete Problems in AI Safety, and Learning from Human Preferences.

Come work with us!

Anthropic is a public benefit corporation headquartered in San Francisco. We offer competitive compensation and benefits, optional equity donation matching, generous vacation and parental leave, flexible working hours, and a lovely office space in which to collaborate with colleagues.

XML job scraping automation by YubHub

]]> full-time senior hybrid $350,000-$500,000 USD reward modeling, RLHF, large language models, novel architectures, training methodologies, evaluation and grading, rubric-based methods, reward hacking, specification gaming, generalization, robustness, failure modes, computational resources, scientific rigor, communication skills, interpretability techniques Engineering Technology Anthropic https://logos.yubhub.co/anthropic.com.png Anthropic is a public benefit corporation that aims to create reliable, interpretable, and steerable AI systems. https://www.anthropic.com/ https://job-boards.greenhouse.io/anthropic/jobs/5024835008 Remote-Friendly (Travel Required) | San Francisco, CA 2026-04-18 d2f5b1e5-545 Research Scientist, Gemini Safety We're seeking a versatile Research Scientist to join our Gemini Safety team. As a Research Scientist, you will apply and develop data and algorithmic cutting-edge solutions to advance our latest user-facing models. Your work will focus on advancing the safety and fairness behavior of state-of-the-art AI models, driving the development of foundational technology adopted by numerous product areas, including Gemini App, Cloud API, and Search.

Key responsibilities include:

Post-training/instruction tuning state-of-the-art LLMs, focusing on text-to-text, image/video/audio-to-text modalities and agentic capabilities
Exploring data, reasoning, and algorithmic solutions to ensure Gemini Models are safe, maximally helpful, and work for everyone
Improve Gemini's adversarial robustness, with a focus on high-stakes abuse risks
Design and maintain high-quality evaluation protocols to assess model behavior gaps and headroom related to safety and fairness
Develop and execute experimental plans to address known gaps, or construct entirely new capabilities
Drive innovation and enhance understanding of Supervised Fine Tuning and Reinforcement Learning fine-tuning at scale

To succeed as a Research Scientist in the Gemini Safety team, we look for the following skills and experience:

PhD in Computer Science, a related field, or equivalent practical experience
Significant LLM post-training experience
Experience in Reward modeling and Reinforcement Learning for LLMs Instruction tuning
Experience with Long-range Reinforcement learning
Experience in areas such as Safety, Fairness, and Alignment
Track record of publications at NeurIPS, ICLR, ICML
Experience taking research from concept to product
Experience with collaborating or leading an applied research project
Strong experimental taste: Good judgment regarding baselines, ablations, and what is worth testing
Experience with JAX

XML job scraping automation by YubHub

]]> full-time senior onsite PhD in Computer Science, LLM post-training experience, Reward modeling and Reinforcement Learning for LLMs Instruction tuning, Long-range Reinforcement learning, Safety, Fairness, and Alignment, NeurIPS, ICLR, ICML publications, Research from concept to product, Collaborating or leading an applied research project, JAX Engineering Technology Google DeepMind https://logos.yubhub.co/deepmind.com.png Google DeepMind is a subsidiary of Alphabet Inc., a multinational conglomerate. https://deepmind.com/ https://job-boards.greenhouse.io/deepmind/jobs/7731944 Zurich, Switzerland 2026-04-18 540ce49c-271 Member of Technical Staff - Multimodal Understanding About the Role

You will join the multimodal team to push toward superhuman multimodal intelligence. Advance understanding and generation across modalities,image, video, audio, and text,spanning the full stack: data curation/acquisition, tokenizer training, large-scale pre-training, post-training/alignment, infrastructure/scaling, evaluation, tooling/demos, and end-to-end product experiences.

Collaborate cross-functionally with pre-training, post-training, reasoning, data, applied, and product teams to deliver frontier capabilities in multimodal reasoning, world modeling, tool use, agentic behaviors, and interactive human-AI collaboration. Contribute to building models that can see, hear, reason about, and interact with the world in real time at unprecedented levels.

Responsibilities

Design, build, and optimize large-scale distributed systems for multimodal pre-training, post-training, inference, data processing, and tokenization at web/petabyte scale.
Develop high-throughput pipelines for data acquisition, preprocessing, filtering, generation, decoding, loading, crawling, visualization, and management (images, videos, audio + text).
Advance multimodal capabilities including spatial-temporal compression, cross-modal alignment, world modeling, reasoning, emergent abilities, audio/image/video understanding & generation, real-time video processing, and noisy data handling.
Drive data quality and studies: curation (human/synthetic), filtering techniques, analysis, and scalable pipelines to support trillion-parameter models.
Create evaluation frameworks, internal benchmarks, reward models, and metrics that capture real-world usage, failure modes, interactive dynamics, and human-AI synergy.
Innovate on algorithms, modeling approaches, hardware/software/algorithm co-design, and scaling paradigms for state-of-the-art performance.
Build research tooling, user-friendly interfaces, prototypes/demos, full-stack applications, and enable rapid iteration based on feedback.
Work across the stack (pre-training → SFT/RL/post-training) to enable reasoning, tool calling, agentic behaviors, orchestration, and seamless real-time interactions.

Basic Qualifications

Hands-on experience with multimodal pre-training, post-training, or fine-tuning (vision, audio, video, or cross-modal).
Expert-level proficiency in Python (core language), with strong experience in at least one of: JAX / PyTorch / XLA.
Proven track record building or optimizing large-scale distributed ML systems (training/inference optimization, GPU utilization, multi-GPU/TPU setups, hardware co-design).
Deep experience designing and running data pipelines at scale: curation, filtering, generation, quality studies, especially for noisy/real-world multimodal data.
Strong fundamentals in evaluation design, benchmarks, reward modeling, or RL techniques (particularly for interactive/agentic behaviors).
Proactive self-starter who thrives in high-intensity environments and is passionate about pushing multimodal AI frontiers.
Willingness to own end-to-end initiatives and do whatever it takes to deliver breakthrough user experiences.

Preferred Skills and Experience

Experience leading major improvements in model capabilities through better data, modeling, algorithms, or scaling.
Familiarity with state-of-the-art in multimodal LLMs, scaling laws, tokenizers, compression techniques, reasoning, or agentic systems.
Proficiency in Rust and/or C++ for performance-critical components.
Hands-on work with large-scale orchestration tools such as Spark, Ray, or Kubernetes.
Background building full-stack tooling: performant interfaces, real-time research demos/apps, or end-to-end product ownership.
Passion for end-to-end user experience in interactive, real-time multimodal AI systems.

XML job scraping automation by YubHub

]]> full-time staff onsite $180,000 - $440,000 USD Multimodal pre-training, Post-training, Fine-tuning, Python, JAX, PyTorch, XLA, Large-scale distributed ML systems, Data pipelines, Evaluation design, Benchmarks, Reward modeling, RL techniques, State-of-the-art in multimodal LLMs, Scaling laws, Tokenizers, Compression techniques, Reasoning, Agentic systems, Rust, C++, Spark, Ray, Kubernetes, Full-stack tooling Engineering Technology xAI https://logos.yubhub.co/xai.com.png xAI creates AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. https://www.xai.com https://job-boards.greenhouse.io/xai/jobs/5111374007 Palo Alto, CA 2026-04-18 a48bc0a6-719 Research Scientist, Gemini Safety Job Title: Research Scientist, Gemini Safety

We're looking for a versatile Research Scientist to join our Gemini Safety team at Google DeepMind. As a Research Scientist, you will be responsible for applying and developing data and algorithmic cutting-edge solutions to advance the safety and fairness behavior of our latest user-facing models.

The Gemini Safety team is accountable for the safety and fairness behavior of GDM's latest Gemini models. Our team focuses on advancing the safety and fairness behavior of state-of-the-art AI models, driving the development of foundational technology adopted by numerous product areas, including Gemini App, Cloud API, and Search.

Key Responsibilities:

Post-training/instruction tuning state-of-the-art LLMs, focusing on text-to-text, image/video/audio-to-text modalities and agentic capabilities
Exploring data, reasoning, and algorithmic solutions to ensure Gemini Models are safe, maximally helpful, and work for everyone
Improve Gemini's adversarial robustness, with a focus on high-stakes abuse risks
Design and maintain high-quality evaluation protocols to assess model behavior gaps and headroom related to safety and fairness
Develop and execute experimental plans to address known gaps, or construct entirely new capabilities
Drive innovation and enhance understanding of Supervised Fine Tuning and Reinforcement Learning fine-tuning at scale

About You:

PhD in Computer Science, a related field, or equivalent practical experience
Significant LLM post-training experience
Experience in Reward modeling and Reinforcement Learning for LLMs Instruction tuning
Experience with Long-range Reinforcement learning
Experience in areas such as Safety, Fairness, and Alignment
Track record of publications at NeurIPS, ICLR, ICML, RL/DL, EMNLP, AAAI, UAI
Experience taking research from concept to product
Experience with collaborating or leading an applied research project
Experience with JAX

At Google DeepMind, we value diversity of experience, knowledge, backgrounds, and perspectives and harness these qualities to create extraordinary impact. We are committed to equal employment opportunity regardless of sex, race, religion or belief, ethnic or national origin, disability, age, citizenship, marital, domestic or civil partnership status, sexual orientation, gender identity, pregnancy, or related condition (including breastfeeding) or any other basis as protected by applicable law. If you have a disability or additional need that requires accommodation, please do not hesitate to let us know.

XML job scraping automation by YubHub

]]> full-time senior onsite PhD in Computer Science, LLM post-training experience, Reward modeling and Reinforcement Learning for LLMs Instruction tuning, Long-range Reinforcement learning, Safety, Fairness, and Alignment, JAX Engineering Technology Google DeepMind https://logos.yubhub.co/deepmind.com.png Google DeepMind is a subsidiary of Alphabet Inc., a multinational conglomerate, and is involved in the development of artificial intelligence. https://deepmind.com/ https://job-boards.greenhouse.io/deepmind/jobs/7421111 Mountain View, California, US 2026-03-16 ca908406-7b8 Member of Technical Staff - Post Training - MAI Superintelligence Team Summary

Microsoft AI are looking for a talented Member of Technical Staff - Post Training - MAI Superintelligence Team at their New York office. This role sits at the heart of strategic decision-making, turning market data into actionable insights for a company that's revolutionising haptic entertainment technology. You'll work directly with leadership to shape the company's direction in the cinema and simulation markets.

About the Role

This role involves contributions to all stages of the post-training process: driving data collection and acquisition, building evaluations of model capabilities, and applying advanced reward modeling and RL techniques to develop and improve the post-training recipe. We work on the bleeding edge and leverage the most powerful pretrained models and algorithms for our needs. We are an interdisciplinary team of engineers and scientists, learning from each other and collaborating to create the best models.

Accountabilities

Develop data collection, evaluation, and post-training methods for models.
Design hypotheses and experiment plans for rapidly iterating on model performance.

The Candidate we're looking for

Experience:

Bachelor’s Degree in Computer Science, Machine Learning, Mathematics, or related technical discipline AND 4+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience.

Technical skills:

Experience with reward modeling, RL, or other post-training techniques.

Personal attributes:

Passionate about advancing the state of post-training research.
Willing to contribute meaningfully as individuals and take end-to-end ownership of projects.

Benefits

Competitive salary range: $119,800 - $234,700 per year.
Comprehensive benefits package, including health insurance, retirement plan, and paid time off.
Opportunities for professional growth and development.
Collaborative and inclusive work environment.

XML job scraping automation by YubHub

]]> full-time staff onsite $119,800 - $234,700 per year reward modeling, RL, post-training techniques, C, C++, C#, Java, JavaScript, Python, conversational AI, deployment Engineering Technology Microsoft AI https://logos.yubhub.co/microsoft.ai.png Microsoft AI is a leading technology company that specializes in developing cutting-edge algorithms for post-training large language models. They aim to empower every person and every organization on the planet to achieve more. Their mission is to push the boundaries of AI toward Humanist Superintelligence—ultra-capable systems that remain controllable, safety-aligned, and anchored to human values. https://microsoft.ai https://microsoft.ai/job/member-of-technical-staff-post-training-mai-superintelligence-team-3/ New York 2026-03-06 a98b937b-085 Member of Technical Staff - Post Training - MAI Superintelligence Team Summary

Microsoft AI are looking for a talented Member of Technical Staff - Post Training - MAI Superintelligence Team at their Redmond office. This role sits at the heart of strategic decision-making, turning market data into actionable insights for a company that's revolutionising AI technology. You'll work directly with leadership to shape the company's direction in the AI market.

About the Role

Accountabilities

Develop data collection, evaluation, and post-training methods for models.
Design hypotheses and experiment plans for rapidly iterating on model performance.

The Candidate we're looking for

Experience:

Bachelor’s Degree in Computer Science, Machine Learning, Mathematics, or related technical discipline AND 4+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience.

Technical skills:

Experience with reward modeling, RL, or other post-training techniques.

Personal attributes:

Passionate about advancing the state of post-training research.
Willing to contribute meaningfully as individuals and take end-to-end ownership of projects.

Benefits

Competitive salary range: $119,800 - $234,700 per year.
Comprehensive benefits package, including health insurance, retirement plan, and paid time off.
Opportunities for professional growth and development.
Collaborative and inclusive work environment.
Access to cutting-edge technology and resources.

XML job scraping automation by YubHub

]]> full-time staff onsite $119,800 - $234,700 per year reward modeling, RL, post-training techniques, C, C++, C#, Java, JavaScript, Python, conversational AI, deployment Engineering Technology Microsoft AI https://logos.yubhub.co/microsoft.ai.png Microsoft AI is a leading technology company that specializes in developing artificial intelligence (AI) and machine learning (ML) solutions. They are known for their cutting-edge research and development in AI, and their products are used by millions of users worldwide. Microsoft AI is committed to empowering every person and every organization on the planet to achieve more. https://microsoft.ai https://microsoft.ai/job/member-of-technical-staff-post-training-mai-superintelligence-team-2/ Redmond 2026-03-06 99319c10-68b Member of Technical Staff - Post Training - MAI Superintelligence Team Summary

Microsoft AI are looking for a talented Member of Technical Staff - Post Training - MAI Superintelligence Team at their Mountain View office. This role sits at the heart of post-training and improving pre-trained models to advance the state-of-the-art on a wide variety of internal and external benchmarks. You'll work on the bleeding edge and leverage the most powerful pretrained models and algorithms for your needs.

About the Role

This role involves contributions to all stages of the post-training process: driving data collection and acquisition, building evaluations of model capabilities, and applying advanced reward modeling and RL techniques to develop and improve the post-training recipe. You will design hypotheses and experiment plans for rapidly iterating on model performance. You will work on the bleeding edge and leverage the most powerful pretrained models and algorithms for your needs.

Accountabilities

Develop data collection, evaluation, and post-training methods for models.
Design hypotheses and experiment plans for rapidly iterating on model performance.

The Candidate we're looking for

Experience:

Bachelor’s Degree in Computer Science, Machine Learning, Mathematics, or related technical discipline AND 4+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience.

Technical skills:

Experience with reward modeling, RL, or other post-training techniques.

Personal attributes:

Passionate about advancing the state of post-training research.
Will thrive in a highly collaborative, fast-paced environment.

Benefits

Starting January 26, 2026, MAI employees are expected to work from a designated Microsoft office at least four days a week if they live within 50 miles (U.S.) or 25 miles (non-U.S., country-specific) of that location.
Microsoft Superintelligence team’s mission is to empower every person and every organization on the planet to achieve more.

XML job scraping automation by YubHub

]]> full-time staff onsite USD $119,800 – reward modeling, RL, post-training techniques, C, C++, C#, Java, JavaScript, Python, conversational AI, large-scale AI Engineering Technology Microsoft AI https://logos.yubhub.co/microsoft.ai.png Microsoft AI is a leading technology company that specializes in developing cutting-edge algorithms for post-training large language models. They aim to push the boundaries of AI toward Humanist Superintelligence—ultra-capable systems that remain controllable, safety-aligned, and anchored to human values. https://microsoft.ai https://microsoft.ai/job/member-of-technical-staff-post-training-mai-superintelligence-team/ Mountain View 2026-03-06