Machine Learning Research Scientist / Engineer, Reasoning

fb1f459e-b3a Machine Learning Research Scientist / Engineer, Reasoning About Scale

At Scale, our mission is to accelerate the development of AI applications. We're looking for a Machine Learning Research Scientist/Engineer to join our team and help us shape the future of AI.

This role operates at the forefront of AI research and real-world implementation, with a strong focus on reasoning within large language models (LLMs). You will study the data types critical for advancing LLM-based agents, including browser and software engineering (SWE) agents. You will play a key role in shaping Scale's data strategy by identifying the most effective data sources and methodologies for improving LLM reasoning.

Success in this role requires a deep understanding of LLMs, planning algorithms, and novel approaches to agentic reasoning, as well as creativity in tackling challenges related to data generation, model interaction, and evaluation. You will contribute to impactful research on language model reasoning, collaborate with external researchers, and work closely with engineering teams to bring state-of-the-art advancements into scalable, real-world solutions.

Responsibilities

Study the data types critical for advancing LLM-based agents, including browser and software engineering (SWE) agents
Shape Scale's data strategy by identifying the most effective data sources and methodologies for improving LLM reasoning
Contribute to impactful research on language model reasoning
Collaborate with external researchers
Work closely with engineering teams to bring state-of-the-art advancements into scalable, real-world solutions

Requirements

Practical experience working with LLMs, with proficiency in frameworks like PyTorch, JAX, or TensorFlow
A track record of published research in top ML and NLP venues (e.g., ACL, EMNLP, NAACL, NeurIPS, ICML, ICLR, CoLLM, etc.)
At least three years of experience solving complex ML challenges, either in a research setting or product development, particularly in areas related to LLM capabilities and reasoning
Strong written and verbal communication skills, along with the ability to work effectively across teams

Nice to Have

Hands-on experience fine-tuning open-source LLMs or leading bespoke LLM fine-tuning projects using PyTorch/JAX
Research and practical experience in building applications and evaluations related to LLM-based agents, including tool-use, text-to-SQL, browser agents, coding agents, and GUI agents
Experience with agent frameworks such as OpenHands, Swarm, LangGraph, or similar
Familiarity with advanced agentic reasoning techniques such as STaR and PLANSEARCH
Proficiency in cloud-based ML development, with experience in AWS or GCP environments

Benefits

Comprehensive health, dental and vision coverage
Retirement benefits
A learning and development stipend
Generous PTO
Commuter stipend

Salary Range

$252,000-$315,000 USD

XML job scraping automation by YubHub

]]> full-time senior remote $252,000-$315,000 USD PyTorch, JAX, TensorFlow, Large Language Models (LLMs), Planning Algorithms, Agentic Reasoning, Data Generation, Model Interaction, Evaluation, Agent Frameworks, Cloud-Based ML Development, AWS, GCP, STaR, PLANSEARCH Engineering Technology Scale AI https://logos.yubhub.co/scale.com.png Scale AI is a leading AI data foundry that provides high-quality data to drive progress toward Artificial General Intelligence (AGI). It was founded 8 years ago and has since become a major player in the AI industry. https://scale.com/ https://job-boards.greenhouse.io/scaleai/jobs/4605596005 San Francisco, CA; Seattle, WA; New York, NY 2026-04-18 307c2f1c-d78 Senior SDET - Tooling Engineer We are looking for a highly skilled Senior Software Quality Engineer (SDET) to lead our end-to-end quality engineering initiatives across mobile, web, backend, and data platforms. This role combines deep technical expertise with a forward-thinking, AI-first mindset, driving innovation, scalability, and reliability through advanced automation and intelligent testing strategies.

As a senior member of the team, you will champion modern, AI-enhanced quality practices and help build a culture where continuous improvement, automation-first thinking, and data-driven decisions are embedded at every stage of product development. This is a hybrid position in Mountain View (Headquarters) and will require in-office work 2 days a week.

The base salary range for this full-time position is $210,000 to $257,000, plus equity and benefits. Our salary ranges are determined by role, level, and location. EarnIn provides excellent benefits for our employees, including healthcare, internet/cell phone reimbursement, and a learning and development stipend.

Quality Engineering & Test

Own end-to-end quality across iOS and Android applications and their supporting backend services, ensuring high confidence in weekly (or faster) releases. Design and implement comprehensive test strategies covering:

Native mobile applications (iOS & Android)
Mobile-to-backend integrations (REST APIs, auth flows, event-driven systems)
Microservices and distributed systems
Critical web workflows that intersect with mobile journeys
Device, OS, browser, and network variability
App lifecycle events, offline behavior, retries, and edge cases

Ensure critical user journeys are validated across mobile UI → API → backend → web touchpoints, preventing production escapes in high-impact flows. Partner with engineering teams to embed quality gates into the mobile release lifecycle, including pre-merge validation, release candidate verification, and post-deploy smoke testing.

Drive improvements in testability by introducing better logging, API contracts, observability hooks, feature flags, and deterministic state management. Establish meaningful quality metrics (crash analytics, defect trends, flaky tests, API reliability, release risk scoring) and surface actionable insights to engineering stakeholders.

Champion shift-left quality by influencing design reviews, API schema discussions, and acceptance criteria early in development.

AI-Driven Quality and Automation

Leverage AI to enhance mobile, backend, and web testing effectiveness, including:

AI-assisted test case and test data generation
Intelligent regression suite prioritization based on code changes
Predictive defect detection and risk-based testing
Flaky test detection and automated stabilization insights

Integrate AI-powered log intelligence, crash clustering, and anomaly detection into quality workflows. Continuously evaluate and experiment with AI-driven QA tools to increase coverage, reduce maintenance overhead, and accelerate release cycles.

Contribute to building an AI-augmented quality ecosystem that improves speed without compromising reliability.

Automation Excellence

Design, build, and scale robust automation frameworks using:

XCUITest, Espresso, Appium (mobile automation)
Playwright (web and mobile web validation)
REST Assured or similar tools for API and service validation

Ensure frameworks are modular, maintainable, and optimized for scale across multiple teams. Integrate automated validation into CI/CD pipelines (Jenkins, GitHub Actions, etc.) to enable:

Pre-merge quality gates
Parallelized execution
Environment-aware test runs
Post-deployment smoke and regression coverage

Build developer-friendly tooling that enables:

Self-service test execution
Real-time reporting and dashboards
Faster debugging and failure triage
Scalable test data and environment management

Continuously reduce flakiness, improve signal quality, and optimize execution time across mobile and backend suites.

Performance, Scalability & Reliability

Design and execute performance validation across:

Mobile app startup time and responsiveness
API latency, throughput, and reliability
Backend load and stress conditions
Web performance for critical flows

Partner with engineering teams to analyze production logs, crash reports, browser telemetry, and service metrics. Lead root-cause analysis of complex cross-layer defects spanning mobile UI, APIs, backend services, and web surfaces.

Ensure reliability validation is embedded directly into release workflows.

Cross-Functional Collaboration and Leadership

Collaborate closely with mobile engineers, backend developers, web engineers, product managers, DevOps teams, and release managers to define clear, testable requirements and release criteria. Actively participate in sprint grooming, planning, stand-ups, and retrospectives.

Influence best practices around mobile-first design, API contracts, and release readiness. Support mobile app release activities, including release candidate validation, go/no-go recommendations, and post-release monitoring.

Mentor junior QA engineers and contribute to raising the technical bar in automation and cross-platform validation. Work effectively with globally distributed teams to coordinate testing across time zones.

XML job scraping automation by YubHub

]]> full-time senior hybrid $210,000 to $257,000.Minus equity and benefits XCUITest, Espresso, Appium, Playwright, REST Assured, API contracts, Feature flags, Deterministic state management, AI-assisted test case and test data generation, Intelligent regression suite prioritization, Predictive defect detection, Risk-based testing, Flaky test detection, Automated stabilization insights, Log intelligence, Crash clustering, Anomaly detection, CI/CD pipelines, Pre-merge quality gates, Parallelized execution, Environment-aware test runs, Post-deployment smoke and regression coverage, Self-service test execution, Real-time reporting and dashboards, Faster debugging and failure triage, Scalable test data and environment management Engineering Technology EarnIn https://logos.yubhub.co/earnin.com.png EarnIn is a financial technology company that provides earned wage access to individuals. https://www.earnin.com/ https://job-boards.greenhouse.io/earnin/jobs/7403324 Mountain View, US 2026-04-18 cdd2fbec-490 Member of Technical Staff - Mid-training About xAI

xAI's mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge.

Our team is small and highly motivated, focused on engineering excellence. We operate with a flat organisational structure, where all employees are expected to be hands-on and contribute directly to the company's mission.

Responsibilities

We're looking for a Member of Technical Staff to join our team. Key responsibilities include:

Scaling synthetic coding data to trillions of tokens with large-scale Docker verification. Distilling the intelligence of flagship models into flash models through synthetic data generation. Optimising mid-training data mixtures to boost the ceiling for RL. Engineering long-context data recipes. Developing robust and diverse evaluation for mid-training checkpoints.

Basic Qualifications

To be successful in this role, you'll need:

Expertise in ML and large model scaling, with familiarity across all kinds of scaling laws. Strong ability to design ML experiments. Familiarity with state-of-the-art techniques for curating AI training data for text, image, audio, and video modalities. Strong engineering abilities in Spark, Ray, and other frameworks for large-scale data processing.

Compensation and Benefits

The base salary for this role is $180,000 - $440,000 USD. Our total rewards package includes equity, comprehensive medical, vision, and dental coverage, access to a 401(k) retirement plan, short & long-term disability insurance, life insurance, and various other discounts and perks.

XML job scraping automation by YubHub

]]> full-time mid onsite $180,000 - $440,000 USD ML, large model scaling, Docker verification, synthetic data generation, Spark, Ray Engineering Technology xAI https://logos.yubhub.co/xai.com.png xAI creates AI systems to understand the universe and aid humanity in its pursuit of knowledge. https://www.xai.com/ https://job-boards.greenhouse.io/xai/jobs/4965893007 Palo Alto, CA 2026-04-18 9e926934-312 Applied Scientist / Research Engineer (Internship) About Mistral AI

At Mistral AI, we believe in the power of AI to simplify tasks, save time, and enhance learning and creativity. Our technology is designed to integrate seamlessly into daily working life.

We are a global company with teams distributed between France, USA, UK, Germany, and Singapore. We offer a comprehensive AI platform that meets enterprise needs, whether on-premises or in cloud environments. Our offerings include le Chat, the AI assistant for life and work.

Role Summary

Mistral AI is seeking Applied Scientists Interns and Research Engineers Interns to drive innovative research and collaborate with clients on complex research projects. You will develop SOTA models across different modalities such as text, image, and speech. By developing novel methods and research ideas, you will apply these models across a diverse set of use cases and domains.

Responsibilities

• Run pre-training, post-training, and deploy state-of-the-art models on clusters with thousands of GPUs. • Generate and curate data for pre-training and post-training, working on evaluations and making sure the model's performance beats expectations. • Develop the necessary tools and frameworks to facilitate data generation, model training, evaluation, and deployment. • Collaborate with cross-functional teams to tackle complex use cases using agents and RAG pipelines. • Manage research projects and communications with client research teams.

About You

• You are fluent in English, and have excellent communication skills. You are at ease explaining complex technical concepts to both technical and non-technical audiences. • You're an expert with PyTorch or JAX. • You're not afraid of contributing to a big codebase and can find yourself around independently with little guidance. • You write clean, readable, high-performance, fault-tolerant Python code. • You don't need roadmaps: you just do. You don't need a manager: you just ship. • Low-ego, collaborative, and eager to learn. • You have a track record of success through personal projects, professional projects, or in academia.

Benefits

• Competitive salary • Food: Daily lunch vouchers • Sport: Monthly contribution to a Gympass subscription • Transportation: Monthly contribution to a mobility pass

XML job scraping automation by YubHub

]]> internship entry onsite PyTorch, JAX, Python, GPU, data generation, model training, evaluation, deployment, agents, multi-modality, robotics, diffusion models, time-series analysis Engineering Technology Mistral AI https://logos.yubhub.co/mistral.ai.png Mistral AI develops high-performance, open-source, and cutting-edge AI models, products, and solutions for enterprise needs. https://mistral.ai https://jobs.lever.co/mistral/426ef8c0-eb26-4004-a690-f33c62b445a7 Paris 2026-04-17 d9383bcf-242 Model Behavior Architect About this role

As a Model Behavior Architect at Mistral AI, you will be at the forefront of defining and measuring Large Language Model (LLM) behavior. You will work closely with our Science team to define what 'good' looks like for various tasks, including Reasoning, Audio, Alignment, Tools, and Frontier bets.

Responsibilities

Interact with models to identify areas for improvement in model behavior
Gather internal and external feedback on model behavior to scope areas for improvement
Design and implement evaluation pipelines, data guidelines, data generation, and synthetic testing environments
Identify and fix edge case behaviors through rigorous testing
Develop robust evaluation pipelines for model candidates
Collaborate with AI Scientists

About you

You have a deep understanding of linguistics, language, and translation, engineering and code behavior, or LLM agents at work, including reasoning and tool use
You have prior knowledge in training and optimizing model behavior
You are an expert at building robust evaluations
You thrive in dynamic and technically complex environments
You have a track record of delivering innovative, out-of-the-box solutions to address real-world constraints

XML job scraping automation by YubHub

]]> full-time senior onsite Large Language Models, Model Evaluation, Policy Writing, Evaluation Pipelines, Data Generation, Synthetic Testing Environments Engineering Technology Mistral AI https://logos.yubhub.co/mistral.ai.png Mistral AI develops high-performance, open-source, and cutting-edge AI models, products, and solutions for enterprise and personal use. https://mistral.ai https://jobs.lever.co/mistral/4337cebc-b951-4528-98f8-ebcb45db5645 Paris 2026-04-17