Staff Machine Learning Systems Engineer

3b01c809-8ef Staff Machine Learning Systems Engineer As a Staff Machine Learning Systems Engineer at Reddit, you will lead the development of a platform for large-scale ML models. Your responsibilities will include designing end-to-end model lifecycle patterns (MLOps) to boost velocity of development for ML engineers, zero-to-one development and support of a graph ML codebase and platform, collaborating with ML engineers on performance tuning, optimizing batch data processing, and architecting pipelines to build and maintain massive graph data structures.

We are looking for an experienced engineer with 8+ years of experience in ML infrastructure, including model training and model deployments. You should have hands-on experience with ML optimization, cloud-based technologies, MLOps tools, and proficiency with common programming languages and frameworks of ML. Strong focus on scalability, reliability, performance, and ease of use is essential.

In addition to base salary, this job is eligible to receive equity in the form of restricted stock units, and depending on the position offered, it may also be eligible to receive a commission. Reddit offers a wide range of benefits to U.S.-based employees, including medical, dental, and vision insurance, 401(k) program with employer match, generous time off for vacation, and parental leave.

XML job scraping automation by YubHub

]]> full-time staff remote $230,000-$322,000 USD ML infrastructure, model training, model deployments, ML optimization, cloud-based technologies, MLOps tools, Python, PyTorch, Tensorflow, graph ML codebase and platform, Apache Beam, Apache Spark, Ray Data Engineering Technology Reddit https://logos.yubhub.co/redditinc.com.png Reddit is a community-driven platform with over 121 million daily active unique visitors and 100,000+ active communities. https://www.redditinc.com https://job-boards.greenhouse.io/reddit/jobs/7731788 Remote - United States 2026-04-18 980a6242-1cf Member of Technical Staff - Quantitative Research We're looking for a full-stack scientist to pioneer quantitative research efforts at Udio. You will build at the intersection of research, engineering and product, bridging disciplines by drawing on huge, one-of-a-kind proprietary datasets of music, metadata and user interactions/feedback.

Design & own evaluation/optimization frameworks for frontier music models. Dive deep under the hood of our music generation systems, applying computational & human resources to understand model capabilities and identify areas for growth. Build optimization loops and apply your findings to our pretraining, post-training and inference systems as applicable.

Drive product & research roadmap. Own our data roadmap end-to-end, formulating research questions, exploring/linking/expanding data sources and conducting experiments at your discretion. Your work will span data mining, machine learning, causal inference, survey design and more, and your results will be critical for decision-making in product development, research investment and overall business direction.

Build stable infrastructure. Your work will reach far beyond the jupyter kernel, manifesting in robust integrations with our research & product tech stacks, potentially in performance-critical paths. You'll also build large-scale standalone data processing systems, allocating resources as needed to manage the data ecosystem.

Champion scientific rigor. As our first quantitative researcher, you'll cultivate a culture of scientific rigor across the company and deepen common understanding of models, users and data. You'll proactively identify opportunities, define metrics, share results, and build a rigorous foundation upon which to understand our highly subjective domain.

We're looking for someone with deep quantitative expertise, preferably a Ph.D. in statistics, mathematics, physics, or another quantitative discipline, or 5+ years' industry experience as a quantitative analyst / data scientist. Autonomy & ownership are key, as you'll thrive in greenfield research domains, undefined product categories and small, flat teams. Engineering chops are also important, as you'll need to translate your ideas into clear, production-ready code and collaborate in an active research codebase.

XML job scraping automation by YubHub

]]> full-time staff remote $250k - $350k Ph.D. in statistics, mathematics, physics, or another quantitative discipline, 5+ years' industry experience as a quantitative analyst / data scientist, Deep learning frameworks, JAX, GCP, Apache Beam/DataFlow, Kubernetes, TensorFlow Data / TFRecord, Obsession with music & the science of sound, Experience in DSP, MIR, music production / composition / performance, Big record collection Engineering Technology Udio https://logos.yubhub.co/udio.com.png Udio builds AI experiences to empower musical artists and super fans, using best-in-class AI models and partnerships across the music industry. https://udio.com https://job-boards.greenhouse.io/udio/jobs/5081608008 New York City (Remote possible for exceptional candidates) 2026-04-17 e06c831d-23a Machine Learning Engineer The Personalization team at Spotify makes deciding what to play next on Spotify easier and more enjoyable for every listener. We seek to understand the world of music, podcasts, and audiobooks better than anyone else so that we can make great recommendations to every individual person and keep the world listening.

Our Minesweeper squad produces Human Understandable Language Knowledge to enrich music and talk content understanding. We use AI and ML techniques, including Large Language Models, to understand music, podcasts and audiobooks, building reliable, scalable systems to distribute that knowledge to Spotify internal teams, users, and creators.

We are looking for a Machine Learning Engineer to join our team and help build the future of music, podcast and audiobook listening experiences for millions of listeners at Spotify. This is a unique opportunity to help develop and shape Spotify content enrichment, and recommendations.

As a Machine Learning Engineer, you will:

Utilize in-house and 3rd party LLMs to solve language understanding problems
Employ techniques such as fine-tuning and RAG to improve models
Contribute to designing, building, evaluating, shipping, and refining Spotify’s product by hands-on ML development
Help drive optimization, testing, and tooling to improve quality of our content enrichment assets
Collaborate with cross-functional teams of MLEs, data and backend engineers, and other stakeholders including tech research, data science, and product to develop new features and technologies
Perform data analysis to establish baselines and inform product decisions
Stay up-to-date on the latest machine learning algorithms and techniques

You will be part of a motivated and supportive team that values agile software processes, data-driven development, reliability, and disciplined experimentation.

If you have a strong background in machine learning, especially experience with Large Language Models, and are passionate about fostering collaborative teams, we encourage you to apply.

XML job scraping automation by YubHub

]]> full-time mid remote $138,250-$197,500 Large Language Models, Machine Learning, Python, Scala, Java, SQL, PyTorch, TensorFlow, Ray, TFX, Apache Beam, Dataflow, Spark Engineering Technology Spotify https://logos.yubhub.co/spotify.com.png Spotify is a music streaming service that offers users access to millions of songs, podcasts, and audiobooks. It has hundreds of millions of users worldwide. https://www.spotify.com/ https://jobs.lever.co/spotify/de3f6a47-4d75-4512-8351-b362f1d1c32e North America 2026-03-31 da758a3e-06e Machine Learning Engineer The Personalization (PZN) team at Spotify makes deciding what to play next on Spotify easier and more enjoyable for every listener. We seek to understand the world of music, podcasts and audiobooks better than anyone else so that we can make great recommendations to every individual and keep the world listening.

The TurnTable team’s mission is to own and innovate on AI DJ and the interactive listening experiences. Using a mixture of LLMs and traditional ML, we strive to provide depth and connection for all listeners. We are looking for a Machine Learning Engineer to join our team to build and improve our interactive listening experiences.

Responsibilities

Design, build, evaluate, and ship an agent-based DJ solution that brings our DJ and interactive experiences to the next level.
Collaborate with cross-functional teams spanning user research, design, data science, product management, and engineering to build new product features that advance our mission to connect artists and fans in personalized and useful ways.
Prototype new approaches and productionize solutions at scale for our hundreds of millions of active users.
Promote and role-model best practices of ML systems development, testing, evaluation, etc., both inside the team as well as throughout the organization.
Be part of an active group of machine learning practitioners.

Requirements

An experienced ML practitioner motivated to work on complex real-world problems in a fast-paced and collaborative environment.
Strong background in machine learning, natural language processing, and generative AI, with experience in applying theory to develop real-world applications.
Hands-on expertise with implementing end-to-end production ML systems at scale. Experience with production LLM scale-based systems is a plus.
Experience with incorporating human feedback to improve LLM-based systems using techniques like DPO, KTO, and reinforcement fine-tuning.
Experience with designing end-to-end tech specs and modular architectures for ML frameworks in complex problem spaces in collaboration with product teams.
Experience with large-scale, distributed data processing frameworks/tools like Apache Beam, Apache Spark, and cloud platforms like GCP or AWS.

Benefits

Health insurance
Six-month paid parental leave
401(k) retirement plan
Monthly meal allowance
23 paid days off
13 paid flexible holidays
Paid sick leave

The United States base range for this position is $176,166 - $251,666 plus equity.

XML job scraping automation by YubHub

]]> full-time senior remote $176,166 - $251,666 Machine Learning, Natural Language Processing, Generative AI, Apache Beam, Apache Spark, GCP, AWS Engineering Technology Spotify https://logos.yubhub.co/spotify.com.png Spotify is a music streaming service that provides access to millions of songs, podcasts, and audiobooks. It has hundreds of millions of active users worldwide. https://www.spotify.com https://jobs.lever.co/spotify/0cd7549d-880c-4861-b343-c0564cc8e9de North America 2026-03-31 11a36eab-3cb Senior Data Engineer Job Description

Are you ready to contribute to the evolution of our data pipelines for our B2C division? At Future, we are transforming our data-driven decision-making processes and we are looking for a passionate and experienced Data Engineer to join us.

This is an exciting opportunity for someone who excels in a creative environment, enjoys solving complex data challenges, and is eager to build impactful business insights, for this role you will directly report into the Head of Data Engineering

Responsibilities

Develop and maintain new/current features of the data platform.
Responsible for delivery of development projects, including scoping, writing and sizing of stories involved.
Take ownership of BAU processes, develop area specific domain mastery, and seek means to automate them or reduce their impact.
Proposes and advocates for changes to reduce risk, cost and overhead.
Provide appropriate documentation for pipelines developed
Parameterise pipelines so configuration can be changed easily without having to perform deep changes to the codebase
Apply appropriate testing principles to ensure code is fit for purpose

Experience

Experience using Python on Google Cloud Platform for Big Data projects, BigQuery, DataFlow (Apache Beam), Cloud Run Functions, Cloud Run, Cloud Workflows, Cloud Composure
SQL development skills
Experience using Dataform or dbt
Demonstrated strength in data modelling, ETL development, and data warehousing
Knowledge of data management fundamentals and data storage principles
Familiarity with statistical models or data mining algorithms and practical experience applying these to business problems

What's in it for you

The expected range for this role is £50,000 - £60,000

This is a Hybrid role from our Bath Office, working three days from the office, two from home … Plus more great perks, which include;

Uncapped leave, because we trust you to manage your workload and time
When we hit our targets, enjoy a share of our profits with a bonus
Refer a friend and get rewarded when they join Future
Wellbeing support with access to our Colleague Assistant Programmes
Opportunity to purchase shares in Future, with our Share Incentive Plan

XML job scraping automation by YubHub

]]> full-time senior hybrid £50,000 - £60,000 Python, Google Cloud Platform, BigQuery, DataFlow, Apache Beam, Cloud Run Functions, Cloud Run, Cloud Workflows, Cloud Composure, SQL, Dataform, dbt, data modelling, ETL development, data warehousing, data management fundamentals, data storage principles, statistical models, data mining algorithms Engineering Technology Future https://logos.yubhub.co/j.com.png Future is a global leader in specialist media, with over 3,000 employees working across 200+ media brands. https://apply.workable.com https://apply.workable.com/j/3535C2B9B5 Bath 2026-03-09 6d5e164b-74d Data Engineer Data Engineer

Are you ready to contribute to the evolution of our data pipelines for our B2C division? We are transforming our data-driven decision-making processes and we are looking for a passionate and experienced Data Engineer to join us. This is an exciting opportunity for someone who grows in a creative environment, enjoys solving complex data challenges. You'll report into the Lead Data Engineer for this position and sit within the wider Data Engineer team.

The Data & Business Intelligence team guides our organisation to become more data-driven. Our to market changes gives us a competitive edge. By ensuring visibility of objective performance data, we empower our teams to make rapid, informed decisions that enhance overall performance.

Responsibilities

Maintain new/current features of the data platform.
Responsible for delivery of development projects.
Utilise established software engineering practices and principles.
Take ownership of BAU processes, develop area specific domain mastery.
Ensure compliance matters are followed.
Utilise CI/CD and infrastructure as code (Terraform) for rapid deployment of changes.

Experience

Experience using Python on Google Cloud Platform for Big Data projects, BigQuery, DataFlow (Apache Beam), Cloud Run Functions, Cloud Run, Cloud Workflows, Cloud Composure.
SQL development skills.
Demonstrated strength in data modelling, ETL development, and data warehousing.
Knowledge of data management fundamentals and data storage principles.
Familiarity with statistical models or data mining algorithms and practical experience applying these to business problems.

What's in it for you

The expected range for this role is £45,000 - £50,000. This is a Hybrid role from our Bath Office, working three days from the office, two from home. Plus more great perks, which include:

Uncapped leave, because we trust you to manage your workload and time.
When we hit our targets, enjoy a share of our profits with a bonus.
Refer a friend and get rewarded when they join Future.
Wellbeing support with access to our Colleague Assistant Programmes.
Opportunity to purchase shares in Future, with our Share Incentive Plan.

XML job scraping automation by YubHub

]]> full-time mid hybrid £45,000 - £50,000 Python, Google Cloud Platform, BigQuery, DataFlow, Apache Beam, Cloud Run Functions, Cloud Run, Cloud Workflows, Cloud Composure, SQL, data modelling, ETL development, data warehousing, data management fundamentals, data storage principles, statistical models, data mining algorithms Engineering Technology Future https://logos.yubhub.co/j.com.png Future is a global leader in specialist media, with over 3,000 employees working across 200+ media brands. https://apply.workable.com https://apply.workable.com/j/BDB1B6F4CF 2026-03-09 7f345e34-fa0 Software Engineering Manager At Ford Motor Company, we believe freedom of movement drives human progress. We are seeking a Software Engineering Manager to provide engineering leadership to multiple product lines within the Ford Customer Service Division (FCSD). FCSD is a true one-stop shop, offering comprehensive diagnostics, repair, and service capabilities for a full portfolio of electrified, hybrid, and internal combustion vehicles globally.

Responsibilities

Provide Engineering Leadership

Provide engineering leadership to multiple product lines within FCSD
Help business partners understand our iterative development approach and focus on delivering a Minimum Viable Product (MVP) and releases
Design and deliver industry-leading products and services to maximize value and productivity for commercial customers

Ensure Software Engineering Excellence

Ensure software engineering excellence (e.g. best practices and quality) is achieved within the FCSD Tech product line
Collaborate with other Product Line Anchors to reduce complexity across the portfolio, enhance interoperability between services, and build reusable API services

Provide Thought Leadership

Provide thought leadership for the development, structure, technology, and tools used within FCSD
Innovate and operate with an iterative, agile, and user-centric perspective

Communicate Technology Strategy

Clearly communicate technology strategy and vision to team members and internal and external stakeholders
Demonstrate software engineering excellence through actively coding, pairing, and performing code and architecture reviews with the software engineers within the FCSD Tech product line

Qualifications

Bachelor's degree in Computer Science or Engineering or related
5+ years experience with progressive leadership responsibilities in Software Engineering, Architecture, and Agile Framework
Experience with Lean methodology & eXtreme Programming
Must be able to operationalize and assist teams with abstract technology concepts
Strong communication, collaborative, and influencing skills
Proven ability to work closely with senior leadership
Strong personal presence and capabilities to resolve technical concerns
Demonstrated ability to drive development of highly technical technology services and capabilities
Demonstrated understanding and ability to drive API economy and solutions
Demonstrated understanding and ability to drive highly available consumer-ready Internet properties and technical platforms
Experience collaborating with engineers, designers, and product owners
Excellent communication skills with the ability to adapt your communication style to the audience
Ability to work collaboratively and navigate complex decision making in a rapidly changing environment
Strong leadership and communication skills and the ability to teach others
Experience 3+ years with building and supporting cloud-native applications leveraging Java, Spring Boot, and REACT tech stack
Experience with cloud services and platform knowledge
Modern databases (Relational and non-relational)
Continuous integration/continuous delivery tools and pipelines, such as Tekton, Jenkins, Terraform, SonarQube, Maven, Gradle, Harness, Apigee X, etc.
Experience developing and deploying to cloud platforms, such as Google Cloud Platform, Pivotal Cloud Foundry, Amazon Web Services, and Microsoft Azure
Experience with GCP Dataflow (Apache Beam) and workflow orchestration

Benefits

Immediate medical, dental, vision, and prescription drug coverage
Flexible family care days, paid parental leave, new parent ramp-up programs, subsidized back-up child care, and more
Family building benefits, including adoption and surrogacy expense reimbursement, fertility treatments, and more
Vehicle discount program for employees and family members and management leases
Tuition assistance
Established and active employee resource groups
Paid time off for individual and team community service
A generous schedule of paid holidays, including the week between Christmas and New Year's Day
Paid time off and the option to purchase additional vacation time

XML job scraping automation by YubHub

]]> full-time senior remote This position is a range of salary grade LL6. Software Engineering, Agile Framework, Lean methodology, eXtreme Programming, Java, Spring Boot, REACT, Cloud services, Platform knowledge, Modern databases, Continuous integration/continuous delivery tools, Pipelines, GCP Dataflow, Apache Beam, Workflow orchestration, Cloud-native applications, Cloud platforms, API economy, Highly available consumer-ready Internet properties, Technical platforms Engineering Automotive Ford Motor Company Ford Motor Company is a multinational automaker that designs, manufactures, and markets vehicles and automotive-related products. It is one of the largest automakers in the world. https://efds.fa.em5.oraclecloud.com https://efds.fa.em5.oraclecloud.com/hcmUI/CandidateExperience/en/sites/CX_1/job/59597 United States 2026-03-09 0841fcf4-9ab Data Engineer SE - II We are on a mission to rid the world of bad customer service by “mobilizing” the way help is delivered. Today’s consumers want an always-available customer service experience that leaves them feeling valued and respected.

Helpshift helps B2B brands deliver this modern customer service experience through a mobile-first approach. We have changed how conversations take place, moving the conversation away from a slow, outdated email and desktop experience to an in-app chat experience that allows users to interact with brands in their own time.

Through our market-leading AI-powered chatbots and automation, we help brands deliver instant and rapid resolutions. Because agents play a key role in delivering help, our platform gives agents superpowers with automation and AI that simply works.

About the Team

Consumers care first and foremost about having their time valued by brands. Brands need insights into their customer service operation to serve their consumers effectively. Such insights and analytics are delivered through various data products like in-app analytics dashboards and data-sharing integrations.

The data platform team is responsible for designing, building, and maintaining the data infrastructure that enables such data and analytics products at scale. We build and manage data pipelines, databases, and other data structures to ensure that the data is reliable, accurate, and easily accessible.

We also enable internal stakeholders with business intelligence and machine learning teams with data ops. This team manages the platform that handles 2 Million events per minute and processes 1+ terabytes of data daily.

About the Role

Building maintainable data pipelines both for data ingestion and operational analytics for data collected from 2 billion devices and 900M Monthly active users
Building customer-facing analytics products that deliver actionable insights and data, easily detect anomalies
Collaborating with data stakeholders to see what their data needs are and being a part of the analysis process
Write design specifications, test, deployment, and scaling plans for the data pipelines
Mentor people in the team & organization

Requirements

3+ years of experience in building and running data pipelines that scale for TBs of data
Proficiency in high-level object-oriented programming language (Python or Java) is must
Experience in Cloud data platforms like Snowflake and AWS, EMR/Athena is a must
Experience in building modern data lakehouse architectures using Snowflake and columnar formats like Apache Iceberg/Hudi, Parquet, etc
Proficiency in Data modeling, SQL query profiling, and data warehousing skills is a must
Experience in distributed data processing engines like Apache Spark, Apache Flink, Datalfow/Apache Beam, etc
Knowledge of workflow orchestrators like Airflow, Dasgter, etc is a plus
Data visualization skills are a plus (PowerBI, Metabase, Tableau, Hex, Sigma, etc)
Excellent verbal and written communication skills
Bachelor’s Degree in Computer Science (or equivalent)

Benefits

Hybrid setup
Worker's insurance
Paid Time Offs
Other employee benefits to be discussed by our Talent Acquisition team in India.

XML job scraping automation by YubHub

]]> full-time senior hybrid Python, Java, Snowflake, AWS, EMR/Athena, Apache Iceberg/Hudi, Parquet, Apache Spark, Apache Flink, Datalflow/Apache Beam, Airflow, Data modeling, SQL query profiling, data warehousing, PowerBI, Metabase, Tableau, Hex, Sigma Engineering Technology Helpshift https://logos.yubhub.co/j.com.png Helpshift is a company that provides a mobile-first customer service experience for B2B brands. It has over 900 million active monthly consumers and is used by hundreds of leading brands. https://apply.workable.com https://apply.workable.com/j/D451DB2325 Pune, Maharashtra, India 2026-03-09 5008b4f7-b62 Member of Technical Staff - Data Research Engineer - MAI Superintelligence Team We are seeking Data Research Engineers to join our Multimodal team, where we are building the next generation of foundation models across vision, language, audio, and beyond. If you are passionate about designing and curating high-quality datasets to power frontier AI models, this role is for you.

In this role, you’ll work at the intersection of data and innovation—collaborating with scientists, engineers, and annotators to curate, analyze, and evaluate diverse multimodal data sources critical to model development. You will lead efforts to:

Develop novel data collection strategies
Improve dataset quality and integrity
Understand data-driven model behaviors
Align datasets with ethical and societal values

This is a cross-disciplinary, high-impact role ideal for engineers who want to push the boundaries of what AI can learn from data, especially in multimodal contexts.

Microsoft Superintelligence Team

The MAIST is a startup-like team inside Microsoft AI, created to push the boundaries of AI toward Humanist Superintelligence—ultra-capable systems that remain controllable, safety-aligned, and anchored to human values. Our mission is to create AI that amplifies human potential while ensuring humanity remains firmly in control.

Responsibilities

Create high-quality datasets for training and evaluation; run experiments on new datasets (data ablations) to assess their impact and determine the most effective data.
Develop and maintain scalable data pipelines for multimodal ingestion, preprocessing, filtering, and annotation.
Analyze real-world multimodal datasets to assess quality, diversity, relevance, and identify areas for improvement.
Build lightweight tools and workflows for dataset auditing, visualization, and versioning.
Collaborate with Safety, Ethics, and Governance teams to ensure datasets meet standards for quality, privacy, and responsible AI practices.

Embody our culture and values.

Qualifications

Bachelor’s Degree in AI, Computer Science, Data Science, Statistics, Physics, Engineering, or related technical discipline AND 4+ years technical engineering experience with coding in languages including, but not limited to, Python and common data libraries (Pandas, NumPy, etc.) OR equivalent experience.
2+ years of experience in data analysis or data engineering, including work with large-scale datasets that are unstructured or semi-structured.
Proficiency in statistics and exploratory data analysis methods.
Familiarity with data processing frameworks such as Spark, Ray, or Apache Beam.
Ability to communicate technical findings clearly to research and product teams.

XML job scraping automation by YubHub

]]> full-time staff hybrid USD $119,800 – $234,700 per year Python, Pandas, NumPy, Spark, Ray, Apache Beam, Data analysis, Data engineering, Statistics, Exploratory data analysis, Data processing frameworks, Lightweight tools and workflows, Dataset auditing, Visualization, Versioning Engineering Technology Microsoft https://logos.yubhub.co/microsoft.ai.png Microsoft is a multinational technology company that develops, manufactures, licenses, and supports a wide range of software products, services, and devices. https://microsoft.ai https://microsoft.ai/job/member-of-technical-staff-data-research-engineer-mai-superintelligence-team-6/ New York 2026-03-08 9ee6a205-e17 Member of Technical Staff - Pretraining Text Data We are seeking engineers and researchers to join our Pretraining Text Data team, where we are building the next generation of foundation large language models. If you are passionate about designing and curating high-quality datasets to power frontier AI models, this role is for you.

In this role, you’ll work at the intersection of data and innovation—collaborating with scientists, engineers, and annotators to curate, analyze, and evaluate diverse text datasets critical to model development. You will lead efforts to:

Develop novel data collection strategies
Improve dataset quality and integrity
Understand data-driven model behaviors
Train models to understand the impact of data and data mixes
Align datasets with ethical and societal values

This is a cross-disciplinary, high-impact role ideal for engineers and researchers who want to push the boundaries of what AI can learn from data.

Microsoft Superintelligence Team

Responsibilities

Create high-quality datasets for training and evaluation; run experiments on new datasets (data ablations) to assess their impact and determine the most effective data.
Develop and maintain scalable data pipelines for text data ingestion, preprocessing, filtering, and annotation.
Analyze real-world text datasets to assess quality, diversity, relevance, and identify areas for improvement.
Build lightweight tools and workflows for dataset auditing, visualization, and versioning.
Collaborate with Safety, Ethics, and Governance teams to ensure datasets meet standards for quality, privacy, and responsible AI practices.

Embody our culture and values.

Qualifications

Bachelor’s Degree in AI, Computer Science, Data Science, Statistics, Physics, Engineering, or related technical discipline AND 4+ years technical engineering experience with coding in languages including, but not limited to, Python and common data libraries (Pandas, NumPy, etc.) OR equivalent experience.
2+ years of experience in data analysis or data engineering, including work with large-scale datasets that are unstructured or semi-structured.
Proficiency in statistics and exploratory data analysis methods.
Familiarity with data processing frameworks such as Spark, Ray, or Apache Beam.
Ability to communicate technical findings clearly to research and product teams.

Software Engineering IC4

The typical base pay range for this role across the U.S. is USD $119,800 – $234,700 per year.

Software Engineering IC5

The typical base pay range for this role across the U.S. is USD $139,900 – $274,800 per year.

This position will be open for a minimum of 5 days, with applications accepted on an ongoing basis until the position is filled.

XML job scraping automation by YubHub

]]> full-time staff hybrid USD $119,800 – $234,700 per year Python, Pandas, NumPy, Spark, Ray, Apache Beam, Data Science, Statistics, Physics, Engineering, Data Analysis, Data Engineering, Exploratory Data Analysis Engineering Technology Microsoft https://logos.yubhub.co/microsoft.ai.png Microsoft is a multinational technology company that develops, manufactures, licenses, and supports a wide range of software products, services, and devices. https://microsoft.ai https://microsoft.ai/job/member-of-technical-staff-pretraining-text-data-3/ New York 2026-03-08 f0e01847-2e0 Member of Technical Staff - Data Research Engineer - MAI Superintelligence Team We are seeking Data Research Engineers to join our Multimodal team, where we are building the next generation of foundation models across vision, language, audio, and beyond. If you are passionate about designing and curating high-quality datasets to power frontier AI models, this role is for you. In this role, you’ll work at the intersection of data and innovation—collaborating with scientists, engineers, and annotators to curate, analyze, and evaluate diverse multimodal data sources critical to model development. You will lead efforts to:

Develop novel data collection strategies

Improve dataset quality and integrity

Understand data-driven model behaviors

Align datasets with ethical and societal values

This is a cross-disciplinary, high-impact role ideal for engineers who want to push the boundaries of what AI can learn from data, especially in multimodal contexts.

XML job scraping automation by YubHub

]]> full-time staff hybrid USD $119,800 – $234,700 per year (U.S.) or USD $158,400 – $258,000 per year (San Francisco Bay area and New York City metropolitan area) Python, Pandas, NumPy, data libraries, data analysis, data engineering, large-scale datasets, unstructured or semi-structured data, statistics, exploratory data analysis methods, data processing frameworks, Spark, Ray, Apache Beam, Master’s Degree in AI, Computer Science, Data Science, Statistics, Physics, Engineering, or related technical discipline, 8+ years technical engineering experience with coding in languages including, but not limited to, Python and common data libraries (Pandas, NumPy, etc.) Engineering Technology Microsoft https://logos.yubhub.co/microsoft.ai.png Microsoft is a multinational technology company that develops, manufactures, licenses, and supports a wide range of software products, services, and devices. https://microsoft.ai https://microsoft.ai/job/member-of-technical-staff-data-research-engineer-mai-superintelligence-team-4/ Mountain View 2026-03-08 2bfc37e4-bc3 Researcher, Pretraining Safety Job Posting

Researcher, Pretraining Safety

Location

San Francisco

Employment Type

Full time

Department

Safety Systems

Compensation

$295K – $445K • Offers Equity

The base pay offered may vary depending on multiple individualized factors, including market location, job-related knowledge, skills, and experience. If the role is non-exempt, overtime pay will be provided consistent with applicable laws. In addition to the salary range listed above, total compensation also includes generous equity, performance-related bonus(es) for eligible employees, and the following benefits.

Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts

Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)

401(k) retirement plan with employer match

Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)

Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees

13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)

Mental health and wellness support

Employer-paid basic life and disability coverage

Annual learning and development stipend to fuel your professional growth

Daily meals in our offices, and meal delivery credits as eligible

Relocation support for eligible employees

Additional taxable fringe benefits, such as charitable donation matching and wellness stipends, may also be provided.

More details about our benefits are available to candidates during the hiring process.

This role is at-will and OpenAI reserves the right to modify base pay and other compensation components at any time based on individual performance, team or company results, or market conditions.

About the Team

The Safety Systems team is responsible for various safety work to ensure our best models can be safely deployed to the real world to benefit the society and is at the forefront of OpenAI's mission to build and deploy safe AGI, driving our commitment to AI safety and fostering a culture of trust and transparency.

The Pretraining Safety team’s goal is to build safer, more capable base models and enable earlier, more reliable safety evaluation during training. We aim to:

Develop upstream safety evaluations that to monitor how and when unsafe behaviors and goals emerge;

Create safer priors through targeted pretraining and mid-training interventions that make downstream alignment more effective and efficient

Design safe-by-design architectures that allow for more controllability of model capabilities

In addition, we will conduct the foundational research necessary for understanding how behaviors emerge, generalize, and can be reliably measured throughout training.

About the Role

The Pretraining Safety team is pioneering how safety is built into models before they reach post-training and deployment. In this role, you will work throughout the full stack of model development with a focus on pre-training:

Identify safety-relevant behaviors as they first emerge in base models

Evaluate and reduce risk without waiting for full-scale training runs

Design architectures and training setups that make safer behavior the default

Strengthen models by incorporating richer, earlier safety signals

We collaborate across OpenAI’s safety ecosystem—from Safety Systems to Training—to ensure that safety foundations are robust, scalable, and grounded in real-world risks.

In this role, you will:

Develop new techniques to predict, measure, and evaluate unsafe behavior in early-stage models

Design data curation strategies that improve pretraining priors and reduce downstream risk

Explore safe-by-design architectures and training configurations that improve controllability

Introduce novel safety-oriented loss functions, metrics, and evals into the pretraining stack

Work closely with cross-functional safety teams to unify pre- and post-training risk reduction

You might thrive in this role if you:

Have experience developing or scaling pretraining architectures (LLMs, diffusion models, multimodal models, etc.)

Are comfortable working with training infrastructure, data pipelines, and evaluation frameworks (e.g., Python, PyTorch/JAX, Apache Beam)

Enjoy hands-on research — designing, implementing, and iterating on experiments

Enjoy collaborating with diverse technical and cross-functional partners (e.g., policy, legal, training)

Are data-driven with strong statistical reasoning and rigor in experimental design

Value building clean, scalable research workflows and streamlining processes for yourself and others

About OpenAI

OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity.

XML job scraping automation by YubHub

]]> full-time mid onsite $295K – $445K • Offers Equity pretraining architectures, training infrastructure, data pipelines, evaluation frameworks, Python, PyTorch/JAX, Apache Beam, hands-on research, collaboration, data-driven, statistical reasoning, LLMs, diffusion models, multimodal models, safe-by-design architectures, training configurations, loss functions, metrics, evals Engineering Technology OpenAI https://logos.yubhub.co/openai.com.png OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. https://jobs.ashbyhq.com https://jobs.ashbyhq.com/openai/d829b701-5ee2-414f-8596-ef94911a168a San Francisco 2026-03-06 55f3e52b-904 Member of Technical Staff - Data Research Engineer Summary

Microsoft AI are looking for a talented Member of Technical Staff - Data Research Engineer at their Redmond office. This role sits at the intersection of data and innovation—collaborating with scientists, engineers, and annotators to curate, analyze, and evaluate diverse multimodal data sources critical to model development. You will lead efforts to develop novel data collection strategies, improve dataset quality and integrity, understand data-driven model behaviors, and align datasets with ethical and societal values.

About the Role

As a Data Research Engineer, you will be responsible for creating high-quality datasets for training and evaluation, running experiments on new datasets (data ablations) to assess their impact and determine the most effective data. You will also develop and maintain scalable data pipelines for multimodal ingestion, preprocessing, filtering, and annotation. Additionally, you will analyze real-world multimodal datasets to assess quality, diversity, relevance, and identify areas for improvement. You will build lightweight tools and workflows for dataset auditing, visualization, and versioning. You will collaborate with Safety, Ethics, and Governance teams to ensure datasets meet standards for quality, privacy, and responsible AI practices.

Accountabilities

Create high-quality datasets for training and evaluation
Run experiments on new datasets (data ablations) to assess their impact and determine the most effective data
Develop and maintain scalable data pipelines for multimodal ingestion, preprocessing, filtering, and annotation
Analyze real-world multimodal datasets to assess quality, diversity, relevance, and identify areas for improvement
Build lightweight tools and workflows for dataset auditing, visualization, and versioning

The Candidate we're looking for

Experience:

4+ years technical engineering experience with coding in languages including, but not limited to, Python and common data libraries (Pandas, NumPy, etc.)

Technical skills:

Proficiency in statistics and exploratory data analysis methods
Familiarity with data processing frameworks such as Spark, Ray, or Apache Beam

Personal attributes:

Ability to communicate technical findings clearly to research and product teams

Benefits

Competitive salary
Comprehensive benefits package
Opportunities for professional growth and development
Collaborative and dynamic work environment

XML job scraping automation by YubHub

]]> full-time staff onsite USD $119,800 – $234,700 per year Python, Pandas, NumPy, Spark, Ray, Apache Beam, statistics, exploratory data analysis, data processing frameworks Engineering Technology Microsoft AI https://logos.yubhub.co/microsoft.ai.png Microsoft AI is a leading technology company that specializes in artificial intelligence, machine learning, and data science. They are known for their innovative products and services that help organizations make data-driven decisions. Microsoft AI is committed to empowering every person and organization on the planet to achieve more. https://microsoft.ai https://microsoft.ai/job/member-of-technical-staff-data-research-engineer-mai-superintelligence-team-5/ Redmond 2026-03-06 41ac4a39-9a3 Member of Technical Staff - Pretraining Text Data Summary

Microsoft AI are looking for a talented Member of Technical Staff - Pretraining Text Data at their Redmond office. This role sits at the heart of strategic decision-making, turning market data into actionable insights for a company that's revolutionising AI technology. You'll work directly with leadership to shape the company's direction in the AI market.

About the Role

We are seeking engineers and researchers to join our Pretraining Text Data team, where we are building the next generation of foundation large language models. If you are passionate about designing and curating high-quality datasets to power frontier AI models, this role is for you. In this role, you’ll work at the intersection of data and innovation—collaborating with scientists, engineers, and annotators to curate, analyze, and evaluate diverse text datasets critical to model development. You will lead efforts to:

Develop novel data collection strategies
Improve dataset quality and integrity
Understand data-driven model behaviors
Train models to understand the impact of data and data mixes
Align datasets with ethical and societal values

Accountabilities

Create high-quality datasets for training and evaluation; run experiments on new datasets (data ablations) to assess their impact and determine the most effective data.
Develop and maintain scalable data pipelines for text data ingestion, preprocessing, filtering, and annotation.
Analyze real-world text datasets to assess quality, diversity, relevance, and identify areas for improvement.
Build lightweight tools and workflows for dataset auditing, visualization, and versioning.
Collaborate with Safety, Ethics, and Governance teams to ensure datasets meet standards for quality, privacy, and responsible AI practices.

The Candidate we're looking for

Experience:

Bachelor’s Degree in AI, Computer Science, Data Science, Statistics, Physics, Engineering, or related technical discipline AND 4+ years technical engineering experience with coding in languages including, but not limited to, Python and common data libraries (Pandas, NumPy, etc.) OR equivalent experience.

Technical skills:

Proficiency in statistics and exploratory data analysis methods.
Familiarity with data processing frameworks such as Spark, Ray, or Apache Beam.

Personal attributes:

Ability to communicate technical findings clearly to research and product teams.

Benefits

Competitive salary
Comprehensive benefits package
Opportunities for professional growth and development
Collaborative and dynamic work environment
Access to cutting-edge technology and resources

XML job scraping automation by YubHub

]]> full-time staff onsite USD $119,800 – $234,700 per year Python, Pandas, NumPy, Spark, Ray, Apache Beam, statistics, exploratory data analysis, data processing frameworks Engineering Technology Microsoft AI https://logos.yubhub.co/microsoft.ai.png Microsoft AI is a leading technology company that specializes in artificial intelligence, machine learning, and data science. They are known for their innovative products and services that empower individuals and organizations to achieve more. Microsoft AI is committed to pushing the boundaries of what is possible with AI and making it accessible to everyone. https://microsoft.ai https://microsoft.ai/job/member-of-technical-staff-pretraining-text-data-2/ Redmond 2026-03-06 365605e7-0ca Member of Technical Staff, Data Research Engineer Summary

Microsoft AI are looking for a talented Member of Technical Staff, Data Research Engineer to join their MAI Superintelligence Team in Zürich, Switzerland. This role sits at the heart of strategic decision-making, turning market data into actionable insights for a company that's revolutionising AI technology. You'll work directly with leadership to shape the company's direction in the AI market.

About the Role

As a Data Research Engineer, you will be responsible for creating high-quality datasets for training and evaluation, running experiments on new datasets to assess their impact, and developing and maintaining scalable data pipelines for multimodal ingestion, pre-processing, filtering, and annotation. You will also analyze real-world multimodal datasets to assess quality, diversity, relevance, and identify areas for improvement. Additionally, you will build lightweight tools and workflows for dataset auditing, visualization, and versioning, and collaborate with Safety, Ethics, and Governance teams to ensure datasets meet standards for quality, privacy, and responsible AI practices.

Accountabilities

Create high-quality datasets for training and evaluation
Run experiments on new datasets to assess their impact and determine the most effective data
Develop and maintain scalable data pipelines for multimodal ingestion, pre-processing, filtering, and annotation
Analyze real-world multimodal datasets to assess quality, diversity, relevance, and identify areas for improvement
Build lightweight tools and workflows for dataset auditing, visualization, and versioning

The Candidate we're looking for

Experience:

Bachelor's Degree in AI, Computer Science, Data Science, Statistics, Physics, Engineering, or a related technical field
Technical engineering experience with coding in languages including, but not limited to, Python and common data libraries (Pandas, NumPy, etc.)

Technical skills:

Proficiency in statistics and exploratory data analysis methods
Experience in data analysis or data engineering

Personal attributes:

Ability to communicate technical findings effectively to research and product teams

Benefits

Competitive salary and benefits package
Opportunity to work with a leading technology company in the AI industry
Collaborative and dynamic work environment
Professional development opportunities

XML job scraping automation by YubHub

]]> full-time staff onsite Python, Pandas, NumPy, statistics, data analysis, data engineering, Spark, Ray, Apache Beam, large-scale data processing Engineering Technology Microsoft AI https://logos.yubhub.co/microsoft.ai.png Microsoft AI is a leading technology company that specializes in artificial intelligence, machine learning, and data science. They are known for their innovative products and services that empower individuals and organizations to achieve more. Microsoft AI is committed to making a positive impact on society through their work in AI. https://microsoft.ai https://microsoft.ai/job/member-of-technical-staff-data-research-engineer-mai-superintelligence-team-2/ Zürich, Switzerland 2026-03-06 88f19c96-557 Member of Technical Staff, Data Research Engineer Summary

Microsoft AI are looking for a talented Data Research Engineer to join their MAI Superintelligence Team in London. This role sits at the heart of strategic decision-making, turning market data into actionable insights for a company that's revolutionising AI technology. You'll work directly with leadership to shape the company's direction in the AI market.

About the Role

Accountabilities

Create high-quality datasets for training and evaluation
Run experiments on new datasets to assess their impact and determine the most effective data
Develop and maintain scalable data pipelines for multimodal ingestion, pre-processing, filtering, and annotation

The Candidate we're looking for

Experience:

Bachelor's Degree in AI, Computer Science, Data Science, Statistics, Physics, Engineering, or a related technical field
Technical engineering experience with coding in languages including, but not limited to, Python and common data libraries (Pandas, NumPy, etc.)

Technical skills:

Proficiency in statistics and exploratory data analysis methods
Familiarity with data processing frameworks such as Spark, Ray, Apache Beam

Personal attributes:

Ability to communicate technical findings effectively to research and product teams

Benefits

Competitive salary and benefits package
Opportunities for professional growth and development
Collaborative and dynamic work environment

XML job scraping automation by YubHub

]]> full-time staff onsite Competitive salary and benefits package Python, Pandas, NumPy, Spark, Ray, Apache Beam, Data processing frameworks, Machine learning algorithms Engineering Technology Microsoft AI https://logos.yubhub.co/microsoft.ai.png Microsoft AI is a leading technology company that specializes in artificial intelligence, machine learning, and data science. They are known for their innovative products and services that empower individuals and organizations to achieve more. Microsoft AI is committed to making a positive impact on society through their technology and research. https://microsoft.ai https://microsoft.ai/job/member-of-technical-staff-data-research-engineer-mai-superintelligence-team/ London 2026-03-06 63bd919b-7b3 Member of Technical Staff - Pretraining Text Data Summary

Microsoft AI are looking for a talented Member of Technical Staff - Pretraining Text Data at their Mountain View office. This role sits at the heart of strategic decision-making, turning market data into actionable insights for a company that's revolutionising AI technology. You'll work directly with leadership to shape the company's direction in the AI market.

About the Role

In this role, you'll work at the intersection of data and innovation—collaborating with scientists, engineers, and annotators to curate, analyze, and evaluate diverse text datasets critical to model development. You will lead efforts to:

Develop novel data collection strategies
Improve dataset quality and integrity
Understand data-driven model behaviors
Train models to understand the impact of data and data mixes
Align datasets with ethical and societal values

Accountabilities

Create high-quality datasets for training and evaluation; run experiments on new datasets (data ablations) to assess their impact and determine the most effective data.
Develop and maintain scalable data pipelines for text data ingestion, preprocessing, filtering, and annotation.

The Candidate we're looking for

Experience:

Bachelor’s Degree in AI, Computer Science, Data Science, Statistics, Physics, Engineering, or related technical discipline AND 4+ years technical engineering experience with coding in languages including, but not limited to, Python and common data libraries (Pandas, NumPy, etc.) OR equivalent experience.

Technical skills:

Proficiency in statistics and exploratory data analysis methods.
Familiarity with data processing frameworks such as Spark, Ray, or Apache Beam.

Personal attributes:

Ability to communicate technical findings clearly to research and product teams.

Benefits

Competitive salary
Comprehensive benefits package
Opportunities for professional growth and development
Collaborative and dynamic work environment

XML job scraping automation by YubHub

]]> full-time staff onsite USD $119,800 – $234,700 per year Python, Pandas, NumPy, Spark, Ray, Apache Beam, Data analysis, Data engineering, Machine learning Engineering Technology Microsoft AI https://logos.yubhub.co/microsoft.ai.png Microsoft AI is a leading technology company that specializes in artificial intelligence, machine learning, and data science. They are known for their innovative products and services that empower individuals and organizations to achieve more. Microsoft AI is committed to pushing the boundaries of what is possible with AI and making it accessible to everyone. https://microsoft.ai https://microsoft.ai/job/member-of-technical-staff-pretraining-text-data/ Mountain View 2026-03-06