Staff+ Software Engineer, Platform

47de4683-b45 Staff+ Software Engineer, Platform We are looking for experienced software engineers to join our Platform organisation. We build the foundational primitives that accelerate product development across Anthropic, and own infrastructure and systems that teams depend on to ship reliably and at scale.

As a Staff+ Software Engineer, you will independently scope complex, multi-month projects, drive cross-org alignment through ambiguous problem spaces, and make architectural decisions that shape how Anthropic builds and scales its products. You will partner directly with research to productize cutting-edge capabilities, and will have lasting impact on the platform that hundreds of thousands of companies and internal/external engineers depend on every day.

Our team is responsible for Platform Acceleration, Service Infra, Multicloud, Auth & Identity, and Connectivity. We work on maximising developer productivity of product engineers at Anthropic, building and maintaining the core infrastructure that powers Anthropic's engineering organisation, operating across multiple cloud providers, and powering identity and authentication across Anthropic's product suite.

You will work on problems where reliability and enterprise trust are the bar: token refresh at scale, admin controls that let IT govern what agents can do, proxy infrastructure that stays up when partner servers don't. We ship for claude.ai, Claude Code, Cowork, and the API.

Relevant experience includes OAuth, API gateways, multi-tenant platforms, building for enterprise, and MCP.

We are looking for someone with 8-10+ years of practical full-stack engineering experience, ideally with 2+ years operating at a Staff or equivalent technical leadership level. You should have led the design and delivery of complex, consumer or B2B user-facing products across the full stack, and take a product-focused approach to building solutions that are robust, scalable, and easy to use.

Strong candidates may also have served as a technical lead or architect for a foundational platform system, owning both the technical vision and execution end-to-end, or experience designing or scaling billing, payments, or financial infrastructure at high transaction volumes.

XML job scraping automation by YubHub

]]> full-time staff onsite $405,000-$485,000 USD OAuth, API gateways, multi-tenant platforms, building for enterprise, MCP, ML training infra, production ML pipelines, backend engineering, finetuning experience Engineering Technology Anthropic https://logos.yubhub.co/anthropic.co.png Anthropic creates reliable, interpretable, and steerable AI systems. It has a quickly growing team of researchers, engineers, policy experts, and business leaders. https://www.anthropic.co/ https://job-boards.greenhouse.io/anthropic/jobs/5157847008 San Francisco, CA | New York City, NY | Seattle, WA 2026-04-18 96d05ee1-799 Staff Software Engineer, Cluster Orchestration Job Description

CoreWeave is The Essential Cloud for AI. Built for pioneers by pioneers, CoreWeave delivers a platform of technology, tools, and teams that enables innovators to build and scale AI with confidence.

Trusted by leading AI labs, startups, and global enterprises, CoreWeave combines superior infrastructure performance with deep technical expertise to accelerate breakthroughs and turn compute into capability.

Founded in 2017, CoreWeave became a publicly traded company (Nasdaq: CRWV) in March 2025.

About the Role

As part of the Cluster Orchestration team, you will play a key role in advancing CoreWeave's orchestration platform including SUNK (Slurm on Kubernetes) and beyond, our Kubernetes-native foundation that powers AI training and inference at scale.

This is an opportunity to help shape one of the most critical layers of the AI cloud: ensuring workloads run seamlessly, reliably, and efficiently across massive GPU clusters.

By building the systems that eliminate infrastructure bottlenecks and create new orchestration capabilities, you will directly empower customers to innovate faster and push the boundaries of what's possible with AI.

What You'll Do

As a Staff Engineer, you will be a technical leader shaping the long-term strategy for CoreWeave's orchestration platform.

You'll define architectural direction, own critical parts of the orchestration platform and other managed services, and drive cross-org initiatives in scheduling, quota enforcement, and scaling at hyperscale.

You'll mentor senior engineers, establish org-wide best practices in reliability and observability, and ensure CoreWeave's orchestration layer evolves to meet the demands of next-generation AI workloads.

Who You Are

8+ years of software engineering experience.

Proven track record designing and operating large-scale distributed systems in production.

Deep expertise in Slurm/Kubernetes internals and cloud-native development.

Advanced proficiency in Go and distributed systems design and cloud-native development.

Experience setting technical direction and influencing cross-team architecture.

Bachelor's or Master's degree in CS, EE, or related field.

Preferred

Familiarity with orchestration and workflow technologies such as Ray, Kubeflow, Kueue, Istio, Knative, or Argo Workflows

Deep expertise in Slurm/Kubernetes internals.

Experience with distributed workloads, GPU-based applications, or ML pipelines.

Knowledge of scheduling concepts like quota enforcement, pre-emption, and scaling strategies.

Exposure to reliability practices including SLOs, alarms, and post-incident reviews.

Experience with AI infrastructure and workloads (ML training, inference, or HPC).

Ability to mentor senior engineers and elevate organizational standards.

Why CoreWeave?

At CoreWeave, we work hard, have fun, and move fast! We're in an exciting stage of hyper-growth that you will not want to miss out on.

We're not afraid of a little chaos, and we're constantly learning.

Our team cares deeply about how we build our product and how we work together, which is represented through our core values:

Be Curious at Your Core

Act Like an Owner

Empower Employees

Deliver Best-in-Class Client Experiences

Achieve More Together

We support and encourage an entrepreneurial outlook and independent thinking.

We foster an environment that encourages collaboration and provides the opportunity to develop innovative solutions to complex problems.

As we get set for take off, the growth opportunities within the organization are constantly expanding.

You will be surrounded by some of the best talent in the industry, who will want to learn from you, too.

Come join us!

Salary and Benefits

The base salary range for this role is $185,000 to $275,000.

The starting salary will be determined based on job-related knowledge, skills, experience, and market location.

We strive for both market alignment and internal equity when determining compensation.

In addition to base salary, our total rewards package includes a discretionary bonus, equity awards, and a comprehensive benefits program (all based on eligibility).

What We Offer

The range we've posted represents the typical compensation range for this role.

To determine actual compensation, we review the market rate for each candidate which can include a variety of factors.

These include qualifications, experience, interview performance, and location.

In addition to a competitive salary, we offer a variety of benefits to support your needs, including:

Medical, dental, and vision insurance - 100% paid for by CoreWeave

Company-paid Life Insurance

Voluntary supplemental life insurance

Short and long-term disability insurance

Flexible Spending Account

Health Savings Account

Tuition Reimbursement

Ability to Participate in Employee Stock Purchase Program (ESPP)

Mental Wellness Benefits through Spring Health

Family-Forming support provided by Carrot

Paid Parental Leave

Flexible, full-service childcare support with Kinside

401(k) with a generous employer match

Flexible PTO

Catered lunch each day in our office and data center locations

A casual work environment

A work culture focused on innovative disruption

XML job scraping automation by YubHub

]]> full-time staff hybrid $185,000 to $275,000 software engineering, distributed systems, Slurm, Kubernetes, cloud-native development, Go, scheduling, quota enforcement, scaling strategies, reliability practices, SLOs, alarms, post-incident reviews, AI infrastructure, workloads, ML training, inference, HPC, orchestration and workflow technologies, Ray, Kubeflow, Kueue, Istio, Knative, Argo Workflows Engineering Technology CoreWeave https://logos.yubhub.co/coreweave.com.png CoreWeave is a cloud computing company that provides a platform for building and scaling AI with confidence. https://www.coreweave.com https://job-boards.greenhouse.io/coreweave/jobs/4658801006 Bellevue, WA / Sunnyvale, CA 2026-04-18 ff4d3a91-b20 Principal Engineer - Perf and Benchmarking We're looking for a Principal Engineer to be the technical lead of CoreWeave's Benchmarking & Performance team. You will be responsible for our planet-scale performance data warehouse: Ingesting, storing, transforming and analyzing performance events in all the data centers across our global infrastructure.

You will also be an integral part of achieving industry-leading end-to-end performance benchmarking publications: If MLPerf (Training & Inference), Working closely with NVIDIA (Megatron-LM, TensorRT-LLM & DGX cloud) and the open-source community (llm-d, vLLM and all popular ML frameworks) speak to you, come help us demonstrate CoreWeave's performance reliability leadership in the field.

Responsibilities

Strategy & Leadership - Define the multi-year benchmarking strategy and roadmap; prioritize models/workloads (LLMs, diffusion, vision, speech) and hardware tiers. Build, lead, and mentor a high-performing team of performance engineers and data analysts. Establish governance for claims: documented methodologies, versioning, reproducibility, and audit trails.

Perf Ownership - Lead end-to-end MLPerf Inference and Training submissions: workload selection, cluster planning, runbooks, audits, and result publication. Coordinate optimization tracks with NVIDIA (CUDA, cuDNN, TensorRT/TensorRT-LLM, Triton, NCCL) to hit competitive results; drive upstream fixes where needed.

Internal Latency & Throughput Benchmarks - Design a Kubernetes-native, repeatable benchmarking service that exercises CoreWeave stacks across SUNK (Slurm on Kubernetes), Kueue, and Kubeflow pipelines. Measure and report p50/p95/p99 latency, jitter, tokens/s, time-to-first-token, cold-start/warm-start, and cost-per-token/request across models, precisions (BF16/FP8/FP4), batch sizes, and GPU types. Maintain a corpus of representative scenarios (streaming, batch, multi-tenant) and data sets; automate comparisons across software releases and hardware generations.

Tooling & Automation - Build CI/CD pipelines and K8s controllers/operators to schedule benchmarks at scale; integrate with observability stacks (Prometheus, Grafana, OpenTelemetry) and results warehouses. Implement supply-chain integrity for benchmark artifacts (SBOMs, Cosign signatures).

Cross-functional & Community - Partner with NVIDIA, key ISVs, and OSS projects (vLLM, Triton, KServe, PyTorch/DeepSpeed, ONNX Runtime) to co-develop optimizations and upstream improvements. Support Sales/SEs with authoritative numbers for RFPs and competitive evaluations; brief analysts and press with rigorous, defensible data.

Requirements

10+ years building distributed systems or HPC/cloud services, with deep expertise on large-scale ML training or similar high-performance workloads.

Proven track record of architecting or building planet-scale data systems (e.g., telemetry platforms, observability stacks, cloud data warehouses, large-scale OLAP engines).

Deep understanding of GPU performance (CUDA, NCCL, RDMA, NVLink/PCIe, memory bandwidth), model-server stacks (Triton, vLLM, TensorRT-LLM, TorchServe), and distributed training frameworks (PyTorch FSDP/DeepSpeed/Megatron-LM).

Proficient with Kubernetes and ML control planes; familiarity with SUNK, Kueue, and Kubeflow in production environments.

Excellent communicator able to interface with executives, customers, auditors, and OSS communities.

Nice to have

Experience with time-series databases, log-structured merge trees (LSM), or custom storage engine development.

Experience running MLPerf submissions (Inference and/or Training) or equivalent audited benchmarks at scale.

Contributions to MLPerf, Triton, vLLM, PyTorch, KServe, or similar OSS projects.

Experience benchmarking multi-region fleets and large clusters (thousands of GPUs).

Publications/talks on ML performance, latency engineering, or large-scale benchmarking methodology.

XML job scraping automation by YubHub

]]> full-time senior hybrid $206,000 to $333,000 Distributed systems, HPC/cloud services, Large-scale ML training, GPU performance, Model-server stacks, Distributed training frameworks, Kubernetes, ML control planes, Time-series databases, Log-structured merge trees, Custom storage engine development, MLPerf submissions, Audited benchmarks, Contributions to OSS projects, Benchmarking multi-region fleets, Large clusters, Publications/talks on ML performance Engineering Technology CoreWeave https://logos.yubhub.co/coreweave.com.png CoreWeave is a cloud-based platform for artificial intelligence that provides technology, tools, and teams to enable innovators to build and scale AI with confidence. https://www.coreweave.com https://job-boards.greenhouse.io/coreweave/jobs/4627302006 Sunnyvale, CA / Bellevue, WA 2026-04-18 372999e8-579 Senior Software Engineer II, AI Workload Orchestration As a Senior Software Engineer II on the AI Workload Orchestration team, you will help build and operate CoreWeave's Kubernetes-native platform for admitting, scheduling, and operating AI workloads at scale.

This platform integrates multiple orchestration and scheduling frameworks such as Kueue, Volcano, and Ray to support modern AI training and inference workflows. It complements SUNK (Slurm on Kubernetes) by providing a Kubernetes-first, cloud-native orchestration layer with deep platform integration.

You will own meaningful components of the platform, drive reliability and performance improvements, and help scale the system as customer demand and workload complexity continue to grow.

Responsibilities:

Design, build, and operate Kubernetes-native services for AI workload orchestration and scheduling
Own one or more platform components end-to-end, including design, implementation, testing, and on-call support
Improve scheduling latency, cluster utilization, and workload reliability through metrics-driven engineering
Contribute to architectural discussions across services and influence design decisions within the platform
Work closely with adjacent teams (CKS, infrastructure, managed inference) to ensure clean interfaces and integrations
Mentor junior engineers and raise the quality bar for code, design, and operations

About the role:

5–8 years of professional software engineering experience in distributed systems, cloud infrastructure, or platform engineering
Strong experience building production systems in Go (Python or C++ a plus)
Solid understanding of Kubernetes fundamentals, APIs, controllers, and operating services in production
Experience working with scheduling, resource management, or quota-based systems
Proven ability to improve system reliability and performance using data and operational metrics
Comfortable owning services in production and participating in on-call rotations

Preferred:

Experience with Kubernetes-native orchestration frameworks such as Kueue, Volcano, Ray, Kubeflow, or Argo Workflows
Familiarity with GPU-based workloads, ML training, or inference pipelines
Knowledge of scheduling concepts such as quota enforcement, pre-emption, and backfilling
Experience with reliability practices including SLOs, alerting, and incident response
Exposure to AI infrastructure, HPC, or large-scale distributed compute environments

Why CoreWeave?

At CoreWeave, we work hard, have fun, and move fast! We’re in an exciting stage of hyper-growth that you will not want to miss out on. We’re not afraid of a little chaos, and we’re constantly learning. Our team cares deeply about how we build our product and how we work together, which is represented through our core values:

Be Curious at Your Core
Act Like an Owner
Empower Employees
Deliver Best-in-Class Client Experiences
Achieve More Together

The base salary range for this role is $165,000 to $242,000. The starting salary will be determined based on job-related knowledge, skills, experience, and market location. We strive for both market alignment and internal equity when determining compensation. In addition to base salary, our total rewards package includes a discretionary bonus, equity awards, and a comprehensive benefits program (all based on eligibility).

What We Offer

The range we’ve posted represents the typical compensation range for this role. To determine actual compensation, we review the market rate for each candidate which can include a variety of factors. These include qualifications, experience, interview performance, and location.

In addition to a competitive salary, we offer a variety of benefits to support your needs, including:

Medical, dental, and vision insurance - 100% paid for by CoreWeave
Company-paid Life Insurance
Voluntary supplemental life insurance
Short and long-term disability insurance
Flexible Spending Account
Health Savings Account
Tuition Reimbursement
Ability to Participate in Employee Stock Purchase Program (ESPP)
Mental Wellness Benefits through Spring Health
Family-Forming support provided by Carrot
Paid Parental Leave
Flexible, full-service childcare support with Kinside
401(k) with a generous employer match
Flexible PTO
Catered lunch each day in our office and data center locations
A casual work environment
A work culture focused on innovative disruption

XML job scraping automation by YubHub

]]> full-time senior hybrid $165,000 to $242,000 Kubernetes, Go, Distributed systems, Cloud infrastructure, Platform engineering, Scheduling, Resource management, Quota-based systems, Kueue, Volcano, Ray, Kubeflow, Argo Workflows, GPU-based workloads, ML training, Inference pipelines, SLOs, Alerting, Incident response, AI infrastructure, HPC, Large-scale distributed compute environments Engineering Technology CoreWeave https://logos.yubhub.co/coreweave.com.png CoreWeave is a technology company that delivers a platform for building and scaling AI with confidence. https://www.coreweave.com https://job-boards.greenhouse.io/coreweave/jobs/4647595006 Sunnyvale, CA / Bellevue, WA 2026-04-18 1ee5ad51-8f0 SWE - Grids - Fixed Term Contract - 6 Months - London, UK We are seeking an experienced and hands-on Software Engineer for a fixed-term contract to join the Energy Grids team at Google DeepMind. In this individual contributor role, you will work at the cutting edge of power systems and machine learning, developing and deploying innovative AI solutions to optimize the operation of electrical power grids.

Your work will be critical to delivering a real-world validation of our approach, with a primary focus on core software engineering tasks to:

Enable rapid, trustworthy experimentation. Maintain rigorous benchmarking and testing. Manage scale for both data and model size. Ensure and maintain high data quality for both real-world and synthetic data.

Key Responsibilities

Design, implement, and maintain robust and reliable systems and workflows for generating large-scale synthetic and real datasets of power grid optimization problems.
Design and implement rigorous unit, integration, and system tests to ensure the reliability, accuracy, and maintained performance of our models and software, with a focus on data pipelines.
Maintain and contribute to our machine learning codebase, ensuring efficient data structures and seamless integration with our power system models and optimization solvers.
Ensure the codebase supports ongoing experimentation, while simultaneously increasing scalability, robustness, and reliability via improved integration testing and performance benchmarking.
Work closely and collaboratively with a team of engineers, research scientists, and product managers to deliver real-world impact.

Minimum Qualifications

Bachelor's degree in Computer Science, Software Engineering, or equivalent practical experience.
Excellent proficiency in C++, Python, or Jax.
Demonstrated experience developing or utilizing solutions for robustness or quality assurance within software and/or ML systems.
Experience processing, generating, and analyzing large-scale data, e.g. for ML applications.
Proven ability to discuss technical ideas effectively and collaborate in interdisciplinary teams.
Motivated by the prospect of real-world impact and focused on excellence in software development.

Preferred Qualifications

Experience with Google's technical stack and/or Google Cloud Platform (GCP).
Familiarity with modern hardware accelerators (GPU / TPU).
Experience with modern ML training frameworks, such as Jax.
Experience in developing software in a translational research or production setting.
Proficiency in Julia

XML job scraping automation by YubHub

]]> contract senior onsite C++, Python, Jax, Robustness, Quality Assurance, Software Development, Machine Learning, Data Analysis, Google's technical stack, Google Cloud Platform (GCP), Modern hardware accelerators (GPU / TPU), Modern ML training frameworks (Jax), Software development in a translational research or production setting, Proficiency in Julia Engineering Technology Google DeepMind https://logos.yubhub.co/deepmind.com.png Google DeepMind is a subsidiary of Alphabet Inc., a multinational conglomerate. https://deepmind.com/ https://job-boards.greenhouse.io/deepmind/jobs/7750738 London, UK 2026-04-18 f19254b6-7fd SWE - Grids - Fixed Term Contract - 6 Months - London, UK We are seeking an experienced Software Engineer for a fixed-term contract to join the Energy Grids team at Google DeepMind. You will work at the cutting edge of power systems and machine learning, developing and deploying innovative AI solutions to optimize the operation of electrical power grids.

Your key responsibilities will include:

Designing, implementing, and maintaining robust and reliable systems and workflows for generating large-scale synthetic and real datasets of power grid optimization problems.

Designing and implementing rigorous unit, integration, and system tests to ensure the reliability, accuracy, and maintained performance of our models and software, with a focus on data pipelines.

Maintaining and contributing to our machine learning codebase, ensuring efficient data structures and seamless integration with our power system models and optimization solvers.

Ensuring the codebase supports ongoing experimentation, while simultaneously increasing scalability, robustness, and reliability via improved integration testing and performance benchmarking.

Working closely and collaboratively with a team of engineers, research scientists, and product managers to deliver real-world impact.

To be successful in this role, you will need:

A Bachelor's degree in Computer Science, Software Engineering, or equivalent practical experience.

Excellent proficiency in C++, Python, or Jax.

Demonstrated experience developing or utilizing solutions for robustness or quality assurance within software and/or ML systems.

Experience processing, generating, and analyzing large-scale data, e.g. for ML applications.

Proven ability to discuss technical ideas effectively and collaborate in interdisciplinary teams.

Motivated by the prospect of real-world impact and focused on excellence in software development.

Preferred qualifications include experience with Google's technical stack and/or Google Cloud Platform (GCP), familiarity with modern hardware accelerators (GPU / TPU), experience with modern ML training frameworks, such as Jax, and experience in developing software in a translational research or production setting.

XML job scraping automation by YubHub

]]> contract senior onsite C++, Python, Jax, Machine Learning, Software Development, Data Analysis, Data Pipelines, Google Cloud Platform (GCP), Modern Hardware Accelerators (GPU / TPU), Modern ML Training Frameworks (Jax), Translational Research or Production Setting Engineering Technology Google DeepMind https://logos.yubhub.co/deepmind.com.png Google DeepMind is a subsidiary of Alphabet Inc., a multinational conglomerate. It focuses on artificial intelligence research and development. https://deepmind.com/ https://job-boards.greenhouse.io/deepmind/jobs/7750738 London, UK 2026-03-31 93a4ece6-182 Member of Technical Staff, Site Reliability Engineer (HPC) As Microsoft continues to push the boundaries of AI, we are on the lookout for experienced individuals to work with us on the most interesting and challenging AI questions of our time. Our vision is to build systems that have true artificial intelligence across agents, applications, services, and infrastructure. We're looking for an experienced HPC Site Reliability Engineer (SRE) to join our High Performance Computing (HPC) infrastructure team. In this role, you'll blend software engineering and systems engineering to keep our large-scale distributed AI infrastructure reliable and efficient. You'll ensure that AI systems stay efficient and reliable with very high uptimes.

Microsoft's mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.

This role is part of Microsoft AI's Superintelligence Team. The MAIST is a startup-like team inside Microsoft AI, created to push the boundaries of AI toward Humanist Superintelligence—ultra-capable systems that remain controllable, safety-aligned, and anchored to human values. Our mission is to create AI that amplifies human potential while ensuring humanity remains firmly in control. We aim to deliver breakthroughs that benefit society—advancing science, education, and global well-being.

Responsibilities Reliability & Availability : Ensure uptime, resiliency, and fault tolerance of HPC clusters powering MAI model training and inference. Observability : Design and maintain monitoring, alerting, and logging systems to provide real-time visibility into all aspects of HPC systems including GPU, clusters, storage and networking. Automation & Tooling : Build automation for deployments, incident response, scaling, and failover in CPU+GPU environments. Incident Management : Lead on-call rotations, troubleshoot production issues, conduct blameless postmortems, and drive continuous improvements. Security & Compliance : Ensure data privacy, compliance, and secure operations across model training and serving environments. Collaboration : Partner with ML engineers and platform teams to improve developer experience and accelerate research-to-production workflows.

Qualifications Required Qualifications Master’s Degree in Computer Science, Information Technology, or related field AND 2+ years technical experience in Site Reliability Engineering, DevOps, or Infrastructure Engineering OR Bachelor’s Degree in Computer Science, Information Technology, or related field AND 4+ years technical experience in Site Reliability Engineering, DevOps, or Infrastructure Engineering OR equivalent experience

Preferred Qualifications Strong proficiency in Kubernetes, Docker, and container orchestration. Knowledge of CI/CD pipelines for Inference and ML model deployment. Hands-on experience with public cloud platforms like Azure/AWS/GCP and infrastructure-as-code. Expertise in monitoring & observability tools (Grafana, Datadog, OpenTelemetry, etc.). Strong programming/scripting skills in Python, Go, or Bash. Solid knowledge of distributed systems, networking, and storage. Experience running large-scale GPU clusters for ML/AI workloads (preferred). Familiarity with ML training/inference pipelines. Experience with high-performance computing (HPC) and workload schedulers (Kubernetes operators). Background in capacity planning & cost optimization for GPU-heavy environments.

Work on cutting-edge infrastructure that powers the future of Generative AI. Collaborate with world-class researchers and engineers. Impact millions of users through reliable and responsible AI deployments. Competitive compensation, equity options, and comprehensive benefits.

XML job scraping automation by YubHub

]]> full-time staff hybrid $139,900 – $274,800 per year Kubernetes, Docker, container orchestration, CI/CD pipelines, public cloud platforms, infrastructure-as-code, monitoring & observability tools, programming/scripting skills in Python, Go, or Bash, distributed systems, networking, storage, GPU clusters, ML training/inference pipelines, high-performance computing, workload schedulers, strong proficiency in Kubernetes, knowledge of CI/CD pipelines, hands-on experience with public cloud platforms, expertise in monitoring & observability tools, strong programming/scripting skills in Python, Go, or Bash, solid knowledge of distributed systems, experience running large-scale GPU clusters, familiarity with ML training/inference pipelines, experience with high-performance computing Engineering Technology Microsoft https://logos.yubhub.co/microsoft.ai.png Microsoft is a multinational technology company that develops, manufactures, licenses, and supports a wide range of software products, services, and devices. https://microsoft.ai https://microsoft.ai/job/member-of-technical-staff-site-reliability-engineer-hpc-mai-superintelligence-team/ Mountain View 2026-03-08 d3a39f4c-d95 Software Engineer, Inference - Multi Modal Software Engineer, Inference - Multi Modal

Location

San Francisco

Employment Type

Full time

Department

Scaling

Compensation

$295K – $555K • Offers Equity

The base pay offered may vary depending on multiple individualized factors, including market location, job-related knowledge, skills, and experience. If the role is non-exempt, overtime pay will be provided consistent with applicable laws. In addition to the salary range listed above, total compensation also includes generous equity, performance-related bonus(es) for eligible employees, and the following benefits.

Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts

Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)

401(k) retirement plan with employer match

Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)

Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees

13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)

Mental health and wellness support

Employer-paid basic life and disability coverage

Annual learning and development stipend to fuel your professional growth

Daily meals in our offices, and meal delivery credits as eligible

Relocation support for eligible employees

Additional taxable fringe benefits, such as charitable donation matching and wellness stipends, may also be provided.

More details about our benefits are available to candidates during the hiring process.

This role is at-will and OpenAI reserves the right to modify base pay and other compensation components at any time based on individual performance, team or company results, or market conditions.

About the Team

OpenAI’s Inference team powers the deployment of our most advanced models - including our GPT models, 4o Image Generation, and Whisper - across a variety of platforms. Our work ensures these models are available, performant, and scalable in production, and we partner closely with Research to bring the next generation of models into the world. We're a small, fast-moving team of engineers focused on delivering a world-class developer experience while pushing the boundaries of what AI can do.

We’re expanding into multimodal inference, building the infrastructure needed to serve models that handle image, audio, and other non-text modalities. These workloads are inherently more heterogeneous and experimental, involving diverse model sizes and interactions, more complex input/output formats, and tighter coordination with product and research.

About the Role

We’re looking for a software engineer to help us serve OpenAI’s multimodal models at scale. You’ll be part of a small team responsible for building reliable, high-performance infrastructure for serving real-time audio, image, and other MM workloads in production.

This work is inherently cross-functional: you’ll collaborate directly with researchers training these models and with product teams defining new modalities of interaction. You'll build and optimize the systems that let users generate speech, understand images, and interact with models in ways far beyond text.

In this role, you will:

Design and implement inference infrastructure for large-scale multimodal models.

Optimize systems for high-throughput, low-latency delivery of image and audio inputs and outputs.

Enable experimental research workflows to transition into reliable production services.

Collaborate closely with researchers, infra teams, and product engineers to deploy state-of-the-art capabilities.

Contribute to system-level improvements including GPU utilization, tensor parallelism, and hardware abstraction layers.

You might thrive in this role if you:

Have experience building and scaling inference systems for LLMs or multimodal models.

Have worked with GPU-based ML workloads and understand the performance dynamics of large models, especially with complex data like images or audio.

Enjoy experimental, fast-evolving work and collaborating closely with research.

Are comfortable dealing with systems that span networking, distributed compute, and high-throughput data handling.

Have familiarity with inference tooling like vLLM, TensorRT-LLM, or custom model parallel systems.

Own problems end-to-end and are excited to operate in ambiguous, fast-moving spaces.

Nice to Have:

Experience working with image generation or audio synthesis models in production.

Exposure to distributed ML training or system-efficient model design.

About OpenAI

OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity.

XML job scraping automation by YubHub

]]> full-time mid onsite $295K – $555K • Offers Equity Software Engineer, Inference Infrastructure, GPU-based ML Workloads, Tensor Parallelism, Hardware Abstraction Layers, vLLM, TensorRT-LLM, Custom Model Parallel Systems, Image Generation, Audio Synthesis, Distributed ML Training, System-Efficient Model Design Engineering Technology OpenAI https://logos.yubhub.co/openai.com.png OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. https://jobs.ashbyhq.com https://jobs.ashbyhq.com/openai/4d14449e-5e7f-45d4-b103-8776a6c87086 San Francisco 2026-03-06 ba52acc3-4fd Engineering Site Lead We're seeking an exceptional Site Lead to establish and scale our London office. This is a unique opportunity to shape Perplexity's presence in one of the world's leading tech hubs, building teams and culture from the ground up while driving technical excellence in infrastructure and AI systems.

What you'll do

As Site Lead, you'll serve as the face of Perplexity in London, responsible for building our technical organization, fostering a world-class engineering culture, and directly managing one or more infrastructure teams. You'll report to senior leadership and work cross-functionally with teams across our global footprint.

What you need

10+ years of experience in software engineering with 5+ years in infrastructure, cloud infrastructure, or AI infrastructure roles
3+ years of people management experience, including building and scaling teams
Proven track record of establishing or significantly growing an engineering site or office

XML job scraping automation by YubHub

]]> full-time senior hybrid distributed systems, cloud platforms, infrastructure automation, GPU infrastructure and orchestration, ML training and inference pipelines, Model serving and deployment at scale, Kubernetes, Terraform, container orchestration, CI/CD systems, experience at companies focused on AI/ML, search, or large-scale consumer applications, previous experience as a site lead, office lead, or similar multi-team leadership role, background in building infrastructure for LLM training or inference, contributions to open-source infrastructure or AI infrastructure projects, experience scaling teams from 0 to 20+ engineers, active involvement in the London or European tech community Engineering Technology Perplexity https://logos.yubhub.co/perplexity.com.png Perplexity is revolutionizing how people discover and interact with information through AI-powered search and knowledge tools. As we expand our global footprint, we're establishing a strategic presence in London to drive innovation and growth across Europe. https://jobs.ashbyhq.com https://jobs.ashbyhq.com/perplexity/638e6823-be7f-46c6-9675-7b1197fc9b8c London 2026-03-04