5-6 month software development internship - GPU Simulation Data Processing Internship

64b848a0-7f0 5-6 month software development internship - GPU Simulation Data Processing Internship Our internship programs offer real-world projects, hands-on experience, and opportunities to collaborate with passionate teams globally. Explore your interests, share your ideas, and bring them to life while shaping your career path within our inclusive culture that fosters innovation and collaboration.

As a GPU Simulation Data Processing Intern, you will extend a simulation data processing toolbox on GPU architectures to enhance performance and scalability for Synopsys solvers and applications. You will collaborate with the DPF group to design and implement generic tools for pre- and post-processing in numerical simulation domains. You will work closely with team members to identify and solve challenges based on GPU technologies, contributing innovative solutions. You will participate in code reviews, documentation, and optimization of existing tools to ensure high-quality and robust deliverables. You will share ideas and actively contribute to a collaborative, creative environment.

Key responsibilities include:

Extending a simulation data processing toolbox on GPU architectures to enhance performance and scalability for Synopsys solvers and applications
Collaborating with the DPF group to design and implement generic tools for pre- and post-processing in numerical simulation domains
Working closely with team members to identify and solve challenges based on GPU technologies, contributing innovative solutions
Participating in code reviews, documentation, and optimization of existing tools to ensure high-quality and robust deliverables
Sharing ideas and actively contributing to a collaborative, creative environment

Requirements include:

Currently pursuing a master’s or engineering degree in Computer Science or a related field
Strong programming skills in C++ and Python; familiarity with Github for version control
Experience with Cuda and understanding of GPU architectures
Ability to work effectively as part of a team, take initiative, and propose innovative solutions to technical challenges

This is a 6-month internship with a full-time schedule (35 hours/week). The internship is located in Lyon, France, and requires the ability to work in-office.

XML job scraping automation by YubHub

]]> internship entry onsite C++, Python, CUDA, GPU architectures, Github Engineering Technology Synopsys https://logos.yubhub.co/careers.synopsys.com.png Synopsys is a leading provider of electronic design automation (EDA) software and services for designing, verifying, and manufacturing electronic systems and microelectronic components. https://careers.synopsys.com https://careers.synopsys.com/job/villeurbanne/5-6-month-software-development-internship-gpu-simulation-data-processing-internship-toolbox-cuda/44408/93816738928 Lyon 2026-04-24 d8ab9aea-841 Senior Client Engineer - Rendering Electronic Arts creates next-level entertainment experiences that inspire players and fans around the world. As one of the largest sports entertainment platforms in the world, EA SPORTS FC is redefining football with genre-leading interactive experiences, connecting a global community of fans to The World's Game through innovation and unrivalled authenticity.

We invite you to join our passionate and dynamic team as we pioneer the future of football fandom. As a Senior Client Engineer - Rendering, you will be responsible for using our existing game engine for game development, building upon the foundation of our current engine to enhance its functionality and performance. You will implement and fine-tune real-time shaders, dynamic lighting, and post-processing effects to enhance visual fidelity while maintaining target frame rates.

Responsibilities:

Use our existing game engine for game development, building upon the foundation of our current engine to enhance its functionality and performance.
Implement and fine-tune real-time shaders, dynamic lighting, and post-processing effects to enhance visual fidelity while maintaining target frame rates.
Develop new tools and improve existing ones based on our existing toolchain.
Identify performance bottlenecks within games and improve them.
Communicate with game designers and artists, ensuring that program functionality aligns with design requirements.
Help create technical specifications and software architecture documents.
Communicate project progress and risks to superiors promptly.

Qualifications:

5+ years of professional experience in real-time graphics or rendering development for games.
Experience with 3D mathematics and algorithms, GPU architectures, and graphics APIs (such as Metal, Vulkan, or DirectX).
Experience in profiling and optimizing rendering performance on iOS and Android platforms.
Expert in C++ and master at least one shader language (such as HLSL or GLSL).
Hands-on experience developing in-house game engines or modifying at least one commercial game engine (such as Unreal Engine or Unity).
A collaborative mindset for working with art, design, and QA teams.

XML job scraping automation by YubHub

]]> full-time senior hybrid C++, Real-time graphics, Rendering development, Game engine development, Shader languages włsl, Metal, Vulkan, DirectX, 3D mathematics, GPU architectures Engineering Technology Electronic Arts https://logos.yubhub.co/jobs.ea.com.png Electronic Arts is a leading video game developer and publisher with a portfolio of popular titles such as EA SPORTS FC. https://jobs.ea.com https://jobs.ea.com/en_US/careers/JobDetail/213520-Rendering-Engineer-III/213520 Shanghai 2026-04-24 002c5e0f-f56 Member of Technical Staff, Software Co-Design AI HPC Systems Our team's mission is to architect, co-design, and productionize next-generation AI systems at datacenter scale. We operate at the intersection of models, systems software, networking, storage, and AI hardware, optimizing end-to-end performance, efficiency, reliability, and cost.

We pursue this mission through deep hardware–software co-design, combining rigorous systems thinking with hands-on engineering. The team invests heavily in understanding real production workloads large-scale training, inference, and emerging multimodal models and translating those insights into concrete improvements across the stack: from kernels, runtimes, and distributed systems, all the way down to silicon-level trade-offs and datacenter-scale architectures.

This role sits at the boundary between exploration and production. You will work closely with internal infrastructure, hardware, compiler, and product teams, as well as external partners across the hardware and systems ecosystem. Our operating model emphasizes rapid ideation and prototyping, followed by disciplined execution to drive high-leverage ideas into production systems that operate at massive scale.

In addition to delivering real-world impact on large-scale AI platforms, the team actively contributes to the broader research and engineering community. Our work aligns closely with leading communities in ML systems, distributed systems, computer architecture, and high-performance computing, and we regularly publish, prototype, and open-source impactful technologies where appropriate.

About the Team

We build foundational AI infrastructure that enables large-scale training and inference across diverse workloads and rapidly evolving hardware generations. Our work directly shapes how AI systems are designed, deployed, and scaled today and into the future. Engineers on this team operate with end-to-end ownership, deep technical rigor, and a strong bias toward real-world impact.

Microsoft Superintelligence Team

Microsoft Superintelligence team’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.

We’re also fortunate to partner with incredible product teams giving our models the chance to reach billions of users and create immense positive impact. If you’re a brilliant, highly-ambitious and low ego individual, you’ll fit right in,come and join us as we work on our next generation of models!

Responsibilities

Lead the co-design of AI systems across hardware and software boundaries, spanning accelerators, interconnects, memory systems, storage, runtimes, and distributed training/inference frameworks.

Drive architectural decisions by analyzing real workloads, identifying bottlenecks across compute, communication, and data movement, and translating findings into actionable system and hardware requirements.

Co-design and optimize parallelism strategies, execution models, and distributed algorithms to improve scalability, utilization, reliability, and cost efficiency of large-scale AI systems.

Develop and evaluate what-if performance models to project system behavior under future workloads, model architectures, and hardware generations, providing early guidance to hardware and platform roadmaps.

Partner with compiler, kernel, and runtime teams to unlock the full performance of current and next-generation accelerators, including custom kernels, scheduling strategies, and memory optimizations.

Influence and guide AI hardware design at system and silicon levels, including accelerator microarchitecture, interconnect topology, memory hierarchy, and system integration trade-offs.

Lead cross-functional efforts to prototype, validate, and productionize high-impact co-design ideas, working across infrastructure, hardware, and product teams.

Mentor senior engineers and researchers, set technical direction, and raise the overall bar for systems rigor, performance engineering, and co-design thinking across the organization.

Qualifications

Minimum Qualifications

Bachelor’s degree in Computer Science, Computer Engineering, Electrical Engineering, or a related technical field, or equivalent practical experience.

10+ years of experience (or equivalent depth) working across systems software, hardware architecture, or AI infrastructure , with demonstrated impact at scale.

Strong background in one or more of the following areas: AI accelerator or GPU architectures Distributed systems and large-scale AI training/inference High-performance computing (HPC) and collective communications ML systems, runtimes, or compilers Performance modeling, benchmarking, and systems analysis Hardware–software co-design for AI workloads Proficiency in systems-level programming (e.g., C/C++, CUDA, Python) and performance-critical software development.

Proven ability to work across organizational boundaries and influence technical decisions involving multiple stakeholders.

Preferred Qualifications

Experience designing or operating large-scale AI clusters for training or inference.

XML job scraping automation by YubHub

]]> full-time staff hybrid AI accelerator or GPU architectures, Distributed systems and large-scale AI training/inference, High-performance computing (HPC) and collective communications, ML systems, runtimes, or compilers, Performance modeling, benchmarking, and systems analysis, Hardware–software co-design for AI workloads, Proficiency in systems-level programming (e.g., C/C++, CUDA, Python) Engineering Technology Microsoft AI https://logos.yubhub.co/microsoft.ai.png Microsoft AI is a subsidiary of Microsoft Corporation, a multinational technology company that develops, manufactures, licenses, and supports a wide range of software products, services, and devices. https://microsoft.ai https://microsoft.ai/job/member-of-technical-staff-software-co-design-ai-hpc-systems-mai-superintelligence-team-5/ Zürich 2026-04-24 51b57192-d10 Member of Technical Staff, Capacity & Efficiency Infrastructure - MAI Superintelligence Team Microsoft AI is looking for a Member of Technical Staff – Capacity & Efficiency Infrastructure to help us improve manage, and improve the efficiency of, our compute fleet. We're seeking someone who brings an abundance of positive energy, empathy, and kindness to the team every day, in addition to being highly effective. The ideal candidate enjoys building world-class consumer experiences and products in a fast-paced environment. You will actively contribute to the development of AI models powering our innovative products. Expect to wear multiple hats and work across engineering, research, and everything in between.

Your contributions will span model architecture, data curation, training and inference infrastructure, evaluation protocols, alignment and reinforcement learning from human feedback (RLHF), and many other exciting topics at the cutting edge of AI. Microsoft AI is building the training infrastructure that powers frontier-scale models and advances research toward humanist superintelligence. As a Member of Technical Staff – Capacity & Efficiency, you will contribute to a fast-moving codebase that enables training at an unprecedented scale. This role will require building software and mathematical models for measuring the effectiveness of our capacity usage and then developing tools and techniques to help us improve. This will require you to partner with ML researchers to scale up the latest research recipes, implement new forms of distributed training parallelism, and ensure the reliability and performance of thousands of GPUs across our supercomputing fleet. Profiling, benchmarking, debugging, and fine-grained optimization are core to this role, demanding both engineering rigor and creativity.

Microsoft Superintelligence Team:

The MAIST is a startup-like team inside Microsoft AI, created to push the boundaries of AI toward Humanist Superintelligence,ultra-capable systems that remain controllable, safety-aligned, and anchored to human values. Our mission is to create AI that amplifies human potential while ensuring humanity remains firmly in control. We aim to deliver breakthroughs that benefit society,advancing science, education, and global well-being. We’re also fortunate to partner with incredible product teams giving our models the chance to reach billions of users and create immense positive impact.

Responsibilities:

Design, implement, test, and optimize distributed training infrastructure in Python and C++ for large-scale GPU clusters.
Build and evolve telemetry systems to provide visibility into infrastructure & ML model performance, utilization, and cost related metrics.
Profile, benchmark, and debug performance bottlenecks across compute, memory, networking, and storage subsystems.
Drive architectural improvements across various ML services which deliver measurable efficiency improvements.
Build and evolve tools to automatically provide insights and recommendations to improve fleet-wide efficiency.
Optimize collective communication libraries (e.g., NCCL) for emerging NVLink and InfiniBand topologies.
Partner with ML researchers and infrastructure engineers to understand their plans and future needs and develop plans to balance growth with efficiency.
Collaborate with hardware teams to optimize for next-generation accelerators (NVIDIA, MAIA, and beyond).

Qualifications:

Bachelor’s Degree in Computer Science, or related technical discipline AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience.
Deep understanding of the fundamentals of GPU architectures and DL/LLM architectures.
Deep experience in profiling and analyzing performance in large-scale distributed computing systems.
Experience with low-level GPU programming (CUDA, Triton, NCCL) and frameworks such as PyTorch or JAX.
Experience in leading technical projects and supporting architectural decisions with data.
Experience building infrastructure for large-scale machine learning or generative AI workloads.
Experience in networking (InfiniBand, NVLink), storage systems, or distributed training parallelisms.
Track record of contributing to high-performance computing or large-scale AI infrastructure projects.

Software Engineering IC4 – The typical base pay range for this role across the U.S. is USD $119,800 – $234,700 per year.

XML job scraping automation by YubHub

]]> full-time staff onsite C, C++, Python, GPU architectures, DL/LLM architectures, low-level GPU programming, PyTorch, JAX, networking, storage systems, distributed training parallelisms Engineering Technology Microsoft AI https://logos.yubhub.co/microsoft.ai.png Microsoft AI is a technology company that develops and markets software products and services. It is one of the largest and most successful companies in the world. https://microsoft.ai https://microsoft.ai/job/member-of-technical-staff-capacity-efficiency-infrastructure-mai-superintelligence-team-2/ Redmond 2026-04-24 540d480d-7d6 Member of Technical Staff, Capacity & Efficiency Infrastructure - MAI Superintelligence Team Microsoft AI is looking for a Member of Technical Staff – Capacity & Efficiency Infrastructure to help us improve manage, and improve the efficiency of, our compute fleet. We're seeking someone who brings an abundance of positive energy, empathy, and kindness to the team every day, in addition to being highly effective. The ideal candidate enjoys building world-class consumer experiences and products in a fast-paced environment. You will actively contribute to the development of AI models powering our innovative products. Expect to wear multiple hats and work across engineering, research, and everything in between. Your contributions will span model architecture, data curation, training and inference infrastructure, evaluation protocols, alignment and reinforcement learning from human feedback (RLHF), and many other exciting topics at the cutting edge of AI. Microsoft AI is building the training infrastructure that powers frontier-scale models and advances research toward humanist superintelligence. As a Member of Technical Staff – Capacity & Efficiency, you will contribute to a fast-moving codebase that enables training at an unprecedented scale. This role will require building software and mathematical models for measuring the effectiveness of our capacity usage and then developing tools and techniques to help us improve. This will require you to partner with ML researchers to scale up the latest research recipes, implement new forms of distributed training parallelism, and ensure the reliability and performance of thousands of GPUs across our supercomputing fleet. Profiling, benchmarking, debugging, and fine-grained optimization are core to this role, demanding both engineering rigor and creativity.

Responsibilities: Design, implement, test, and optimize distributed training infrastructure in Python and C++ for large-scale GPU clusters. Build and evolve telemetry systems to provide visibility into infrastructure & ML model performance, utilization, and cost related metrics. Profile, benchmark, and debug performance bottlenecks across compute, memory, networking, and storage subsystems. Drive architectural improvements across various ML services which deliver measurable efficiency improvements. Build and evolve tools to automatically provide insights and recommendations to improve fleet-wide efficiency. Optimize collective communication libraries (e.g., NCCL) for emerging NVLink and InfiniBand topologies. Partner with ML researchers and infrastructure engineers to understand their plans and future needs and develop plans to balance growth with efficiency. Collaborate with hardware teams to optimize for next-generation accelerators (NVIDIA, MAIA, and beyond). Embody our Culture and Values.

Qualifications: Required Qualifications: Bachelor’s Degree in Computer Science, or related technical discipline AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience. Preferred Qualifications: Bachelor’s Degree in Computer Science or related technical field AND 10+ years technical engineering experience with coding in languages including, but not limited to, C++ or Python OR Master’s Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C++ or Python OR equivalent experience. Deep understanding of the fundamentals of GPU architectures and DL/LLM architectures. Deep experience in profiling and analyzing performance in large-scale distributed computing systems. Deep experience in profiling and analyzing performance in ML models especially GenAI models. Experience with low-level GPU programming (CUDA, Triton, NCCL) and frameworks such as PyTorch or JAX. Experience in leading technical projects and supporting architectural decisions with data. Experience building infrastructure for large-scale machine learning or generative AI workloads. Experience in networking (InfiniBand, NVLink), storage systems, or distributed training parallelisms. Track record of contributing to high-performance computing or large-scale AI infrastructure projects.

Software Engineering IC4 – The typical base pay range for this role across the U.S. is USD $119,800 – $234,700 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City.

XML job scraping automation by YubHub

]]> full-time staff onsite $119,800 – $234,700 per year C, C++, C#, Java, JavaScript, Python, GPU architectures, DL/LLM architectures, low-level GPU programming, PyTorch, JAX, networking, storage systems, distributed training parallelisms Engineering Technology Microsoft AI https://logos.yubhub.co/microsoft.ai.png Microsoft AI is a technology company that develops and markets software products and services. It is one of the largest and most influential technology companies in the world. https://microsoft.ai https://microsoft.ai/job/member-of-technical-staff-capacity-efficiency-infrastructure-mai-superintelligence-team/ Mountain View 2026-04-24 a51375e8-30e Member of Technical Staff, Software Co-Design AI HPC Systems Our team's mission is to architect, co-design, and productionize next-generation AI systems at datacenter scale. We operate at the intersection of models, systems software, networking, storage, and AI hardware, optimizing end-to-end performance, efficiency, reliability, and cost. Our work spans today's frontier AI workloads and directly shapes the next generation of accelerators, system architectures, and large-scale AI platforms. We pursue this mission through deep hardware–software co-design, combining rigorous systems thinking with hands-on engineering. The team invests heavily in understanding real production workloads large-scale training, inference, and emerging multimodal models and translating those insights into concrete improvements across the stack: from kernels, runtimes, and distributed systems, all the way down to silicon-level trade-offs and datacenter-scale architectures. This role sits at the boundary between exploration and production. You will work closely with internal infrastructure, hardware, compiler, and product teams, as well as external partners across the hardware and systems ecosystem. Our operating model emphasizes rapid ideation and prototyping, followed by disciplined execution to drive high-leverage ideas into production systems that operate at massive scale. In addition to delivering real-world impact on large-scale AI platforms, the team actively contributes to the broader research and engineering community. Our work aligns closely with leading communities in ML systems, distributed systems, computer architecture, and high-performance computing, and we regularly publish, prototype, and open-source impactful technologies where appropriate.

About the Team

Microsoft Superintelligence Team

This role is part of Microsoft AI’s Superintelligence Team. The MAIST is a startup-like team inside Microsoft AI, created to push the boundaries of AI toward Humanist Superintelligence—ultra-capable systems that remain controllable, safety-aligned, and anchored to human values. Our mission is to create AI that amplifies human potential while ensuring humanity remains firmly in control. We aim to deliver breakthroughs that benefit society—advancing science, education, and global well-being. We’re also fortunate to partner with incredible product teams giving our models the chance to reach billions of users and create immense positive impact. If you’re a brilliant, highly-ambitious and low ego individual, you’ll fit right in—come and join us as we work on our next generation of models!

Responsibilities

Lead the co-design of AI systems across hardware and software boundaries, spanning accelerators, interconnects, memory systems, storage, runtimes, and distributed training/inference frameworks. Drive architectural decisions by analyzing real workloads, identifying bottlenecks across compute, communication, and data movement, and translating findings into actionable system and hardware requirements. Co-design and optimize parallelism strategies, execution models, and distributed algorithms to improve scalability, utilization, reliability, and cost efficiency of large-scale AI systems. Develop and evaluate what-if performance models to project system behavior under future workloads, model architectures, and hardware generations, providing early guidance to hardware and platform roadmaps. Partner with compiler, kernel, and runtime teams to unlock the full performance of current and next-generation accelerators, including custom kernels, scheduling strategies, and memory optimizations. Influence and guide AI hardware design at system and silicon levels, including accelerator microarchitecture, interconnect topology, memory hierarchy, and system integration trade-offs. Lead cross-functional efforts to prototype, validate, and productionize high-impact co-design ideas, working across infrastructure, hardware, and product teams. Mentor senior engineers and researchers, set technical direction, and raise the overall bar for systems rigor, performance engineering, and co-design thinking across the organization.

XML job scraping automation by YubHub

]]> full-time staff hybrid AI accelerator or GPU architectures, Distributed systems and large-scale AI training/inference, High-performance computing (HPC) and collective communications, ML systems, runtimes, or compilers, Performance modeling, benchmarking, and systems analysis, Hardware–software co-design for AI workloads, Proficiency in systems-level programming (e.g., C/C++, CUDA, Python) and performance-critical software development, Experience designing or operating large-scale AI clusters for training or inference, Deep familiarity with LLMs, multimodal models, or recommendation systems, and their systems-level implications, Experience with accelerator interconnects and communication stacks (e.g., NCCL, MPI, RDMA, high-speed Ethernet or InfiniBand), Background in performance modeling and capacity planning for future hardware generations, Prior experience contributing to or leading hardware roadmaps, silicon bring-up, or platform architecture reviews, Publications, patents, or open-source contributions in systems, architecture, or ML systems Engineering Technology Microsoft AI https://logos.yubhub.co/microsoft.ai.png Microsoft AI is a technology company that develops and markets software products and services. It is one of the largest and most successful technology companies in the world. https://microsoft.ai https://microsoft.ai/job/member-of-technical-staff-software-co-design-ai-hpc-systems-mai-superintelligence-team-3/ London 2026-03-08 cd1a0d16-311 Member of Technical Staff, Software Co-Design AI HPC Systems Our team's mission is to architect, co-design, and productionize next-generation AI systems at datacenter scale. We operate at the intersection of models, systems software, networking, storage, and AI hardware, optimizing end-to-end performance, efficiency, reliability, and cost.

Microsoft Superintelligence Team Microsoft Superintelligence team’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.

Responsibilities Lead the co-design of AI systems across hardware and software boundaries, spanning accelerators, interconnects, memory systems, storage, runtimes, and distributed training/inference frameworks.

Co-design and optimize parallelism strategies, execution models, and distributed algorithms to improve scalability, utilization, reliability, and cost efficiency of large-scale AI systems.

Partner with compiler, kernel, and runtime teams to unlock the full performance of current and next-generation accelerators, including custom kernels, scheduling strategies, and memory optimizations.

Influence and guide AI hardware design at system and silicon levels, including accelerator microarchitecture, interconnect topology, memory hierarchy, and system integration trade-offs.

Lead cross-functional efforts to prototype, validate, and productionize high-impact co-design ideas, working across infrastructure, hardware, and product teams.

Mentor senior engineers and researchers, set technical direction, and raise the overall bar for systems rigor, performance engineering, and co-design thinking across the organization.

Qualifications Bachelor’s Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience.

Additional or Preferred Qualifications Master’s Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR Bachelor’s Degree in Computer Science or related technical field AND 12+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience.

Proven ability to work across organizational boundaries and influence technical decisions involving multiple stakeholders. Experience designing or operating large-scale AI clusters for training or inference. Deep familiarity with LLMs, multimodal models, or recommendation systems, and their systems-level implications. Experience with accelerator interconnects and communication stacks (e.g., NCCL, MPI, RDMA, high-speed Ethernet or InfiniBand). Background in performance modeling and capacity planning for future hardware generations. Prior experience contributing to or leading hardware roadmaps, silicon bring-up, or platform architecture reviews. Publications, patents, or open-source contributions in systems, architecture, or ML systems are a plus.

XML job scraping automation by YubHub

]]> full-time staff hybrid $139,900 – $274,800 per year C, C++, C#, Java, JavaScript, Python, AI accelerator or GPU architectures, Distributed systems and large-scale AI training/inference, High-performance computing (HPC) and collective communications, ML systems, runtimes, or compilers, Performance modeling, benchmarking, and systems analysis, Hardware–software co-design for AI workloads, Proficiency in systems-level programming (e.g., C/C++, CUDA, Python) and performance-critical software development, LLMs, multimodal models, or recommendation systems, and their systems-level implications, Accelerator interconnects and communication stacks (e.g., NCCL, MPI, RDMA, high-speed Ethernet or InfiniBand), Performance modeling and capacity planning for future hardware generations, Contributing to or leading hardware roadmaps, silicon bring-up, or platform architecture reviews, Publications, patents, or open-source contributions in systems, architecture, or ML systems Engineering Technology Microsoft AI https://logos.yubhub.co/microsoft.ai.png Microsoft AI is a technology company that develops and markets software products and services. It is one of the largest and most successful technology companies in the world. https://microsoft.ai https://microsoft.ai/job/member-of-technical-staff-software-co-design-ai-hpc-systems-mai-superintelligence-team-2/ Redmond 2026-03-08 4054dca1-a4f AI Inference Engineer We are looking for an AI Inference engineer to join our growing team. Our current stack is Python, Rust, C++, PyTorch, Triton, CUDA, Kubernetes. You will have the opportunity to work on large-scale deployment of machine learning models for real-time inference.

What you'll do

Develop APIs for AI inference that will be used by both internal and external customers.

Develop APIs for AI inference that will be used by both internal and external customers
Benchmark and address bottlenecks throughout our inference stack
Improve the reliability and observability of our systems and respond to system outages
Explore novel research and implement LLM inference optimizations

What you need

Experience with ML systems and deep learning frameworks (e.g. PyTorch, TensorFlow, ONNX)
Familiarity with common LLM architectures and inference optimization techniques (e.g. continuous batching, quantization, etc.)
Understanding of GPU architectures or experience with GPU kernel programming using CUDA

Why this matters

As an AI Inference engineer, you will play a critical role in the development and deployment of our machine learning models. Your work will have a direct impact on the performance and reliability of our systems, and will help us to continue to innovate and improve our products.

XML job scraping automation by YubHub

]]> full-time mid onsite Final offer amounts are determined by multiple factors, including, experience and expertise. ML systems, deep learning frameworks, GPU architectures, LLM architectures, inference optimization techniques Engineering Technology Perplexity https://logos.yubhub.co/perplexity.com.png Perplexity is a company that is looking for an AI Inference engineer to join their growing team. They are a technology company that is working on large-scale deployment of machine learning models for real-time inference. https://jobs.ashbyhq.com https://jobs.ashbyhq.com/perplexity/e4777627-ff8f-4257-8612-3a016bb58592 London 2026-03-04 e37be4c0-4be AI Inference Engineer Perplexity is looking for an AI Inference Engineer to join their team. The successful candidate will be responsible for developing APIs for AI inference, benchmarking and addressing bottlenecks throughout the inference stack, improving the reliability and observability of systems, and exploring novel research and implementing LLM inference optimisations.

What you'll do

As an AI Inference Engineer at Perplexity, you will have the opportunity to work on large-scale deployment of machine learning models for real-time inference. You will be responsible for developing APIs for AI inference that will be used by both internal and external customers.

Develop APIs for AI inference that will be used by both internal and external customers
Benchmark and address bottlenecks throughout our inference stack
Improve the reliability and observability of our systems and respond to system outages
Explore novel research and implement LLM inference optimisations

What you need

To be successful in this role, you will need to have experience with ML systems and deep learning frameworks (e.g. PyTorch, TensorFlow, ONNX), familiarity with common LLM architectures and inference optimisation techniques (e.g. continuous batching, quantisation, etc.), and understanding of GPU architectures or experience with GPU kernel programming using CUDA.

Experience with ML systems and deep learning frameworks (e.g. PyTorch, TensorFlow, ONNX)
Familiarity with common LLM architectures and inference optimisation techniques (e.g. continuous batching, quantisation, etc.)
Understanding of GPU architectures or experience with GPU kernel programming using CUDA

XML job scraping automation by YubHub

]]> full-time mid onsite $220K – $405K ML systems, deep learning frameworks, LLM architectures, inference optimisation techniques, GPU architectures, GPU kernel programming, continuous batching, quantisation, PyTorch, TensorFlow, ONNX Engineering Technology Perplexity https://logos.yubhub.co/perplexity.ai.png Perplexity is a cutting-edge technology company that specialises in artificial intelligence and machine learning. They are looking for talented individuals to join their team and contribute to the development of their AI products. https://www.perplexity.ai/ https://jobs.ashbyhq.com/perplexity/8a976851-9bef-4b07-8d36-567fa9540aef San Francisco, New York City, Palo Alto 2026-03-04