{"version":"0.1","company":{"name":"YubHub","url":"https://yubhub.co","jobsUrl":"https://yubhub.co/jobs/skill/cuda"},"x-facet":{"type":"skill","slug":"cuda","display":"Cuda","count":66},"x-feed-size-limit":100,"x-feed-sort":"enriched_at desc","x-feed-notice":"This feed contains at most 100 jobs (the most recently enriched). For the full corpus, use the paginated /stats/by-facet endpoint or /search.","x-generator":"yubhub-xml-generator","x-rights":"Free to redistribute with attribution: \"Data by YubHub (https://yubhub.co)\"","x-schema":"Each entry in `jobs` follows https://schema.org/JobPosting. YubHub-native raw fields carry `x-` prefix.","jobs":[{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_566c8778-7f9"},"title":"Quantitative Developer (Python) -  Central Liquidity Strategies","description":"<p>We are seeking a highly driven, results-oriented Senior Quantitative Developer to join a dynamic group tasked with developing our next-generation alpha research pipeline, encompassing data ingestion to model evaluation and reporting.</p>\n<p>The successful candidate will be expected to:</p>\n<ul>\n<li>Help design and contribute to the alpha research platform</li>\n<li>Support, maintain, and test their own code following best practices, including unit testing, regression testing, documentation, and automation within typical CI processes</li>\n<li>Provide leadership and vision to help determine the overall direction, design, and architecture of the alpha research pipeline</li>\n<li>Mentor junior resources</li>\n<li>Regularly interact with quantitative researchers and other stakeholders, and prioritise and implement features</li>\n</ul>\n<p>The ideal candidate will have:</p>\n<ul>\n<li>5+ years of Python experience in a quantitative finance setting</li>\n<li>Familiarity with linear models and basic statistics for creating model evaluation and reporting workflows</li>\n<li>Familiarity with the Python data science ecosystem, including dashboarding and popular ML libraries such as Plotly, Altair, JAX, TensorFlow, and PyTorch</li>\n<li>Prior experience building alpha research or machine learning pipelines</li>\n<li>Highly analytical with strong problem-solving skills and attention to detail</li>\n<li>Strong communication skills, with the ability to explain technical and sophisticated concepts clearly and concisely</li>\n<li>Ability to tune and debug runtime performance of data applications</li>\n<li>Familiarity with C++/Rust/CUDA to debug and profile underlying native code in ML libraries (Nice to have)</li>\n</ul>\n<p>The estimated base salary range for this position is $160,000 to $250,000, which is specific to New York and may change in the future.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_566c8778-7f9","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Central Execution Book","sameAs":"https://mlp.eightfold.ai","logo":"https://logos.yubhub.co/mlp.eightfold.ai.png"},"x-apply-url":"https://mlp.eightfold.ai/careers/job/755954183338","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$160,000 to $250,000","x-skills-required":["Python","linear models","basic statistics","Plotly","Altair","JAX","TensorFlow","PyTorch","C++/Rust/CUDA"],"x-skills-preferred":[],"datePosted":"2026-04-18T22:13:25.204Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"New York, New York, United States of America"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Finance","skills":"Python, linear models, basic statistics, Plotly, Altair, JAX, TensorFlow, PyTorch, C++/Rust/CUDA","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":160000,"maxValue":250000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_f28927b0-573"},"title":"Machine Learning Systems Research Engineer, Agent Post-training - Enterprise GenAI","description":"<p>At Scale, our mission is to accelerate the development of AI applications. We are working on an arsenal of proprietary research and resources that serve all of our enterprise clients. As an ML Sys Research Engineer, you&#39;ll work on building out the algorithms for our next-gen Agent RL training platform, support large scale training, and research and integrate state-of-the-art technologies to optimize our ML system.</p>\n<p>Your customer will be other MLREs and AAIs on the Enterprise AI team who are taking the training algorithms and applying them to client use-cases ranging from next-generation AI cybersecurity firewall LLMs to training foundation healthtech search models.</p>\n<p>If you are excited about shaping the future of the modern AI movement, we would love to hear from you!</p>\n<p>Key Responsibilities:</p>\n<ul>\n<li>Build, profile and optimize our training and inference framework.</li>\n<li>Post-train state of the art models, developed both internally and from the community, to define stable post-training recipes for our enterprise engagements.</li>\n<li>Collaborate with ML teams to accelerate their research and development, and enable them to develop the next generation of models and data curation.</li>\n<li>Create a next-gen agent training algorithm for multi-agent/multi-tool rollouts.</li>\n</ul>\n<p>Ideal Candidate:</p>\n<ul>\n<li>At least 1-3 years of LLM training in a production environment.</li>\n<li>Passionate about system optimization.</li>\n<li>Experience with post-training methods like RLHF/RLVR and related algorithms like PPO/GRPO etc.</li>\n<li>Ability to demonstrate know-how on how to operate the architecture of the modern GPU cluster.</li>\n<li>Experience with multi-node LLM training and inference.</li>\n<li>Strong software engineering skills, proficient in frameworks and tools such as CUDA, Pytorch, transformers, flash attention, etc.</li>\n<li>Strong written and verbal communication skills to operate in a cross functional team environment.</li>\n<li>PhD or Masters in Computer Science or a related field.</li>\n</ul>\n<p>Compensation:</p>\n<p>We offer competitive compensation packages, including base salary, equity, and benefits. The range displayed on each job posting reflects the minimum and maximum target for new hire salaries for the position, determined by work location and additional factors, including job-related skills, experience, interview performance, and relevant education or training.</p>\n<p>Benefits:</p>\n<ul>\n<li>Comprehensive health, dental and vision coverage.</li>\n<li>Retirement benefits.</li>\n<li>A learning and development stipend.</li>\n<li>Generous PTO.</li>\n<li>Commuter stipend.</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_f28927b0-573","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Scale","sameAs":"https://www.scale.com/","logo":"https://logos.yubhub.co/scale.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/scaleai/jobs/4625341005","x-work-arrangement":"hybrid","x-experience-level":"mid","x-job-type":"full-time","x-salary-range":"$189,600-$237,000 USD","x-skills-required":["LLM training","System optimization","Post-training methods","GPU cluster operation","Multi-node LLM training","Inference","CUDA","Pytorch","Transformers","Flash attention"],"x-skills-preferred":[],"datePosted":"2026-04-18T16:00:01.664Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco, CA; New York, NY"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"LLM training, System optimization, Post-training methods, GPU cluster operation, Multi-node LLM training, Inference, CUDA, Pytorch, Transformers, Flash attention","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":189600,"maxValue":237000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_539e2a23-ddf"},"title":"Tech Lead Manager- MLRE, ML Systems","description":"<p>You will lead the development of our internal distributed framework for large language model training. The platform powers MLEs, researchers, data scientists, and operators for fast and automatic training and evaluation of LLMs. It also serves as the underlying training framework for the data quality evaluation pipeline.</p>\n<p>You will work closely with Scale’s ML teams and researchers to build the foundation platform which supports all our ML research and development works. You will be building and optimising the platform to enable our next generation LLM training, inference and data curation.</p>\n<p>Key responsibilities include:</p>\n<ul>\n<li>Building, profiling and optimising our training and inference framework.</li>\n<li>Collaborating with ML and research teams to accelerate their research and development, and enable them to develop the next generation of models and data curation.</li>\n<li>Researching and integrating state-of-the-art technologies to optimise our ML system.</li>\n</ul>\n<p>The ideal candidate will have:</p>\n<ul>\n<li>Passionate about system optimisation.</li>\n<li>Experience with multi-node LLM training and inference.</li>\n<li>Experience with developing large-scale distributed ML systems.</li>\n<li>Experience with post-training methods like RLHF/RLVR and related algorithms like PPO/GRPO etc.</li>\n<li>Strong software engineering skills, proficient in frameworks and tools such as CUDA, PyTorch, transformers, flash attention, etc.</li>\n</ul>\n<p>Nice to haves include demonstrated expertise in post-training methods and/or next generation use cases for large language models including instruction tuning, RLHF, tool use, reasoning, agents, and multimodal, etc.</p>\n<p>Compensation packages at Scale for eligible roles include base salary, equity, and benefits. The range displayed on each job posting reflects the minimum and maximum target for new hire salaries for the position, determined by work location and additional factors, including job-related skills, experience, interview performance, and relevant education or training.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_539e2a23-ddf","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Scale","sameAs":"https://scale.com/","logo":"https://logos.yubhub.co/scale.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/scaleai/jobs/4618046005","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$264,800-$331,000 USD","x-skills-required":["system optimisation","multi-node LLM training and inference","large-scale distributed ML systems","post-training methods","software engineering skills","CUDA","PyTorch","transformers","flash attention"],"x-skills-preferred":["next generation use cases for large language models","instruction tuning","RLHF","tool use","reasoning","agents","multimodal"],"datePosted":"2026-04-18T15:59:21.558Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco, CA; New York, NY"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"system optimisation, multi-node LLM training and inference, large-scale distributed ML systems, post-training methods, software engineering skills, CUDA, PyTorch, transformers, flash attention, next generation use cases for large language models, instruction tuning, RLHF, tool use, reasoning, agents, multimodal","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":264800,"maxValue":331000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_840bab06-7be"},"title":"ML Research Engineer, ML Systems","description":"<p>Job Description:</p>\n<p>Scale&#39;s ML platform (RLXF) team builds our internal distributed framework for large language model training and inference. The platform has been powering MLEs, researchers, data scientists and operators for fast and automatic training and evaluation of LLM&#39;s, as well as evaluation of data quality.</p>\n<p>At Scale, we&#39;re uniquely positioned at the heart of the field of AI as an indispensable provider of training and evaluation data and end-to-end solutions for the ML lifecycle. You will work closely across Scale&#39;s ML teams and researchers to build the foundation platform that supports all our ML research and development. You will be building and optimizing the platform to enable our next generation of LLM training, inference and data curation.</p>\n<p>Responsibilities:</p>\n<ul>\n<li>Build, profile and optimize our training and inference framework</li>\n<li>Collaborate with ML teams to accelerate their research and development and enable them to develop the next generation of models and data curation</li>\n<li>Research and integrate state-of-the-art technologies to optimize our ML system</li>\n</ul>\n<p>Ideal Candidate:</p>\n<ul>\n<li>Strong excitement about system optimization</li>\n<li>Experience with multi-node LLM training and inference</li>\n<li>Experience with developing large-scale distributed ML systems</li>\n<li>Strong software engineering skills, proficient in frameworks and tools such as CUDA, Pytorch, transformers, flash attention, etc.</li>\n<li>Strong written and verbal communication skills and the ability to operate in a cross functional team environment</li>\n</ul>\n<p>Nice to Have:</p>\n<ul>\n<li>Demonstrated expertise in post-training methods &amp;/or next generation use cases for large language models including instruction tuning, RLHF, tool use, reasoning, agents, and multimodal, etc.</li>\n</ul>\n<p>Compensation Packages:</p>\n<p>Compensation packages at Scale for eligible roles include base salary, equity, and benefits. The range displayed on each job posting reflects the minimum and maximum target for new hire salaries for the position, determined by work location and additional factors, including job-related skills, experience, interview performance, and relevant education or training. Scale employees in eligible roles are also granted equity based compensation, subject to Board of Director approval. Your recruiter can share more about the specific salary range for your preferred location during the hiring process, and confirm whether the hired role will be eligible for equity grant. You&#39;ll also receive benefits including, but not limited to: Comprehensive health, dental and vision coverage, retirement benefits, a learning and development stipend, and generous PTO. Additionally, this role may be eligible for additional benefits such as a commuter stipend.</p>\n<p>Please note that our policy requires a 90-day waiting period before reconsidering candidates for the same role. This allows us to ensure a fair and thorough evaluation of all applicants.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_840bab06-7be","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Scale","sameAs":"https://scale.com/","logo":"https://logos.yubhub.co/scale.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/scaleai/jobs/4534631005","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$189,600-$237,000 USD","x-skills-required":["System Optimization","Multi-node LLM Training and Inference","Large-Scale Distributed ML Systems","CUDA","Pytorch","Transformers","Flash Attention"],"x-skills-preferred":["Post-Training Methods","Next Generation Use Cases for Large Language Models","Instruction Tuning","RLHF","Tool Use","Reasoning","Agents","Multimodal"],"datePosted":"2026-04-18T15:58:47.020Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco, CA; Seattle, WA; New York, NY"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"System Optimization, Multi-node LLM Training and Inference, Large-Scale Distributed ML Systems, CUDA, Pytorch, Transformers, Flash Attention, Post-Training Methods, Next Generation Use Cases for Large Language Models, Instruction Tuning, RLHF, Tool Use, Reasoning, Agents, Multimodal","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":189600,"maxValue":237000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_9af8d812-df8"},"title":"AI Infrastructure Engineer","description":"<p>We&#39;re looking for Senior+ AI Infrastructure Engineers to build the systems that train and serve Intercom&#39;s next generation of AI products.</p>\n<p>As a Senior AI Infrastructure Engineer focused on model training and inference, you will:</p>\n<p>Implement and scale training pipelines for large transformer and LLM models, from data ingestion and preprocessing through distributed training and evaluation.</p>\n<p>Build and optimize inference services that deliver low-latency, high-reliability experiences for our customers, including autoscaling, routing, and fallbacks.</p>\n<p>Work on GPU-level performance: tuning kernels, improving utilization, and identifying bottlenecks across our training and inference stack.</p>\n<p>Collaborate closely with ML scientists to implement cutting edge training and inference methods and bring them to production.</p>\n<p>Play an active role in hiring, mentoring, and developing other engineers on the team.</p>\n<p>Raise the bar for technical standards, reliability, and operational excellence across Intercom’s AI platform.</p>\n<p>We’re looking to hire Senior+ AI Infrastructure Engineers. You’re likely a great fit if:</p>\n<p>You have 5+ years of experience in software engineering, with a strong track record of shipping high-quality products or platforms.</p>\n<p>You hold a degree in Computer Science, Computer Engineering, or a related field (or you have equivalent experience with very strong fundamentals).</p>\n<p>You have hands-on experience with one or more of the following:</p>\n<p>Model training (especially transformers and LLMs).</p>\n<p>Model inference at scale (again, especially transformers and LLMs).</p>\n<p>Low-level GPU work, such as writing CUDA or Triton kernels.</p>\n<p>Comfortable working in production environments at meaningful scale (traffic, data, or organizational).</p>\n<p>You communicate clearly, can explain complex technical topics to different audiences, and enjoy close collaboration with both engineers and non-engineers.</p>\n<p>You take pride in strong technical fundamentals, love learning, and are willing to invest in your own development.</p>\n<p>Have deep knowledge of at least one programming language (for example Python, Ruby, Java, Go, etc.). Specific language experience is less important than your ability to write clean, reliable code and learn new stacks quickly.</p>\n<p>We are a well-treated bunch, with awesome benefits! If there’s something important to you that’s not on this list, talk to us!</p>\n<p>Competitive salary, annual bonus and equity</p>\n<p>Regular compensation reviews - we reward great work!</p>\n<p>Unlimited access to Claude Code and best-in-class AI tools; experimentation &amp; building is encouraged &amp; celebrated.</p>\n<p>Generous paid time off above statutory minimum</p>\n<p>Hybrid working</p>\n<p>MacBooks are our standard, but we also offer Windows for certain roles when needed.</p>\n<p>Fun events for employees, friends, and family!</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_9af8d812-df8","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Intercom","sameAs":"https://www.intercom.com/","logo":"https://logos.yubhub.co/intercom.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/intercom/jobs/7824142","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["model training","model inference","low-level GPU work","CUDA","Triton","Python","Ruby","Java","Go"],"x-skills-preferred":["experience at AI native companies","running training or inference workloads on Kubernetes","AWS","cloud providers","production experience with Python in ML or infrastructure contexts"],"datePosted":"2026-04-18T15:57:33.379Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Berlin, Germany"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"model training, model inference, low-level GPU work, CUDA, Triton, Python, Ruby, Java, Go, experience at AI native companies, running training or inference workloads on Kubernetes, AWS, cloud providers, production experience with Python in ML or infrastructure contexts"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_f0f66ce3-d78"},"title":"Senior GenAI Research Engineer - Optimization and Kernels","description":"<p>As a research engineer on the Scaling team at Databricks, you will be responsible for keeping up with the latest developments in deep learning and advancing the scientific frontier by creating new techniques that go beyond the state of the art.</p>\n<p>You will work together on a collaborative team of researchers and engineers with diverse backgrounds and technical training. Your goal will be to make our customers successful in applying state-of-the-art LLMs and AI systems, and we encode our scientific expertise into our products to make that possible.</p>\n<p>Your responsibilities will include:</p>\n<ul>\n<li>Driving performance improvements through advanced optimization techniques including kernel fusion, mixed precision, memory layout optimization, tiling strategies, and tensorization for training-specific patterns</li>\n</ul>\n<ul>\n<li>Designing, implementing, and optimizing high-performance GPU kernels for training workloads (e.g., attention mechanisms, custom layers, gradient computation, activation functions) targeting NVIDIA architectures</li>\n</ul>\n<ul>\n<li>Designing and implementing distributed training frameworks for large language models, including parallelism strategies (data, tensor, pipeline, ZeRO-based) and optimized communication patterns for gradient synchronization and collective operations</li>\n</ul>\n<ul>\n<li>Profiling, debugging, and optimizing end-to-end training workflows to identify and resolve performance bottlenecks, applying memory optimization techniques like activation checkpointing, gradient sharding, and mixed precision training</li>\n</ul>\n<p>We look for candidates with a strong background in computer science or a related field, hands-on experience writing and tuning CUDA kernels for ML training applications, and a deep understanding of parallelism techniques and memory optimization strategies for large-scale model training.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_f0f66ce3-d78","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Databricks","sameAs":"https://databricks.com","logo":"https://logos.yubhub.co/databricks.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/databricks/jobs/8297797002","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$166,000-$225,000 USD","x-skills-required":["CUDA","NVIDIA GPU architecture","PyTorch","distributed training frameworks","parallelism techniques","memory optimization strategies"],"x-skills-preferred":[],"datePosted":"2026-04-18T15:57:26.571Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco, California"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"CUDA, NVIDIA GPU architecture, PyTorch, distributed training frameworks, parallelism techniques, memory optimization strategies","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":166000,"maxValue":225000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_53bd182c-902"},"title":"DSP Engineer, EW","description":"<p>Anduril Industries is seeking a highly skilled DSP Engineer to join our team. As a DSP Engineer, you will design, develop, and optimize digital signal processing algorithms and systems for radio direction finding and direction-of-arrival estimation in defense applications.</p>\n<p>Key responsibilities include:</p>\n<ul>\n<li>Collaborating with a multidisciplinary team of software and hardware engineers to develop software defined radios;</li>\n<li>Implementing high-performance, real-time signal processing chains on embedded and hardware platforms to support mission-critical sensing capabilities;</li>\n<li>Developing Modeling and Simulation (M&amp;S) code for RADAR techniques and data analysis including Hardware-in-the Loop / Software-in-the-loop (HIL/SIL) testing;</li>\n<li>Participating in laboratory and field testing of RF systems and techniques;</li>\n<li>Participating in the maturation of RF systems into deployable systems and products.</li>\n</ul>\n<p>Required qualifications include:</p>\n<ul>\n<li>5+ years of experience with a BSEE or related field;</li>\n<li>Strong foundation in digital signal processing, comms theory, and system engineering with emphasis in direction finding algorithm implementation;</li>\n<li>Hands-on experience with direction finding, angle-of-arrival estimation, and multi-antenna signal processing;</li>\n<li>Strong experience with DSP implementation for embedded devices including FPGA, Nvidia Jetson, and Software Defined Radios and/or software defined radios;</li>\n<li>Strong knowledge of Python and MATLAB;</li>\n<li>Experience with CUDA or GPU accelerated frameworks like cuSignal is preferred;</li>\n<li>Familiar with deep learning algorithms;</li>\n<li>Familiar with wireless communication standards (Bluetooth, 3G/4G/5G, Wi-Fi, SINCGARS, MUOS, etc.).</li>\n</ul>\n<p>Preferred qualifications include:</p>\n<ul>\n<li>Masters or PhD degree in Electrical, Electronics, Computer Engineering, or related fields;</li>\n<li>Experience with ML frameworks such as TensorFlow and PyTorch;</li>\n<li>Defense, national security, or aerospace domain familiarity through industry or education;</li>\n<li>Extensive Digital Signal Processing (DSP) knowledge and experience;</li>\n<li>Expertise in Synthetic Aperture Radar (SAR) and/or Inverse SAR (ISAR): Image formation, waveforms, phenomenology, modeling and simulation.</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_53bd182c-902","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Anduril Industries","sameAs":"https://anduril.com","logo":"https://logos.yubhub.co/anduril.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/andurilindustries/jobs/5031495007","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$166,000-$220,000 USD","x-skills-required":["Digital Signal Processing","Comms Theory","System Engineering","Direction Finding Algorithm Implementation","Embedded Devices","FPGA","Nvidia Jetson","Software Defined Radios","Python","MATLAB","CUDA","GPU Accelerated Frameworks","Deep Learning Algorithms","Wireless Communication Standards"],"x-skills-preferred":["ML Frameworks","TensorFlow","PyTorch","Defense Domain","National Security","Aerospace Domain","Synthetic Aperture Radar","Inverse SAR","Image Formation","Waveforms","Phenomenology","Modeling and Simulation"],"datePosted":"2026-04-18T15:57:17.065Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Costa Mesa, California, United States"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Digital Signal Processing, Comms Theory, System Engineering, Direction Finding Algorithm Implementation, Embedded Devices, FPGA, Nvidia Jetson, Software Defined Radios, Python, MATLAB, CUDA, GPU Accelerated Frameworks, Deep Learning Algorithms, Wireless Communication Standards, ML Frameworks, TensorFlow, PyTorch, Defense Domain, National Security, Aerospace Domain, Synthetic Aperture Radar, Inverse SAR, Image Formation, Waveforms, Phenomenology, Modeling and Simulation","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":166000,"maxValue":220000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_a2c81b27-4e2"},"title":"Sr. Engineering Manager, AI/ML Serving Platform","description":"<p>We&#39;re seeking a Sr. Engineering Manager to lead the team that builds the serving and deployment infrastructure for all AI/ML models at Pinterest. The AI/ML Serving Platform team provides foundational tools and infrastructure used by hundreds of AI/ML engineers across Pinterest, including recommendations, ads, visual search, growth/notifications, trust and safety.</p>\n<p>The ideal candidate will have experience managing platform engineering teams with many cross-organizational customers, leading the development of large-scale distributed serving systems, and working with AI/ML inference technologies for online serving at Web scale.</p>\n<p>Key responsibilities include:</p>\n<ul>\n<li>Leading the team to deliver continual improvements in advanced model architectures, cost-efficient resource utilization, and AI/ML developer productivity.</li>\n<li>Setting technical direction for the team based on company and org priorities.</li>\n<li>Coaching and developing talent on the team.</li>\n</ul>\n<p>In return, you&#39;ll have the opportunity to work on a high-impact project that will shape the future of AI/ML at Pinterest.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_a2c81b27-4e2","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Pinterest","sameAs":"https://www.pinterest.com/","logo":"https://logos.yubhub.co/pinterest.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/pinterest/jobs/7569150","x-work-arrangement":"remote","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$208,592-\\$429,454 USD","x-skills-required":["AI/ML inference technologies","PyTorch","TensorFlow","Kubernetes","C++","TorchScript","CUDA Graph"],"x-skills-preferred":[],"datePosted":"2026-04-18T15:55:20.853Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco, CA, US; Remote, US"}},"jobLocationType":"TELECOMMUTE","employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"AI/ML inference technologies, PyTorch, TensorFlow, Kubernetes, C++, TorchScript, CUDA Graph","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":208592,"maxValue":429454,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_cba88898-896"},"title":"Research Engineer, Infrastructure, Kernels","description":"<p>We&#39;re looking for an infrastructure research engineer to design, optimize, and maintain the compute foundations that power large-scale language model training. You will develop high-performance ML kernels (e.g., CUDA, CuTe, Triton), enable efficient low-precision arithmetic, and improve the distributed compute stack that makes training large models possible.</p>\n<p>This role is perfect for an engineer who enjoys working close to the metal and across the research boundary. You&#39;ll collaborate with researchers and systems architects to bridge algorithmic design with hardware efficiency. You&#39;ll prototype new kernel implementations, profile performance across hardware generations, and help define the numerical and parallelism strategies that determine how we scale next-generation AI systems.</p>\n<p><strong>Responsibilities</strong></p>\n<ul>\n<li>Design and implement custom ML kernels (e.g., CUDA, CuTe, Triton) for core LLM operations such as attention, matrix multiplication, gating, and normalization, optimized for modern GPU and accelerator architectures.</li>\n<li>Design and think through compute primitives to reduce memory bandwidth bottlenecks and improve kernel compute efficiency.</li>\n<li>Collaborate with research teams to align kernel-level optimizations with model architecture and algorithmic goals.</li>\n<li>Develop and maintain a library of reusable kernels and performance benchmarks that serve as the foundation for internal model training.</li>\n<li>Contribute to infrastructure stability and scalability, ensuring reproducibility, consistency across precision formats, and high utilization of compute resources.</li>\n<li>Document and share insights through internal talks, technical papers, or open-source contributions to strengthen the broader ML systems community.</li>\n</ul>\n<p><strong>Skills and Qualifications</strong></p>\n<p>Minimum qualifications:</p>\n<ul>\n<li>Bachelor’s degree or equivalent experience in computer science, electrical engineering, statistics, machine learning, physics, robotics, or similar.</li>\n<li>Strong engineering skills, ability to contribute performant, maintainable code and debug in complex codebases</li>\n<li>Understanding of deep learning frameworks (e.g., PyTorch, JAX) and their underlying system architectures.</li>\n<li>Thrive in a highly collaborative environment involving many, different cross-functional partners and subject matter experts.</li>\n<li>A bias for action with a mindset to take initiative to work across different stacks and different teams where you spot the opportunity to make sure something ships.</li>\n<li>Proficiency in CUDA, CuTe, Triton, or other GPU programming frameworks.</li>\n<li>Demonstrated ability to analyze, profile, and optimize compute-intensive workloads.</li>\n</ul>\n<p>Preferred qualifications:</p>\n<ul>\n<li>Experience training or supporting large-scale language models with tens of billions of parameters or more.</li>\n<li>Track record of improving research productivity through infrastructure design or process improvements.</li>\n<li>Experience developing or tuning kernels for deep learning frameworks such as PyTorch, JAX, or custom accelerators.</li>\n<li>Familiarity with tensor parallelism, pipeline parallelism, or distributed data processing frameworks.</li>\n<li>Experience implementing low-precision formats (FP8, INT8, block floating point) or contributing to related compiler stacks (e.g., XLA, TVM).</li>\n<li>Contributions to open-source GPU, ML systems, or compiler optimization projects.</li>\n<li>Prior research or engineering experience in numerical optimization, communication-efficient training, or scalable AI infrastructure.</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_cba88898-896","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Thinking Machines Lab","sameAs":"https://thinkingmachines.ai/","logo":"https://logos.yubhub.co/thinkingmachines.ai.png"},"x-apply-url":"https://job-boards.greenhouse.io/thinkingmachines/jobs/5013934008","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$350,000 - $475,000 USD","x-skills-required":["CUDA","CuTe","Triton","GPU programming frameworks","Deep learning frameworks (e.g., PyTorch, JAX)","Computer science","Electrical engineering","Statistics","Machine learning","Physics","Robotics"],"x-skills-preferred":["Experience training or supporting large-scale language models with tens of billions of parameters or more","Track record of improving research productivity through infrastructure design or process improvements","Experience developing or tuning kernels for deep learning frameworks such as PyTorch, JAX, or custom accelerators","Familiarity with tensor parallelism, pipeline parallelism, or distributed data processing frameworks","Experience implementing low-precision formats (FP8, INT8, block floating point) or contributing to related compiler stacks (e.g., XLA, TVM)","Contributions to open-source GPU, ML systems, or compiler optimization projects","Prior research or engineering experience in numerical optimization, communication-efficient training, or scalable AI infrastructure"],"datePosted":"2026-04-18T15:54:38.498Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"CUDA, CuTe, Triton, GPU programming frameworks, Deep learning frameworks (e.g., PyTorch, JAX), Computer science, Electrical engineering, Statistics, Machine learning, Physics, Robotics, Experience training or supporting large-scale language models with tens of billions of parameters or more, Track record of improving research productivity through infrastructure design or process improvements, Experience developing or tuning kernels for deep learning frameworks such as PyTorch, JAX, or custom accelerators, Familiarity with tensor parallelism, pipeline parallelism, or distributed data processing frameworks, Experience implementing low-precision formats (FP8, INT8, block floating point) or contributing to related compiler stacks (e.g., XLA, TVM), Contributions to open-source GPU, ML systems, or compiler optimization projects, Prior research or engineering experience in numerical optimization, communication-efficient training, or scalable AI infrastructure","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":350000,"maxValue":475000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_dd290e64-a85"},"title":"Quantum Software Engineer","description":"<p>We are seeking a talented and innovative Quantum Software Engineer to join our forward-looking team at Anduril Labs. In this role, you will be instrumental in building and delivering impactful quantum solutions for both Anduril-internal use cases and external customer applications.</p>\n<p>You will work closely with delivery leads, application developers, and other solutions architects, as well as internal and external partners to design, implement, and deliver bleeding edge quantum solutions on state-of-the-art quantum-inspired, quantum annealing, and quantum gate platforms for real-world defense and national security challenges.</p>\n<p>The ideal candidate will combine a strong foundation in quantum computing principles with hands-on classical and quantum software development expertise. You will leverage your skills to translate complex problems into (hybrid) quantum algorithms, applications, and services.</p>\n<p>This includes developing robust software implementations, and integrate quantum-enhanced solutions into existing and new defense systems.</p>\n<p>If you are passionate about applying theoretical quantum concepts to deliver tangible, high-impact results, and thrive in an environment that values innovation, collaboration, and rapid prototyping, we encourage you to apply.</p>\n<p><strong>Key Responsibilities:</strong> Be a key contributor to the development of next-generation quantum-enhanced Anduril offerings and lead the design, development, and deployment of novel quantum-enhanced applications and services in the defense and national security domain. Develop impactful hybrid quantum algorithms and applications that promise significant decision advantages and focus on practical scalability and real-world applicability. Contribute knowledge of classical and quantum optimization algorithms and tools, evaluating, and communicating their pros and cons, current state-of-the-art, scaling behaviors, trade-offs, and cross-over points. Participate in the full (hybrid) quantum software development lifecycle, from concept and design to testing, deployment, and ongoing maintenance.</p>\n<p><strong>Requirements:</strong> Bachelor&#39;s degree in Computer Science, Quantum Information Science, Physics, Mathematics, or a closely related technical field. 3+ years of hands-on, professional software development experience with C, C++, Python, or another general-purpose compiled programming language. Practical experience in quantum computing, including programming quantum applications, or quantum circuit compilation. Proficiency with one or more leading quantum programming languages, SDKs, or APIs such as Qiskit, CUDA-Q, Q#, Cirq, PennyLane, or similar. Expertise in key mathematical techniques foundational to quantum computing, including linear algebra, matrix decompositions, probability theory, group theory, symmetry, and computational complexity. Proficient with database systems and SQL, with hands-on experience working with relational databases (e.g., PostgreSQL, Oracle, MySQL). Experience with Git version control, build tools, and CI/CD pipelines. Demonstrated understanding and application of software testing principles and practices, including unit testing, integration testing, and end-to-end testing. Strong problem-solving skills, meticulous attention to detail, and the ability to work effectively in a collaborative team environment. Excellent communication and interpersonal skills, with the ability to effectively articulate complex technical concepts to diverse audiences. Eligible to obtain and maintain an active U.S. Top Secret SCI security clearance. Demonstrable hands-on experience using GenAI tools (e.g., OpenAI Codex, Claude Code, Gemini Code Assist, GitHub Copilot, Amazon CodeWhisperer, or similar) for software development, code generation, debugging, and algorithmic exploration.</p>\n<p><strong>Preferred Qualifications:</strong> Master&#39;s or Ph.D. in Quantum Information Science, Physics, Computer Science, or a related quantitative field. Familiarity with leading classical optimization tools and solvers (e.g., CPLEX, Gurobi, OR-Tools) and knowledge of mathematical modeling and classical optimization solution techniques. Experience building and deploying applications to solve complex business or defense problems for customers. Proven record of successful on-time delivery of complex software projects with a high degree of predictability and quality. Experience with deployment of code in distributed environments, cloud application development (e.g., AWS, Azure, GCP), and RESTful API-driven architectures. Experience with high-performance computing (HPC) environments or parallel programming. Familiarity with quantum hardware platforms and their unique characteristics. Prior experience in defense, aerospace, or related industries applying advanced technologies. Willingness to travel up to approximately 10%.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_dd290e64-a85","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Anduril Industries","sameAs":"https://www.anduril.com/","logo":"https://logos.yubhub.co/anduril.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/andurilindustries/jobs/5089054007","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$132,000-$198,000 USD","x-skills-required":["C","C++","Python","Qiskit","CUDA-Q","Q#","Cirq","PennyLane","Linear Algebra","Matrix Decompositions","Probability Theory","Group Theory","Symmetry","Computational Complexity","Database Systems","SQL","Git","Build Tools","CI/CD Pipelines","Software Testing Principles","Unit Testing","Integration Testing","End-to-End Testing","GenAI Tools"],"x-skills-preferred":["Master's or Ph.D. in Quantum Information Science, Physics, Computer Science, or a related quantitative field","Familiarity with leading classical optimization tools and solvers","Experience building and deploying applications to solve complex business or defense problems for customers","Proven record of successful on-time delivery of complex software projects with a high degree of predictability and quality","Experience with deployment of code in distributed environments, cloud application development, and RESTful API-driven architectures","Experience with high-performance computing (HPC) environments or parallel programming","Familiarity with quantum hardware platforms and their unique characteristics","Prior experience in defense, aerospace, or related industries applying advanced technologies"],"datePosted":"2026-04-18T15:54:19.846Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Washington, District of Columbia, United States"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"C, C++, Python, Qiskit, CUDA-Q, Q#, Cirq, PennyLane, Linear Algebra, Matrix Decompositions, Probability Theory, Group Theory, Symmetry, Computational Complexity, Database Systems, SQL, Git, Build Tools, CI/CD Pipelines, Software Testing Principles, Unit Testing, Integration Testing, End-to-End Testing, GenAI Tools, Master's or Ph.D. in Quantum Information Science, Physics, Computer Science, or a related quantitative field, Familiarity with leading classical optimization tools and solvers, Experience building and deploying applications to solve complex business or defense problems for customers, Proven record of successful on-time delivery of complex software projects with a high degree of predictability and quality, Experience with deployment of code in distributed environments, cloud application development, and RESTful API-driven architectures, Experience with high-performance computing (HPC) environments or parallel programming, Familiarity with quantum hardware platforms and their unique characteristics, Prior experience in defense, aerospace, or related industries applying advanced technologies","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":132000,"maxValue":198000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_f2196e99-854"},"title":"Software Engineer - GenAI inference","description":"<p>As a software engineer for GenAI inference, you will help design, develop, and optimize the inference engine that powers Databricks&#39; Foundation Model API. You&#39;ll work at the intersection of research and production, ensuring our large language model (LLM) serving systems are fast, scalable, and efficient.</p>\n<p>Your work will touch the full GenAI inference stack , from kernels and runtimes to orchestration and memory management. You will contribute to the design and implementation of the inference engine, and collaborate on model-serving stack optimized for large-scale LLMs inference.</p>\n<p>Key responsibilities include:</p>\n<ul>\n<li>Collaborating with researchers to bring new model architectures or features (sparsity, activation compression, mixture-of-experts) into the engine</li>\n<li>Optimizing for latency, throughput, memory efficiency, and hardware utilization across GPUs, and accelerators</li>\n<li>Building and maintaining instrumentation, profiling, and tracing tooling to uncover bottlenecks and guide optimizations</li>\n<li>Developing and enhancing scalable routing, batching, scheduling, memory management, and dynamic loading mechanisms for inference workloads</li>\n<li>Supporting reliability, reproducibility, and fault tolerance in the inference pipelines, including A/B launches, rollback, and model versioning</li>\n<li>Integrating with federated, distributed inference infrastructure – orchestrate across nodes, balance load, handle communication overhead</li>\n<li>Collaborating cross-functionally: with platform engineers, cloud infrastructure, and security/compliance teams</li>\n<li>Documenting and sharing learnings, contributing to internal best practices and open-source efforts when possible</li>\n</ul>\n<p>Requirements include:</p>\n<ul>\n<li>BS/MS/PhD in Computer Science, or a related field</li>\n<li>Strong software engineering background (3+ years or equivalent) in performance-critical systems</li>\n<li>Solid understanding of ML inference internals: attention, MLPs, recurrent modules, quantization, sparse operations, etc.</li>\n<li>Hands-on experience with CUDA, GPU programming, and key libraries (cuBLAS, cuDNN, NCCL, etc.)</li>\n<li>Comfortable designing and operating distributed systems, including RPC frameworks, queuing, RPC batching, sharding, memory partitioning</li>\n<li>Demonstrated ability to uncover and solve performance bottlenecks across layers (kernel, memory, networking, scheduler)</li>\n<li>Experience building instrumentation, tracing, and profiling tools for ML models</li>\n<li>Ability to work closely with ML researchers, translate novel model ideas into production systems</li>\n<li>Ownership mindset and eagerness to dive deep into complex system challenges</li>\n<li>Bonus: published research or open-source contributions in ML systems, inference optimization, or model serving</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_f2196e99-854","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Databricks","sameAs":"https://databricks.com","logo":"https://logos.yubhub.co/databricks.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/databricks/jobs/8202670002","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$142,200-$204,600 USD","x-skills-required":["software engineering","performance-critical systems","ML inference internals","CUDA","GPU programming","distributed systems","RPC frameworks","queuing","RPC batching","sharding","memory partitioning","instrumentation","tracing","profiling tools","ML researchers","complex system challenges"],"x-skills-preferred":[],"datePosted":"2026-04-18T15:54:17.777Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco, California"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"software engineering, performance-critical systems, ML inference internals, CUDA, GPU programming, distributed systems, RPC frameworks, queuing, RPC batching, sharding, memory partitioning, instrumentation, tracing, profiling tools, ML researchers, complex system challenges","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":142200,"maxValue":204600,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_c9ab5cbc-dd6"},"title":"Research Engineer, Performance RL","description":"<p>We&#39;re hiring a Research Engineer to join our Code RL team within the RL organization. As a Research Engineer, you&#39;ll advance our models&#39; ability to safely write correct, fast code for accelerators.</p>\n<p>You&#39;ll need to know accelerator performance well to turn it into tasks and signals models can learn from. Specifically, you will:</p>\n<ul>\n<li>Invent, design and implement RL environments and evaluations.</li>\n<li>Conduct experiments and shape our research roadmap.</li>\n<li>Deliver your work into training runs.</li>\n<li>Collaborate with other researchers, engineers, and performance engineering specialists across and outside Anthropic.</li>\n</ul>\n<p>We&#39;re looking for someone with expertise in accelerators (CUDA, ROCm, Triton, Pallas), ML framework programming (JAX or PyTorch), and experience with balancing research exploration with engineering implementation.</p>\n<p>Strong candidates may also have experience with reinforcement learning, porting ML workloads between different types of accelerators, and familiarity with LLM training methodologies.</p>\n<p>The annual compensation range for this role is $350,000-$850,000 USD.</p>\n<p>Please note that we&#39;re an extremely collaborative group, and we value communication skills. The easiest way to understand our research directions is to read our recent research.</p>\n<p>We offer competitive compensation and benefits, optional equity donation matching, generous vacation and parental leave, flexible working hours, and a lovely office space in which to collaborate with colleagues.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_c9ab5cbc-dd6","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Anthropic","sameAs":"https://www.anthropic.com/","logo":"https://logos.yubhub.co/anthropic.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/anthropic/jobs/5160330008","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$350,000-$850,000 USD","x-skills-required":["accelerator performance","ML framework programming","reinforcement learning","RL environments and evaluations","experiments and research roadmap","training runs","collaboration with researchers and engineers"],"x-skills-preferred":["CUDA","ROCm","Triton","Pallas","JAX","PyTorch","LLM training methodologies"],"datePosted":"2026-04-18T15:54:02.762Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco, CA"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"accelerator performance, ML framework programming, reinforcement learning, RL environments and evaluations, experiments and research roadmap, training runs, collaboration with researchers and engineers, CUDA, ROCm, Triton, Pallas, JAX, PyTorch, LLM training methodologies","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":350000,"maxValue":850000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_dc17980d-461"},"title":"Research Engineer, Interpretability","description":"<p>JOB TITLE: Research Engineer, Interpretability \\n LOCATION: San Francisco, CA \\n DEPARTMENT: AI Research &amp; Engineering \\n \\n JOB DESCRIPTION: \\n \\n When you see what modern language models are capable of, do you wonder, &quot;How do these things work? How can we trust them?&quot; \\n \\n The Interpretability team at Anthropic is working to reverse-engineer how trained models work because we believe that a mechanistic understanding is the most robust way to make advanced systems safe. \\n \\n Think of us as doing &quot;neuroscience&quot; of neural networks using &quot;microscopes&quot; we build - or reverse-engineering neural networks like binary programs. \\n \\n More resources to learn about our work: \\n - Our research blog - covering advances including Monosemantic Features and Circuits \\n - An Introduction to Interpretability from our research lead, Chris Olah \\n - The Urgency of Interpretability from CEO Dario Amodei \\n - Engineering Challenges Scaling Interpretability - directly relevant to this role \\n - 60 Minutes segment - Around 8:07, see a demo of tooling our team built \\n - New Yorker article - what it&#39;s like to work on one of AI&#39;s hardest open problems \\n \\n Even if you haven&#39;t worked on interpretability before, the infrastructure expertise is similar to what&#39;s needed across the lifecycle of a production language model: \\n - Pretraining: Training dictionary learning models looks a lot like model pretraining - creating stable, performant training jobs for massively parameterized models across thousands of chips \\n - Inference: Interp runs a customized inference stack. Day-to-day analysis requires services that allow editing a model&#39;s internal activations mid-forward-pass - for example, adding a &quot;steering vector&quot; \\n - Performance: Like all LLM work, we push up against the limits of hardware and software. Rather than squeezing the last 0.1%, we are focused on finding bottlenecks, fixing them and moving ahead given rapidly evolving research and safety mission \\n \\n The science keeps scaling - and it&#39;s now applied directly in safety audits on frontier models, with real deadlines. As our research has matured, engineering and infrastructure have become a bottleneck. Your work will have a direct impact on one of the most important open problems in AI. \\n \\n RESPONSIBILITIES: \\n - Build and maintain the specialized inference and training infrastructure that powers interpretability research - including instrumented forward/backward passes, activation extraction, and steering vector application \\n - Resolve scaling and efficiency bottlenecks through profiling, optimization, and close collaboration with peer infrastructure teams \\n - Design tools, abstractions, and platforms that enable researchers to rapidly experiment without hitting engineering barriers \\n - Help bring interpretability research into production safety audits - with real deadlines and high reliability expectations \\n - Work across the stack - from model internals and accelerator-level optimization to user-facing research tooling \\n \\n YOU MAY BE A GOOD FIT IF YOU: \\n - Have 5-10+ years of experience building software \\n - Are highly proficient in at least one programming language (e.g., Python, Rust, Go, Java) and productive with Python \\n - Are extremely curious about unfamiliar domains; can quickly learn and put that knowledge to work, e.g. diving into new layers of the stack to find bottlenecks \\n - Have a strong ability to prioritize the most impactful work and are comfortable operating with ambiguity and questioning assumptions \\n - Prefer fast-moving collaborative projects to extensive solo efforts \\n - Are curious about interpretability research and its role in AI safety (though no research experience is required!) \\n - Care about the societal impacts and ethics of your work \\n - Are comfortable working closely with researchers, translating research needs into engineering solutions. \\n \\n STRONG CANDIDATES MAY ALSO HAVE EXPERIENCE WITH: \\n - Optimizing the performance of large-scale distributed systems \\n - Language modeling fundamentals with transformers \\n - High Performance LLM optimization: memory management, compute efficiency, parallelism strategies, inference throughput optimization \\n - Working hands-on in a mainstream ML stack - PyTorch/CUDA on GPUs or JAX/XLA on TPUs \\n - Collaborating closely with researchers and building tooling to support research teams; or directly performed research with complex engineering challenges \\n \\n REPRESENTATIVE PROJECTS: \\n - Building Garcon, a tool that allows researchers to easily instrument LLMs to extract internal activations \\n - Designing and optimizing a pipeline to efficiently collect petabytes of transformer activations and shuffle them \\n - Profiling and optimizing ML training jobs, including multi-GPU parallelism and memory optimization \\n - Building a steered inference system that applies targeted interventions to model internals at scale (conceptually similar to Golden Gate Claude but for safety research) \\n \\n ROLE SPECIFIC LOCATION POLICY: \\n - This role is based in the San Francisco office; however, we are open to considering exceptional candidates for remote work on a case-by-case basis. \\n \\n The annual compensation range for this role is listed below. \\n For sales roles, the range provided is the role&#39;s On Target Earnings (\\&quot;OTE\\&quot;) range, meaning that the range includes both the sales commissions/sales bonuses target and annual base salary for the role. \\n Annual Salary:\\\\$315,000-\\\\$560,000 USD</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_dc17980d-461","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Anthropic","sameAs":"https://www.anthropic.com/","logo":"https://logos.yubhub.co/anthropic.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/anthropic/jobs/4980430008","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$315,000-$560,000 USD","x-skills-required":["Python","Rust","Go","Java","PyTorch","CUDA","JAX","XLA","High Performance LLM optimization","memory management","compute efficiency","parallelism strategies","inference throughput optimization"],"x-skills-preferred":["large-scale distributed systems","language modeling fundamentals","transformers","collaborating closely with researchers","building tooling to support research teams"],"datePosted":"2026-04-18T15:53:01.682Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco, CA"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Python, Rust, Go, Java, PyTorch, CUDA, JAX, XLA, High Performance LLM optimization, memory management, compute efficiency, parallelism strategies, inference throughput optimization, large-scale distributed systems, language modeling fundamentals, transformers, collaborating closely with researchers, building tooling to support research teams","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":315000,"maxValue":560000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_ec7cc743-ef4"},"title":"Senior Software Engineer II, Inference","description":"<p>We&#39;re seeking a senior software engineer to join our team and lead the design and development of our Kubernetes-native inference platform. As a senior engineer, you will be responsible for leading design reviews, driving architecture, and ensuring the reliability and scalability of our platform.</p>\n<p>Key responsibilities include:</p>\n<ul>\n<li>Leading design reviews and driving architecture within the team</li>\n<li>Defining and owning SLIs/SLOs and ensuring post-incident actions land and reliability improves release-over-release</li>\n<li>Implementing advanced optimizations such as micro-batch schedulers, speculative decoding, and KV-cache reuse</li>\n<li>Strengthening incident posture through capacity planning, autoscaling policy, and rollback/traffic-shift strategies</li>\n<li>Mentoring IC1/IC2 engineers and reviewing cross-team designs to elevate coding/testing standards</li>\n</ul>\n<p>We&#39;re looking for someone with strong coding skills in Python or Go, deep familiarity with networked systems and performance, and hands-on experience with Kubernetes at production scale. If you have experience with inference internals, batching, caching, mixed precision, and streaming token delivery, that&#39;s a plus.</p>\n<p>In addition to a competitive salary, we offer a range of benefits including medical, dental, and vision insurance, company-paid life insurance, and flexible PTO. We&#39;re committed to creating a work environment that&#39;s inclusive, diverse, and supportive of our employees&#39; well-being.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_ec7cc743-ef4","directApply":true,"hiringOrganization":{"@type":"Organization","name":"CoreWeave","sameAs":"https://www.coreweave.com","logo":"https://logos.yubhub.co/coreweave.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/coreweave/jobs/4604832006","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$165,000 to $242,000","x-skills-required":["Python","Go","Kubernetes","Networked systems","Performance","Inference internals","Batching","Caching","Mixed precision","Streaming token delivery"],"x-skills-preferred":["CUDA kernels","NCCL/SHARP","RDMA/NUMA","GPU interconnect topologies","Contributions to inference frameworks","Experience with multi-team initiatives"],"datePosted":"2026-04-18T15:50:27.738Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Sunnyvale, CA / Bellevue, WA"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Python, Go, Kubernetes, Networked systems, Performance, Inference internals, Batching, Caching, Mixed precision, Streaming token delivery, CUDA kernels, NCCL/SHARP, RDMA/NUMA, GPU interconnect topologies, Contributions to inference frameworks, Experience with multi-team initiatives","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":165000,"maxValue":242000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_a2b0b667-b4a"},"title":"Senior DSP Engineer","description":"<p>Anduril Industries is seeking a Senior DSP Engineer to join their team. As a Senior DSP Engineer, you will guide DSP engineers in the execution of DSP trade studies and optimization of signal processing and machine learning algorithms for deployment on FPGAs and GPUs. You will collaborate with a multidisciplinary team of software and hardware engineers to develop software defined radios; and direct DSP team in the engagement with the software &amp; hardware team, including the implementation of DSP techniques into software and firmware and integration activities. You will design and implement algorithms and techniques for RADAR systems, develop Modeling and Simulation (M&amp;S) code for RADAR techniques and data analysis including Hardware-in-the Loop / Software-in-the-loop (HIL/SIL) testing, participate in laboratory and field testing of RF systems and techniques, and participate in the maturation of RF systems into deployable systems and products.</p>\n<p>The ideal candidate will have 7+ years of experience with a BSEE or related field, strong experience with DSP implementation for embedded devices and/or software defined radios, strong knowledge of Python and MATLAB, experience with CUDA or GPU accelerated frameworks like cuSignal, experience with embedded devices, including FPGA, Nvidia Jetson, and Software Defined Radios, skilled with Modeling and Simulation of RF systems including Radar and SAR, familiar with deep learning algorithms, experience with ML frameworks such as TensorFlow and PyTorch, familiar with wireless communication standards (Bluetooth, 3G/4G/5G, Wi-Fi, SINCGARS, MUOS, etc.), excellent at balancing multiple projects at any given time and/or managing a larger team for a larger program, enthusiastic about both working with a team and executing some work individually (depending on program scope), experience with Electronic Warfare systems, and currently possesses and is able to maintain an active U.S. Secret security clearance.</p>\n<p>Preferred qualifications include a Master&#39;s or PhD degree in Electrical, Electronics, Computer Engineering, or related fields, defense, national security, or aerospace domain familiarity through industry or education, extensive Digital Signal Processing (DSP) knowledge and experience, expertise in Synthetic Aperture Radar (SAR) and/or Inverse SAR (ISAR): Image formation, waveforms, phenomenology, modeling and simulation.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_a2b0b667-b4a","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Anduril Industries","sameAs":"https://anduril.com","logo":"https://logos.yubhub.co/anduril.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/andurilindustries/jobs/5031497007","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$191,000-$253,000 USD","x-skills-required":["Digital Signal Processing","Embedded Devices","Software Defined Radios","Python","MATLAB","CUDA","GPU Accelerated Frameworks","Modeling and Simulation","RF Systems","Radar and SAR","Deep Learning Algorithms","ML Frameworks","Wireless Communication Standards","Electronic Warfare Systems"],"x-skills-preferred":["Synthetic Aperture Radar (SAR)","Inverse SAR (ISAR)","Image Formation","Waveforms","Phenomenology"],"datePosted":"2026-04-18T15:49:39.269Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Costa Mesa, California, United States"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Digital Signal Processing, Embedded Devices, Software Defined Radios, Python, MATLAB, CUDA, GPU Accelerated Frameworks, Modeling and Simulation, RF Systems, Radar and SAR, Deep Learning Algorithms, ML Frameworks, Wireless Communication Standards, Electronic Warfare Systems, Synthetic Aperture Radar (SAR), Inverse SAR (ISAR), Image Formation, Waveforms, Phenomenology","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":191000,"maxValue":253000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_9701c504-1a6"},"title":"Senior Software Engineer I, Inference","description":"<p>We&#39;re looking for a Senior Software Engineer I to join our team. As a senior engineer, you&#39;ll lead designs, raise engineering standards, and deliver measurable improvements to latency, throughput, and reliability across multiple services. You&#39;ll partner with product, orchestration, and hardware teams to evolve our Kubernetes-native inference platform and meet strict P99 SLAs at scale.</p>\n<p>Key responsibilities include:</p>\n<ul>\n<li>Lead design reviews and drive architecture within the team; decompose multi-service work into clear milestones.</li>\n<li>Define and own SLIs/SLOs; ensure post-incident actions land and reliability improves release-over-release.</li>\n<li>Implement advanced optimizations (e.g., micro-batch schedulers, speculative decoding, KV-cache reuse) and quantify impact.</li>\n<li>Strengthen incident posture: capacity planning, autoscaling policy, graceful degradation, rollback/traffic-shift strategies.</li>\n<li>Mentor IC1/IC2 engineers; review cross-team designs and elevate coding/testing standards.</li>\n</ul>\n<p>Requirements include:</p>\n<ul>\n<li>3-5 years of industry experience building distributed systems or cloud services.</li>\n<li>Strong coding in Python or Go (C++ a plus) and deep familiarity with networked systems and performance.</li>\n<li>Hands-on experience with Kubernetes at production scale, CI/CD, and observability stacks (Prometheus, Grafana, OpenTelemetry).</li>\n<li>Practical knowledge of inference internals: batching, caching, mixed precision (BF16/FP8), streaming token delivery.</li>\n<li>Proven track record improving tail latency (P95/P99) and service reliability through metrics-driven work.</li>\n</ul>\n<p>Preferred qualifications include contributions to inference frameworks, experience with CUDA kernels, NCCL/SHARP, RDMA/NUMA, or GPU interconnect topologies, and leading multi-team initiatives or partnering with customers on mission-critical launches.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_9701c504-1a6","directApply":true,"hiringOrganization":{"@type":"Organization","name":"CoreWeave","sameAs":"https://www.coreweave.com","logo":"https://logos.yubhub.co/coreweave.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/coreweave/jobs/4647603006","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$139,000 to $204,000","x-skills-required":["Python","Go","Kubernetes","CI/CD","Observability stacks","Inference internals","Batching","Caching","Mixed precision","Streaming token delivery"],"x-skills-preferred":["Contributions to inference frameworks","CUDA kernels","NCCL/SHARP","RDMA/NUMA","GPU interconnect topologies"],"datePosted":"2026-04-18T15:48:09.297Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Sunnyvale, CA / Bellevue, WA"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Python, Go, Kubernetes, CI/CD, Observability stacks, Inference internals, Batching, Caching, Mixed precision, Streaming token delivery, Contributions to inference frameworks, CUDA kernels, NCCL/SHARP, RDMA/NUMA, GPU interconnect topologies","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":139000,"maxValue":204000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_126e36d8-668"},"title":"Perception Engineering Intern","description":"<p>We are seeking a perception engineer with a strong background in computer vision to join our rapidly growing team in Costa Mesa, CA. In this role, you will be at the forefront of developing advanced perception systems for complex autonomous aerial platforms.</p>\n<p>Your expertise in computer vision algorithms, combined with your understanding of robotics principles, will be crucial in solving a wide variety of challenges involving visual perception, SLAM, motion planning, controls, and state estimation. This role requires not only technical expertise in computer vision and robotics but also the ability to make pragmatic engineering tradeoffs, considering the unique constraints of aerial platforms.</p>\n<p>Your work will directly contribute to the seamless integration of Anduril&#39;s products, achieving critical outcomes in autonomous operations. This position demands strong systems-level knowledge and experience, as you&#39;ll be working on the intersection of computer vision, robotics, and autonomous systems.</p>\n<p>If you are passionate about pushing the boundaries of computer vision in robotics, possess a &#39;Whatever It Takes&#39; mindset, and can execute in an expedient, scalable, and pragmatic way while keeping the mission top-of-mind and making sound engineering decisions, then this role is for you.</p>\n<p>Responsibilities:</p>\n<ul>\n<li>Work at the intersection of 3D perception and computer vision, developing robust algorithms that power real-time decision-making for autonomous aerial systems.</li>\n</ul>\n<ul>\n<li>Develop and implement advanced structure from motion and SLAM algorithms to create accurate 3D models from multiple camera inputs in real-time.</li>\n</ul>\n<ul>\n<li>Integrate perception outputs with path planning algorithms to enable autonomous navigation in complex, unstructured environments.</li>\n</ul>\n<ul>\n<li>Design experiments, data collection efforts, and curate training/evaluation sets to develop insights for both internal purposes and customers.</li>\n</ul>\n<ul>\n<li>Collaborate closely with robotics, software, and hardware teams to integrate perception algorithms into autonomous aerial systems.</li>\n</ul>\n<ul>\n<li>Work with vendors and government stakeholders to advance the state-of-the-art in perception and world modeling for autonomous aerial systems.</li>\n</ul>\n<p>Required Qualifications:</p>\n<ul>\n<li>BS in Robotics, Computer Science, Mechatronics, Electrical Engineering, Mechanical Engineering, or related field.</li>\n</ul>\n<ul>\n<li>Strong knowledge of 3D computer vision concepts, including multi-view geometry, camera models, photogrammetry, depth estimation, and 3D reconstruction techniques.</li>\n</ul>\n<ul>\n<li>Fluency in standard domain libraries (numpy, opencv, pytorch, etc).</li>\n</ul>\n<ul>\n<li>Proven understanding of data structures, algorithms, concurrency, and code optimization.</li>\n</ul>\n<ul>\n<li>Experience working with Python, PyTorch, or C++ programming languages.</li>\n</ul>\n<ul>\n<li>Experience deploying software to end customers, internal or external.</li>\n</ul>\n<ul>\n<li>Must be willing to travel 25%.</li>\n</ul>\n<ul>\n<li>Eligible to obtain an active U.S. Secret security clearance.</li>\n</ul>\n<p>Preferred Qualifications:</p>\n<ul>\n<li>MS or PhD in Robotics, Computer Science, Engineering, or related field.</li>\n</ul>\n<ul>\n<li>Experience with perception systems for aerial robotics or other highly dynamic platforms.</li>\n</ul>\n<ul>\n<li>Experience with real-world sensor integrations, including LiDAR, RGB-D cameras, IR cameras, stereo cameras, or TOF cameras.</li>\n</ul>\n<ul>\n<li>Experience with GPU / CUDA programming for accelerated computer vision processing.</li>\n</ul>\n<ul>\n<li>Knowledge of path planning algorithms and their integration with perception systems in dynamic environments.</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_126e36d8-668","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Anduril Industries","sameAs":"https://www.anduril.com/","logo":"https://logos.yubhub.co/anduril.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/andurilindustries/jobs/4830032007","x-work-arrangement":"onsite","x-experience-level":"entry","x-job-type":"internship","x-salary-range":null,"x-skills-required":["computer vision","robotics","Python","PyTorch","C++","numpy","opencv","data structures","algorithms","concurrency","code optimization"],"x-skills-preferred":["perception systems","aerial robotics","LiDAR","RGB-D cameras","IR cameras","stereo cameras","TOF cameras","GPU","CUDA"],"datePosted":"2026-04-18T15:48:07.380Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Costa Mesa, California, United States"}},"employmentType":"INTERN","occupationalCategory":"Engineering","industry":"Technology","skills":"computer vision, robotics, Python, PyTorch, C++, numpy, opencv, data structures, algorithms, concurrency, code optimization, perception systems, aerial robotics, LiDAR, RGB-D cameras, IR cameras, stereo cameras, TOF cameras, GPU, CUDA"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_97212bdf-dd1"},"title":"Research Engineer, Interpretability","description":"<p>Job Title: Research Engineer, Interpretability</p>\n<p>About the Role:</p>\n<p>When you see what modern language models are capable of, do you wonder, &quot;How do these things work? How can we trust them?&quot; The Interpretability team at Anthropic is working to reverse-engineer how trained models work because we believe that a mechanistic understanding is the most robust way to make advanced systems safe.</p>\n<p>Think of us as doing &quot;neuroscience&quot; of neural networks using &quot;microscopes&quot; we build - or reverse-engineering neural networks like binary programs.</p>\n<p>More resources to learn about our work:</p>\n<ul>\n<li>Our research blog - covering advances including Monosemantic Features and Circuits</li>\n</ul>\n<ul>\n<li>An Introduction to Interpretability from our research lead, Chris Olah</li>\n</ul>\n<ul>\n<li>The Urgency of Interpretability from CEO Dario Amodei</li>\n</ul>\n<ul>\n<li>Engineering Challenges Scaling Interpretability - directly relevant to this role</li>\n</ul>\n<ul>\n<li>60 Minutes segment - Around 8:07, see a demo of tooling our team built</li>\n</ul>\n<ul>\n<li>New Yorker article - what it&#39;s like to work on one of AI&#39;s hardest open problems</li>\n</ul>\n<p>Even if you haven&#39;t worked on interpretability before, the infrastructure expertise is similar to what&#39;s needed across the lifecycle of a production language model:</p>\n<ul>\n<li>Pretraining: Training dictionary learning models looks a lot like model pretraining - creating stable, performant training jobs for massively parameterized models across thousands of chips</li>\n</ul>\n<ul>\n<li>Inference: Interp runs a customized inference stack. Day-to-day analysis requires services that allow editing a model&#39;s internal activations mid-forward-pass - for example, adding a &quot;steering vector&quot;</li>\n</ul>\n<ul>\n<li>Performance: Like all LLM work, we push up against the limits of hardware and software. Rather than squeezing the last 0.1%, we are focused on finding bottlenecks, fixing them and moving ahead given rapidly evolving research and safety mission</li>\n</ul>\n<p>The science keeps scaling - and it&#39;s now applied directly in safety audits on frontier models, with real deadlines. As our research has matured, engineering and infrastructure have become a bottleneck. Your work will have a direct impact on one of the most important open problems in AI.</p>\n<p>Responsibilities:</p>\n<ul>\n<li>Build and maintain the specialized inference and training infrastructure that powers interpretability research - including instrumented forward/backward passes, activation extraction, and steering vector application</li>\n</ul>\n<ul>\n<li>Resolve scaling and efficiency bottlenecks through profiling, optimization, and close collaboration with peer infrastructure teams</li>\n</ul>\n<ul>\n<li>Design tools, abstractions, and platforms that enable researchers to rapidly experiment without hitting engineering barriers</li>\n</ul>\n<ul>\n<li>Help bring interpretability research into production safety audits - with real deadlines and high reliability expectations</li>\n</ul>\n<ul>\n<li>Work across the stack - from model internals and accelerator-level optimization to user-facing research tooling</li>\n</ul>\n<p>You may be a good fit if you:</p>\n<ul>\n<li>Have 5-10+ years of experience building software</li>\n</ul>\n<ul>\n<li>Are highly proficient in at least one programming language (e.g., Python, Rust, Go, Java) and productive with Python</li>\n</ul>\n<ul>\n<li>Are extremely curious about unfamiliar domains; can quickly learn and put that knowledge to work, e.g. diving into new layers of the stack to find bottlenecks</li>\n</ul>\n<ul>\n<li>Have a strong ability to prioritize the most impactful work and are comfortable operating with ambiguity and questioning assumptions</li>\n</ul>\n<ul>\n<li>Prefer fast-moving collaborative projects to extensive solo efforts</li>\n</ul>\n<ul>\n<li>Are curious about interpretability research and its role in AI safety (though no research experience is required!)</li>\n</ul>\n<ul>\n<li>Care about the societal impacts and ethics of your work</li>\n</ul>\n<ul>\n<li>Are comfortable working closely with researchers, translating research needs into engineering solutions.</li>\n</ul>\n<p>Strong candidates may also have experience with:</p>\n<ul>\n<li>Optimizing the performance of large-scale distributed systems</li>\n</ul>\n<ul>\n<li>Language modeling fundamentals with transformers</li>\n</ul>\n<ul>\n<li>High Performance LLM optimization: memory management, compute efficiency, parallelism strategies, inference throughput optimization</li>\n</ul>\n<ul>\n<li>Working hands-on in a mainstream ML stack - PyTorch/CUDA on GPUs or JAX/XLA on TPUs</li>\n</ul>\n<ul>\n<li>Collaborating closely with researchers and building tooling to support research teams; or directly performed research with complex engineering challenges</li>\n</ul>\n<p>Representative Projects:</p>\n<ul>\n<li>Building Garcon, a tool that allows researchers to easily instrument LLMs to extract internal activations</li>\n</ul>\n<ul>\n<li>Designing and optimizing a pipeline to efficiently collect petabytes of transformer activations and shuffle them</li>\n</ul>\n<ul>\n<li>Profiling and optimizing ML training jobs, including multi-GPU parallelism and memory optimization</li>\n</ul>\n<ul>\n<li>Building a steered inference system that applies targeted interventions to model internals at scale (conceptually similar to Golden Gate Claude but for safety research)</li>\n</ul>\n<p>Role Specific Location Policy:</p>\n<ul>\n<li>This role is based in the San Francisco office; however, we are open to considering exceptional candidates for remote work on a case-by-case basis.</li>\n</ul>\n<p>The annual compensation range for this role is listed below.</p>\n<p>For sales roles, the range provided is the role&#39;s On Target Earnings (&quot;OTE&quot;) range, meaning that the range includes both the sales commissions/sales bonuses target and annual base salary for the role.</p>\n<p>Annual Salary: $315,000-$560,000 USD</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_97212bdf-dd1","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Anthropic","sameAs":"https://www.anthropic.com/","logo":"https://logos.yubhub.co/anthropic.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/anthropic/jobs/4980430008","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$315,000-$560,000 USD","x-skills-required":["Python","Rust","Go","Java","PyTorch","CUDA","JAX","XLA","Transformers","High Performance LLM optimization","Memory management","Compute efficiency","Parallelism strategies","Inference throughput optimization"],"x-skills-preferred":["Optimizing the performance of large-scale distributed systems","Language modeling fundamentals","Collaborating closely with researchers and building tooling to support research teams"],"datePosted":"2026-04-18T15:46:01.999Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco, CA"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Python, Rust, Go, Java, PyTorch, CUDA, JAX, XLA, Transformers, High Performance LLM optimization, Memory management, Compute efficiency, Parallelism strategies, Inference throughput optimization, Optimizing the performance of large-scale distributed systems, Language modeling fundamentals, Collaborating closely with researchers and building tooling to support research teams","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":315000,"maxValue":560000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_44251c7b-221"},"title":"Member of Technical Staff - Recommendation Systems","description":"<p>We&#39;re seeking exceptional Applied engineers to join a high-priority project used by approximately 600 million monthly users. This is an exciting opportunity for individuals with an engineer or scientist background to apply their skills to recommendation systems, ranking algorithms, search technologies, and many other systems.</p>\n<p>You&#39;ll work at the intersection of advanced AI development and real-world impact, enhancing the ability to connect users with relevant content, accounts, and experiences.</p>\n<p>Responsibilities:</p>\n<ul>\n<li>Designing and architecting recommendation algorithms across various product surfaces</li>\n</ul>\n<ul>\n<li>Leveraging all of xAI&#39;s infrastructure and AI stacks to dramatically enhance the user experience</li>\n</ul>\n<ul>\n<li>Writing data pipelines and training jobs that continuously learn from product data</li>\n</ul>\n<ul>\n<li>Iterating and improving the algorithm by gathering user feedback in real time through experimentation</li>\n</ul>\n<ul>\n<li>Ensuring scalability and efficiency of machine learning systems</li>\n</ul>\n<p>Basic Qualifications:</p>\n<ul>\n<li>Knowledge of data infrastructure like Kafka, Clickhouse, and Spark</li>\n</ul>\n<ul>\n<li>Experienced in implementing recommender systems and/or deep learning applications at industrial scale</li>\n</ul>\n<ul>\n<li>Skilled in one or more DL software frameworks such as JAX or PyTorch</li>\n</ul>\n<ul>\n<li>Exceptional candidates may be experienced in writing CUDA kernels</li>\n</ul>\n<p>Compensation and Benefits:</p>\n<p>$180,000 - $440,000 USD</p>\n<p>Base salary is just one part of our total rewards package at xAI, which also includes equity, comprehensive medical, vision, and dental coverage, access to a 401(k) retirement plan, short &amp; long-term disability insurance, life insurance, and various other discounts and perks.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_44251c7b-221","directApply":true,"hiringOrganization":{"@type":"Organization","name":"xAI","sameAs":"https://www.xai.com/","logo":"https://logos.yubhub.co/xai.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/xai/jobs/4703144007","x-work-arrangement":"onsite","x-experience-level":"staff","x-job-type":"full-time","x-salary-range":"$180,000 - $440,000 USD","x-skills-required":["data infrastructure","recommender systems","deep learning","DL software frameworks","CUDA kernels"],"x-skills-preferred":[],"datePosted":"2026-04-18T15:45:00.153Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Palo Alto, CA"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"data infrastructure, recommender systems, deep learning, DL software frameworks, CUDA kernels","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":180000,"maxValue":440000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_1507524b-770"},"title":"Research Engineer, Performance RL","description":"<p>We&#39;re hiring a Research Engineer to join our Code RL team within the RL organization. As a Research Engineer, you&#39;ll advance our models&#39; ability to safely write correct, fast code for accelerators.</p>\n<p>You&#39;ll need to know accelerator performance well to turn it into tasks and signals models can learn from. Specifically, you will:</p>\n<ul>\n<li>Invent, design and implement RL environments and evaluations.</li>\n<li>Conduct experiments and shape our research roadmap.</li>\n<li>Deliver your work into training runs.</li>\n<li>Collaborate with other researchers, engineers, and performance engineering specialists across and outside Anthropic.</li>\n</ul>\n<p>You may be a good fit if you:</p>\n<ul>\n<li>Have expertise with accelerators (CUDA, ROCm, Triton, Pallas), ML framework programming (JAX or PyTorch).</li>\n<li>Have worked across the stack – kernels, model code, distributed systems.</li>\n<li>Know how to balance research exploration with engineering implementation.</li>\n<li>Are passionate about AI&#39;s potential and committed to developing safe and beneficial systems.</li>\n</ul>\n<p>Strong candidates may also have:</p>\n<ul>\n<li>Experience with reinforcement learning.</li>\n<li>Experience porting ML workloads between different types of accelerators.</li>\n<li>Familiarity with LLM training methodologies.</li>\n</ul>\n<p>The annual compensation range for this role is $350,000-$850,000 USD.</p>\n<p>We&#39;re an extremely collaborative group, and we host frequent research discussions to ensure that we are pursuing the highest-impact work at any given time. As such, we greatly value communication skills.</p>\n<p>We believe that the highest-impact AI research will be big science. At Anthropic we work as a single cohesive team on just a few large-scale research efforts. And we value impact , advancing our long-term goals of steerable, trustworthy AI , rather than work on smaller and more specific puzzles. We view AI research as an empirical science, which has as much in common with physics and biology as with traditional efforts in computer science.</p>\n<p>We kitchen is a public benefit corporation headquartered in San Francisco. We offer competitive compensation and benefits, optional equity donation matching, generous vacation and parental leave, flexible working hours, and a lovely office space in which to collaborate with colleagues.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_1507524b-770","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Anthropic","sameAs":"https://www.anthropic.com/","logo":"https://logos.yubhub.co/anthropic.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/anthropic/jobs/5160330008","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$350,000-$850,000 USD","x-skills-required":["accelerators","ML framework programming","distributed systems","reinforcement learning","LLM training methodologies"],"x-skills-preferred":["CUDA","ROCm","Triton","Pallas","JAX","PyTorch"],"datePosted":"2026-04-18T15:42:09.925Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco, CA"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"accelerators, ML framework programming, distributed systems, reinforcement learning, LLM training methodologies, CUDA, ROCm, Triton, Pallas, JAX, PyTorch","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":350000,"maxValue":850000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_28107212-128"},"title":"Performance Engineer, GPU","description":"<p>As a GPU Performance Engineer at Anthropic, you will be responsible for architecting and implementing the foundational systems that power Claude and push the frontiers of what&#39;s possible with large language models. You will maximize GPU utilization and performance at unprecedented scale, develop cutting-edge optimizations that directly enable new model capabilities, and dramatically improve inference efficiency.</p>\n<p>Working at the intersection of hardware and software, you will implement state-of-the-art techniques from custom kernel development to distributed system architectures. Your work will span the entire stack,from low-level tensor core optimizations to orchestrating thousands of GPUs in perfect synchronization.</p>\n<p>Strong candidates will have a track record of delivering transformative GPU performance improvements in production ML systems and will be excited to shape the future of AI infrastructure alongside world-class researchers and engineers.</p>\n<p>Responsibilities:</p>\n<ul>\n<li>Architect and implement foundational systems that power Claude</li>\n<li>Maximize GPU utilization and performance at unprecedented scale</li>\n<li>Develop cutting-edge optimizations that directly enable new model capabilities</li>\n<li>Dramatically improve inference efficiency</li>\n<li>Implement state-of-the-art techniques from custom kernel development to distributed system architectures</li>\n<li>Work at the intersection of hardware and software</li>\n<li>Span the entire stack,from low-level tensor core optimizations to orchestrating thousands of GPUs in perfect synchronization</li>\n</ul>\n<p>Requirements:</p>\n<ul>\n<li>Deep experience with GPU programming and optimization at scale</li>\n<li>Impact-driven, passionate about delivering measurable performance breakthroughs</li>\n<li>Ability to navigate complex systems from hardware interfaces to high-level ML frameworks</li>\n<li>Enjoy collaborative problem-solving and pair programming</li>\n<li>Want to work on state-of-the-art language models with real-world impact</li>\n<li>Care about the societal impacts of your work</li>\n<li>Thrive in ambiguous environments where you define the path forward</li>\n</ul>\n<p>Nice to have:</p>\n<ul>\n<li>Experience with GPU Kernel Development: CUDA, Triton, CUTLASS, Flash Attention, tensor core optimization</li>\n<li>ML Compilers &amp; Frameworks: PyTorch/JAX internals, torch.compile, XLA, custom operators</li>\n<li>Performance Engineering: Kernel fusion, memory bandwidth optimization, profiling with Nsight</li>\n<li>Distributed Systems: NCCL, NVLink, collective communication, model parallelism</li>\n<li>Low-Precision: INT8/FP8 quantization, mixed-precision techniques</li>\n<li>Production Systems: Large-scale training infrastructure, fault tolerance, cluster orchestration</li>\n</ul>\n<p>Representative projects:</p>\n<ul>\n<li>Co-design attention mechanisms and algorithms for next-generation hardware architectures</li>\n<li>Develop custom kernels for emerging quantization formats and mixed-precision techniques</li>\n<li>Design distributed communication strategies for multi-node GPU clusters</li>\n<li>Optimize end-to-end training and inference pipelines for frontier language models</li>\n<li>Build performance modeling frameworks to predict and optimize GPU utilization</li>\n<li>Implement kernel fusion strategies to minimize memory bandwidth bottlenecks</li>\n<li>Create resilient systems for planet-scale distributed training infrastructure</li>\n<li>Profile and eliminate performance bottlenecks in production serving infrastructure</li>\n<li>Partner with hardware vendors to influence future accelerator capabilities and software stacks</li>\n</ul>\n<p>Note: The salary range for this position is $280,000-$850,000 USD per year.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_28107212-128","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Anthropic","sameAs":"https://www.anthropic.com/","logo":"https://logos.yubhub.co/anthropic.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/anthropic/jobs/4926227008","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$280,000-$850,000 USD per year","x-skills-required":["GPU programming","optimization at scale","CUDA","Triton","CUTLASS","Flash Attention","tensor core optimization","PyTorch/JAX internals","torch.compile","XLA","custom operators","kernel fusion","memory bandwidth optimization","profiling with Nsight","NCCL","NVLink","collective communication","model parallelism","INT8/FP8 quantization","mixed-precision techniques","large-scale training infrastructure","fault tolerance","cluster orchestration"],"x-skills-preferred":[],"datePosted":"2026-04-18T15:40:11.758Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco, CA | New York City, NY | Seattle, WA"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"GPU programming, optimization at scale, CUDA, Triton, CUTLASS, Flash Attention, tensor core optimization, PyTorch/JAX internals, torch.compile, XLA, custom operators, kernel fusion, memory bandwidth optimization, profiling with Nsight, NCCL, NVLink, collective communication, model parallelism, INT8/FP8 quantization, mixed-precision techniques, large-scale training infrastructure, fault tolerance, cluster orchestration","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":280000,"maxValue":850000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_98550091-2de"},"title":"Staff Engineer, State Estimation","description":"<p>As a Staff State Estimation Engineer, you will play a critical role on the GNC team, contributing to the development, optimisation, and deployment of advanced sensor fusion and navigation algorithms for autonomous UAV operations in dynamic and contested environments.</p>\n<p>Your work will support the transition of cutting-edge research into fielded capabilities, helping Shield AI deliver precision navigation solutions for mission-critical applications.</p>\n<p><strong>Responsibilities:</strong></p>\n<ul>\n<li>Develop and implement real-time state estimation algorithms including inertial navigation, sensor fusion, and alternative navigation methods for GPS-denied or degraded environments.</li>\n<li>Integrate data from IMUs, GNSS receivers, visual odometry, magnetometers, barometers, and radar into robust estimation frameworks.</li>\n<li>Design sensor processing pipelines focused on accuracy, robustness, and system-level fault tolerance.</li>\n<li>Collaborate with autonomy, software, and hardware teams to ensure end-to-end integration of navigation and PNT systems.</li>\n<li>Conduct simulation, lab testing, and field trials to evaluate algorithm performance under real-world conditions.</li>\n<li>Stay current on advancements in state estimation and navigation technologies and help adapt new innovations into deployable solutions.</li>\n</ul>\n<p><strong>Qualifications:</strong></p>\n<ul>\n<li>Typically requires a minimum of 7 years of relevant experience with a bachelor’s degree; or 6 years with a master’s degree; or 4 years with a PhD; or equivalent practical experience.</li>\n<li>Experience developing and deploying real-time navigation or sensor fusion algorithms using IMUs, GPS, or other sensors.</li>\n<li>Strong understanding of filtering and estimation techniques (e.g., Kalman filters, EKF, UKF, particle filters).</li>\n<li>Proficient in C++11 or newer in real-time environments.</li>\n<li>Comfortable working in Linux, with experience using standard command-line tools and scripting.</li>\n<li>Strong written and verbal communication skills with a collaborative mindset.</li>\n<li>Demonstrated success working in fast-paced development cycles and delivering high-quality results.</li>\n</ul>\n<p><strong>Salary:</strong></p>\n<p>$187,531 - $281,297 a year</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_98550091-2de","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Shield AI","sameAs":"https://www.shield.ai","logo":"https://logos.yubhub.co/shield.ai.png"},"x-apply-url":"https://jobs.lever.co/shieldai/f8849287-b9ff-4c3e-a37f-be20e39c597b","x-work-arrangement":"onsite","x-experience-level":"staff","x-job-type":"full-time","x-salary-range":"$187,531 - $281,297 a year","x-skills-required":["state estimation","sensor fusion","inertial navigation","Kalman filters","C++11","Linux"],"x-skills-preferred":["visual odometry","computer vision","CUDA","hardware acceleration"],"datePosted":"2026-04-17T13:04:28.903Z","employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"state estimation, sensor fusion, inertial navigation, Kalman filters, C++11, Linux, visual odometry, computer vision, CUDA, hardware acceleration","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":187531,"maxValue":281297,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_7da005da-ff5"},"title":"Senior Engineer, State Estimation","description":"<p>As a Senior Engineer, State Estimation, you will work on the GNC team to develop and optimise algorithms that process and fuse data from various sensors to provide accurate, reliable state estimates, enabling the X-BAT to operate autonomously in complex and contested environments.</p>\n<p>Your key responsibilities will include:</p>\n<p>Developing and implementing advanced sensor algorithms for processing data from IMUs, radar, cameras, GPS, and other sensors.\nEnhancing state estimation algorithms by integrating multi-sensor data for improved accuracy and robustness.\nDesigning and implementing real-time sensor data processing pipelines.\nCollaborating with cross-functional teams, including software engineers, autonomy researchers, and hardware engineers, to ensure seamless integration of state estimation algorithms.\nConducting experiments and field tests to validate the performance of state estimation algorithms in real-world scenarios.\nStaying updated with the latest advancements in sensor technologies and state estimation, applying them to our systems.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_7da005da-ff5","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Shield AI","sameAs":"https://www.shield.ai","logo":"https://logos.yubhub.co/shield.ai.png"},"x-apply-url":"https://jobs.lever.co/shieldai/0c6acdd5-a39b-4ad3-84fa-b1a1f83409d3","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$160,000 - $240,000 a year","x-skills-required":["C++ 11 or newer","Linux","command line tools","Kalman filters","particle filters"],"x-skills-preferred":["inertial navigation algorithms","computer vision techniques","optimising algorithms for compute-constrained systems","CUDA or other hardware acceleration technologies"],"datePosted":"2026-04-17T13:03:53.983Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Dallas, Texas / Boston, MA / San Diego, California / Washington, DC"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"C++ 11 or newer, Linux, command line tools, Kalman filters, particle filters, inertial navigation algorithms, computer vision techniques, optimising algorithms for compute-constrained systems, CUDA or other hardware acceleration technologies","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":160000,"maxValue":240000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_5b9f33df-224"},"title":"Engineer, State Estimation","description":"<p>As a State Estimation Engineer, you will play a critical role on the GNC team, contributing to the development, optimization, and deployment of advanced sensor fusion and navigation algorithms for autonomous UAV operations in dynamic and contested environments. You will help design real-time sensor processing pipelines, integrate multi-sensor data for robust state estimation, and collaborate closely with autonomy researchers, software engineers, and hardware teams to ensure high system performance and reliability.</p>\n<p>Your work will support the transition of cutting-edge research into fielded capabilities, helping Shield AI deliver precision navigation solutions for mission-critical applications.</p>\n<p><strong>Responsibilities:</strong></p>\n<ul>\n<li>Develop and implement real-time state estimation algorithms including inertial navigation, sensor fusion, and alternative navigation methods for GPS-denied or degraded environments.</li>\n<li>Integrate data from IMUs, GNSS receivers, visual odometry, magnetometers, barometers, and radar into robust estimation frameworks.</li>\n<li>Design sensor processing pipelines focused on accuracy, robustness, and system-level fault tolerance.</li>\n<li>Collaborate with autonomy, software, and hardware teams to ensure end-to-end integration of navigation and PNT systems.</li>\n<li>Conduct simulation, lab testing, and field trials to evaluate algorithm performance under real-world conditions.</li>\n<li>Stay current on advancements in state estimation and navigation technologies and help adapt new innovations into deployable solutions.</li>\n</ul>\n<p><strong>Qualifications:</strong></p>\n<ul>\n<li>Typically requires a minimum of 3 years of relevant experience with a bachelor’s degree; or 2 years with a master’s degree; or 1 years with a PhD; or equivalent practical experience.</li>\n<li>Familiarity with algorithms.</li>\n<li>Proficient in C++11 or newer in real-time environments.</li>\n<li>Comfortable working in Linux, with experience using standard command-line tools and scripting.</li>\n<li>Strong written and verbal communication skills with a collaborative mindset.</li>\n<li>Demonstrated success working in fast-paced development cycles and delivering high-quality results.</li>\n</ul>\n<p><strong>Preferred Qualifications:</strong></p>\n<ul>\n<li>Experience developing and deploying real-time navigation or sensor fusion algorithms using IMUs, GPS, or other sensors.</li>\n<li>Strong understanding of filtering and estimation techniques (e.g., Kalman filters, EKF, UKF, particle filters).</li>\n<li>Experience implementing inertial navigation algorithms in degraded or GPS-denied conditions.</li>\n<li>Exposure to visual odometry or computer vision-based navigation approaches.</li>\n<li>Experience optimizing code for performance on compute-constrained platforms.</li>\n<li>Familiarity with CUDA or hardware acceleration techniques (e.g., FPGAs).</li>\n<li>Experience transitioning navigation solutions from research into production environments.</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_5b9f33df-224","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Shield AI","sameAs":"https://www.shield.ai","logo":"https://logos.yubhub.co/shield.ai.png"},"x-apply-url":"https://jobs.lever.co/shieldai/133e1006-4bcd-4a31-afaf-c85ad113b749","x-work-arrangement":"onsite","x-experience-level":"mid","x-job-type":"full-time","x-salary-range":"$120,000 - $250,000 a year","x-skills-required":["C++11","Linux","standard command-line tools","scripting","algorithms","real-time environments"],"x-skills-preferred":["Kalman filters","EKF","UKF","particle filters","visual odometry","computer vision-based navigation","CUDA","hardware acceleration techniques"],"datePosted":"2026-04-17T13:02:29.847Z","employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"C++11, Linux, standard command-line tools, scripting, algorithms, real-time environments, Kalman filters, EKF, UKF, particle filters, visual odometry, computer vision-based navigation, CUDA, hardware acceleration techniques","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":120000,"maxValue":250000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_d2256e99-10a"},"title":"Research Engineer, Machine Learning","description":"<p>About Mistral AI</p>\n<p>Mistral AI is a pioneering company shaping the future of AI. They believe in the power of AI to simplify tasks, save time, and enhance learning and creativity.</p>\n<p>Role Summary</p>\n<p>The Research Engineering team at Mistral AI spans Platform (shared infra &amp; clean code) and Embedded (inside research squads). Engineers can move along the research↔production spectrum as needs or interests evolve. As a Research Engineer – ML track, you’ll build and optimise the large-scale learning systems that power their open-weight models.</p>\n<p>Responsibilities</p>\n<ul>\n<li>Accelerate researchers by taking on the heavy parts of large-scale ML pipelines and building robust tools.</li>\n<li>Interface cutting-edge research with production: integrate checkpoints, streamline evaluation, and expose APIs.</li>\n<li>Conduct experiments on the latest deep-learning techniques (sparsified 70 B + runs, distributed training on thousands of GPUs).</li>\n<li>Design, implement and benchmark ML algorithms; write clear, efficient code in Python.</li>\n<li>Deliver prototypes that become production-grade components for Le Chat and their enterprise API.</li>\n</ul>\n<p>Requirements</p>\n<ul>\n<li>Master’s or PhD in Computer Science (or equivalent proven track record).</li>\n<li>4 + years working on large-scale ML codebases.</li>\n<li>Hands-on with PyTorch, JAX or TensorFlow; comfortable with distributed training (DeepSpeed / FSDP / SLURM / K8s).</li>\n<li>Experience in deep learning, NLP or LLMs; bonus for CUDA or data-pipeline chops.</li>\n<li>Strong software-design instincts: testing, code review, CI/CD.</li>\n<li>Self-starter, low-ego, collaborative.</li>\n</ul>\n<p>What we offer</p>\n<ul>\n<li>Competitive salary and equity.</li>\n<li>Healthcare: Medical/Dental/Vision covered for you and your family.</li>\n<li>Pension: 401K (6% matching)</li>\n<li>PTO: 18 days</li>\n<li>Transportation: Reimburse office parking charges, or $120/month for public transport</li>\n<li>Sport: $120/month reimbursement for gym membership</li>\n<li>Meal stipend: $400 monthly allowance for meals (solution might evolve as they grow bigger)</li>\n<li>Visa sponsorship</li>\n<li>Coaching: they offer BetterUp coaching on a voluntary basis</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_d2256e99-10a","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Mistral AI","sameAs":"https://mistral.ai/careers","logo":"https://logos.yubhub.co/mistral.ai.png"},"x-apply-url":"https://jobs.lever.co/mistral/bada0014-0f32-4370-b55f-81c5595c7339","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["PyTorch","JAX","TensorFlow","Distributed training","Deep learning","NLP","LLMs","CUDA","Data pipeline"],"x-skills-preferred":[],"datePosted":"2026-04-17T12:47:41.659Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Palo Alto"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"PyTorch, JAX, TensorFlow, Distributed training, Deep learning, NLP, LLMs, CUDA, Data pipeline"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_50cacac8-b47"},"title":"Research Engineer, Machine Learning","description":"<p><strong>About the Role</strong></p>\n<p>We are seeking a Research Engineer to join our Machine Learning team. As a Research Engineer, you will work on building and optimizing large-scale learning systems that power our open-weight models.</p>\n<p><strong>Responsibilities</strong></p>\n<ul>\n<li>Accelerate researchers by taking on the heavy parts of large-scale ML pipelines and building robust tools.</li>\n<li>Interface cutting-edge research with production: integrate checkpoints, streamline evaluation, and expose APIs.</li>\n<li>Conduct experiments on the latest deep-learning techniques.</li>\n<li>Design, implement and benchmark ML algorithms; write clear, efficient code in Python.</li>\n<li>Deliver prototypes that become production-grade components for Le Chat and our enterprise API.</li>\n</ul>\n<p><strong>Requirements</strong></p>\n<ul>\n<li>Master&#39;s or PhD in Computer Science (or equivalent proven track record).</li>\n<li>4 + years working on large-scale ML codebases.</li>\n<li>Hands-on with PyTorch, JAX or TensorFlow; comfortable with distributed training (DeepSpeed / FSDP / SLURM / K8s).</li>\n<li>Experience in deep learning, NLP or LLMs; bonus for CUDA or data-pipeline chops.</li>\n<li>Strong software-design instincts: testing, code review, CI/CD.</li>\n<li>Self-starter, low-ego, collaborative.</li>\n</ul>\n<p><strong>Benefits</strong></p>\n<ul>\n<li>Competitive cash salary and equity.</li>\n<li>Food: Daily lunch vouchers.</li>\n<li>Sport: Monthly contribution to a Gympass subscription.</li>\n<li>Transportation: Monthly contribution to a mobility pass.</li>\n<li>Health: Full health insurance for you and your family.</li>\n<li>Parental: Generous parental leave policy.</li>\n</ul>\n<p>Note: Benefits may vary depending on location.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_50cacac8-b47","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Mistral AI","sameAs":"https://mistral.ai/careers","logo":"https://logos.yubhub.co/mistral.ai.png"},"x-apply-url":"https://jobs.lever.co/mistral/07447e1d-7900-46d4-b61b-186f2f76847f","x-work-arrangement":"hybrid","x-experience-level":"mid","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["PyTorch","JAX","TensorFlow","DeepSpeed","FSDP","SLURM","K8s","Python","CUDA","data-pipeline"],"x-skills-preferred":[],"datePosted":"2026-04-17T12:47:05.094Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Paris"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"PyTorch, JAX, TensorFlow, DeepSpeed, FSDP, SLURM, K8s, Python, CUDA, data-pipeline"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_690339e7-e86"},"title":"Senior Software Engineer, Autonomy - Calibration, Mapping & Localization","description":"<p>About Cyngn</p>\n<p>Based in Mountain View, CA, Cyngn is a publicly-traded autonomous technology company. We deploy self-driving industrial vehicles like forklifts and tuggers to factories, warehouses, and other facilities throughout North America.</p>\n<p>To build this emergent technology, we are looking for innovative, motivated, and experienced leaders to join us and move this field forward. If you like to build, tinker, and create with a team of trusted and passionate colleagues, then Cyngn is the place for you.</p>\n<p>Key reasons to join Cyngn:</p>\n<p>We are small and big. With under 100 employees, Cyngn operates with the energy of a startup. On the other hand, we’re publicly traded. This means our employees not only work in close-knit teams with mentorship from company leaders,they also get access to the liquidity of our publicly-traded equity.</p>\n<p>We build today and deploy tomorrow. Our autonomous vehicles aren’t just test concepts,they’re deployed to real clients right now. That means your work will have a tangible, visible impact.</p>\n<p>We aren’t robots. We just develop them. We’re a welcoming, diverse team of sharp thinkers and kind humans. Collaboration and trust drive our creative environment. At Cyngn, everyone’s perspective matters,and that’s what powers our innovation.</p>\n<p>About this role:</p>\n<p>As a Staff/Senior Software Engineer on our Calibration, Localization, &amp; Mapping (CLAM) team, you will be responsible for delivering mission-critical improvements and new features to our calibration, localization, and mapping subsystems. You will work on a small, highly focused team developing production-quality software that enables efficient and accurate creation of HD maps at Cyngn deployment-sites and robust localization for Cyngn’s autonomous vehicle fleets.</p>\n<p>Responsibilities</p>\n<ul>\n<li><p>Design, implement, tune, and test mapping, localization, and sensor calibration algorithms for our autonomous vehicle platforms using C++ and Python.</p>\n</li>\n<li><p>Develop tooling and metrics for performance validation and continuous testing frameworks.</p>\n</li>\n<li><p>Balance project tasks, code reviews, and research to meet product-driven milestones in a fast-paced startup environment.</p>\n</li>\n</ul>\n<p>Qualifications</p>\n<ul>\n<li><p>MS/Phd with focus in robotics or a similar technical field of study</p>\n</li>\n<li><p>Solid foundation in probability theory, linear algebra, 3D geometry, and spatial coordinate transformations.</p>\n</li>\n<li><p>In-depth understanding of matrix factorization algorithms and Lie algebra/groups.</p>\n</li>\n<li><p>Solid theoretical knowledge of state-of-the-art techniques in 3D Lidar-based mapping and localization for autonomous vehicles (LOAM series, GICP, FastLIO, bundle-adjustment)</p>\n</li>\n<li><p>Familiarity with state estimation frameworks such as EKF’s as well as modern nonlinear optimization libraries (GTSAM, G2O, Ceres-Solver, GNC-Solver, etc.)</p>\n</li>\n<li><p>6+ years of industry experience as an autonomous vehicle or robotics software engineering professional including hands-on implementation and tuning on production hardware.</p>\n</li>\n<li><p>6+ years industry experience writing C++ software in a production environment - architecture design, unit testing, code review, algorithm performance trade-offs, etc.</p>\n</li>\n<li><p>Proficiency in Python.</p>\n</li>\n<li><p>Excellent written &amp; verbal communication skills.</p>\n</li>\n</ul>\n<p>Bonus Qualifications</p>\n<ul>\n<li><p>Proven record of top-tier publications or patents.</p>\n</li>\n<li><p>Experience with GPU programming, CUDA.</p>\n</li>\n<li><p>Experience in implementing automated map change detection and updating techniques.</p>\n</li>\n<li><p>Experience implementing modern multi-sensor calibration and sensor mis-alignment detection algorithms.</p>\n</li>\n<li><p>Experience with camera-based SLAM and 3D multi-view geometry.</p>\n</li>\n<li><p>Experience working with ROS2 to design, build, and operate robotic systems.</p>\n</li>\n<li><p>Exposure to modern software development version control and project management tools - Git, Jira, etc.</p>\n</li>\n</ul>\n<p>Benefits &amp; Perks</p>\n<ul>\n<li><p>Health benefits (Medical, Dental, Vision, HSA and FSA (Health &amp; Dependent Daycare), Employee Assistance Program, 1:1 Health Concierge)</p>\n</li>\n<li><p>Life, Short-term and long-term disability insurance (Cyngn funds 100% of premiums)</p>\n</li>\n<li><p>Company 401(k)</p>\n</li>\n<li><p>Commuter Benefits</p>\n</li>\n<li><p>Flexible vacation policy</p>\n</li>\n<li><p>Sabbatical leave opportunity after 5 years with the company</p>\n</li>\n<li><p>Paid Parental Leave</p>\n</li>\n<li><p>Daily lunches for in-office employees and fully-stocked kitchen with snacks and beverages</p>\n</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_690339e7-e86","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Cyngn","sameAs":"https://www.cyngn.com/","logo":"https://logos.yubhub.co/cyngn.com.png"},"x-apply-url":"https://jobs.lever.co/cyngn/716dbe41-cac5-4d23-9ec3-cc05b32322b4","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$180,000-198,000 per year","x-skills-required":["C++","Python","Probability theory","Linear algebra","3D geometry","Spatial coordinate transformations","Matrix factorization algorithms","Lie algebra/groups","State estimation frameworks","Nonlinear optimization libraries"],"x-skills-preferred":["GPU programming","CUDA","Automated map change detection and updating techniques","Modern multi-sensor calibration and sensor mis-alignment detection algorithms","Camera-based SLAM and 3D multi-view geometry","ROS2","Git","Jira"],"datePosted":"2026-04-17T12:28:37.248Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Mountain View"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"C++, Python, Probability theory, Linear algebra, 3D geometry, Spatial coordinate transformations, Matrix factorization algorithms, Lie algebra/groups, State estimation frameworks, Nonlinear optimization libraries, GPU programming, CUDA, Automated map change detection and updating techniques, Modern multi-sensor calibration and sensor mis-alignment detection algorithms, Camera-based SLAM and 3D multi-view geometry, ROS2, Git, Jira","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":180000,"maxValue":198000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_8e582153-6af"},"title":"Senior DevOps Lead - Cloud & Autonomous System","description":"<p>About Cyngn</p>\n<p>Cyngn is a publicly-traded autonomous technology company that deploys self-driving industrial vehicles to factories, warehouses, and other facilities throughout North America.</p>\n<p>We are a small company with under 100 employees, operating with the energy of a startup. However, we&#39;re also publicly traded, which means our employees get access to the liquidity of our publicly-traded equity.</p>\n<p>As a Senior DevOps Lead at Cyngn, you will play a vital role in architecting and managing infrastructure across cloud and autonomous vehicle systems. This position combines traditional cloud DevOps leadership with specialized expertise in robotics and autonomous systems infrastructure.</p>\n<p>Responsibilities</p>\n<ul>\n<li>Lead and architect cloud and vehicle infrastructure initiatives across AWS and ROS/Linux environments</li>\n<li>Design and implement scalable solutions for both cloud services and autonomous vehicle systems</li>\n<li>Establish and maintain DevOps best practices, CI/CD pipelines, and infrastructure as code</li>\n<li>Drive observability, monitoring, and incident response strategies</li>\n<li>Optimize performance and cost efficiency of cloud and edge computing resources</li>\n<li>Mentor team members and foster a developer-friendly environment</li>\n<li>Manage on-call rotations and incident response processes</li>\n<li>Architect solutions for processing and storing large-scale vehicle telemetry data</li>\n<li>Lead security initiatives and compliance efforts across infrastructure</li>\n</ul>\n<p>Requirements</p>\n<ul>\n<li>10+ years of relevant DevOps/Infrastructure experience</li>\n<li>Proven track record as a technical lead in platform or infrastructure teams</li>\n<li>Advanced expertise in AWS services, infrastructure as code (Terraform), and Kubernetes</li>\n<li>Strong experience with service mesh (Istio) and Helm/Kustomize</li>\n<li>Deep understanding of ROS/ROS2 and Linux kernel configurations</li>\n<li>Experience with GPU configurations and ML infrastructure</li>\n<li>Expertise in ARM and NVIDIA CUDA platform configurations</li>\n<li>Strong programming skills in Python and shell scripting</li>\n<li>Experience with infrastructure automation (Ansible)</li>\n<li>Expertise in CI/CD tools (Jenkins, GitHub Actions)</li>\n<li>Strong system architecture and design skills</li>\n<li>Excellence in technical documentation</li>\n<li>Outstanding problem-solving abilities</li>\n<li>Strong leadership and mentoring capabilities</li>\n</ul>\n<p>Nice to haves</p>\n<ul>\n<li>Experience with autonomous vehicle systems</li>\n<li>Track record of optimizing GPU-based ML infrastructure</li>\n<li>Experience with large-scale IoT deployments</li>\n<li>Contributions to open-source projects</li>\n<li>Experience with real-time systems and low-latency requirements</li>\n<li>Expertise in security implementations including SSO, IdP, and AWS Cognito</li>\n<li>Experience with JFrog artifactory and container registry management</li>\n<li>Proficiency in AWS IoT Greengrass</li>\n<li>Experience with container resource management on edge devices</li>\n<li>Understanding of CPU affinity and priority scheduling</li>\n<li>Track record of implementing cost optimization strategies</li>\n<li>Experience with scaling systems both horizontally and vertically</li>\n</ul>\n<p>Benefits &amp; Perks</p>\n<ul>\n<li>Health benefits (Medical, Dental, Vision, HSA and FSA (Health &amp; Dependent Daycare), Employee Assistance Program, 1:1 Health Concierge)</li>\n<li>Life, Short-term, and long-term disability insurance (Cyngn funds 100% of premiums)</li>\n<li>Company 401(k)</li>\n<li>Commuter Benefits</li>\n<li>Flexible vacation policy</li>\n<li>Sabbatical leave opportunity after five years with the company</li>\n<li>Paid Parental Leave</li>\n<li>Daily lunches for in-office employees</li>\n<li>Monthly meal and tech allowances for remote employees</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_8e582153-6af","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Cyngn","sameAs":"https://www.cyngn.com/","logo":"https://logos.yubhub.co/cyngn.com.png"},"x-apply-url":"https://jobs.lever.co/cyngn/1c31b7d8-cf85-472f-9358-1e10189cf815","x-work-arrangement":"remote","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$198,000-225,000 per year","x-skills-required":["AWS services","infrastructure as code (Terraform)","Kubernetes","service mesh (Istio)","Helm/Kustomize","ROS/ROS2","Linux kernel configurations","GPU configurations","ML infrastructure","ARM","NVIDIA CUDA platform configurations","Python","shell scripting","infrastructure automation (Ansible)","CI/CD tools (Jenkins, GitHub Actions)","system architecture and design skills","technical documentation","problem-solving abilities","leadership and mentoring capabilities"],"x-skills-preferred":["autonomous vehicle systems","optimizing GPU-based ML infrastructure","large-scale IoT deployments","open-source projects","real-time systems and low-latency requirements","security implementations including SSO, IdP, and AWS Cognito","JFrog artifactory and container registry management","AWS IoT Greengrass","container resource management on edge devices","CPU affinity and priority scheduling","cost optimization strategies","scaling systems both horizontally and vertically"],"datePosted":"2026-04-17T12:27:09.593Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Mountain View"}},"jobLocationType":"TELECOMMUTE","employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"AWS services, infrastructure as code (Terraform), Kubernetes, service mesh (Istio), Helm/Kustomize, ROS/ROS2, Linux kernel configurations, GPU configurations, ML infrastructure, ARM, NVIDIA CUDA platform configurations, Python, shell scripting, infrastructure automation (Ansible), CI/CD tools (Jenkins, GitHub Actions), system architecture and design skills, technical documentation, problem-solving abilities, leadership and mentoring capabilities, autonomous vehicle systems, optimizing GPU-based ML infrastructure, large-scale IoT deployments, open-source projects, real-time systems and low-latency requirements, security implementations including SSO, IdP, and AWS Cognito, JFrog artifactory and container registry management, AWS IoT Greengrass, container resource management on edge devices, CPU affinity and priority scheduling, cost optimization strategies, scaling systems both horizontally and vertically","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":198000,"maxValue":225000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_c9678449-2cf"},"title":"Senior C++ Robotics Engineer","description":"<p>About Cyngn</p>\n<p>Cyngn is a publicly-traded autonomous technology company that deploys self-driving industrial vehicles to factories, warehouses, and other facilities throughout North America.</p>\n<p>We are looking for a Senior C++ Robotics Engineer to join our team. As a key member of our engineering team, you will play a vital role in developing and integrating autonomous vehicle systems.</p>\n<p>Responsibilities</p>\n<ul>\n<li>Design and implement robust robotics software using C++ and ROS/ROS2 framework</li>\n<li>Develop and maintain critical system components including state management, health monitoring, and diagnostic tools</li>\n<li>Create and optimize high-performance software for processing sensor data from LiDAR, cameras, and other perception systems</li>\n<li>Implement and maintain CAN bus communications and firmware update systems</li>\n<li>Configure and optimize container environments for various autonomous vehicle components</li>\n<li>Develop and maintain system provisioning and configuration management tools</li>\n<li>Implement performance profiling and optimization across the autonomous vehicle stack</li>\n<li>Create and maintain automated testing and validation frameworks for system integration</li>\n<li>Troubleshoot complex system issues across hardware, software, and network interfaces</li>\n<li>Collaborate with cross-functional teams to integrate perception, localization, and control systems</li>\n</ul>\n<p>Requirements</p>\n<ul>\n<li>5+ years of experience in robotics software development or system integration</li>\n<li>Strong proficiency in ROS/ROS2 and Ubuntu-based systems</li>\n<li>Extensive experience with real-time system performance optimization and CUDA programming</li>\n<li>Deep understanding of autonomous vehicle architecture and systems integration</li>\n<li>Strong background in electrical systems, CAN protocols, and firmware development</li>\n<li>Expertise in container technologies (Docker, Podman) and their underlying systems</li>\n<li>Experience with configuration management tools like Ansible</li>\n<li>Strong programming skills in C++, Python, and shell scripting</li>\n<li>Thorough understanding of networking principles and protocols</li>\n<li>Experience with high-performance computing and system optimization</li>\n<li>Strong debugging and problem-solving skills across hardware and software domains</li>\n<li>Excellent documentation and communication skills</li>\n</ul>\n<p>Nice to Have</p>\n<ul>\n<li>Experience with fleet management systems or logistics software</li>\n<li>Experience with industrial automation or autonomous mobile robots</li>\n<li>Knowledge of Open-RMF middleware framework</li>\n<li>Experience with telematics data processing and analytics</li>\n<li>Familiarity with computer vision and machine learning deployment</li>\n<li>Experience with over-the-air (OTA) update systems</li>\n<li>Knowledge of safety-critical software development practices</li>\n<li>Experience with real-time operating systems</li>\n<li>Familiarity with automotive-grade software development</li>\n<li>Background in system safety and fault tolerance design</li>\n<li>Experience with simulation environments for autonomous systems testing</li>\n<li>Knowledge of DevOps practices and CI/CD pipelines</li>\n</ul>\n<p>Benefits &amp; Perks</p>\n<ul>\n<li>Health benefits (Medical, Dental, Vision, HSA and FSA (Health &amp; Dependent Daycare), Employee Assistance Program, 1:1 Health Concierge)</li>\n<li>Life, Short-term, and long-term disability insurance (Cyngn funds 100% of premiums)</li>\n<li>Company 401(k)</li>\n<li>Commuter Benefits</li>\n<li>Flexible vacation policy</li>\n<li>Sabbatical leave opportunity after five years with the company</li>\n<li>Paid Parental Leave</li>\n<li>Daily lunches for in-office employees</li>\n<li>Monthly meal and tech allowances for remote employees</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_c9678449-2cf","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Cyngn","sameAs":"https://www.cyngn.com/","logo":"https://logos.yubhub.co/cyngn.com.png"},"x-apply-url":"https://jobs.lever.co/cyngn/d5a8db2f-b21f-4e57-a64a-1dfd642a49b7","x-work-arrangement":"remote","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$198,000-225,000 per year","x-skills-required":["C++","ROS/ROS2","Ubuntu-based systems","Real-time system performance optimization","CUDA programming","Autonomous vehicle architecture","Systems integration","Electrical systems","CAN protocols","Firmware development","Container technologies","Configuration management tools","Programming skills in C++, Python, and shell scripting","Networking principles and protocols","High-performance computing","System optimization"],"x-skills-preferred":["Fleet management systems","Industrial automation","Open-RMF middleware framework","Telematics data processing","Computer vision","Machine learning deployment","Over-the-air (OTA) update systems","Safety-critical software development practices","Real-time operating systems","Automotive-grade software development","System safety and fault tolerance design","Simulation environments","DevOps practices and CI/CD pipelines"],"datePosted":"2026-04-17T12:26:58.443Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Mountain View"}},"jobLocationType":"TELECOMMUTE","employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"C++, ROS/ROS2, Ubuntu-based systems, Real-time system performance optimization, CUDA programming, Autonomous vehicle architecture, Systems integration, Electrical systems, CAN protocols, Firmware development, Container technologies, Configuration management tools, Programming skills in C++, Python, and shell scripting, Networking principles and protocols, High-performance computing, System optimization, Fleet management systems, Industrial automation, Open-RMF middleware framework, Telematics data processing, Computer vision, Machine learning deployment, Over-the-air (OTA) update systems, Safety-critical software development practices, Real-time operating systems, Automotive-grade software development, System safety and fault tolerance design, Simulation environments, DevOps practices and CI/CD pipelines","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":198000,"maxValue":225000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_76ec9c27-a1c"},"title":"Signal Processing Engineer","description":"<p>We&#39;re seeking a highly skilled Signal Processing Engineer to join our growing team. As a Signal Processing Engineer at CX2, you will design, implement, and test signal processing techniques using MATLAB, Python, and other existing frameworks. You will work on digital signal processing, write and contribute to existing Python repositories using CUDA and PyTorch, own requirements, ICDs, and verification from concept through delivery, and stay current with advances in signal processing techniques and associated technologies.</p>\n<p>Responsibilities:</p>\n<ul>\n<li>Design, characterize, and deliver algorithms such as channelizers, frequency agile detection, adaptive filters, MIMO, wideband detectors, and other algorithms related to signal sorting</li>\n<li>Write and contribute to existing Python repositories using CUDA and PyTorch</li>\n<li>Own requirements, ICDs, and verification from concept through delivery</li>\n<li>Stay current: Track and insert advances in signal processing techniques and associated technologies, adaptive beamforming, RF machine learning, and resilient PNT for GPS-denied ops.</li>\n</ul>\n<p>Required Qualifications:</p>\n<ul>\n<li>Masters Degree in Electrical, Computer or Systems Engineering or related field with Graduate study emphasis in Signal Processing; OR a Bachelor’s Degree in an Engineering discipline with 3-5 years relevant working Signal Processing Experience</li>\n<li>Intermediate to advanced proficiency in Python</li>\n<li>Willingness to support critical test events that occasionally require extended hours/weekends.</li>\n<li>Ability to obtain and maintain a security clearance. Learn more about Security Clearances here.</li>\n<li>Must be a U.S. Person (see ITAR Regulations below) due to required access to U.S. export-controlled information or facilities</li>\n</ul>\n<p>Bonus Points:</p>\n<ul>\n<li>PhD in Electrical Engineering, Computer Engineering, or related field</li>\n<li>5+ years’ experience with EW subsystems and payloads.</li>\n<li>EA/ECM technique design (deception, Digital RF Memory, coherent/non-coherent techniques).</li>\n<li>Comms system design (LPI/LPD, Waveform-of-Interest exploitation)</li>\n<li>RF machine learning for emitter ID, modulation/classification, anomaly detection, PDW creation</li>\n<li>Tools Experience: ADS/AWR/SystemVue, MATLAB/Simulink, Python (NumPy/SciPy), GNU Radio/SDR (USRP/RFSoC), VITA-49; HDL/firmware experience also helpful (Vivado/Quartus/Libero).</li>\n<li>Clearance: Active Secret or ability to obtain and maintain; TS/SCI eligibility preferred. ITAR/EAR-controlled work.</li>\n<li>Field work: supporting periodic travel for flight tests and customer demonstrations/support</li>\n<li>Mindset: Builder-tester who loves first-principles RF, rapid lab iteration, and getting hardware flying fast.</li>\n</ul>\n<p>What We Offer:</p>\n<ul>\n<li>Competitive salary, stock options and benefits, including health, vision and dental.</li>\n<li>401K enrollment at 90 days.</li>\n<li>Generous PTO + most Federal Holidays observed.</li>\n<li>Collaborative and inclusive work environment.</li>\n<li>Access to the latest tools and technologies.</li>\n<li>High levels of responsibility and autonomy.</li>\n<li>Professional growth and development opportunities.</li>\n<li>Access to the hardest problems in electronic warfare.</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_76ec9c27-a1c","directApply":true,"hiringOrganization":{"@type":"Organization","name":"CX2","sameAs":"https://cx2.com/","logo":"https://logos.yubhub.co/cx2.com.png"},"x-apply-url":"https://jobs.lever.co/cx2/c03eadf7-133f-4785-b7f9-37e5c3d52db9","x-work-arrangement":"onsite","x-experience-level":"mid","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["MATLAB","Python","CUDA","PyTorch","Digital Signal Processing","Channelizers","Frequency Agile Detection","Adaptive Filters","MIMO","Wideband Detectors"],"x-skills-preferred":["ADS/AWR/SystemVue","MATLAB/Simulink","Python (NumPy/SciPy)","GNU Radio/SDR (USRP/RFSoC)","VITA-49","HDL/Firmware (Vivado/Quartus/Libero)"],"datePosted":"2026-04-17T12:26:41.350Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"El Segundo"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"MATLAB, Python, CUDA, PyTorch, Digital Signal Processing, Channelizers, Frequency Agile Detection, Adaptive Filters, MIMO, Wideband Detectors, ADS/AWR/SystemVue, MATLAB/Simulink, Python (NumPy/SciPy), GNU Radio/SDR (USRP/RFSoC), VITA-49, HDL/Firmware (Vivado/Quartus/Libero)"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_38debdc4-b87"},"title":"GPU R&D Engineer (CUDA programming)","description":"<p>You are a passionate technology leader with deep expertise in GPU-accelerated computing and algorithm design. With over a decade of experience in software engineering, you thrive in environments that challenge you to innovate and push boundaries.</p>\n<p>As a GPU R&amp;D Engineer at Synopsys, you will be responsible for optimizing and enhancing existing GPU implementations for cutting-edge ILT (Inverse Lithography Technology) software. You will also design, develop, and deploy new GPU-accelerated algorithms for handling large-scale geometric data in mask synthesis tools.</p>\n<p>Key responsibilities include:</p>\n<ul>\n<li>Optimizing and enhancing existing GPU implementations for cutting-edge ILT software</li>\n<li>Designing, developing, and deploying new GPU-accelerated algorithms for handling large-scale geometric data in mask synthesis tools</li>\n<li>Collaborating with software, hardware, and QA teams to ensure seamless integration of advanced GPU features into Synopsys solutions</li>\n<li>Leading benchmarking and performance testing efforts to maximize throughput and efficiency of GPU algorithms</li>\n<li>Conducting research and staying current on GPU technology advancements, integrating the latest trends into Synopsys EDA products</li>\n<li>Interfacing with customers and hardware vendors to deliver optimal solutions and support rapid chip manufacturing cycles</li>\n</ul>\n<p>This role requires a strong foundation in algorithms and data structures, with proven experience optimizing for performance. You should also have exceptional troubleshooting skills and the ability to resolve complex integration challenges.</p>\n<p>In return, you will have the opportunity to make a tangible impact in the world of electronic design automation and lead initiatives that shape the next generation of semiconductor technology.</p>\n<p>The team you will be a part of is a dynamic, diverse group of engineers focused on advancing mask synthesis and lithography solutions within Synopsys. The team is renowned for its innovative spirit, technical excellence, and collaborative approach, working closely with customers and hardware partners to deliver industry-leading EDA tools.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_38debdc4-b87","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Synopsys","sameAs":"https://careers.synopsys.com","logo":"https://logos.yubhub.co/careers.synopsys.com.png"},"x-apply-url":"https://careers.synopsys.com/job/bengaluru/gpu-r-and-d-engineer-cuda-programming/44408/91681543296","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["Advanced knowledge of CUDA or similar GPU computing technologies","Proficiency in C/C++, Python, and distributed computing environments","Strong foundation in algorithms and data structures, with proven experience optimizing for performance","Exceptional troubleshooting skills and ability to resolve complex integration challenges","Experience with computational geometry algorithms, including Beziers, NURBS, and B-splines"],"x-skills-preferred":["Background in designing algorithms for Optical Proximity Correction and Inverse Lithography Technology","Experience with large-scale data handling and distributed systems"],"datePosted":"2026-04-05T13:22:03.873Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Bengaluru"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Advanced knowledge of CUDA or similar GPU computing technologies, Proficiency in C/C++, Python, and distributed computing environments, Strong foundation in algorithms and data structures, with proven experience optimizing for performance, Exceptional troubleshooting skills and ability to resolve complex integration challenges, Experience with computational geometry algorithms, including Beziers, NURBS, and B-splines, Background in designing algorithms for Optical Proximity Correction and Inverse Lithography Technology, Experience with large-scale data handling and distributed systems"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_1662ffb6-3c9"},"title":"R&D Engineering, Sr Staff Engineer","description":"<p>You will work as a senior staff engineer in the R&amp;D engineering team at Synopsys. As a member of this team, you will be responsible for architecting and optimizing high-performance simulation kernels for the Synopsys VCS RTL simulator using advanced C++ techniques. You will also explore and implement GPU acceleration strategies with CUDA to significantly reduce simulation runtimes for customers. Additionally, you will leverage deep knowledge of Verilog/SystemVerilog LRM to ensure accurate and reliable simulation across diverse design environments.</p>\n<p>Your responsibilities will include:</p>\n<ul>\n<li>Architecting and optimizing high-performance simulation kernels for the Synopsys VCS RTL simulator using advanced C++ techniques.</li>\n<li>Exploring and implementing GPU acceleration strategies with CUDA to significantly reduce simulation runtimes for customers.</li>\n<li>Leveraging deep knowledge of Verilog/SystemVerilog LRM to ensure accurate and reliable simulation across diverse design environments.</li>\n<li>Integrating AI-powered tools (such as Cursor, GitHub Copilot, and generative AI assistants) to automate code generation and debugging processes.</li>\n<li>Mentoring and guiding junior engineers, fostering skills development and technical growth within the team.</li>\n<li>Collaborating with distributed R&amp;D teams to maintain Synopsys&#39; leadership and drive innovation in the EDA industry.</li>\n</ul>\n<p>As a senior staff engineer, you will have a significant impact on the company&#39;s success. You will be responsible for driving the evolution of the world&#39;s fastest Verilog simulator, setting new industry standards for performance and reliability. You will also empower customers to achieve greater productivity and efficiency through advanced simulation capabilities and reduced runtimes.</p>\n<p>To be successful in this role, you will need to have:</p>\n<ul>\n<li>8-10 years of relevant experience.</li>\n<li>Expert-level proficiency in C++ with proven experience in performance-critical software development.</li>\n<li>Deep understanding of Verilog/SystemVerilog Language Reference Manuals (LRM) and simulation methodologies.</li>\n<li>Hands-on experience with GPU programming, especially using CUDA for parallel acceleration.</li>\n<li>Familiarity with AI-powered development tools such as Cursor, GitHub Copilot, and generative AI assistants.</li>\n<li>Strong architectural design skills and ability to analyze and optimize complex software systems.</li>\n<li>Experience in mentoring and guiding junior engineers within an R&amp;D environment.</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_1662ffb6-3c9","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Synopsys","sameAs":"https://careers.synopsys.com","logo":"https://logos.yubhub.co/careers.synopsys.com.png"},"x-apply-url":"https://careers.synopsys.com/job/sunnyvale/r-and-d-engineering-sr-staff-engineer/44408/92995225280","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$165,000 - $248,000","x-skills-required":["C++","Verilog/SystemVerilog LRM","GPU programming","AI-powered development tools","architectural design skills"],"x-skills-preferred":["CUDA","Cursor","GitHub Copilot","generative AI assistants"],"datePosted":"2026-04-05T13:21:56.685Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Sunnyvale"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"C++, Verilog/SystemVerilog LRM, GPU programming, AI-powered development tools, architectural design skills, CUDA, Cursor, GitHub Copilot, generative AI assistants","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":165000,"maxValue":248000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_f1a00bea-138"},"title":"R&D Engineering, Staff Engineer (EDA, GPU Acceleration)","description":"<p>We Are:</p>\n<p>At Synopsys, we drive the innovations that shape the way we live and connect. Our technology is central to the Era of Pervasive Intelligence, from self-driving cars to learning machines. We lead in chip design, verification, and IP integration, empowering the creation of high-performance silicon chips and software content.</p>\n<p>You Are:</p>\n<p>You are an accomplished engineering leader with over 3-6 years of experience in developing large-scale applications, particularly within the EDA domain. Your expertise spans the entire lifecycle of solution development,from initial specification to hands-on implementation, customer engagement, and iterative refinement. You thrive in environments that demand both deep technical prowess and strong leadership, and you are passionate about mentoring the next generation of engineers. Your background in C/C++ development is robust, and you approach new languages and technologies with curiosity and adaptability, making you an ideal fit for our dynamic team. You having experience with CUDA, GPU acceleration and GPU architecture knowledge is plus.</p>\n<p>What You’ll Be Doing:</p>\n<p>Enabling GPU Acceleration for Fusion Compiler entire R2G flow. This includes GPU acceleration of each of the engine in R2G flow with new problem formulation to take advantage of GPU architectures. These engines include placement, global routing, detail routing, CTS, optimization, timer, extraction, legalizer and synthesis.</p>\n<p>Owning projects end-to-end,from requirements gathering and design specification to development, testing, and customer interaction,ensuring high-quality deliverables.</p>\n<p>Collaborating closely with cross-functional teams, including product management and product engineering.</p>\n<p>The Impact You Will Have:</p>\n<p>Delivering  GPU Accelerated Fusion Compiler, which will be game changing for chip design and implementation steps by reducing flow cycle times from weeks to days (or hours).</p>\n<p>Empowering Synopsys customers to achieve faster turn around time and accelerating their design cycles and reducing time to market.</p>\n<p>Elevating the technical excellence of the team by sharing best practices, fostering a culture of learning, and mentoring future leaders.</p>\n<p>Shaping the roadmap for Digital Implementation solutions, ensuring that Synopsys remains at the forefront of EDA technology.</p>\n<p>What You’ll Need:</p>\n<p>Minimum 3-6 years of hands-on experience in developing software projects, preferably in EDA or semiconductor domains.</p>\n<p>Expert proficiency in C/C++ development, with a proven track record of delivering robust, scalable solutions.</p>\n<p>Experience with physical design, placement, and routing flows in EDA tools.</p>\n<p>Experience with CUDA, GPU acceleration and GPU architecture knowledge is plus.</p>\n<p>Strong knowledge of software architecture, Design Thinking, and use of design patterns.</p>\n<p>Excellent communication skills for technical interactions.</p>\n<p>Who You Are:</p>\n<p>Innovative thinker who embraces new technologies and methodologies.</p>\n<p>Strong problem solver with a strategic mindset and attention to detail.</p>\n<p>Effective communicator, able to translate complex technical concepts for diverse audiences.</p>\n<p>Collaborative team player, eager to contribute and learn from others.</p>\n<p>Adaptable and resilient in the face of evolving challenges and requirements.</p>\n<p>The Team You’ll Be A Part Of:</p>\n<p>You’ll join the Fusion Compiler GPU Acceleration team in Synopsys Sunnyvale, CA (or Hillsboro, OR), a group of passionate engineers focused on developing industry-first and game changing GPU Accelerated Digital Implementation solution. This development is part of Nvidia/Synopsys GPU Acceleration collaboration. This team  is driving innovation in EDA and empowering customers worldwide by accelerating their design cycles and reducing time to market.</p>\n<p>Rewards and Benefits:</p>\n<p>We offer a comprehensive range of health, wellness, and financial benefits to cater to your needs. Our total rewards include both monetary and non-monetary offerings. Your recruiter will provide more details about the salary range and benefits during the hiring process.</p>\n<p>At Synopsys, we want talented people of every background to feel valued and supported to do their best work. Synopsys considers all applicants for employment without regard to race, color, religion, national origin, gender, sexual orientation, age, military veteran status, or disability.</p>\n<p>In addition to the base salary, this role may be eligible for an annual bonus, equity, and other discretionary bonuses. Synopsys offers comprehensive health, wellness, and financial benefits as part of a competitive total rewards package. The actual compensation offered will be based on a number of job-related factors, including location, skills, experience, and education. Your recruiter can share more specific details on the total rewards package upon request. The base salary range for this role is across the U.S.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_f1a00bea-138","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Synopsys","sameAs":"https://careers.synopsys.com","logo":"https://logos.yubhub.co/careers.synopsys.com.png"},"x-apply-url":"https://careers.synopsys.com/job/sunnyvale/r-and-d-engineering-staff-engineer-eda-gpu-acceleration/44408/93189758192","x-work-arrangement":"onsite","x-experience-level":"staff","x-job-type":"full-time","x-salary-range":"$138000-$207000","x-skills-required":["C/C++ development","CUDA","GPU acceleration","GPU architecture knowledge","Physical design","Placement","Routing flows","Software architecture","Design Thinking","Use of design patterns"],"x-skills-preferred":[],"datePosted":"2026-04-05T13:20:37.265Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Sunnyvale"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"C/C++ development, CUDA, GPU acceleration, GPU architecture knowledge, Physical design, Placement, Routing flows, Software architecture, Design Thinking, Use of design patterns","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":138000,"maxValue":207000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_f50d9e71-65b"},"title":"Staff Engineer (R&D Engineering)","description":"<p>You will be working as a Staff Engineer in the R&amp;D Engineering team at Synopsys. As a key member of the team, you will be responsible for designing and implementing GPU/CPU performance optimizations for large-scale TCAD simulations. You will also develop distributed-computing solutions to enable efficient simulation of complex nanoscale devices and apply machine learning models to address emerging technology challenges in semiconductor simulation.</p>\n<p>Your responsibilities will include performing numerical analysis of strongly coupled PDE systems to enhance simulation accuracy and speed, collaborating closely with Application Engineering and cross-functional teams to refine and validate solutions, and contributing to the continuous improvement of the Sentaurus product line used by semiconductor companies, research institutions, and universities worldwide.</p>\n<p>As a Staff Engineer, you will have the opportunity to drive innovations that enable next-generation chip design and simulation for global industry leaders, accelerate the development of consumer products,phones, cameras, cars, and more,by advancing simulation technology, and enhance the performance and scalability of the Sentaurus product line, directly influencing semiconductor research and development.</p>\n<p>You will be part of a high-performing, collaborative group of engineers, scientists, and innovators dedicated to advancing simulation technology for semiconductor devices. Our team brings together diverse expertise in mathematics, physics, computing, and engineering, fostering a creative, international work environment where every member&#39;s contributions matter.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_f50d9e71-65b","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Synopsys","sameAs":"https://careers.synopsys.com","logo":"https://logos.yubhub.co/careers.synopsys.com.png"},"x-apply-url":"https://careers.synopsys.com/job/glasgow/staff-engineer-r-and-d-engineering/44408/93437232528","x-work-arrangement":"onsite","x-experience-level":"staff","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["numerical methods","high-performance computing","parallel programming","C++","CUDA","Python"],"x-skills-preferred":["applied physics","electrical engineering","mechanical engineering","linear solver methods","discretization methods (FEM, FVM)"],"datePosted":"2026-04-05T13:17:46.886Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Glasgow"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"numerical methods, high-performance computing, parallel programming, C++, CUDA, Python, applied physics, electrical engineering, mechanical engineering, linear solver methods, discretization methods (FEM, FVM)"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_baea7339-8a8"},"title":"Sr. Systems Sales Engineer","description":"<p>We&#39;re looking for a Sr. Systems Sales Engineer who combines strong technical depth with a passion for solving complex customer challenges.</p>\n<p>You will design end-to-end enterprise solutions, guide customers through technical decision-making, and partner with sales to expand Corsair&#39;s footprint in high-performance computing and AI-driven workloads. The ideal candidate merges advanced systems knowledge with customer-facing expertise to shape solutions, support sales, and accelerate adoption of Corsair&#39;s enterprise platforms.</p>\n<p>**Key Responsibilities:*</p>\n<ul>\n<li>Platform &amp; Product Strategy: Collaborate with product management, engineering teams, and HPC integration partners to shape the roadmap for workstations and HPC platforms. Identify and evaluate emerging technologies, market trends, and evolving workloads to inform product strategies and unlock new business opportunities.</li>\n</ul>\n<ul>\n<li>New Product Introduction (NPI): Develop and drive NPI readiness plans including technical documentation, sales enablement resources, and customer-facing solution guides. Ensure smooth product rollout by aligning engineering, marketing, support, and ecosystem partners.</li>\n</ul>\n<ul>\n<li>AI &amp; Developer Ecosystem Engagement: Align with AI software partners, SDK/tool providers, and developer communities to build value-added integrations and optimize emerging AI workloads on Corsair platforms and provide architecture-level guidance to support AI, ML, and HPC applications.</li>\n</ul>\n<ul>\n<li>Customer Solutions &amp; Technical Leadership: Design system- and application-level solutions based on customer requirements; perform diagnostics, optimization, and version upgrade management and act as a technical subject matter expert for enterprise accounts, providing advanced troubleshooting guidance and deployment support.</li>\n</ul>\n<ul>\n<li>Client Relationship &amp; Escalation Management: Build and maintain strong customer relationships through effective communication, pre-sales support, and solution clarity. Manage hardware escalations by coordinating with internal teams and vendor partners to ensure timely issue resolution and serve as a trusted hardware and technical SME across internal and external engagements.</li>\n</ul>\n<p><strong>Qualifications:</strong></p>\n<ul>\n<li>Bachelor’s degree in Computer Science, Engineering, or related field; equivalent practical experience (10+ years) considered.</li>\n</ul>\n<ul>\n<li>Extensive experience in high-performance computing, workstation architecture, or enterprise systems design.</li>\n</ul>\n<ul>\n<li>Strong background in Solutions Architecture, Sales Engineering, Product Marketing, or ODM platform development.</li>\n</ul>\n<ul>\n<li>Deep knowledge of Linux ecosystems, software build pipelines, and GPU computing technologies (NVIDIA CUDA, AMD ROCm, PCIe, InfiniBand).</li>\n</ul>\n<ul>\n<li>Excellent communication and leadership skills, with the ability to translate complex technical concepts into clear business value.</li>\n</ul>\n<p>For roles that are based at our headquarters in Milpitas, CA: The starting base pay for this position is as shown below. The actual base pay is dependent upon a variety of job-related factors such as professional background, training, work experience, location, business needs and market demand. Therefore, in some circumstances, the actual salary could fall outside of this expected range. This pay range is subject to change and may be modified in the future.</p>\n<p>Annual Salary Range $165,000—$180,000 USD</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_baea7339-8a8","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Corsair","sameAs":"https://www.corsair.com/","logo":"https://logos.yubhub.co/corsair.com.png"},"x-apply-url":"https://edix.fa.us2.oraclecloud.com/hcmUI/CandidateExperience/en/sites/CX_1/job/8694","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$165,000—$180,000 USD","x-skills-required":["Linux ecosystems","software build pipelines","GPU computing technologies (NVIDIA CUDA, AMD ROCm, PCIe, InfiniBand)","high-performance computing","workstation architecture","enterprise systems design"],"x-skills-preferred":[],"datePosted":"2026-03-10T12:20:26.667Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Milpitas Nous found city name, so using empty string"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Linux ecosystems, software build pipelines, GPU computing technologies (NVIDIA CUDA, AMD ROCm, PCIe, InfiniBand), high-performance computing, workstation architecture, enterprise systems design","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":165000,"maxValue":180000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_ea503adf-fac"},"title":"Research Engineer, Machine Learning","description":"<p><strong>About the Role</strong></p>\n<p>We are seeking a Research Engineer to join our Machine Learning team. As a Research Engineer, you will work on building and optimizing large-scale learning systems that power our open-weight models.</p>\n<p><strong>Responsibilities</strong></p>\n<ul>\n<li>Accelerate researchers by taking on the heavy parts of large-scale ML pipelines and building robust tools.</li>\n<li>Interface cutting-edge research with production: integrate checkpoints, streamline evaluation, and expose APIs.</li>\n<li>Conduct experiments on the latest deep-learning techniques.</li>\n<li>Design, implement and benchmark ML algorithms; write clear, efficient code in Python.</li>\n<li>Deliver prototypes that become production-grade components for Le Chat and our enterprise API.</li>\n</ul>\n<p><strong>Requirements</strong></p>\n<ul>\n<li>Master&#39;s or PhD in Computer Science (or equivalent proven track record).</li>\n<li>4 + years working on large-scale ML codebases.</li>\n<li>Hands-on with PyTorch, JAX or TensorFlow; comfortable with distributed training (DeepSpeed / FSDP / SLURM / K8s).</li>\n<li>Experience in deep learning, NLP or LLMs; bonus for CUDA or data-pipeline chops.</li>\n<li>Strong software-design instincts: testing, code review, CI/CD.</li>\n<li>Self-starter, low-ego, collaborative.</li>\n</ul>\n<p><strong>Benefits</strong></p>\n<ul>\n<li>Competitive cash salary and equity</li>\n<li>Food: Daily lunch vouchers</li>\n<li>Sport: Monthly contribution to a Gympass subscription</li>\n<li>Transportation: Monthly contribution to a mobility pass</li>\n<li>Health: Full health insurance for you and your family</li>\n<li>Parental: Generous parental leave policy</li>\n<li>Visa sponsorship</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_ea503adf-fac","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Mistral AI","sameAs":"https://mistral.ai/careers"},"x-apply-url":"https://jobs.lever.co/mistral/07447e1d-7900-46d4-b61b-186f2f76847f","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["PyTorch","JAX","TensorFlow","Distributed training","Deep learning","NLP","LLMs","CUDA","Data pipeline"],"x-skills-preferred":[],"datePosted":"2026-03-10T11:33:33.327Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Paris"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"PyTorch, JAX, TensorFlow, Distributed training, Deep learning, NLP, LLMs, CUDA, Data pipeline"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_797494bd-994"},"title":"Research Engineer, Machine Learning","description":"<p><strong>About Mistral AI</strong></p>\n<p>Mistral AI is a pioneering company that develops and provides high-performance, open-source AI models, products, and solutions.</p>\n<p><strong>Role Summary</strong></p>\n<p>The Research Engineering team at Mistral AI spans Platform (shared infrastructure and clean code) and Embedded (inside research squads). Engineers can move along the research↔production spectrum as needs or interests evolve.</p>\n<p>As a Research Engineer – ML track, you’ll build and optimize the large-scale learning systems that power our open-weight models. Working hand-in-hand with Research Scientists, you’ll either join:</p>\n<ul>\n<li>Platform RE Team: Enhance the shared training framework, data pipelines, and cluster tooling used by every team;</li>\n<li>Embedded RE Team: Sit inside a research squad (Alignment, Pre-training, Multimodal, …) and turn fresh ideas into repeatable, scalable code.</li>\n</ul>\n<p><strong>Responsibilities</strong></p>\n<ul>\n<li>Accelerate researchers by taking on the heavy parts of large-scale ML pipelines and building robust tools.</li>\n<li>Interface cutting-edge research with production: integrate checkpoints, streamline evaluation, and expose APIs.</li>\n<li>Conduct experiments on the latest deep-learning techniques (sparsified 70 B + runs, distributed training on thousands of GPUs).</li>\n<li>Design, implement, and benchmark ML algorithms; write clear, efficient code in Python.</li>\n<li>Deliver prototypes that become production-grade components for Le Chat and our enterprise API.</li>\n</ul>\n<p><strong>Requirements</strong></p>\n<ul>\n<li>Master’s or PhD in Computer Science (or equivalent proven track record).</li>\n<li>4 + years working on large-scale ML codebases.</li>\n<li>Hands-on with PyTorch, JAX, or TensorFlow; comfortable with distributed training (DeepSpeed / FSDP / SLURM / K8s).</li>\n<li>Experience in deep learning, NLP, or LLMs; bonus for CUDA or data-pipeline chops.</li>\n<li>Strong software-design instincts: testing, code review, CI/CD.</li>\n<li>Self-starter, low-ego, collaborative.</li>\n</ul>\n<p><strong>What We Offer</strong></p>\n<ul>\n<li>Competitive salary and equity.</li>\n<li>Healthcare: Medical/Dental/Vision covered for you and your family.</li>\n<li>Pension: 401K (6% matching).</li>\n<li>PTO: 18 days.</li>\n<li>Transportation: Reimburse office parking charges, or $120/month for public transport.</li>\n<li>Sport: $120/month reimbursement for gym membership.</li>\n<li>Meal stipend: $400 monthly allowance for meals (solution might evolve as we grow bigger).</li>\n<li>Visa sponsorship.</li>\n<li>Coaching: we offer BetterUp coaching on a voluntary basis.</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_797494bd-994","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Mistral AI","sameAs":"https://mistral.ai/careers"},"x-apply-url":"https://jobs.lever.co/mistral/bada0014-0f32-4370-b55f-81c5595c7339","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["PyTorch","JAX","TensorFlow","Distributed Training","Deep Learning","NLP","LLMs","CUDA","Data Pipelines"],"x-skills-preferred":[],"datePosted":"2026-03-10T11:33:07.101Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Palo Alto"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"PyTorch, JAX, TensorFlow, Distributed Training, Deep Learning, NLP, LLMs, CUDA, Data Pipelines"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_8d359571-77e"},"title":"Lead Software Engineer, Runtime","description":"<p>As the Technical Lead for the Inference team, you will drive the architecture and optimization of our inference backbone, ensuring high performance, scalability, and efficiency in a dynamic environment.</p>\n<p>The role involves architecting and optimizing the inference for high-volume, low-latency, and high-availability environments, leading the acquisition and automation of benchmarks, collaborating with cross-functional teams, and innovating solutions to enhance our AI-powered applications.</p>\n<p>Key responsibilities include:</p>\n<ul>\n<li>Architecting and optimizing the inference for high-volume, low-latency, and high-availability environments</li>\n<li>Leading the acquisition and automation of benchmarks at both micro and macro scales</li>\n<li>Introducing new techniques and tools to improve performance, latency, throughput, and efficiency in our model inference stack</li>\n<li>Building tools to identify bottlenecks and sources of instability, and designing solutions to address them</li>\n<li>Collaborating with machine learning researchers, engineers, and product managers to bring cutting-edge technologies into production</li>\n<li>Optimizing code and infrastructure to maximize hardware utilization and efficiency</li>\n<li>Mentoring and guiding team members, fostering a culture of collaboration, innovation, and continuous learning</li>\n</ul>\n<p>Requirements include:</p>\n<ul>\n<li>Extensive experience in C++ and Python, with a strong focus on backend development and performance optimization</li>\n<li>Deep understanding of modern ML architectures and experience with performance optimization for inference</li>\n<li>Proven track record with large-scale distributed systems, particularly performance-critical ones</li>\n<li>Familiarity with PyTorch, TensorRT, CUDA, NCCL</li>\n<li>Strong grasp of infrastructure, continuous integration, and continuous development principles</li>\n<li>Ability to lead and mentor team members, driving projects from concept to implementation</li>\n<li>Results-oriented mindset with a bias towards flexibility and impact</li>\n<li>Passion for staying ahead of emerging technologies and applying them to AI-driven solutions</li>\n<li>Humble attitude, eagerness to help colleagues, and a desire to see the team succeed</li>\n</ul>\n<p>Our Culture</p>\n<p>We&#39;re driven to build a strong company culture and are looking for individuals with solid alignment with the following:</p>\n<ul>\n<li>Reason with rigor</li>\n<li>Are you audacious enough?</li>\n<li>Make our customers succeed</li>\n<li>Ship early and accelerate</li>\n<li>Leave your ego aside</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_8d359571-77e","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Mistral AI","sameAs":"https://mistral.ai"},"x-apply-url":"https://jobs.lever.co/mistral/0593f273-44f5-4c20-a84c-0406d5da6a0b","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["C++","Python","PyTorch","TensorRT","CUDA","NCCL"],"x-skills-preferred":[],"datePosted":"2026-03-10T11:27:09.420Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Paris"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"C++, Python, PyTorch, TensorRT, CUDA, NCCL"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_af442d9f-834"},"title":"Senior AI Developer Technology Engineer, Financial Sector","description":"<p>We&#39;re seeking a Senior AI Developer Technology Engineer to help shape the future of financial AI and data analytics by designing and optimizing parallel algorithms on cutting-edge computing platforms. You will research and develop techniques to GPU-accelerate high-performance workloads at the intersection of AI and financial markets. You will work directly with other technical experts in their fields to perform in-depth analysis and optimization of complex AI and HPC workloads to ensure the best possible performance on modern CPU and GPU architectures. You will publish and present discovered optimization techniques in developer blogs or relevant conferences to engage and educate the Developer community. You will influence the design of next-generation hardware architectures, software, and programming models in collaboration with research, hardware, system software, libraries, and tools teams at NVIDIA.</p>\n<p><strong>Responsibilities:</strong></p>\n<ul>\n<li>Research and develop techniques to GPU-accelerate high-performance workloads at the intersection of AI and financial markets.</li>\n<li>Work directly with other technical experts in their fields to perform in-depth analysis and optimization of complex AI and HPC workloads to ensure the best possible performance on modern CPU and GPU architectures.</li>\n<li>Publish and present discovered optimization techniques in developer blogs or relevant conferences to engage and educate the Developer community.</li>\n<li>Influence the design of next-generation hardware architectures, software, and programming models in collaboration with research, hardware, system software, libraries, and tools teams at NVIDIA.</li>\n</ul>\n<p><strong>Requirements:</strong></p>\n<ul>\n<li>An advanced degree in Computer Science, Computer Engineering, or related computationally focused science degree (or equivalent experience).</li>\n<li>5+ years of relevant work or research experience.</li>\n<li>Direct experience improving the performance of large computational applications used by financial institutions.</li>\n<li>Excellent understanding of linear algebra.</li>\n<li>Programming fluency in C/C++ with a deep understanding of algorithms and software design.</li>\n<li>Hands-on experience with low-level parallel programming, e.g., CUDA, OpenACC, OpenMP, MPI, pthreads, TBB, etc.</li>\n<li>In-depth expertise with CPU/GPU architecture fundamentals.</li>\n<li>Good communication and organization skills, with a logical approach to problem solving, and prioritization skills.</li>\n</ul>\n<p><strong>Ways to stand out from the crowd:</strong></p>\n<ul>\n<li>A Master’s or PhD in a relevant field is highly valued.</li>\n<li>Prior work experience in capital markets with exposure to systematic/algorithmic strategies and quantitative trading.</li>\n<li>Experience with parallelizing and optimizing machine learning algorithms like decision trees, time series, and Monte Carlo simulations.</li>\n<li>Deep knowledge of financial data models, pricing/risk simulation algorithms, portfolio optimization, or other financial specific applications/ services.</li>\n<li>Have developed ML/DL techniques in the finance space, such as stock market prediction, fraud detection, portfolio optimization/selection.</li>\n</ul>\n<p>You will also be eligible for equity and benefits.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_af442d9f-834","directApply":true,"hiringOrganization":{"@type":"Organization","name":"NVIDIA","sameAs":"https://nvidia.wd5.myworkdayjobs.com","logo":"https://logos.yubhub.co/nvidia.com.png"},"x-apply-url":"https://nvidia.wd5.myworkdayjobs.com/en-US/NVIDIAExternalCareerSite/job/US-CA-Santa-Clara/Senior-AI-Developer-Technology-Engineer--Financial-Sector_JR2013482","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["C/C++","CUDA","OpenACC","OpenMP","MPI","pthreads","TBB","CPU/GPU architecture fundamentals","Linear algebra","Parallel programming"],"x-skills-preferred":["Machine learning","Deep learning","Financial data models","Pricing/risk simulation algorithms","Portfolio optimization"],"datePosted":"2026-03-09T20:46:22.712Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Santa Clara, Remote, New York"}},"jobLocationType":"TELECOMMUTE","employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"C/C++, CUDA, OpenACC, OpenMP, MPI, pthreads, TBB, CPU/GPU architecture fundamentals, Linear algebra, Parallel programming, Machine learning, Deep learning, Financial data models, Pricing/risk simulation algorithms, Portfolio optimization"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_ce88828f-470"},"title":"Solutions Architect, AI and ML","description":"<p>We are building the world&#39;s leading AI company and are looking for an experienced Cloud Solution Architect to help assist customers with adoption of GPU hardware and Software, as well as building and deploying Machine Learning (ML), Deep Learning (DL), data analytics solutions on various Cloud Computing Platforms.</p>\n<p>As part of the Solutions Architecture team, we work with some of the most exciting computing hardware and software technologies including the latest breakthroughs in machine learning and data science. A Solutions Architect is the first line of technical expertise between NVIDIA and our customers so you will engage directly with developers, researchers, and data scientists with some of NVIDIA&#39;s most strategic technology customers as well as work directly with business and engineering teams on product strategy.</p>\n<p><strong>What you will be doing:</strong></p>\n<ul>\n<li>Working with Cloud Service Providers to develop and demonstrate solutions based on NVIDIA&#39;s ML/DL and data science software and hardware technologies</li>\n</ul>\n<ul>\n<li>Build and deploy AI/ML solutions at scale using NVIDIA&#39;s AI software on cloud-based GPU platforms.</li>\n</ul>\n<ul>\n<li>Build custom PoCs for solution that address customer&#39;s critical business needs applying NVIDIA hardware and software technology</li>\n</ul>\n<ul>\n<li>Partner with Sales Account Managers or Developer Relations Managers to identify and secure new business opportunities for NVIDIA products and solutions for ML/DL and other software solutions</li>\n</ul>\n<ul>\n<li>Prepare and deliver technical content to customers including presentations about purpose-built solutions, workshops about NVIDIA products and solutions, etc.</li>\n</ul>\n<ul>\n<li>Conduct regular technical customer meetings for project/product roadmap, feature discussions, and intro to new technologies. Establish close technical ties to the customer to facilitate rapid resolution of customer issues</li>\n</ul>\n<p><strong>What we need to see:</strong></p>\n<ul>\n<li>3+ years of Solutions Engineering (or similar Sales Engineering roles) or equivalent experience</li>\n</ul>\n<ul>\n<li>3+ years of work-related experience in Deep Learning and Machine Learning, including deep learning frameworks TensorFlow or PyTorch, GPU, and CUDA experience extremely helpful.</li>\n</ul>\n<ul>\n<li>BS/MS/PhD in Electrical/Computer Engineering, Computer Science, Statistics, Physics, or other Engineering fields or equivalent experience.</li>\n</ul>\n<ul>\n<li>Established track record of deploying solutions in cloud computing environments including AWS, GCP, or Azure</li>\n</ul>\n<ul>\n<li>Knowledge of DevOps/ML Ops technologies such as Docker/containers, Kubernetes, data center deployments</li>\n</ul>\n<ul>\n<li>Ability to use at least one scripting language (i.e., Python)</li>\n</ul>\n<ul>\n<li>Good programming and debugging skills</li>\n</ul>\n<ul>\n<li>Ability to communicate your ideas/code clearly through documents, presentation etc.</li>\n</ul>\n<p><strong>Ways to stand out from the crowd:</strong></p>\n<ul>\n<li>AWS, GCP or Azure Professional Solution Architect Certification.</li>\n</ul>\n<ul>\n<li>Hands-on experience with NVIDIA GPUs and SDKs (e.g. CUDA, RAPIDS, Triton etc.)</li>\n</ul>\n<ul>\n<li>System-level experience specifically GPU-based systems</li>\n</ul>\n<ul>\n<li>Experience with Deep Learning at scale</li>\n</ul>\n<ul>\n<li>Familiarity with parallel programming and distributed computing platforms</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_ce88828f-470","directApply":true,"hiringOrganization":{"@type":"Organization","name":"NVIDIA","sameAs":"https://nvidia.wd5.myworkdayjobs.com","logo":"https://logos.yubhub.co/nvidia.com.png"},"x-apply-url":"https://nvidia.wd5.myworkdayjobs.com/en-US/NVIDIAExternalCareerSite/job/US-WA-Redmond/Solutions-Architect--AI-and-ML_JR2000691","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["Solutions Engineering","Deep Learning and Machine Learning","TensorFlow or PyTorch","GPU and CUDA experience","BS/MS/PhD in Electrical/Computer Engineering, Computer Science, Statistics, Physics, or other Engineering fields","DevOps/ML Ops technologies","Docker/containers, Kubernetes, data center deployments","Scripting language (i.e., Python)","Good programming and debugging skills","Ability to communicate your ideas/code clearly through documents, presentation etc."],"x-skills-preferred":["AWS, GCP or Azure Professional Solution Architect Certification","Hands-on experience with NVIDIA GPUs and SDKs (e.g. CUDA, RAPIDS, Triton etc.)","System-level experience specifically GPU-based systems","Experience with Deep Learning at scale","Familiarity with parallel programming and distributed computing platforms"],"datePosted":"2026-03-09T20:46:16.733Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Redmond, Santa Clara, Seattle"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Solutions Engineering, Deep Learning and Machine Learning, TensorFlow or PyTorch, GPU and CUDA experience, BS/MS/PhD in Electrical/Computer Engineering, Computer Science, Statistics, Physics, or other Engineering fields, DevOps/ML Ops technologies, Docker/containers, Kubernetes, data center deployments, Scripting language (i.e., Python), Good programming and debugging skills, Ability to communicate your ideas/code clearly through documents, presentation etc., AWS, GCP or Azure Professional Solution Architect Certification, Hands-on experience with NVIDIA GPUs and SDKs (e.g. CUDA, RAPIDS, Triton etc.), System-level experience specifically GPU-based systems, Experience with Deep Learning at scale, Familiarity with parallel programming and distributed computing platforms"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_cf4fd05b-818"},"title":"Senior Software Engineer, NCCL","description":"<p>We are looking for a highly motivated senior software engineer to join our communication libraries and network software team. The position will be part of a fast-paced crew that develops and maintains software for complex heterogeneous computing systems that power disruptive products in High Performance Computing and Deep Learning.</p>\n<p><strong>Responsibilities:</strong></p>\n<ul>\n<li>Design, implement and maintain highly-optimized communication runtimes for Deep Learning frameworks (e.g. NCCL for TensorFlow/Pytorch) and HPC programming interfaces (e.g. UCX for MPI/OpenSHMEM) on GPU clusters.</li>\n<li>Participate in and contribute to parallel programming interface specifications like MPI/OpenSHMEM.</li>\n<li>Design, implement and maintain system software that enables interactions among GPUs and interactions between GPUs and other system components.</li>\n<li>Create proof-of-concepts to evaluate and motivate extensions in programming models, new designs in runtimes and new features in hardware.</li>\n</ul>\n<p><strong>Requirements:</strong></p>\n<ul>\n<li>M.S./Ph.D. degree in CS/CE or equivalent experience.</li>\n<li>5+ years of relevant experience.</li>\n<li>Excellent C/C++ programming and debugging skills.</li>\n<li>Strong experience with Linux.</li>\n<li>Expert understanding of computer system architecture and operating systems.</li>\n<li>Experience with parallel programming interfaces and communication runtimes.</li>\n<li>Ability and flexibility to work and communicate effectively in a multi-national, multi-time-zone corporate environment.</li>\n</ul>\n<p><strong>Nice to Have:</strong></p>\n<ul>\n<li>Deep understanding of technology and passionate about what you do.</li>\n<li>Experience with CUDA programming and NVIDIA GPUs.</li>\n<li>Knowledge of high-performance networks like InfiniBand, iWARP etc.</li>\n<li>Experience with HPC applications.</li>\n<li>Experience with Deep Learning Frameworks such PyTorch, TensorFlow, etc.</li>\n<li>Strong collaborative and interpersonal skills, specifically a proven ability to effectively guide and influence within a dynamic matrix environment.</li>\n</ul>\n<p><strong>Benefits:</strong></p>\n<ul>\n<li>Highly competitive salaries.</li>\n<li>Comprehensive benefits package.</li>\n<li>Eligibility for equity.</li>\n<li>Opportunity to work with a world-class engineering team.</li>\n<li>Ability to work in a dynamic matrix environment.</li>\n<li>Opportunity to contribute to cutting-edge technology.</li>\n<li>Flexible work arrangements.</li>\n<li>Professional development opportunities.</li>\n</ul>\n<p><strong>How to Apply:</strong></p>\n<p>Applications for this job will be accepted at least until March 13, 2026. NVIDIA uses AI tools in its recruiting processes.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_cf4fd05b-818","directApply":true,"hiringOrganization":{"@type":"Organization","name":"NVIDIA","sameAs":"https://nvidia.wd5.myworkdayjobs.com","logo":"https://logos.yubhub.co/nvidia.com.png"},"x-apply-url":"https://nvidia.wd5.myworkdayjobs.com/en-US/NVIDIAExternalCareerSite/job/US-CA-Santa-Clara/Senior-Software-Engineer--GPU-Communications-and-Networking_JR1997186","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["C/C++","Linux","Computer system architecture","Operating systems","Parallel programming interfaces","Communication runtimes"],"x-skills-preferred":["CUDA programming","NVIDIA GPUs","High-performance networks","HPC applications","Deep Learning Frameworks"],"datePosted":"2026-03-09T20:44:17.925Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Santa Clara"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"C/C++, Linux, Computer system architecture, Operating systems, Parallel programming interfaces, Communication runtimes, CUDA programming, NVIDIA GPUs, High-performance networks, HPC applications, Deep Learning Frameworks"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_a51375e8-30e"},"title":"Member of Technical Staff, Software Co-Design AI HPC Systems","description":"<p>Our team&#39;s mission is to architect, co-design, and productionize next-generation AI systems at datacenter scale. We operate at the intersection of models, systems software, networking, storage, and AI hardware, optimizing end-to-end performance, efficiency, reliability, and cost. Our work spans today&#39;s frontier AI workloads and directly shapes the next generation of accelerators, system architectures, and large-scale AI platforms. We pursue this mission through deep hardware–software co-design, combining rigorous systems thinking with hands-on engineering. The team invests heavily in understanding real production workloads large-scale training, inference, and emerging multimodal models and translating those insights into concrete improvements across the stack: from kernels, runtimes, and distributed systems, all the way down to silicon-level trade-offs and datacenter-scale architectures. This role sits at the boundary between exploration and production. You will work closely with internal infrastructure, hardware, compiler, and product teams, as well as external partners across the hardware and systems ecosystem. Our operating model emphasizes rapid ideation and prototyping, followed by disciplined execution to drive high-leverage ideas into production systems that operate at massive scale. In addition to delivering real-world impact on large-scale AI platforms, the team actively contributes to the broader research and engineering community. Our work aligns closely with leading communities in ML systems, distributed systems, computer architecture, and high-performance computing, and we regularly publish, prototype, and open-source impactful technologies where appropriate.</p>\n<p>About the Team</p>\n<p>We build foundational AI infrastructure that enables large-scale training and inference across diverse workloads and rapidly evolving hardware generations. Our work directly shapes how AI systems are designed, deployed, and scaled today and into the future. Engineers on this team operate with end-to-end ownership, deep technical rigor, and a strong bias toward real-world impact.</p>\n<p>Microsoft Superintelligence Team</p>\n<p>Microsoft Superintelligence team’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.</p>\n<p>This role is part of Microsoft AI’s Superintelligence Team. The MAIST is a startup-like team inside Microsoft AI, created to push the boundaries of AI toward Humanist Superintelligence—ultra-capable systems that remain controllable, safety-aligned, and anchored to human values. Our mission is to create AI that amplifies human potential while ensuring humanity remains firmly in control. We aim to deliver breakthroughs that benefit society—advancing science, education, and global well-being. We’re also fortunate to partner with incredible product teams giving our models the chance to reach billions of users and create immense positive impact. If you’re a brilliant, highly-ambitious and low ego individual, you’ll fit right in—come and join us as we work on our next generation of models!</p>\n<p>Responsibilities</p>\n<p>Lead the co-design of AI systems across hardware and software boundaries, spanning accelerators, interconnects, memory systems, storage, runtimes, and distributed training/inference frameworks. Drive architectural decisions by analyzing real workloads, identifying bottlenecks across compute, communication, and data movement, and translating findings into actionable system and hardware requirements. Co-design and optimize parallelism strategies, execution models, and distributed algorithms to improve scalability, utilization, reliability, and cost efficiency of large-scale AI systems. Develop and evaluate what-if performance models to project system behavior under future workloads, model architectures, and hardware generations, providing early guidance to hardware and platform roadmaps. Partner with compiler, kernel, and runtime teams to unlock the full performance of current and next-generation accelerators, including custom kernels, scheduling strategies, and memory optimizations. Influence and guide AI hardware design at system and silicon levels, including accelerator microarchitecture, interconnect topology, memory hierarchy, and system integration trade-offs. Lead cross-functional efforts to prototype, validate, and productionize high-impact co-design ideas, working across infrastructure, hardware, and product teams. Mentor senior engineers and researchers, set technical direction, and raise the overall bar for systems rigor, performance engineering, and co-design thinking across the organization.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_a51375e8-30e","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Microsoft AI","sameAs":"https://microsoft.ai","logo":"https://logos.yubhub.co/microsoft.ai.png"},"x-apply-url":"https://microsoft.ai/job/member-of-technical-staff-software-co-design-ai-hpc-systems-mai-superintelligence-team-3/","x-work-arrangement":"hybrid","x-experience-level":"staff","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["AI accelerator or GPU architectures","Distributed systems and large-scale AI training/inference","High-performance computing (HPC) and collective communications","ML systems, runtimes, or compilers","Performance modeling, benchmarking, and systems analysis","Hardware–software co-design for AI workloads","Proficiency in systems-level programming (e.g., C/C++, CUDA, Python) and performance-critical software development"],"x-skills-preferred":["Experience designing or operating large-scale AI clusters for training or inference","Deep familiarity with LLMs, multimodal models, or recommendation systems, and their systems-level implications","Experience with accelerator interconnects and communication stacks (e.g., NCCL, MPI, RDMA, high-speed Ethernet or InfiniBand)","Background in performance modeling and capacity planning for future hardware generations","Prior experience contributing to or leading hardware roadmaps, silicon bring-up, or platform architecture reviews","Publications, patents, or open-source contributions in systems, architecture, or ML systems"],"datePosted":"2026-03-08T22:18:41.443Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"London"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"AI accelerator or GPU architectures, Distributed systems and large-scale AI training/inference, High-performance computing (HPC) and collective communications, ML systems, runtimes, or compilers, Performance modeling, benchmarking, and systems analysis, Hardware–software co-design for AI workloads, Proficiency in systems-level programming (e.g., C/C++, CUDA, Python) and performance-critical software development, Experience designing or operating large-scale AI clusters for training or inference, Deep familiarity with LLMs, multimodal models, or recommendation systems, and their systems-level implications, Experience with accelerator interconnects and communication stacks (e.g., NCCL, MPI, RDMA, high-speed Ethernet or InfiniBand), Background in performance modeling and capacity planning for future hardware generations, Prior experience contributing to or leading hardware roadmaps, silicon bring-up, or platform architecture reviews, Publications, patents, or open-source contributions in systems, architecture, or ML systems"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_cd1a0d16-311"},"title":"Member of Technical Staff, Software Co-Design AI HPC Systems","description":"<p>Our team&#39;s mission is to architect, co-design, and productionize next-generation AI systems at datacenter scale. We operate at the intersection of models, systems software, networking, storage, and AI hardware, optimizing end-to-end performance, efficiency, reliability, and cost.</p>\n<p>We pursue this mission through deep hardware–software co-design, combining rigorous systems thinking with hands-on engineering. The team invests heavily in understanding real production workloads large-scale training, inference, and emerging multimodal models and translating those insights into concrete improvements across the stack: from kernels, runtimes, and distributed systems, all the way down to silicon-level trade-offs and datacenter-scale architectures.</p>\n<p>This role sits at the boundary between exploration and production. You will work closely with internal infrastructure, hardware, compiler, and product teams, as well as external partners across the hardware and systems ecosystem. Our operating model emphasizes rapid ideation and prototyping, followed by disciplined execution to drive high-leverage ideas into production systems that operate at massive scale.</p>\n<p>In addition to delivering real-world impact on large-scale AI platforms, the team actively contributes to the broader research and engineering community. Our work aligns closely with leading communities in ML systems, distributed systems, computer architecture, and high-performance computing, and we regularly publish, prototype, and open-source impactful technologies where appropriate.</p>\n<p>Microsoft Superintelligence Team\nMicrosoft Superintelligence team’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.</p>\n<p>This role is part of Microsoft AI’s Superintelligence Team. The MAIST is a startup-like team inside Microsoft AI, created to push the boundaries of AI toward Humanist Superintelligence—ultra-capable systems that remain controllable, safety-aligned, and anchored to human values. Our mission is to create AI that amplifies human potential while ensuring humanity remains firmly in control. We aim to deliver breakthroughs that benefit society—advancing science, education, and global well-being. We’re also fortunate to partner with incredible product teams giving our models the chance to reach billions of users and create immense positive impact.</p>\n<p>Responsibilities\nLead the co-design of AI systems across hardware and software boundaries, spanning accelerators, interconnects, memory systems, storage, runtimes, and distributed training/inference frameworks.</p>\n<p>Drive architectural decisions by analyzing real workloads, identifying bottlenecks across compute, communication, and data movement, and translating findings into actionable system and hardware requirements.</p>\n<p>Co-design and optimize parallelism strategies, execution models, and distributed algorithms to improve scalability, utilization, reliability, and cost efficiency of large-scale AI systems.</p>\n<p>Develop and evaluate what-if performance models to project system behavior under future workloads, model architectures, and hardware generations, providing early guidance to hardware and platform roadmaps.</p>\n<p>Partner with compiler, kernel, and runtime teams to unlock the full performance of current and next-generation accelerators, including custom kernels, scheduling strategies, and memory optimizations.</p>\n<p>Influence and guide AI hardware design at system and silicon levels, including accelerator microarchitecture, interconnect topology, memory hierarchy, and system integration trade-offs.</p>\n<p>Lead cross-functional efforts to prototype, validate, and productionize high-impact co-design ideas, working across infrastructure, hardware, and product teams.</p>\n<p>Mentor senior engineers and researchers, set technical direction, and raise the overall bar for systems rigor, performance engineering, and co-design thinking across the organization.</p>\n<p>Qualifications\nBachelor’s Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience.</p>\n<p>Additional or Preferred Qualifications\nMaster’s Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR Bachelor’s Degree in Computer Science or related technical field AND 12+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience.</p>\n<p>Strong background in one or more of the following areas: AI accelerator or GPU architectures Distributed systems and large-scale AI training/inference High-performance computing (HPC) and collective communications ML systems, runtimes, or compilers Performance modeling, benchmarking, and systems analysis Hardware–software co-design for AI workloads Proficiency in systems-level programming (e.g., C/C++, CUDA, Python) and performance-critical software development.</p>\n<p>Proven ability to work across organizational boundaries and influence technical decisions involving multiple stakeholders. Experience designing or operating large-scale AI clusters for training or inference. Deep familiarity with LLMs, multimodal models, or recommendation systems, and their systems-level implications. Experience with accelerator interconnects and communication stacks (e.g., NCCL, MPI, RDMA, high-speed Ethernet or InfiniBand). Background in performance modeling and capacity planning for future hardware generations. Prior experience contributing to or leading hardware roadmaps, silicon bring-up, or platform architecture reviews. Publications, patents, or open-source contributions in systems, architecture, or ML systems are a plus.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_cd1a0d16-311","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Microsoft AI","sameAs":"https://microsoft.ai","logo":"https://logos.yubhub.co/microsoft.ai.png"},"x-apply-url":"https://microsoft.ai/job/member-of-technical-staff-software-co-design-ai-hpc-systems-mai-superintelligence-team-2/","x-work-arrangement":"hybrid","x-experience-level":"staff","x-job-type":"full-time","x-salary-range":"$139,900 – $274,800 per year","x-skills-required":["C","C++","C#","Java","JavaScript","Python","AI accelerator or GPU architectures","Distributed systems and large-scale AI training/inference","High-performance computing (HPC) and collective communications","ML systems, runtimes, or compilers","Performance modeling, benchmarking, and systems analysis","Hardware–software co-design for AI workloads","Proficiency in systems-level programming (e.g., C/C++, CUDA, Python) and performance-critical software development"],"x-skills-preferred":["LLMs, multimodal models, or recommendation systems, and their systems-level implications","Accelerator interconnects and communication stacks (e.g., NCCL, MPI, RDMA, high-speed Ethernet or InfiniBand)","Performance modeling and capacity planning for future hardware generations","Contributing to or leading hardware roadmaps, silicon bring-up, or platform architecture reviews","Publications, patents, or open-source contributions in systems, architecture, or ML systems"],"datePosted":"2026-03-08T22:13:30.666Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Redmond"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"C, C++, C#, Java, JavaScript, Python, AI accelerator or GPU architectures, Distributed systems and large-scale AI training/inference, High-performance computing (HPC) and collective communications, ML systems, runtimes, or compilers, Performance modeling, benchmarking, and systems analysis, Hardware–software co-design for AI workloads, Proficiency in systems-level programming (e.g., C/C++, CUDA, Python) and performance-critical software development, LLMs, multimodal models, or recommendation systems, and their systems-level implications, Accelerator interconnects and communication stacks (e.g., NCCL, MPI, RDMA, high-speed Ethernet or InfiniBand), Performance modeling and capacity planning for future hardware generations, Contributing to or leading hardware roadmaps, silicon bring-up, or platform architecture reviews, Publications, patents, or open-source contributions in systems, architecture, or ML systems","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":139900,"maxValue":274800,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_8b82d370-9f7"},"title":"Open Application","description":"<p>At Varjo, we are pioneers in the immersive computing revolution. Our mixed reality solutions redefine realism, creating virtual experiences that match the authenticity of the real world. We are not just a company; we are a team of talents from around the globe, where diversity fuels innovation and drives results.</p>\n<p>Join a multicultural team where English is our daily working language, providing an inclusive and collaborative atmosphere. At Varjo, we believe in the power of different experiences, backgrounds, and ideas coming together to shape the future of immersive computing.</p>\n<p>We are seeking the best and brightest to join us on this exhilarating journey. As we continue to set new standards in technology, we invite you to be a part of our vision for the future. When we are done, computers will look nothing like what they do right now.</p>\n<p>Areas and Technologies We Work With:</p>\n<ul>\n<li>C++/C Programming</li>\n<li>Embedded C/C++</li>\n<li>SLAM and Computer Vision (image processing, object recognition and detection)</li>\n<li>Algorithm Design and Optimization</li>\n<li>GPU/CPU Programming</li>\n<li>Unity and Unreal Development</li>\n<li>Sensor Fusion</li>\n<li>ROS (Robot Operating System)</li>\n<li>ADAS (Advanced Driver Assistance Systems)</li>\n<li>3D Reconstruction</li>\n<li>CUDA</li>\n<li>Optics and Cameras</li>\n<li>Audio and Video Streaming</li>\n</ul>\n<p>Apply Now\nSubmit an open application, including your CV, a link to your LinkedIn profile, and details of projects that make you particularly proud. If you have connections at Varjo, feel free to drop some names. Join Varjo and play a crucial role in shaping the future of immersive computing.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_8b82d370-9f7","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Varjo","sameAs":"https://apply.workable.com","logo":"https://logos.yubhub.co/j.com.png"},"x-apply-url":"https://apply.workable.com/j/B64F9C1C64","x-work-arrangement":"onsite","x-experience-level":"entry|mid|senior|staff|executive","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["C++/C Programming","Embedded C/C++","SLAM and Computer Vision","Algorithm Design and Optimization","GPU/CPU Programming","Unity and Unreal Development","Sensor Fusion","ROS (Robot Operating System)","ADAS (Advanced Driver Assistance Systems)","3D Reconstruction","CUDA","Optics and Cameras","Audio and Video Streaming"],"x-skills-preferred":[],"datePosted":"2026-03-08T17:55:55.964Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Helsinki"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"C++/C Programming, Embedded C/C++, SLAM and Computer Vision, Algorithm Design and Optimization, GPU/CPU Programming, Unity and Unreal Development, Sensor Fusion, ROS (Robot Operating System), ADAS (Advanced Driver Assistance Systems), 3D Reconstruction, CUDA, Optics and Cameras, Audio and Video Streaming"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_11a60d5a-f54"},"title":"Performance Engineer, GPU","description":"<p><strong>About the role:</strong></p>\n<p>Pioneering the next generation of AI requires breakthrough innovations in GPU performance and systems engineering. As a GPU Performance Engineer, you&#39;ll architect and implement the foundational systems that power Claude and push the frontiers of what&#39;s possible with large language models. You&#39;ll be responsible for maximizing GPU utilization and performance at unprecedented scale, developing cutting-edge optimizations that directly enable new model capabilities and dramatically improve inference efficiency.</p>\n<p>Working at the intersection of hardware and software, you&#39;ll implement state-of-the-art techniques from custom kernel development to distributed system architectures. Your work will span the entire stack—from low-level tensor core optimizations to orchestrating thousands of GPUs in perfect synchronization.</p>\n<p>Strong candidates will have a track record of delivering transformative GPU performance improvements in production ML systems and will be excited to shape the future of AI infrastructure alongside world-class researchers and engineers.</p>\n<p><strong>You might be a good fit if you:</strong></p>\n<ul>\n<li>Have deep experience with GPU programming and optimization at scale</li>\n<li>Are impact-driven, passionate about delivering measurable performance breakthroughs</li>\n<li>Can navigate complex systems from hardware interfaces to high-level ML frameworks</li>\n<li>Enjoy collaborative problem-solving and pair programming</li>\n<li>Want to work on state-of-the-art language models with real-world impact</li>\n<li>Care about the societal impacts of your work</li>\n<li>Thrive in ambiguous environments where you define the path forward</li>\n</ul>\n<p><strong>Strong candidates may also have experience with:</strong></p>\n<ul>\n<li>GPU Kernel Development: CUDA, Triton, CUTLASS, Flash Attention, tensor core optimization</li>\n<li>ML Compilers &amp; Frameworks: PyTorch/JAX internals, torch.compile, XLA, custom operators</li>\n<li>Performance Engineering: Kernel fusion, memory bandwidth optimization, profiling with Nsight</li>\n<li>Distributed Systems: NCCL, NVLink, collective communication, model parallelism</li>\n<li>Low-Precision: INT8/FP8 quantization, mixed-precision techniques</li>\n<li>Production Systems: Large-scale training infrastructure, fault tolerance, cluster orchestration</li>\n</ul>\n<p><strong>Representative projects:</strong></p>\n<ul>\n<li>Co-design attention mechanisms and algorithms for next-generation hardware architectures</li>\n<li>Develop custom kernels for emerging quantization formats and mixed-precision techniques</li>\n<li>Design distributed communication strategies for multi-node GPU clusters</li>\n<li>Optimize end-to-end training and inference pipelines for frontier language models</li>\n<li>Build performance modeling frameworks to predict and optimize GPU utilization</li>\n<li>Implement kernel fusion strategies to minimize memory bandwidth bottlenecks</li>\n<li>Create resilient systems for planet-scale distributed training infrastructure</li>\n<li>Profile and eliminate performance bottlenecks in production serving infrastructure</li>\n<li>Partner with hardware vendors to influence future accelerator capabilities and software stacks</li>\n</ul>\n<p><strong>Deadline to apply:</strong> None. Applications will be reviewed on a rolling basis.</p>\n<p>The expected salary range for this position is:</p>\n<p>Annual Salary: $280,000 - $850,000USD</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_11a60d5a-f54","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Anthropic","sameAs":"https://job-boards.greenhouse.io","logo":"https://logos.yubhub.co/anthropic.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/anthropic/jobs/4926227008","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$280,000 - $850,000USD","x-skills-required":["GPU programming","optimization at scale","custom kernel development","distributed system architectures","low-level tensor core optimizations","orchestrating thousands of GPUs","GPU kernel development","CUDA","Triton","CUTLASS","Flash Attention","tensor core optimization","ML compilers & frameworks","PyTorch/JAX internals","torch.compile","XLA","custom operators","performance engineering","kernel fusion","memory bandwidth optimization","profiling with Nsight","distributed systems","NCCL","NVLink","collective communication","model parallelism","low-precision","INT8/FP8 quantization","mixed-precision techniques","production systems","large-scale training infrastructure","fault tolerance","cluster orchestration"],"x-skills-preferred":["GPU programming","optimization at scale","custom kernel development","distributed system architectures","low-level tensor core optimizations","orchestrating thousands of GPUs","GPU kernel development","CUDA","Triton","CUTLASS","Flash Attention","tensor core optimization","ML compilers & frameworks","PyTorch/JAX internals","torch.compile","XLA","custom operators","performance engineering","kernel fusion","memory bandwidth optimization","profiling with Nsight","distributed systems","NCCL","NVLink","collective communication","model parallelism","low-precision","INT8/FP8 quantization","mixed-precision techniques","production systems","large-scale training infrastructure","fault tolerance","cluster orchestration"],"datePosted":"2026-03-08T13:45:05.412Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco, CA | New York City, NY | Seattle, WA"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"GPU programming, optimization at scale, custom kernel development, distributed system architectures, low-level tensor core optimizations, orchestrating thousands of GPUs, GPU kernel development, CUDA, Triton, CUTLASS, Flash Attention, tensor core optimization, ML compilers & frameworks, PyTorch/JAX internals, torch.compile, XLA, custom operators, performance engineering, kernel fusion, memory bandwidth optimization, profiling with Nsight, distributed systems, NCCL, NVLink, collective communication, model parallelism, low-precision, INT8/FP8 quantization, mixed-precision techniques, production systems, large-scale training infrastructure, fault tolerance, cluster orchestration, GPU programming, optimization at scale, custom kernel development, distributed system architectures, low-level tensor core optimizations, orchestrating thousands of GPUs, GPU kernel development, CUDA, Triton, CUTLASS, Flash Attention, tensor core optimization, ML compilers & frameworks, PyTorch/JAX internals, torch.compile, XLA, custom operators, performance engineering, kernel fusion, memory bandwidth optimization, profiling with Nsight, distributed systems, NCCL, NVLink, collective communication, model parallelism, low-precision, INT8/FP8 quantization, mixed-precision techniques, production systems, large-scale training infrastructure, fault tolerance, cluster orchestration","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":280000,"maxValue":850000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_7badeaf5-492"},"title":"Hardware / Software CoDesign Engineer","description":"<p><strong>Hardware / Software CoDesign Engineer</strong></p>\n<p><strong>Location</strong></p>\n<p>San Francisco</p>\n<p><strong>Employment Type</strong></p>\n<p>Full time</p>\n<p><strong>Location Type</strong></p>\n<p>Hybrid</p>\n<p><strong>Department</strong></p>\n<p>Scaling</p>\n<p><strong>Compensation</strong></p>\n<ul>\n<li>$342K – $555K • Offers Equity</li>\n</ul>\n<p>The base pay offered may vary depending on multiple individualized factors, including market location, job-related knowledge, skills, and experience. If the role is non-exempt, overtime pay will be provided consistent with applicable laws. In addition to the salary range listed above, total compensation also includes generous equity, performance-related bonus(es) for eligible employees, and the following benefits.</p>\n<p><strong>Benefits</strong></p>\n<ul>\n<li>Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts</li>\n</ul>\n<ul>\n<li>Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)</li>\n</ul>\n<ul>\n<li>401(k) retirement plan with employer match</li>\n</ul>\n<ul>\n<li>Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)</li>\n</ul>\n<ul>\n<li>Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees</li>\n</ul>\n<ul>\n<li>13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)</li>\n</ul>\n<ul>\n<li>Mental health and wellness support</li>\n</ul>\n<ul>\n<li>Employer-paid basic life and disability coverage</li>\n</ul>\n<ul>\n<li>Annual learning and development stipend to fuel your professional growth</li>\n</ul>\n<ul>\n<li>Daily meals in our offices, and meal delivery credits as eligible</li>\n</ul>\n<ul>\n<li>Relocation support for eligible employees</li>\n</ul>\n<ul>\n<li>Additional taxable fringe benefits, such as charitable donation matching and wellness stipends, may also be provided.</li>\n</ul>\n<p><strong>About the Team</strong></p>\n<p>OpenAI’s Hardware organization develops silicon and system-level solutions designed for the unique demands of advanced AI workloads. The team is responsible for building the next generation of AI-native silicon while working closely with software and research partners to co-design hardware tightly integrated with AI models. In addition to delivering production-grade silicon for OpenAI’s supercomputing infrastructure, the team also creates custom design tools and methodologies that accelerate innovation and enable hardware optimized specifically for AI.</p>\n<p><strong>About the Role</strong></p>\n<p>As an Engineer on our hardware optimization and co-design team, you will co-design future hardware from different vendors for programmability and performance. You will work with our kernel, compiler and machine learning engineers to understand their unique needs related to ML techniques, algorithms, numerical approximations, programming expressivity, and compiler optimizations. You will evangelize these constraints with various vendors to develop and influence future hardware architectures towards efficient training and inference on our models. If you are excited about efficiently distributing a large language model across devices, dealing with and optimizing system-wide/rack-wide networking bottlenecks and eventually tailoring the compute pipe and memory hierarchy of the hardware platform, simulating workloads at different abstractions and working closely with our partners, this is the perfect opportunity!</p>\n<p><strong>In this role, you will:</strong></p>\n<ul>\n<li>Co-design future hardware for programmability and performance with our hardware vendors</li>\n</ul>\n<ul>\n<li>Assist hardware vendors in developing optimal kernels and add support for it in our compiler</li>\n</ul>\n<ul>\n<li>Develop performance estimates for critical kernels for different hardware configurations and drive decisions on compute core and memory hierarchy features</li>\n</ul>\n<ul>\n<li>Build system performance models at different abstraction levels and carry out analysis to drive decisions on scale up, scale out, front end networking</li>\n</ul>\n<ul>\n<li>Work with machine learning engineers, kernel engineers and compiler developers to understand their vision and needs from high performance accelerators</li>\n</ul>\n<ul>\n<li>Manage communication and coordination with internal and external partners</li>\n</ul>\n<ul>\n<li>Influence the roadmap of hardware partners to optimize them for OpenAI’s workloads.</li>\n</ul>\n<ul>\n<li>Evaluate potential partners’ accelerators and platforms.</li>\n</ul>\n<ul>\n<li>As the scope of the role and team grows, understand and influence roadmaps for hardware partners for our datacenter networks, racks, and buildings.</li>\n</ul>\n<p><strong>You might thrive in this role if you have:</strong></p>\n<ul>\n<li>4+ years of industry experience, including experience harnessing compute at scale and optimizing ML platform code to run efficiently on target hardware.</li>\n</ul>\n<ul>\n<li>Strong experience in software/hardware co-design</li>\n</ul>\n<ul>\n<li>Deep understanding of GPU and/or other AI accelerators</li>\n</ul>\n<ul>\n<li>Experience with CUDA, Triton or a related accelerator programming language</li>\n</ul>\n<ul>\n<li>Experience driving Machine Learning accuracy with low precision formats</li>\n</ul>\n<ul>\n<li>Experience with system performance modeling and analysis to optimize ML model deployment</li>\n</ul>\n<ul>\n<li>Strong coding skills in C/C++ and Python</li>\n</ul>\n<ul>\n<li>Are familiar with the fundamentals of deep learning computing and chip architecture/microarchitecture.</li>\n</ul>\n<p><strong>These attributes are nice to have:</strong></p>\n<ul>\n<li>PhD in Computer Science and Engineering with a specialization in Computer Architecture, Parallel Computing. Compilers or other Systems</li>\n</ul>\n<ul>\n<li>Strong understanding of LLMs and challenges related to their training and inference</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_7badeaf5-492","directApply":true,"hiringOrganization":{"@type":"Organization","name":"OpenAI","sameAs":"https://jobs.ashbyhq.com","logo":"https://logos.yubhub.co/openai.com.png"},"x-apply-url":"https://jobs.ashbyhq.com/openai/bdbb2292-ecb3-42dc-ba89-65edf397d8f8","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$342K – $555K • Offers Equity","x-skills-required":["software/hardware co-design","GPU and/or other AI accelerators","CUDA, Triton or a related accelerator programming language","Machine Learning accuracy with low precision formats","system performance modeling and analysis to optimize ML model deployment","C/C++ and Python"],"x-skills-preferred":["PhD in Computer Science and Engineering with a specialization in Computer Architecture, Parallel Computing. Compilers or other Systems","Strong understanding of LLMs and challenges related to their training and inference"],"datePosted":"2026-03-06T18:39:51.459Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"software/hardware co-design, GPU and/or other AI accelerators, CUDA, Triton or a related accelerator programming language, Machine Learning accuracy with low precision formats, system performance modeling and analysis to optimize ML model deployment, C/C++ and Python, PhD in Computer Science and Engineering with a specialization in Computer Architecture, Parallel Computing. Compilers or other Systems, Strong understanding of LLMs and challenges related to their training and inference","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":342000,"maxValue":555000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_f2722128-3e2"},"title":"Inference Runtime, Engineering Manager","description":"<p><strong>Inference Runtime, Engineering Manager</strong></p>\n<p><strong>Location</strong></p>\n<p>San Francisco</p>\n<p><strong>Employment Type</strong></p>\n<p>Full time</p>\n<p><strong>Department</strong></p>\n<p>Scaling</p>\n<p><strong>Compensation</strong></p>\n<ul>\n<li>$455K – $555K</li>\n</ul>\n<p>The base pay offered may vary depending on multiple individualized factors, including market location, job-related knowledge, skills, and experience. If the role is non-exempt, overtime pay will be provided consistent with applicable laws. In addition to the salary range listed above, total compensation also includes generous equity, performance-related bonus(es) for eligible employees, and the following benefits.</p>\n<ul>\n<li>Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts</li>\n</ul>\n<ul>\n<li>Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)</li>\n</ul>\n<ul>\n<li>401(k) retirement plan with employer match</li>\n</ul>\n<ul>\n<li>Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)</li>\n</ul>\n<ul>\n<li>Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees</li>\n</ul>\n<ul>\n<li>13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)</li>\n</ul>\n<ul>\n<li>Mental health and wellness support</li>\n</ul>\n<ul>\n<li>Employer-paid basic life and disability coverage</li>\n</ul>\n<ul>\n<li>Annual learning and development stipend to fuel your professional growth</li>\n</ul>\n<ul>\n<li>Daily meals in our offices, and meal delivery credits as eligible</li>\n</ul>\n<ul>\n<li>Relocation support for eligible employees</li>\n</ul>\n<ul>\n<li>Additional taxable fringe benefits, such as charitable donation matching and wellness stipends, may also be provided.</li>\n</ul>\n<p>More details about our benefits are available to candidates during the hiring process.</p>\n<p>This role is at-will and OpenAI reserves the right to modify base pay and other compensation components at any time based on individual performance, team or company results, or market conditions.</p>\n<p><strong>About the Team</strong></p>\n<p>Our Inference team brings OpenAI’s most capable research and technology to the world through our products. We empower consumers, enterprise and developers alike to use and access our start-of-the-art AI models, allowing them to do things that they’ve never been able to before. We focus on performant and efficient model inference, as well as accelerating research progression via model inference.</p>\n<p><strong>About the Role</strong></p>\n<p>We are looking for an engineering leader who wants to build and lead the worlds leading AI systems and modeling engineers who take the world&#39;s largest and most capable AI models and optimize them for use in a high-volume, low-latency, and high-availability production and research environment.</p>\n<p>In this role, you will:</p>\n<ul>\n<li>Lead a team of engineers who are experts in working with distributed systems, with a deep understanding of model architecture, system co-design with research and production team,</li>\n</ul>\n<ul>\n<li>Work alongside partners in machine learning researchers, engineers, and product managers to bring our latest technologies into production.</li>\n</ul>\n<ul>\n<li>Work in an outcome-oriented environment where everyone contributes across layers of the stack, from infra plumbing to performance tuning.</li>\n</ul>\n<ul>\n<li>Introduce new techniques, tools, and architecture that improve the performance, latency, throughput, and efficiency of our model inference stack.</li>\n</ul>\n<ul>\n<li>Build tools to give us visibility into our bottlenecks and sources of instability and then design and implement solutions to address the highest priority issues.</li>\n</ul>\n<ul>\n<li>Optimize our code and fleet of GPU’s to utilize every FLOP and every GB of GPU RAM of our hardware.</li>\n</ul>\n<p><strong>You might thrive in this role if you:</strong></p>\n<ul>\n<li>Have an understanding of modern ML architectures and an intuition for how to optimize their performance, particularly for inference.</li>\n</ul>\n<ul>\n<li>Own problems end-to-end, and are willing to pick up whatever knowledge you&#39;re missing to get the job done.</li>\n</ul>\n<ul>\n<li>Have at least 15 years of professional software engineering experience.</li>\n</ul>\n<ul>\n<li>Have or can quickly gain familiarity with PyTorch, NVidia GPUs and the software stacks that optimize them (e.g. NCCL, CUDA), as well as HPC technologies such as InfiniBand, MPI, NVLink, etc.</li>\n</ul>\n<ul>\n<li>Have experience architecting, building, observing, and debugging production distributed systems. Bonus point if worked on performance-critical distributed systems.</li>\n</ul>\n<ul>\n<li>Have needed to rebuild or substantially refactor production systems several times over due to rapidly increasing scale.</li>\n</ul>\n<ul>\n<li>Are self-directed and enjoy figuring out the most important problem to work on.</li>\n</ul>\n<ul>\n<li>Have a humble attitude, an eagerness to help your colleagues, and a desire to do whatever it takes to make the team succeed.</li>\n</ul>\n<p><strong>About OpenAI</strong></p>\n<p>OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_f2722128-3e2","directApply":true,"hiringOrganization":{"@type":"Organization","name":"OpenAI","sameAs":"https://jobs.ashbyhq.com","logo":"https://logos.yubhub.co/openai.com.png"},"x-apply-url":"https://jobs.ashbyhq.com/openai/4f998abb-4510-4bd3-9922-161599625171","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$455K – $555K","x-skills-required":["PyTorch","NVidia GPUs","NCCL","CUDA","InfiniBand","MPI","NVLink","HPC technologies","Distributed systems","Model architecture","System co-design","Machine learning","Research","Production","Software engineering","GPU optimization"],"x-skills-preferred":["HPC technologies","Distributed systems","Model architecture","System co-design","Machine learning","Research","Production","Software engineering","GPU optimization"],"datePosted":"2026-03-06T18:39:15.426Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"PyTorch, NVidia GPUs, NCCL, CUDA, InfiniBand, MPI, NVLink, HPC technologies, Distributed systems, Model architecture, System co-design, Machine learning, Research, Production, Software engineering, GPU optimization, HPC technologies, Distributed systems, Model architecture, System co-design, Machine learning, Research, Production, Software engineering, GPU optimization","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":455000,"maxValue":555000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_0e457a06-cee"},"title":"Training Performance Engineer","description":"<p><strong>Location</strong></p>\n<p>San Francisco</p>\n<p><strong>Employment Type</strong></p>\n<p>Full time</p>\n<p><strong>Location Type</strong></p>\n<p>Hybrid</p>\n<p><strong>Department</strong></p>\n<p>Scaling</p>\n<p><strong>Compensation</strong></p>\n<ul>\n<li>$250K – $445K • Offers Equity</li>\n</ul>\n<p>The base pay offered may vary depending on multiple individualized factors, including market location, job-related knowledge, skills, and experience. If the role is non-exempt, overtime pay will be provided consistent with applicable laws. In addition to the salary range listed above, total compensation also includes generous equity, performance-related bonus(es) for eligible employees, and the following benefits.</p>\n<ul>\n<li>Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts</li>\n</ul>\n<ul>\n<li>Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)</li>\n</ul>\n<ul>\n<li>401(k) retirement plan with employer match</li>\n</ul>\n<ul>\n<li>Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)</li>\n</ul>\n<ul>\n<li>Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees</li>\n</ul>\n<ul>\n<li>13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)</li>\n</ul>\n<ul>\n<li>Mental health and wellness support</li>\n</ul>\n<ul>\n<li>Employer-paid basic life and disability coverage</li>\n</ul>\n<ul>\n<li>Annual learning and development stipend to fuel your professional growth</li>\n</ul>\n<ul>\n<li>Daily meals in our offices, and meal delivery credits as eligible</li>\n</ul>\n<ul>\n<li>Relocation support for eligible employees</li>\n</ul>\n<ul>\n<li>Additional taxable fringe benefits, such as charitable donation matching and wellness stipends, may also be provided.</li>\n</ul>\n<p>More details about our benefits are available to candidates during the hiring process.</p>\n<p>This role is at-will and OpenAI reserves the right to modify base pay and other compensation components at any time based on individual performance, team or company results, or market conditions.</p>\n<p><strong>About the Team</strong> Training Runtime designs the core distributed machine-learning training runtime that powers everything from early research experiments to frontier-scale model runs. With a dual mandate to accelerate researchers and enable frontier scale, we’re building a unified, modular runtime that meets researchers where they are and moves with them up the scaling curve.</p>\n<p>Our work focuses on three pillars: high-performance, asynchronous, zero-copy tensor and optimizer-state-aware data movement; performant, high-uptime, fault-tolerant training frameworks (training loop, state management, resilient checkpointing, deterministic orchestration, and observability); and distributed process management for long-lived, job-specific and user-provided processes.</p>\n<p>We integrate proven large-scale capabilities into a composable, developer-facing runtime so teams can iterate quickly and run reliably at any scale, partnering closely with model-stack, research, and platform teams. Success for us is measured by raising both training throughput (how fast models train) and researcher throughput (how fast ideas become experiments and products).</p>\n<p><strong>About the Role</strong> As a Training Performance Engineer, you’ll drive efficiency improvements across our distributed training stack. You’ll analyze large-scale training runs, identify utilization gaps, and design optimizations that push the boundaries of throughput and uptime. This role blends deep systems understanding with practical performance engineering — analyzing GPU kernel performance, collective communication throughput, investigating I/O bottlenecks, and sharding our models so we can train them at massive scale.</p>\n<p>You’ll help ensure that our clusters are running at peak performance, enabling OpenAI to train larger, more capable models with the same compute budget.</p>\n<p>This role is based in San Francisco, CA. We use a hybrid work model of three days in the office per week and offer relocation assistance to new employees.</p>\n<p><strong>In this role, you will:</strong></p>\n<ul>\n<li>Profile end-to-end training runs to identify performance bottlenecks across compute, communication, and storage.</li>\n<li>Optimize GPU utilization and throughput for large-scale distributed model training.</li>\n<li>Collaborate with runtime and systems engineers to improve kernel efficiency, scheduling, and collective communication performance.</li>\n<li>Implement model graph transforms to improve end to end throughput.</li>\n<li>Build tooling to monitor and visualize MFU, throughput, and uptime across clusters.</li>\n<li>Partner with researchers to ensure new model architectures scale efficiently during pre-training.</li>\n<li>Contribute to infrastructure decisions that improve reliability and efficiency of large training jobs.</li>\n</ul>\n<p><strong>You might thrive in this role if you:</strong></p>\n<ul>\n<li>Love optimizing performance and digging into systems to understand how every layer interacts.</li>\n<li>Have strong programming skills in Python and C++ (Rust or CUDA a plus).</li>\n<li>Have experience running distributed training jobs on multi-GPU systems or HPC clusters.</li>\n<li>Enjoy debugging complex distributed systems and measuring efficiency rigorously.</li>\n<li>Have exposure to frameworks like PyTorch, JAX, or TensorFlow and an understanding of how large-scale training loops are built.</li>\n<li>Are comfortable collaborating across teams and translating raw profiling data into practical engineering improvements.</li>\n</ul>\n<p><strong>Nice to have:</strong></p>\n<ul>\n<li>Familiarity with NCCL, MPI, or UCX communication libraries.</li>\n<li>Experience with large-scale data loading and checkpointing systems.</li>\n<li>Prior work on training runtime, distributed scheduling, or ML compiler optimization.</li>\n</ul>\n<p><strong>About OpenAI</strong> OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_0e457a06-cee","directApply":true,"hiringOrganization":{"@type":"Organization","name":"OpenAI","sameAs":"https://jobs.ashbyhq.com","logo":"https://logos.yubhub.co/openai.com.png"},"x-apply-url":"https://jobs.ashbyhq.com/openai/6eb386ac-9056-4795-aa79-a27e105faf5c","x-work-arrangement":"hybrid","x-experience-level":"mid","x-job-type":"full-time","x-salary-range":"$250K – $445K","x-skills-required":["Python","C++","Rust","CUDA","PyTorch","JAX","TensorFlow","NCCL","MPI","UCX"],"x-skills-preferred":["Large-scale data loading and checkpointing systems","Training runtime, distributed scheduling, or ML compiler optimization"],"datePosted":"2026-03-06T18:32:30.509Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Python, C++, Rust, CUDA, PyTorch, JAX, TensorFlow, NCCL, MPI, UCX, Large-scale data loading and checkpointing systems, Training runtime, distributed scheduling, or ML compiler optimization","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":250000,"maxValue":445000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_d5390946-539"},"title":"Software Engineer, Model Inference","description":"<p><strong>Software Engineer, Model Inference</strong></p>\n<p><strong>Location</strong></p>\n<p>San Francisco</p>\n<p><strong>Employment Type</strong></p>\n<p>Full time</p>\n<p><strong>Department</strong></p>\n<p>Scaling</p>\n<p><strong>Compensation</strong></p>\n<ul>\n<li>$295K – $555K • Offers Equity</li>\n</ul>\n<p>The base pay offered may vary depending on multiple individualized factors, including market location, job-related knowledge, skills, and experience. If the role is non-exempt, overtime pay will be provided consistent with applicable laws. In addition to the salary range listed above, total compensation also includes generous equity, performance-related bonus(es) for eligible employees, and the following benefits.</p>\n<p><strong>Benefits</strong></p>\n<ul>\n<li>Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts</li>\n</ul>\n<ul>\n<li>Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)</li>\n</ul>\n<ul>\n<li>401(k) retirement plan with employer match</li>\n</ul>\n<ul>\n<li>Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)</li>\n</ul>\n<ul>\n<li>Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees</li>\n</ul>\n<ul>\n<li>13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)</li>\n</ul>\n<ul>\n<li>Mental health and wellness support</li>\n</ul>\n<ul>\n<li>Employer-paid basic life and disability coverage</li>\n</ul>\n<ul>\n<li>Annual learning and development stipend to fuel your professional growth</li>\n</ul>\n<ul>\n<li>Daily meals in our offices, and meal delivery credits as eligible</li>\n</ul>\n<ul>\n<li>Relocation support for eligible employees</li>\n</ul>\n<ul>\n<li>Additional taxable fringe benefits, such as charitable donation matching and wellness stipends, may also be provided.</li>\n</ul>\n<p><strong>About the Team</strong></p>\n<p>Our Inference team brings OpenAI’s most capable research and technology to the world through our products. We empower consumers, enterprise and developers alike to use and access our start-of-the-art AI models, allowing them to do things that they’ve never been able to before. We focus on performant and efficient model inference, as well as accelerating research progression via model inference.</p>\n<p><strong>About the Role</strong></p>\n<p>We are looking for an engineer who wants to take the world&#39;s largest and most capable AI models and optimize them for use in a high-volume, low-latency, and high-availability production and research environment.</p>\n<p><strong>In this role, you will:</strong></p>\n<ul>\n<li>Work alongside machine learning researchers, engineers, and product managers to bring our latest technologies into production.</li>\n</ul>\n<ul>\n<li>Work alongside researchers to enable advanced research through awesome engineering.</li>\n</ul>\n<ul>\n<li>Introduce new techniques, tools, and architecture that improve the performance, latency, throughput, and efficiency of our model inference stack.</li>\n</ul>\n<ul>\n<li>Build tools to give us visibility into our bottlenecks and sources of instability and then design and implement solutions to address the highest priority issues.</li>\n</ul>\n<ul>\n<li>Optimize our code and fleet of Azure VMs to utilize every FLOP and every GB of GPU RAM of our hardware.</li>\n</ul>\n<p><strong>You might thrive in this role if you:</strong></p>\n<ul>\n<li>Have an understanding of modern ML architectures and an intuition for how to optimize their performance, particularly for inference.</li>\n</ul>\n<ul>\n<li>Own problems end-to-end, and are willing to pick up whatever knowledge you&#39;re missing to get the job done.</li>\n</ul>\n<ul>\n<li>Have at least 5 years of professional software engineering experience.</li>\n</ul>\n<ul>\n<li>Have or can quickly gain familiarity with PyTorch, NVidia GPUs and the software stacks that optimize them (e.g. NCCL, CUDA), as well as HPC technologies such as InfiniBand, MPI, NVLink, etc.</li>\n</ul>\n<ul>\n<li>Have experience architecting, building, observing, and debugging production distributed systems. Bonus point if worked on performance-critical distributed systems.</li>\n</ul>\n<ul>\n<li>Have needed to rebuild or substantially refactor production systems several times over due to rapidly increasing scale.</li>\n</ul>\n<ul>\n<li>Are self-directed and enjoy figuring out the most important problem to work on.</li>\n</ul>\n<ul>\n<li>Have a humble attitude, an eagerness to help your colleagues, and a desire to do whatever it takes to make the team succeed.</li>\n</ul>\n<p><strong>About OpenAI</strong></p>\n<p>OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_d5390946-539","directApply":true,"hiringOrganization":{"@type":"Organization","name":"OpenAI","sameAs":"https://jobs.ashbyhq.com","logo":"https://logos.yubhub.co/openai.com.png"},"x-apply-url":"https://jobs.ashbyhq.com/openai/83b6755d-7785-4186-9050-5ef3ad127941","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$295K – $555K • Offers Equity","x-skills-required":["PyTorch","NVidia GPUs","NCCL","CUDA","HPC technologies","InfiniBand","MPI","NVLink","Azure VMs","GPU RAM","FLOP"],"x-skills-preferred":["modern ML architectures","intuition for optimizing performance","distributed systems","performance-critical distributed systems"],"datePosted":"2026-03-06T18:31:29.482Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"PyTorch, NVidia GPUs, NCCL, CUDA, HPC technologies, InfiniBand, MPI, NVLink, Azure VMs, GPU RAM, FLOP, modern ML architectures, intuition for optimizing performance, distributed systems, performance-critical distributed systems","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":295000,"maxValue":555000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_2ad46876-f84"},"title":"Software Engineer, Collective Communication","description":"<p><strong>Job Posting</strong></p>\n<p><strong>Software Engineer, Collective Communication</strong></p>\n<p><strong>Location</strong></p>\n<p>San Francisco</p>\n<p><strong>Employment Type</strong></p>\n<p>Full time</p>\n<p><strong>Department</strong></p>\n<p>Scaling</p>\n<p><strong>Compensation</strong></p>\n<ul>\n<li>$380K – $555K • Offers Equity</li>\n</ul>\n<p>The base pay offered may vary depending on multiple individualized factors, including market location, job-related knowledge, skills, and experience. If the role is non-exempt, overtime pay will be provided consistent with applicable laws. In addition to the salary range listed above, total compensation also includes generous equity, performance-related bonus(es) for eligible employees, and the following benefits.</p>\n<ul>\n<li>Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts</li>\n</ul>\n<ul>\n<li>Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)</li>\n</ul>\n<ul>\n<li>401(k) retirement plan with employer match</li>\n</ul>\n<ul>\n<li>Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)</li>\n</ul>\n<ul>\n<li>Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees</li>\n</ul>\n<ul>\n<li>13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)</li>\n</ul>\n<ul>\n<li>Mental health and wellness support</li>\n</ul>\n<ul>\n<li>Employer-paid basic life and disability coverage</li>\n</ul>\n<ul>\n<li>Annual learning and development stipend to fuel your professional growth</li>\n</ul>\n<ul>\n<li>Daily meals in our offices, and meal delivery credits as eligible</li>\n</ul>\n<ul>\n<li>Relocation support for eligible employees</li>\n</ul>\n<ul>\n<li>Additional taxable fringe benefits, such as charitable donation matching and wellness stipends, may also be provided.</li>\n</ul>\n<p>More details about our benefits are available to candidates during the hiring process.</p>\n<p>This role is at-will and OpenAI reserves the right to modify base pay and other compensation components at any time based on individual performance, team or company results, or market conditions.</p>\n<p><strong>About the Team</strong></p>\n<p>The Workload Networking team is responsible for the collective communication stack used in our largest training jobs. Using a combination of C++ and CUDA we work on novel collective communication techniques that enable efficient training of our flagship models on our largest custom built supercomputers.</p>\n<p>The models we train are key ingredients to the AI research progress at OpenAI and the field as a whole, and we continually incorporate learnings from our entire research org into our training platform.</p>\n<p><strong>About the Role</strong></p>\n<p>As a Software Engineer, Networking you will design and implement custom networking collectives that are tightly integrated into our training stack.</p>\n<p>We’re looking for people who have a background in low level performance critical software. Experience with collective communication is a bonus.</p>\n<p>This role is based in San Francisco, CA. We use a hybrid work model of 3 days in the office per week and offer relocation assistance to new employees.</p>\n<p><strong>In this role, you will:</strong></p>\n<ul>\n<li>Collaborate closely with ML researchers to design and implement efficient collective operations in C++ and CUDA.</li>\n</ul>\n<ul>\n<li>Ensure that our largest training jobs take full advantage of the different network transports used in our supercomputers.</li>\n</ul>\n<ul>\n<li>Work on simulations to inform our future supercomputer network designs.</li>\n</ul>\n<p><strong>You might thrive in this role if you:</strong></p>\n<ul>\n<li>Have written distributed algorithms using RDMA in the past.</li>\n</ul>\n<ul>\n<li>Are comfortable writing low level performance sensitive CPU and/or GPU code.</li>\n</ul>\n<ul>\n<li>Are familiar with network simulation techniques.</li>\n</ul>\n<p><strong>About OpenAI</strong></p>\n<p>OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_2ad46876-f84","directApply":true,"hiringOrganization":{"@type":"Organization","name":"OpenAI","sameAs":"https://jobs.ashbyhq.com","logo":"https://logos.yubhub.co/openai.com.png"},"x-apply-url":"https://jobs.ashbyhq.com/openai/340c0c22-8d8f-4232-b17e-f642b64c25c3","x-work-arrangement":"hybrid","x-experience-level":"mid","x-job-type":"full-time","x-salary-range":"$380K – $555K • Offers Equity","x-skills-required":["C++","CUDA","RDMA","network simulation techniques","low level performance sensitive CPU and/or GPU code"],"x-skills-preferred":["distributed algorithms","collective communication"],"datePosted":"2026-03-06T18:29:12.241Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"C++, CUDA, RDMA, network simulation techniques, low level performance sensitive CPU and/or GPU code, distributed algorithms, collective communication","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":380000,"maxValue":555000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_989f992b-6b2"},"title":"Software Engineer, Inference – AMD GPU Enablement","description":"<p><strong>Software Engineer, Inference – AMD GPU Enablement</strong></p>\n<p><strong>Location</strong></p>\n<p>San Francisco</p>\n<p><strong>Employment Type</strong></p>\n<p>Full time</p>\n<p><strong>Department</strong></p>\n<p>Scaling</p>\n<p><strong>Compensation</strong></p>\n<ul>\n<li>$295K – $555K • Offers Equity</li>\n</ul>\n<p>The base pay offered may vary depending on multiple individualized factors, including market location, job-related knowledge, skills, and experience. If the role is non-exempt, overtime pay will be provided consistent with applicable laws. In addition to the salary range listed above, total compensation also includes generous equity, performance-related bonus(es) for eligible employees, and the following benefits.</p>\n<ul>\n<li>Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts</li>\n</ul>\n<ul>\n<li>Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)</li>\n</ul>\n<ul>\n<li>401(k) retirement plan with employer match</li>\n</ul>\n<ul>\n<li>Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)</li>\n</ul>\n<ul>\n<li>Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees</li>\n</ul>\n<ul>\n<li>13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)</li>\n</ul>\n<ul>\n<li>Mental health and wellness support</li>\n</ul>\n<ul>\n<li>Employer-paid basic life and disability coverage</li>\n</ul>\n<ul>\n<li>Annual learning and development stipend to fuel your professional growth</li>\n</ul>\n<ul>\n<li>Daily meals in our offices, and meal delivery credits as eligible</li>\n</ul>\n<ul>\n<li>Relocation support for eligible employees</li>\n</ul>\n<ul>\n<li>Additional taxable fringe benefits, such as charitable donation matching and wellness stipends, may also be provided.</li>\n</ul>\n<p>More details about our benefits are available to candidates during the hiring process.</p>\n<p>This role is at-will and OpenAI reserves the right to modify base pay and other compensation components at any time based on individual performance, team or company results, or market conditions.</p>\n<p><strong>About the Team</strong></p>\n<p>Our Inference team brings OpenAI’s most capable research and technology to the world through our products. We empower consumers, enterprises and developers alike to use and access our state-of-the-art AI models, allowing them to do things that they’ve never been able to before. We focus on performant and efficient model inference, as well as accelerating research progression via model inference.</p>\n<p><strong>About the Role</strong></p>\n<p>We’re hiring engineers to scale and optimize OpenAI’s inference infrastructure across emerging GPU platforms. You’ll work across the stack - from low-level kernel performance to high-level distributed execution - and collaborate closely with research, infra, and performance teams to ensure our largest models run smoothly on new hardware.</p>\n<p>This is a high-impact opportunity to shape OpenAI’s multi-platform inference capabilities from the ground up with a particular focus on advancing inference performance on AMD accelerators.</p>\n<p><strong>In this role, you will:</strong></p>\n<ul>\n<li>Own bring-up, correctness and performance of the OpenAI inference stack on AMD hardware.</li>\n</ul>\n<ul>\n<li>Integrate internal model-serving infrastructure (e.g., vLLM, Triton) into a variety of GPU-backed systems.</li>\n</ul>\n<ul>\n<li>Debug and optimize distributed inference workloads across memory, network, and compute layers.</li>\n</ul>\n<ul>\n<li>Validate correctness, performance, and scalability of model execution on large GPU clusters.</li>\n</ul>\n<ul>\n<li>Collaborate with partner teams to design and optimize high-performance GPU kernels for accelerators using HIP, Triton, or other performance-focused frameworks.</li>\n</ul>\n<ul>\n<li>Collaborate with partner teams to build, integrate and tune collective communication libraries (e.g., RCCL) used to parallelize model execution across many GPUs.</li>\n</ul>\n<p><strong>You can thrive in this role if you:</strong></p>\n<ul>\n<li>Have experience writing or porting GPU kernels using HIP, CUDA, or Triton, and care deeply about low-level performance.</li>\n</ul>\n<ul>\n<li>Are familiar with communication libraries like NCCL/RCCL and understand their role in high-throughput model serving.</li>\n</ul>\n<ul>\n<li>Have worked on distributed inference systems and are comfortable scaling models across fleets of accelerators.</li>\n</ul>\n<ul>\n<li>Enjoy solving end-to-end performance challenges across hardware, system libraries, and orchestration layers.</li>\n</ul>\n<ul>\n<li>Are excited to be part of a small, fast-moving team building new infrastructure from first principles.</li>\n</ul>\n<p><strong>Nice to Have:</strong></p>\n<ul>\n<li>Contributions to open-source libraries like RCCL, Triton, or vLLM.</li>\n</ul>\n<ul>\n<li>Experience with GPU performance tools (Nsight, rocprof, perf) and memory/comms profiling.</li>\n</ul>\n<ul>\n<li>Prior experience deploying inference on other non-NVIDIA GPU environments.</li>\n</ul>\n<ul>\n<li>Knowledge of model/tensor parallelism, mixed precision, and serving 10B+ parameter models.</li>\n</ul>\n<p><strong>About OpenAI</strong></p>\n<p>OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_989f992b-6b2","directApply":true,"hiringOrganization":{"@type":"Organization","name":"OpenAI","sameAs":"https://jobs.ashbyhq.com","logo":"https://logos.yubhub.co/openai.com.png"},"x-apply-url":"https://jobs.ashbyhq.com/openai/9b79406c-89a8-49bd-8a38-e72db80996e9","x-work-arrangement":"onsite","x-experience-level":"mid","x-job-type":"full-time","x-salary-range":"$295K – $555K • Offers Equity","x-skills-required":["GPU kernels","HIP","CUDA","Triton","NCCL/RCCL","distributed inference systems","GPU performance tools","memory/comms profiling"],"x-skills-preferred":["open-source libraries","GPU performance tools","memory/comms profiling"],"datePosted":"2026-03-06T18:28:36.084Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"GPU kernels, HIP, CUDA, Triton, NCCL/RCCL, distributed inference systems, GPU performance tools, memory/comms profiling, open-source libraries, GPU performance tools, memory/comms profiling","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":295000,"maxValue":555000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_46bb9922-091"},"title":"ML Research Engineer - Hardware Codesign","description":"<p><strong>Location</strong></p>\n<p>San Francisco</p>\n<p><strong>Employment Type</strong></p>\n<p>Full time</p>\n<p><strong>Department</strong></p>\n<p>Scaling</p>\n<p><strong>Compensation</strong></p>\n<ul>\n<li>$185K – $455K • Offers Equity</li>\n</ul>\n<p>The base pay offered may vary depending on multiple individualized factors, including market location, job-related knowledge, skills, and experience. If the role is non-exempt, overtime pay will be provided consistent with applicable laws. In addition to the salary range listed above, total compensation also includes generous equity, performance-related bonus(es) for eligible employees, and the following benefits.</p>\n<ul>\n<li>Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts</li>\n</ul>\n<ul>\n<li>Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)</li>\n</ul>\n<ul>\n<li>401(k) retirement plan with employer match</li>\n</ul>\n<ul>\n<li>Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)</li>\n</ul>\n<ul>\n<li>Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees</li>\n</ul>\n<ul>\n<li>13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)</li>\n</ul>\n<ul>\n<li>Mental health and wellness support</li>\n</ul>\n<ul>\n<li>Employer-paid basic life and disability coverage</li>\n</ul>\n<ul>\n<li>Annual learning and development stipend to fuel your professional growth</li>\n</ul>\n<ul>\n<li>Daily meals in our offices, and meal delivery credits as eligible</li>\n</ul>\n<ul>\n<li>Relocation support for eligible employees</li>\n</ul>\n<ul>\n<li>Additional taxable fringe benefits, such as charitable donation matching and wellness stipends, may also be provided.</li>\n</ul>\n<p>More details about our benefits are available to candidates during the hiring process.</p>\n<p>This role is at-will and OpenAI reserves the right to modify base pay and other compensation components at any time based on individual performance, team or company results, or market conditions.</p>\n<p><strong><strong>About the Team</strong></strong></p>\n<p>OpenAI’s Hardware organization develops silicon and system-level solutions designed for the unique demands of advanced AI workloads. The team is responsible for building the next generation of AI silicon while working closely with software and research partners to co-design hardware tightly integrated with AI models. In addition to delivering production-grade silicon for OpenAI’s supercomputing infrastructure, the team also creates custom design tools and methodologies that accelerate innovation and enable hardware optimized specifically for AI.</p>\n<p><strong><strong>About the Role</strong></strong></p>\n<p>We’re seeking a Research-Hardware Codesign Engineer to operate at the boundary between model research and silicon/system architecture. You’ll help shape the numerics, architecture, and technology bets of future OpenAI silicon in collaboration with both Research and Hardware.</p>\n<p>Your work will include debugging gaps between rooflines and reality, writing quantization kernels, derisking numerics via model evals, quantifying system architecture tradeoffs, and implementing novel numeric RTL. This is a hands-on role for people who go looking for hard problems, get to ground truth, and drive it to production. Strong prioritization and clear, honest communication are essential.</p>\n<p>Location: San Francisco, CA (Hybrid: 3 days/week onsite)</p>\n<p>Relocation assistance available.</p>\n<p><strong><strong>In this role you will:</strong></strong></p>\n<ul>\n<li>Build on our roofline simulator to track evolving workloads, and deliver analyses that quantify the impact of system architecture decisions and support technology pathfinding.</li>\n</ul>\n<ul>\n<li>Debug gaps between performance simulation and real measurements; clearly communicate root cause, bottlenecks, and invalid assumptions.</li>\n</ul>\n<ul>\n<li>Write emulation kernels for low-precision numerics and lossy compression schemes, and get Research the information they need to trade efficiency with model quality.</li>\n</ul>\n<ul>\n<li>Prototype numerics modules by pushing RTL through synthesis; hand off novel numerics cleanly, or occasionally own an RTL module end-to-end.</li>\n</ul>\n<ul>\n<li>Proactively pull in new ML workloads, prototype them with rooflines and/or functional simulation, and drive initial evaluation of new opportunities or risks.</li>\n</ul>\n<ul>\n<li>Understand the whole picture from ML science to hardware optimization, and slice this end-to-end objective into near-term deliverables.</li>\n</ul>\n<ul>\n<li>Build ad-hoc collaborations across teams with very different goals and areas of expertise, and keep progress unblocked.</li>\n</ul>\n<ul>\n<li>Communicate design tradeoffs clearly with explicit assumptions and confidence levels; produce a trail of evidence that enables confident execution.</li>\n</ul>\n<p><strong><strong>You Will Thrive in this Role if:</strong></strong></p>\n<ul>\n<li>An exceptional track record of high-quality technical output, and a bias for shipping a prototype now and iterating later in the absence of clear requirements.</li>\n</ul>\n<ul>\n<li>Strong Python, and C++ or Rust, with a cautious attitude toward correctness and an intuition for clean extensibility.</li>\n</ul>\n<ul>\n<li>Experience writing Triton, CUDA, or similar, and an understanding of the resulting mapping of tensor ops to functional units.</li>\n</ul>\n<ul>\n<li>Working knowledge of PyTorch or JAX; experience in large ML codebases is a plus.</li>\n</ul>\n<ul>\n<li>Practical understanding of floating point numerics, the ML tradeoffs of reduced precision, and the current state of the art in model quantization.</li>\n</ul>\n<ul>\n<li>Deep understanding of transformer models, and strong intuition for transformer rooflines and the tradeoffs of sharded training and inference in large-scale ML systems.</li>\n</ul>\n<ul>\n<li>Experience writing RTL (especially for floating point logic) and understanding of PPA tradeoffs is a plus.</li>\n</ul>\n<ul>\n<li>Strong cross-functional communication (e.g. across ML researchers and hardware engineers); ability to slice ambiguous early-incubation ideas into concrete arenas in which progress can be made.</li>\n</ul>\n<p>_To comply with U.S. export control laws and regulations, candidates for this role may need to meet certain legal status requirements.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_46bb9922-091","directApply":true,"hiringOrganization":{"@type":"Organization","name":"OpenAI","sameAs":"https://jobs.ashbyhq.com","logo":"https://logos.yubhub.co/openai.com.png"},"x-apply-url":"https://jobs.ashbyhq.com/openai/5931abef-191b-417e-89f1-1d06f00e908c","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$185K – $455K","x-skills-required":["Python","C++","Rust","Triton","CUDA","PyTorch","JAX","Floating point numerics","Model quantization","Transformer models","RTL","PPA tradeoffs"],"x-skills-preferred":["Strong Python","C++ or Rust","Experience writing Triton","CUDA or similar","Working knowledge of PyTorch or JAX","Experience in large ML codebases"],"datePosted":"2026-03-06T18:28:06.437Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Python, C++, Rust, Triton, CUDA, PyTorch, JAX, Floating point numerics, Model quantization, Transformer models, RTL, PPA tradeoffs, Strong Python, C++ or Rust, Experience writing Triton, CUDA or similar, Working knowledge of PyTorch or JAX, Experience in large ML codebases","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":185000,"maxValue":455000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_1ee94df2-ca6"},"title":"Senior Research Engineer/Scientist - On-Device Transformer Models","description":"<p><strong>Location</strong></p>\n<p>San Francisco</p>\n<p><strong>Employment Type</strong></p>\n<p>Full time</p>\n<p><strong>Location Type</strong></p>\n<p>Hybrid</p>\n<p><strong>Department</strong></p>\n<p>Consumer Products</p>\n<p><strong>Compensation</strong></p>\n<ul>\n<li>$380K – $445K • Offers Equity</li>\n</ul>\n<p>The base pay offered may vary depending on multiple individualized factors, including market location, job-related knowledge, skills, and experience. If the role is non-exempt, overtime pay will be provided consistent with applicable laws. In addition to the salary range listed above, total compensation also includes generous equity, performance-related bonus(es) for eligible employees, and the following benefits.</p>\n<ul>\n<li>Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts</li>\n</ul>\n<ul>\n<li>Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)</li>\n</ul>\n<ul>\n<li>401(k) retirement plan with employer match</li>\n</ul>\n<ul>\n<li>Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)</li>\n</ul>\n<ul>\n<li>Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees</li>\n</ul>\n<ul>\n<li>13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)</li>\n</ul>\n<ul>\n<li>Mental health and wellness support</li>\n</ul>\n<ul>\n<li>Employer-paid basic life and disability coverage</li>\n</ul>\n<ul>\n<li>Annual learning and development stipend to fuel your professional growth</li>\n</ul>\n<ul>\n<li>Daily meals in our offices, and meal delivery credits as eligible</li>\n</ul>\n<ul>\n<li>Relocation support for eligible employees</li>\n</ul>\n<ul>\n<li>Additional taxable fringe benefits, such as charitable donation matching and wellness stipends, may also be provided.</li>\n</ul>\n<p>More details about our benefits are available to candidates during the hiring process.</p>\n<p>This role is at-will and OpenAI reserves the right to modify base pay and other compensation components at any time based on individual performance, team or company results, or market conditions.</p>\n<p><strong>About the Team</strong></p>\n<p>The Future of Computing Research team is an Applied Research team within the Consumer Products group focused on developing new methods and models to support our vision for the future of computing as we advance forward in our mission of building AGI that benefits all of humanity.</p>\n<p><strong>About the Role</strong></p>\n<p>As a Research Engineer/Scientist on the Future of Computing Research team, you will work together with _both_ the best ML researchers in the world and the greatest design talent of our generation to push the frontier of model capabilities.</p>\n<p><strong>This role is based in San Francisco, CA. We follow a hybrid model with 4 days a week in the office and offer relocation assistance to new employees.</strong></p>\n<p><strong>In this role you will:</strong></p>\n<ul>\n<li>Train and evaluate multimodal SoTA models along axis that are important to our vision for future devices.</li>\n<li>Develop novel architectures that improve model performance when scaling the models themselves is not an option.</li>\n<li>Run through the necessary walls to take nascent research capabilities and turn them into capabilities we can build on top of.</li>\n</ul>\n<p><strong>You might thrive in this role if you:</strong></p>\n<ul>\n<li>Have a research background related to developing on-device transformer models.</li>\n<li>Love performance optimization and working with GPU kernel engineers (but you do not need CUDA experience yourself).</li>\n<li>Do rigorous science (rather than vibes based). We need confidence in the experiments we run to move quickly.</li>\n<li>Have already spent time in the weeds teaching models to speak and perceive.</li>\n</ul>\n<p><strong>About OpenAI</strong></p>\n<p>OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_1ee94df2-ca6","directApply":true,"hiringOrganization":{"@type":"Organization","name":"OpenAI","sameAs":"https://jobs.ashbyhq.com","logo":"https://logos.yubhub.co/openai.com.png"},"x-apply-url":"https://jobs.ashbyhq.com/openai/7f9eb43b-423e-43e4-9f42-d14b8ba0f234","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$380K – $445K • Offers Equity","x-skills-required":["research background related to developing on-device transformer models","performance optimization","GPU kernel engineers","rigorous science","teaching models to speak and perceive"],"x-skills-preferred":["CUDA experience","multimodal SoTA models","novel architectures","nascent research capabilities"],"datePosted":"2026-03-06T18:22:44.309Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"research background related to developing on-device transformer models, performance optimization, GPU kernel engineers, rigorous science, teaching models to speak and perceive, CUDA experience, multimodal SoTA models, novel architectures, nascent research capabilities","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":380000,"maxValue":445000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_3c0a8f07-6b9"},"title":"Principal Software Engineer","description":"<p><strong>Summary</strong></p>\n<p>Microsoft are looking for a talented Principal Software Engineer at their Beijing office. This role sits at the heart of AI infrastructure development, driving innovation in large-scale AI infrastructure. You will be instrumental in designing and implementing high-performance, massively scalable infrastructure required to deploy frontier LLM models.</p>\n<p><strong>About the Role</strong></p>\n<p>As a Principal Software Engineer on the AI Infrastructure team, you will be responsible for designing and implementing innovative system optimization solutions for internal LLM workloads. You will optimize LLM inference workloads through innovative kernel, algorithm, scheduling, and parallelization technologies. You will also continuously develop and maintain internal LLM inference infrastructure, discovering new LLM system optimization needs and innovations.</p>\n<p><strong>Accountabilities</strong></p>\n<ul>\n<li>Keep up to date with and utilize the latest developments in LLM system optimization.</li>\n<li>Take the lead in designing innovative system optimization solutions for internal LLM workloads.</li>\n<li>Optimize LLM inference workloads through innovative kernel, algorithm, scheduling, and parallelization technologies.</li>\n<li>Continuously develop and maintain internal LLM inference infrastructure.</li>\n<li>Discover new LLM system optimization needs and innovations.</li>\n</ul>\n<p><strong>The Candidate we&#39;re looking for</strong></p>\n<p><strong>Experience:</strong></p>\n<ul>\n<li>A bachelor&#39;s degree or higher in computer science, engineering, or a related field, PhD is preferred.</li>\n</ul>\n<p><strong>Technical skills:</strong></p>\n<ul>\n<li>Strong programming skills in Python and C/C++.</li>\n<li>5+ years of experience in machine learning system development and optimization.</li>\n</ul>\n<p><strong>Personal attributes:</strong></p>\n<ul>\n<li>A growth mindset and a passion for learning new things.</li>\n</ul>\n<p><strong>Benefits</strong></p>\n<ul>\n<li>Competitive salary and benefits package.</li>\n<li>Opportunities for professional growth and development.</li>\n<li>Collaborative and dynamic work environment.</li>\n<li>Access to cutting-edge technology and resources.</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_3c0a8f07-6b9","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Microsoft","sameAs":"https://microsoft.ai","logo":"https://logos.yubhub.co/microsoft.ai.png"},"x-apply-url":"https://microsoft.ai/job/principal-software-engineer-28/","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"Competitive salary and benefits package","x-skills-required":["Python","C/C++","Machine learning system development and optimization"],"x-skills-preferred":["CUDA kernel development and optimization","Experience in optimizing communication layer / kernels for deep learning systems"],"datePosted":"2026-03-06T07:32:22.965Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Beijing"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Python, C/C++, Machine learning system development and optimization, CUDA kernel development and optimization, Experience in optimizing communication layer / kernels for deep learning systems"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_96cf54a4-999"},"title":"Senior Software Engineer","description":"<p><strong>Summary</strong></p>\n<p>Microsoft are looking for a talented Senior Software Engineer at their Beijing office. This role sits at the heart of AI Infrastructure development, driving innovation in large-scale AI infrastructure. You will be instrumental in designing and implementing high-performance, massively scalable infrastructure required to deploy frontier LLM models.</p>\n<p><strong>About the Role</strong></p>\n<p>We are seeking brilliant and passionate engineers to work with us on the most interesting and challenging problems of AI Infrastructure development. As a Senior Software Engineer, you will be responsible for designing and implementing the high-performance, massively scalable infrastructure required to deploy frontier LLM models through innovative GPU kernel, compression, scheduling and parallelization optimizations.</p>\n<p><strong>Accountabilities</strong></p>\n<ul>\n<li>Keep up to date with and utilize the latest developments in LLM system optimization.</li>\n<li>Discover/solve impactful technical problems, advance state-of-the-art LLM technologies, and translate ideas into production.</li>\n<li>Optimize LLM inference workloads through innovative kernel, algorithm, scheduling, and parallelization technologies.</li>\n<li>Continuously maintain internal LLM inference infrastructure.</li>\n</ul>\n<p><strong>The Candidate we&#39;re looking for</strong></p>\n<p><strong>Experience:</strong></p>\n<ul>\n<li>A bachelor&#39;s degree or higher in computer science, engineering, or a related field, PhD is preferred.</li>\n</ul>\n<p><strong>Technical skills:</strong></p>\n<ul>\n<li>Strong programming skills in Python and C/C++.</li>\n<li>2+ years of experience in machine learning system development and optimization.</li>\n</ul>\n<p><strong>Personal attributes:</strong></p>\n<ul>\n<li>A growth mindset and a passion for learning new things.</li>\n</ul>\n<p><strong>Benefits</strong></p>\n<ul>\n<li>Competitive salary and benefits package.</li>\n<li>Opportunities for professional growth and development.</li>\n<li>Collaborative and dynamic work environment.</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_96cf54a4-999","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Microsoft","sameAs":"https://microsoft.ai","logo":"https://logos.yubhub.co/microsoft.ai.png"},"x-apply-url":"https://microsoft.ai/job/senior-software-engineer-64/","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"Competitive salary and benefits package","x-skills-required":["Python","C/C++","Machine learning system development and optimization"],"x-skills-preferred":["CUDA kernel development and optimization","Experience in optimizing communication layer / kernels for deep learning systems"],"datePosted":"2026-03-06T07:32:05.702Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Beijing"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Python, C/C++, Machine learning system development and optimization, CUDA kernel development and optimization, Experience in optimizing communication layer / kernels for deep learning systems"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_7f56054b-d77"},"title":"Principal Software Engineer","description":"<p><strong>Summary</strong></p>\n<p>Microsoft AI are looking for a talented Principal Software Engineer at their Mountain View office. This role sits at the heart of strategic decision-making, driving innovations in AI infrastructure. You&#39;ll work directly with key partners to understand, design, and implement complex inferencing capabilities for state-of-the-art deep learning models.</p>\n<p><strong>About the Role</strong></p>\n<p>As a Principal Software Engineer, you will be responsible for engaging directly with key partners to understand, design, and implement complex inferencing capabilities for state-of-the-art deep learning models. You will work with cutting-edge hardware and software stacks to deliver best-in-class inference performance while optimizing for cost, leveraging open-source projects to advance deep learning applications. You will collaborate with external and internal teams to identify new areas for improvement and contribute to innovations that enhance model performance and deployment.</p>\n<p><strong>Accountabilities</strong></p>\n<ul>\n<li>Engage directly with key partners to understand, design, and implement complex inferencing capabilities for state-of-the-art deep learning models.</li>\n<li>Work with cutting-edge hardware and software stacks to deliver best-in-class inference performance while optimizing for cost.</li>\n</ul>\n<p><strong>The Candidate we&#39;re looking for</strong></p>\n<p><strong>Experience:</strong></p>\n<ul>\n<li>6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python.</li>\n</ul>\n<p><strong>Technical skills:</strong></p>\n<ul>\n<li>Experience with model compression (quantization, distillation, SVD, low-rank methods).</li>\n<li>Experience in building high-throughput inference serving stacks (continuous batching, KV-cache optimizations, routing).</li>\n</ul>\n<p><strong>Personal attributes:</strong></p>\n<ul>\n<li>Solid experience in GPU inference optimization (CUDA, TensorRT, Triton, or custom GPU kernels).</li>\n<li>Proficiency in profiling tools (Nsight, TensorBoard, PyTorch profiler) and ability to identify CPU/GPU bottlenecks.</li>\n</ul>\n<p><strong>Benefits</strong></p>\n<ul>\n<li>Competitive salary range of USD $139,900 – $274,800 per year.</li>\n<li>Comprehensive benefits package, including health insurance, retirement plan, and paid time off.</li>\n<li>Opportunities for professional growth and development.</li>\n<li>Collaborative and dynamic work environment.</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_7f56054b-d77","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Microsoft AI","sameAs":"https://microsoft.ai","logo":"https://logos.yubhub.co/microsoft.ai.png"},"x-apply-url":"https://microsoft.ai/job/principal-software-engineer-24/","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"USD $139,900 – $274,800 per year","x-skills-required":["C","C++","C#","Java","JavaScript","Python","model compression","GPU inference optimization"],"x-skills-preferred":["TensorRT","Triton","CUDA","Nsight","TensorBoard","PyTorch profiler"],"datePosted":"2026-03-06T07:30:21.077Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Mountain View"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"C, C++, C#, Java, JavaScript, Python, model compression, GPU inference optimization, TensorRT, Triton, CUDA, Nsight, TensorBoard, PyTorch profiler","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":139900,"maxValue":274800,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_961a53f3-82e"},"title":"Senior Software Engineer","description":"<p><strong>Summary</strong></p>\n<p>Microsoft are looking for a talented Senior Software Engineer at their Suzhou office. This role sits at the heart of strategic decision-making, turning market data into actionable insights for a company that&#39;s revolutionising the search engine and online advertising ecosystem. You&#39;ll work directly with leadership to shape the company&#39;s direction in the search and advertising markets.</p>\n<p><strong>About the Role</strong></p>\n<p>The R&amp;D of Search Ads aims to build an online advertising ecosystem of users, advertisers, and the search engine. Bing Search Ads Understanding team is chartered to deliver world class algorithm using web scale data. Our mission is to drive user satisfaction, advertiser ROI and Bing revenue. A core challenge is to match advertisers’ “Ad display” and users’ “query” by build an intelligent system to really understand the users need. This is a very hard problem that demands the most advanced AI models and sophisticated engineering systems. Join us to work on projects highly strategic to Bing search in a fun and fast-paced environment!</p>\n<p><strong>Accountabilities</strong></p>\n<ul>\n<li>Design, develop, and maintain high-performance software in C/C++ and Python, including GPU programming with CUDA, ROCm, or Triton.</li>\n<li>Optimize model inference and training pipelines for speed, throughput, memory efficiency, and cost across GPU platforms.</li>\n</ul>\n<p><strong>The Candidate we&#39;re looking for</strong></p>\n<p><strong>Experience:</strong></p>\n<ul>\n<li>Bachelor’s Degree in Computer Science or related technical field AND 4+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, Python, CUDA, or ROCm OR equivalent experience.</li>\n</ul>\n<p><strong>Technical skills:</strong></p>\n<ul>\n<li>Practical experience writing new GPU kernels, going beyond experience of GPU workloads with existing library kernels.</li>\n</ul>\n<p><strong>Personal attributes:</strong></p>\n<ul>\n<li>Cross-team collaboration skills and the desire to collaborate in a team of researchers and developers.</li>\n</ul>\n<p><strong>Benefits</strong></p>\n<ul>\n<li>Work on projects highly strategic to Bing search in a fun and fast-paced environment.</li>\n<li>Collaborate with platform teams to integrate and tune solutions on emerging accelerator stacks and rapidly evolving toolchains.</li>\n<li>Partner with internal and external stakeholders to translate requirements into scalable performance features and optimizations for state-of-the-art models.</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_961a53f3-82e","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Microsoft","sameAs":"https://microsoft.ai","logo":"https://logos.yubhub.co/microsoft.ai.png"},"x-apply-url":"https://microsoft.ai/job/senior-software-engineer-76/","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["C/C++","Python","CUDA","ROCm","Triton","GPU programming","High-performance software development"],"x-skills-preferred":["Deep learning frameworks","Inference optimization","GPU profiling tools"],"datePosted":"2026-03-06T07:29:46.024Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Suzhou"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"C/C++, Python, CUDA, ROCm, Triton, GPU programming, High-performance software development, Deep learning frameworks, Inference optimization, GPU profiling tools"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_a15b11dd-765"},"title":"Principal Software Engineer","description":"<p><strong>Summary</strong></p>\n<p>Microsoft AI are looking for a talented Principal Software Engineer at their Redmond office. This role sits at the heart of strategic decision-making, turning market data into actionable insights for a company that&#39;s revolutionising AI technology. You&#39;ll work directly with leadership to shape the company&#39;s direction in the AI market.</p>\n<p><strong>About the Role</strong></p>\n<p>As a Principal Software Engineer, you will be responsible for designing and implementing complex software systems that drive innovation in AI infrastructure. You will work with cutting-edge hardware and software stacks to deliver best-in-class inference performance while optimizing for cost, leveraging open-source projects to advance deep learning applications. You will collaborate with external and internal teams to identify new areas for improvement and contribute to innovations that enhance model performance and deployment.</p>\n<p><strong>Accountabilities</strong></p>\n<ul>\n<li>Engage directly with key partners to understand, design, and implement complex inferencing capabilities for state-of-the-art deep learning models, driving innovations in AI infrastructure.</li>\n<li>Work with cutting-edge hardware and software stacks to deliver best-in-class inference performance while optimizing for cost, leveraging open-source projects to advance deep learning applications.</li>\n</ul>\n<p><strong>The Candidate we&#39;re looking for</strong></p>\n<p><strong>Experience:</strong></p>\n<ul>\n<li>Bachelor’s Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience.</li>\n</ul>\n<p><strong>Technical skills:</strong></p>\n<ul>\n<li>Experience with model compression (quantization, distillation, SVD, low-rank methods).</li>\n<li>Experience in building high-throughput inference serving stacks (continuous batching, KV-cache optimizations, routing).</li>\n</ul>\n<p><strong>Personal attributes:</strong></p>\n<ul>\n<li>Solid experience in GPU inference optimization (CUDA, TensorRT, Triton, or custom GPU kernels).</li>\n<li>Proficiency in profiling tools (Nsight, TensorBoard, PyTorch profiler) and ability to identify CPU/GPU bottlenecks.</li>\n</ul>\n<p><strong>Benefits</strong></p>\n<ul>\n<li>Competitive salary</li>\n<li>Comprehensive benefits package</li>\n<li>Opportunities for professional growth and development</li>\n<li>Collaborative and dynamic work environment</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_a15b11dd-765","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Microsoft AI","sameAs":"https://microsoft.ai","logo":"https://logos.yubhub.co/microsoft.ai.png"},"x-apply-url":"https://microsoft.ai/job/principal-software-engineer-23/","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"USD $139,900 – $274,800 per year","x-skills-required":["C","C++","C#","Java","JavaScript","Python","model compression","GPU inference optimization","profiling tools"],"x-skills-preferred":["TensorRT","Triton","CUDA","TensorBoard","PyTorch profiler"],"datePosted":"2026-03-06T07:29:39.108Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Redmond"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"C, C++, C#, Java, JavaScript, Python, model compression, GPU inference optimization, profiling tools, TensorRT, Triton, CUDA, TensorBoard, PyTorch profiler","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":139900,"maxValue":274800,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_426a1b6c-bb9"},"title":"Senior Software Engineer","description":"<p><strong>Summary</strong></p>\n<p>Microsoft are looking for a talented Senior Software Engineer at their Beijing office. This role sits at the heart of strategic decision-making, turning market data into actionable insights for a company that&#39;s revolutionising the search engine and online advertising ecosystem. You&#39;ll work directly with leadership to shape the company&#39;s direction in the search engine and online advertising markets.</p>\n<p><strong>About the Role</strong></p>\n<p>The R&amp;D of Search Ads aims to build an online advertising ecosystem of users, advertisers, and the search engine. Bing Search Ads Understanding team is chartered to deliver world class algorithm using web scale data. Our mission is to drive user satisfaction, advertiser ROI and Bing revenue. A core challenge is to match advertisers’ “Ad display” and users’ “query” by build an intelligent system to really understand the users need. This is a very hard problem that demands the most advanced AI models and sophisticated engineering systems. Join us to work on projects highly strategic to Bing search in a fun and fast-paced environment!</p>\n<p><strong>Accountabilities</strong></p>\n<ul>\n<li>Design, develop, and maintain high-performance software in C/C++ and Python, including GPU programming with CUDA, ROCm, or Triton.</li>\n<li>Optimize model inference and training pipelines for speed, throughput, memory efficiency, and cost across GPU platforms.</li>\n</ul>\n<p><strong>The Candidate we&#39;re looking for</strong></p>\n<p><strong>Experience:</strong></p>\n<ul>\n<li>Bachelor’s Degree in Computer Science or related technical field AND 4+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, Python, CUDA, or ROCm OR equivalent experience.</li>\n</ul>\n<p><strong>Technical skills:</strong></p>\n<ul>\n<li>Practical experience writing new GPU kernels, going beyond experience of GPU workloads with existing library kernels.</li>\n</ul>\n<p><strong>Personal attributes:</strong></p>\n<ul>\n<li>Cross-team collaboration skills and the desire to collaborate in a team of researchers and developers.</li>\n</ul>\n<p><strong>Benefits</strong></p>\n<ul>\n<li>Work on projects highly strategic to Bing search in a fun and fast-paced environment.</li>\n<li>Collaborate with platform teams to integrate and tune solutions on emerging accelerator stacks and rapidly evolving toolchains.</li>\n<li>Partner with internal and external stakeholders to translate requirements into scalable performance features and optimizations for state-of-the-art models.</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_426a1b6c-bb9","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Microsoft","sameAs":"https://microsoft.ai","logo":"https://logos.yubhub.co/microsoft.ai.png"},"x-apply-url":"https://microsoft.ai/job/senior-software-engineer-75/","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["C/C++","Python","CUDA","ROCm","Triton","GPU programming","High-performance software development"],"x-skills-preferred":["Deep learning frameworks","Inference optimization","Software engineering principles","Architecture design"],"datePosted":"2026-03-06T07:29:11.951Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Beijing"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"C/C++, Python, CUDA, ROCm, Triton, GPU programming, High-performance software development, Deep learning frameworks, Inference optimization, Software engineering principles, Architecture design"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_356892b1-542"},"title":"Senior Software Engineer","description":"<p><strong>Summary</strong></p>\n<p>Microsoft AI are looking for a talented Senior Software Engineer at their Suzhou office. This role sits at the heart of strategic decision-making, turning market data into actionable insights for a company that&#39;s revolutionising AI technology. You&#39;ll work directly with leadership to shape the company&#39;s direction in the AI market.</p>\n<p><strong>About the Role</strong></p>\n<p>We are seeking an expert Senior GPU Engineer to join our AI Infrastructure team. In this role, you will architect and optimize the core inference engine that powers our large-scale AI models. You will be responsible for pushing the boundaries of hardware performance, reducing latency, and maximizing throughput for Generative AI and Deep Learning workloads. You will work at the intersection of Deep Learning algorithms and low-level hardware, designing custom operators and building a highly efficient training/inference execution engine from the ground up.</p>\n<p><strong>Accountabilities</strong></p>\n<ul>\n<li>Custom Operator Development: Design and implement highly optimized GPU kernels (CUDA/Triton) for critical deep learning operations (e.g., FlashAttention, GEMM, LayerNorm) to outperform standard libraries.</li>\n<li>Inference Engine Architecture: Contribute to the development of our high-performance inference engine, focusing on graph optimizations, operator fusion, and dynamic memory management (e.g., KV Cache optimization).</li>\n</ul>\n<p><strong>The Candidate we&#39;re looking for</strong></p>\n<p><strong>Experience:</strong></p>\n<ul>\n<li>Bachelor’s Degree in Computer Science or related technical field AND 4+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience.</li>\n</ul>\n<p><strong>Technical skills:</strong></p>\n<ul>\n<li>Expertise in the CUDA programming model and NVIDIA GPU architectures (specifically Ampere/Hopper).</li>\n<li>Deep understanding of the memory hierarchy (Shared Memory, L2 cache, Registers), warp-level primitives, occupancy optimization, and bank conflict resolution.</li>\n</ul>\n<p><strong>Personal attributes:</strong></p>\n<ul>\n<li>Proven ability to navigate and modify complex, large-scale codebases (e.g., PyTorch internals, Linux kernel).</li>\n</ul>\n<p><strong>Benefits</strong></p>\n<ul>\n<li>Starting January 26, 2026, Microsoft AI employees who live within a 50-mile commute of a designated Microsoft office in the U.S. or 25-mile commute of a non-U.S., country-specific location are expected to work from the office at least four days per week.</li>\n<li>Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, age, disability, or protected veteran status.</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_356892b1-542","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Microsoft AI","sameAs":"https://microsoft.ai","logo":"https://logos.yubhub.co/microsoft.ai.png"},"x-apply-url":"https://microsoft.ai/job/senior-software-engineer-18/","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["C","C++","CUDA","NVIDIA GPU architectures","Deep Learning algorithms","low-level hardware"],"x-skills-preferred":["PyTorch","Linux kernel"],"datePosted":"2026-03-06T07:26:27.271Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Suzhou"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"C, C++, CUDA, NVIDIA GPU architectures, Deep Learning algorithms, low-level hardware, PyTorch, Linux kernel"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_7c0b682d-d0b"},"title":"Senior Software Engineer","description":"<p><strong>Summary</strong></p>\n<p>Microsoft AI are looking for a talented Senior Software Engineer at their Beijing office. This role sits at the heart of strategic decision-making, turning market data into actionable insights for a company that&#39;s revolutionising AI technology. You&#39;ll work directly with leadership to shape the company&#39;s direction in the AI market.</p>\n<p><strong>About the Role</strong></p>\n<p>We are seeking an expert Senior GPU Engineer to join our AI Infrastructure team. In this role, you will architect and optimize the core inference engine that powers our large-scale AI models. You will be responsible for pushing the boundaries of hardware performance, reducing latency, and maximizing throughput for Generative AI and Deep Learning workloads. You will work at the intersection of Deep Learning algorithms and low-level hardware, designing custom operators and building a highly efficient training/inference execution engine from the ground up.</p>\n<p><strong>Accountabilities</strong></p>\n<ul>\n<li>Custom Operator Development: Design and implement highly optimized GPU kernels (CUDA/Triton) for critical deep learning operations (e.g., FlashAttention, GEMM, LayerNorm) to outperform standard libraries.</li>\n<li>Inference Engine Architecture: Contribute to the development of our high-performance inference engine, focusing on graph optimizations, operator fusion, and dynamic memory management (e.g., KV Cache optimization).</li>\n</ul>\n<p><strong>The Candidate we&#39;re looking for</strong></p>\n<p><strong>Experience:</strong></p>\n<ul>\n<li>4+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python.</li>\n</ul>\n<p><strong>Technical skills:</strong></p>\n<ul>\n<li>Expertise in the CUDA programming model and NVIDIA GPU architectures (specifically Ampere/Hopper).</li>\n<li>Deep understanding of the memory hierarchy (Shared Memory, L2 cache, Registers), warp-level primitives, occupancy optimization, and bank conflict resolution.</li>\n</ul>\n<p><strong>Personal attributes:</strong></p>\n<ul>\n<li>Proven ability to navigate and modify complex, large-scale codebases (e.g., PyTorch internals, Linux kernel).</li>\n</ul>\n<p><strong>Benefits</strong></p>\n<ul>\n<li>Starting January 26, 2026, Microsoft AI employees who live within a 50-mile commute of a designated Microsoft office in the U.S. or 25-mile commute of a non-U.S., country-specific location are expected to work from the office at least four days per week.</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_7c0b682d-d0b","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Microsoft AI","sameAs":"https://microsoft.ai","logo":"https://logos.yubhub.co/microsoft.ai.png"},"x-apply-url":"https://microsoft.ai/job/senior-software-engineer-17/","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["C","C++","CUDA","Triton","PyTorch","Linux"],"x-skills-preferred":["CMake","pybind11","CI/CD","GPU workloads"],"datePosted":"2026-03-06T07:25:46.472Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Beijing"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"C, C++, CUDA, Triton, PyTorch, Linux, CMake, pybind11, CI/CD, GPU workloads"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_29f7bd1f-c36"},"title":"Principal Software Engineer","description":"<p><strong>Summary</strong></p>\n<p>Microsoft are looking for an experienced Software Engineer to join the Ads Engineering team and help advance the core capabilities of our Ads serving stack. This system powers advertisements across a range of Microsoft services, including Bing Search, MSN, Start.com, and shopping experiences in the Edge browser.</p>\n<p><strong>About the Role</strong></p>\n<p>We are looking for an experienced Software Engineer to join the Ads Engineering team and help advance the core capabilities of our Ads serving stack. This system powers advertisements across a range of Microsoft services, including Bing Search, MSN, Start.com, and shopping experiences in the Edge browser. Our serving stack is a high-scale, low-latency, geo-distributed system with numerous components—including large-scale machine learning inference for ad ranking, real-time bidding infrastructure, and other subsystems supporting diverse ad scenarios. This role offers an exciting opportunity to contribute to the innovation and evolution of a system operating at an exceptional scale and speed. You’ll face a wide variety of technical challenges: from designing new features and optimizing performance down to the millisecond, to building scalable infrastructure for containerized services. You’ll work alongside a passionate, world-class engineering team, own major feature areas, and collaborate globally. If you thrive on solving complex technical problems in a dynamic environment, this is the opportunity for you.</p>\n<p><strong>Accountabilities</strong></p>\n<ul>\n<li>Design and develop large-scale, distributed systems—including CPU and GPU ranking platforms—to support real-time processing of millions of ad requests per second with high efficiency, extensibility, diagnosability, reliability, and maintainability.</li>\n<li>Lead architecture discussions, create technical design documents, and drive end-to-end solution planning—identifying system dependencies, performance optimizations, and security/compliance requirements across interconnected services.</li>\n</ul>\n<p><strong>The Candidate we&#39;re looking for</strong></p>\n<p><strong>Experience:</strong></p>\n<ul>\n<li>6+ years technical engineering experience with coding in languages including, but not limited to, C++, C#, Python.</li>\n</ul>\n<p><strong>Technical skills:</strong></p>\n<ul>\n<li>Proven experience in designing, implementing, and validating deep learning systems for real-time online inference.</li>\n<li>Solid expertise in optimizing machine learning models for GPUs, including development of custom CUDA kernels for performance-critical workloads.</li>\n<li>Hands-on experience in designing, implementing, and scaling large-scale, distributed online systems with a deep understanding of system architecture.</li>\n</ul>\n<p><strong>Personal attributes:</strong></p>\n<ul>\n<li>Proven ability to profile, analyze, and optimize performance and capacity of native C++ systems in complex, high-throughput environments.</li>\n</ul>\n<p><strong>Benefits</strong></p>\n<ul>\n<li>Competitive salary range: $139,900 - $274,800 per year.</li>\n<li>Comprehensive benefits package, including medical, dental, and vision insurance.</li>\n<li>401(k) matching program.</li>\n<li>Paid time off and holidays.</li>\n<li>Opportunities for professional growth and development.</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_29f7bd1f-c36","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Microsoft","sameAs":"https://microsoft.ai","logo":"https://logos.yubhub.co/microsoft.ai.png"},"x-apply-url":"https://microsoft.ai/job/principal-software-engineer-25/","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$139,900 - $274,800 per year","x-skills-required":["C++","C#","Python","Deep learning","GPU optimization","System architecture"],"x-skills-preferred":["CUDA","Containerized services","Scalable infrastructure"],"datePosted":"2026-03-05T19:49:43.106Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Redmond"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"C++, C#, Python, Deep learning, GPU optimization, System architecture, CUDA, Containerized services, Scalable infrastructure","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":139900,"maxValue":274800,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_c041d54a-929"},"title":"Internship Program","description":"<p>Perplexity is excited to announce the Internship Program for exceptional Master’s or PhD students studying Computer Science or Engineering in the UK, enrolled in the 2025-2026 academic year. This is an intensive program in which you will work directly with our AI Inference team.</p>\n<p><strong>What you&#39;ll do</strong></p>\n<ul>\n<li>Work with the inference team to improve serving latency and throughput</li>\n<li>Bring up support for new models and state-of-the-art inference optimizations or quantization schemes</li>\n<li>Optimize inference across the entire stack, from GPU kernels to serving endpoints</li>\n</ul>\n<p><strong>What you need</strong></p>\n<ul>\n<li>Strong engineering track record with proven knowledge of fundamentals and programming languages (multi-threaded programming, networking, compilation, systems programming, etc)</li>\n<li>Pursuing a Master&#39;s or PhD in Computer Science with a focus on performance-related subjects (HPC, Compilers, Distributed Systems)</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_c041d54a-929","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Perplexity","sameAs":"https://jobs.ashbyhq.com","logo":"https://logos.yubhub.co/perplexity.com.png"},"x-apply-url":"https://jobs.ashbyhq.com/perplexity/79a07e2d-6150-4929-80fe-bbe13a641763","x-work-arrangement":"hybrid","x-experience-level":"entry","x-job-type":"internship","x-salary-range":null,"x-skills-required":["strong engineering track record","proven knowledge of fundamentals and programming languages","pursuing a Master's or PhD in Computer Science"],"x-skills-preferred":["experience with ML frameworks (Torch, JAX)","experience with GPU programming (CUDA, Triton)","experience with High-Performance Computing (OpenMPI)"],"datePosted":"2026-03-04T12:25:51.516Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"London"}},"employmentType":"INTERN","occupationalCategory":"Engineering","industry":"Technology","skills":"strong engineering track record, proven knowledge of fundamentals and programming languages, pursuing a Master's or PhD in Computer Science, experience with ML frameworks (Torch, JAX), experience with GPU programming (CUDA, Triton), experience with High-Performance Computing (OpenMPI)"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_7917d1eb-6e2"},"title":"Engineering Manager - Inference","description":"<p>We are looking for an Inference Engineering Manager to lead our AI Inference team. This is a unique opportunity to build and scale the infrastructure that powers Perplexity&#39;s products and APIs, serving millions of users with state-of-the-art AI capabilities.</p>\n<p><strong>What you&#39;ll do</strong></p>\n<p>You will own the technical direction and execution of our inference systems while building and leading a world-class team of inference engineers. Our current stack includes Python, PyTorch, Rust, C++, and Kubernetes.</p>\n<ul>\n<li>Lead and grow a high-performing team of AI inference engineers</li>\n<li>Develop APIs for AI inference used by both internal and external customers</li>\n<li>Architect and scale our inference infrastructure for reliability and efficiency</li>\n</ul>\n<p><strong>What you need</strong></p>\n<ul>\n<li>5+ years of engineering experience with 2+ years in a technical leadership or management role</li>\n<li>Deep experience with ML systems and inference frameworks (PyTorch, TensorFlow, ONNX, TensorRT, vLLM)</li>\n<li>Strong understanding of LLM architecture: Multi-Head Attention, Multi/Grouped-Query Attention, and common layers</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_7917d1eb-6e2","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Perplexity","sameAs":"https://jobs.ashbyhq.com","logo":"https://logos.yubhub.co/perplexity.com.png"},"x-apply-url":"https://jobs.ashbyhq.com/perplexity/2a87ccbf-82ef-4fc7-b1ed-4dd18b11baf9","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$300K - $405K","x-skills-required":["ML systems","inference frameworks","LLM architecture"],"x-skills-preferred":["CUDA","Triton","custom kernel development"],"datePosted":"2026-03-04T12:24:50.159Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"ML systems, inference frameworks, LLM architecture, CUDA, Triton, custom kernel development","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":300000,"maxValue":405000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_46711770-4ab"},"title":"AI Researcher","description":"<p>Perplexity is seeking top-tier AI Research Scientists and Engineers to advance our AI products and capabilities. We&#39;re building the future of AI-powered search and agent experiences through our Sonar models, Deep Research Agent, Comet Agent, and Search products. Join us in creating SOTA experiences that handle hundreds of millions of queries and continue to scale rapidly.</p>\n<p><strong>What you&#39;ll do</strong></p>\n<p>Research &amp; Development</p>\n<ul>\n<li>Post-train SOTA LLMs using the latest supervised and reinforcement learning techniques (SFT/DPO/GRPO)</li>\n<li>Leverage our rich query/answer dataset to scale model performance across Sonar, Deep Research, Comet, and Search products</li>\n</ul>\n<p><strong>What you need</strong></p>\n<ul>\n<li>Proven experience with large-scale LLMs and Deep Learning systems</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_46711770-4ab","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Perplexity","sameAs":"https://jobs.ashbyhq.com","logo":"https://logos.yubhub.co/perplexity.com.png"},"x-apply-url":"https://jobs.ashbyhq.com/perplexity/8fe61c73-0daf-4432-a47d-44714c1ef764","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$220K – $485K","x-skills-required":["large-scale LLMs","Deep Learning systems","Python/PyTorch","post-training techniques","reinforcement learning"],"x-skills-preferred":["PhD in Machine Learning, AI, Systems, or related areas","C++/CUDA programming skills","experience building LLM training frameworks","academic publications and research impact","experience with agent systems and multi-step reasoning","background in personalization and preference learning"],"datePosted":"2026-03-04T12:24:44.562Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco, Palo Alto"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"large-scale LLMs, Deep Learning systems, Python/PyTorch, post-training techniques, reinforcement learning, PhD in Machine Learning, AI, Systems, or related areas, C++/CUDA programming skills, experience building LLM training frameworks, academic publications and research impact, experience with agent systems and multi-step reasoning, background in personalization and preference learning","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":220000,"maxValue":485000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_2dd5703c-3f0"},"title":"Internship Development of Driving Assistance Systems","description":"<p><strong>What you&#39;ll do</strong></p>\n<p>Your main tasks will be one of the following:</p>\n<ul>\n<li>Software engineering</li>\n<li>Cutting-Edge AI prototyping</li>\n<li>Innovative data analysis</li>\n<li>Research &amp; development</li>\n<li>System validation</li>\n</ul>\n<p><strong>What you need</strong></p>\n<p>An immatriculation for MINT or a comparable degree program (at least in your 3rd semester or gap year between your bachelor&#39;s and master&#39;s degree)</p>\n<ul>\n<li>Knowledge of Python or C++ is required</li>\n<li>Experience with one or more of the following is preferred: Matlab, Simulink, CARLA, Linux, PyTorch, CUDA, Computer Vision</li>\n<li>Independent working style</li>\n<li>Ability to work in a team</li>\n<li>Confident use of MS Office</li>\n<li>Very good German or English skills (min. B2)</li>\n</ul>\n<p><strong>Why this matters</strong></p>\n<p>This role keeps a world-championship-winning F1 team running. When equipment fails, races can be lost, so your work directly impacts performance. You&#39;ll develop deep expertise in high-spec facilities and have clear progression into senior facilities management roles. The F1 environment means you&#39;ll work with cutting-edge building systems and learn from the best in the industry.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_2dd5703c-3f0","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Dr. Ing. h.c. F. Porsche AG","sameAs":"https://jobs.porsche.com","logo":"https://logos.yubhub.co/jobs.porsche.com.png"},"x-apply-url":"https://jobs.porsche.com/index.php?ac=jobad&id=18936","x-work-arrangement":"onsite","x-experience-level":"entry","x-job-type":"internship","x-salary-range":null,"x-skills-required":["Python","C++","Matlab","Simulink","CARLA","Linux","PyTorch","CUDA","Computer Vision"],"x-skills-preferred":["Matlab","Simulink","CARLA","Linux","PyTorch","CUDA","Computer Vision"],"datePosted":"2025-12-08T16:35:04.543Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Mönsheim"}},"employmentType":"INTERN","occupationalCategory":"Engineering","industry":"Motorsport","skills":"Python, C++, Matlab, Simulink, CARLA, Linux, PyTorch, CUDA, Computer Vision, Matlab, Simulink, CARLA, Linux, PyTorch, CUDA, Computer Vision"}]}