{"version":"0.1","company":{"name":"YubHub","url":"https://yubhub.co","jobsUrl":"https://yubhub.co/jobs/skill/xla"},"x-facet":{"type":"skill","slug":"xla","display":"Xla","count":19},"x-feed-size-limit":100,"x-feed-sort":"enriched_at desc","x-feed-notice":"This feed contains at most 100 jobs (the most recently enriched). For the full corpus, use the paginated /stats/by-facet endpoint or /search.","x-generator":"yubhub-xml-generator","x-rights":"Free to redistribute with attribution: \"Data by YubHub (https://yubhub.co)\"","x-schema":"Each entry in `jobs` follows https://schema.org/JobPosting. YubHub-native raw fields carry `x-` prefix.","jobs":[{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_8a3caae4-044"},"title":"Member of Technical Staff - Imagine Model","description":"<p>As a Member of Technical Staff on the Imagine Model Team, you will develop cutting-edge AI experiences beyond text, with a strong focus on enabling high-fidelity understanding and generation across image and video modalities. Responsibilities span data curation, modeling, training, inference serving, and product integration, covering both pretraining and post-training phases. You will collaborate closely with product teams to push model frontiers and deliver exceptional end-to-end user experiences.</p>\n<p>Key responsibilities include creating and driving engineering agendas to advance multimodal capabilities, improving data quality through annotation, filtering, augmentation, synthetic generation, captioning, and in-depth data studies, designing evaluation frameworks, metrics, benchmarks, evals, and reward models tailored to image/video/audio quality and coherence, implementing efficient algorithms for state-of-the-art model performance, and developing scalable data collection and processing pipelines for multimodal (primarily image/video-focused) datasets.</p>\n<p>The ideal candidate will have a track record in leading studies that significantly improve neural network capabilities and performance through better data or modeling, experience in data-driven experiment designs, systematic analysis, and iterative model debugging, experience developing or working with large-scale distributed machine learning systems, and ability to deliver optimal end-to-end user experiences.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_8a3caae4-044","directApply":true,"hiringOrganization":{"@type":"Organization","name":"xAI","sameAs":"https://www.xai.com/","logo":"https://logos.yubhub.co/xai.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/xai/jobs/5051985007","x-work-arrangement":"hybrid","x-experience-level":"staff","x-job-type":"full-time","x-salary-range":"$180,000 - $440,000 USD","x-skills-required":["data curation","modeling","training","inference serving","product integration","large-scale distributed machine learning systems"],"x-skills-preferred":["SFT","RL","evals","human/synthetic data collection","agentic systems","Python","JAX/XLA","PyTorch","Rust/C++","Spark","Ray"],"datePosted":"2026-04-18T15:58:43.641Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Palo Alto, CA; Seattle, WA"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"data curation, modeling, training, inference serving, product integration, large-scale distributed machine learning systems, SFT, RL, evals, human/synthetic data collection, agentic systems, Python, JAX/XLA, PyTorch, Rust/C++, Spark, Ray","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":180000,"maxValue":440000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_07a3c83e-51e"},"title":"Research Engineer, Infrastructure, Numerics","description":"<p>We&#39;re looking for an infrastructure research engineer to design and build the core systems that enable efficient large-scale model training with a focus on numerics. You will focus on improving the numerical foundations of our distributed training stack, from precision formats and kernel optimizations to communication frameworks that make training trillion-parameter models stable, scalable, and fast.</p>\n<p>This role is ideal for someone who thrives at the intersection of research and systems engineering: a builder who understands both the math of optimization and the realities of distributed compute.</p>\n<p>Responsibilities:</p>\n<ul>\n<li>Design and optimize distributed training infrastructure for large-scale LLMs, focusing on performance, stability, and reproducibility across multi-GPU and multi-node setups.</li>\n<li>Implement and evaluate low-precision numerics (for example, BF16, MXFP8, NVFP4) to improve efficiency without sacrificing model quality.</li>\n<li>Develop kernels and communication primitives that use hardware-level support for mixed and low-precision arithmetic.</li>\n<li>Collaborate with research teams to co-design model architectures and training recipes that align with emerging numeric formats and stability constraints.</li>\n<li>Prototype and benchmark scaling strategies such as data, tensor, and pipeline parallelism that integrate precision-adaptive computation and quantized communication.</li>\n<li>Contribute to the design of our internal orchestration and monitoring systems to ensure that thousands of distributed experiments can run efficiently and reproducibly.</li>\n<li>Publish and share learnings through internal documentation, open-source libraries, or technical reports that advance the field of scalable AI infrastructure.</li>\n</ul>\n<p>Skills and Qualifications:</p>\n<p>Minimum qualifications:</p>\n<ul>\n<li>Bachelor’s degree or equivalent experience in computer science, electrical engineering, statistics, machine learning, physics, robotics, or similar.</li>\n<li>Understanding of deep learning frameworks (e.g., PyTorch, JAX) and their underlying system architectures.</li>\n<li>Thrive in a highly collaborative environment involving many, different cross-functional partners and subject matter experts.</li>\n<li>A bias for action with a mindset to take initiative to work across different stacks and different teams where you spot the opportunity to make sure something ships.</li>\n<li>Strong engineering skills, ability to contribute performant, maintainable code and debug in complex codebases in areas such as floating-point numerics, low-precision arithmetic, and distributed systems.</li>\n</ul>\n<p>Preferred qualifications , we encourage you to apply if you meet some but not all of these:</p>\n<ul>\n<li>Familiarity with distributed frameworks such as PyTorch/XLA, DeepSpeed, Megatron-LM.</li>\n<li>Experience implementing FP8, INT8, or block-floating point (MX) formats and understanding their numerical trade-offs.</li>\n<li>Prior contributions to open-source deep learning infrastructure such as PyTorch, DeepSpeed, or XLA.</li>\n<li>Publications, patents, or projects related to numerical optimization, communication-efficient training, or systems for large models.</li>\n<li>Experience training and supporting large-scale AI models.</li>\n<li>Track record of improving research productivity through infrastructure design or process improvements.</li>\n</ul>\n<p>Logistics:</p>\n<ul>\n<li>Location: This role is based in San Francisco, California.</li>\n<li>Compensation: Depending on background, skills and experience, the expected annual salary range for this position is $350,000 - $475,000 USD.</li>\n<li>Visa sponsorship: We sponsor visas. While we can&#39;t guarantee success for every candidate or role, if you&#39;re the right fit, we&#39;re committed to working through the visa process together.</li>\n<li>Benefits: Thinking Machines offers generous health, dental, and vision benefits, unlimited PTO, paid parental leave, and relocation support as needed.</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_07a3c83e-51e","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Thinking Machines Lab","sameAs":"https://thinkingmachines.ai/","logo":"https://logos.yubhub.co/thinkingmachines.ai.png"},"x-apply-url":"https://job-boards.greenhouse.io/thinkingmachines/jobs/5013937008","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$350,000 - $475,000 USD","x-skills-required":["Bachelor’s degree or equivalent experience in computer science, electrical engineering, statistics, machine learning, physics, robotics, or similar","Understanding of deep learning frameworks (e.g., PyTorch, JAX) and their underlying system architectures","Thriving in a highly collaborative environment involving many, different cross-functional partners and subject matter experts","Strong engineering skills, ability to contribute performant, maintainable code and debug in complex codebases in areas such as floating-point numerics, low-precision arithmetic, and distributed systems","Familiarity with distributed frameworks such as PyTorch/XLA, DeepSpeed, Megatron-LM"],"x-skills-preferred":["Experience implementing FP8, INT8, or block-floating point (MX) formats and understanding their numerical trade-offs","Prior contributions to open-source deep learning infrastructure such as PyTorch, DeepSpeed, or XLA","Publications, patents, or projects related to numerical optimization, communication-efficient training, or systems for large models","Experience training and supporting large-scale AI models","Track record of improving research productivity through infrastructure design or process improvements"],"datePosted":"2026-04-18T15:56:14.922Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Bachelor’s degree or equivalent experience in computer science, electrical engineering, statistics, machine learning, physics, robotics, or similar, Understanding of deep learning frameworks (e.g., PyTorch, JAX) and their underlying system architectures, Thriving in a highly collaborative environment involving many, different cross-functional partners and subject matter experts, Strong engineering skills, ability to contribute performant, maintainable code and debug in complex codebases in areas such as floating-point numerics, low-precision arithmetic, and distributed systems, Familiarity with distributed frameworks such as PyTorch/XLA, DeepSpeed, Megatron-LM, Experience implementing FP8, INT8, or block-floating point (MX) formats and understanding their numerical trade-offs, Prior contributions to open-source deep learning infrastructure such as PyTorch, DeepSpeed, or XLA, Publications, patents, or projects related to numerical optimization, communication-efficient training, or systems for large models, Experience training and supporting large-scale AI models, Track record of improving research productivity through infrastructure design or process improvements","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":350000,"maxValue":475000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_cba88898-896"},"title":"Research Engineer, Infrastructure, Kernels","description":"<p>We&#39;re looking for an infrastructure research engineer to design, optimize, and maintain the compute foundations that power large-scale language model training. You will develop high-performance ML kernels (e.g., CUDA, CuTe, Triton), enable efficient low-precision arithmetic, and improve the distributed compute stack that makes training large models possible.</p>\n<p>This role is perfect for an engineer who enjoys working close to the metal and across the research boundary. You&#39;ll collaborate with researchers and systems architects to bridge algorithmic design with hardware efficiency. You&#39;ll prototype new kernel implementations, profile performance across hardware generations, and help define the numerical and parallelism strategies that determine how we scale next-generation AI systems.</p>\n<p><strong>Responsibilities</strong></p>\n<ul>\n<li>Design and implement custom ML kernels (e.g., CUDA, CuTe, Triton) for core LLM operations such as attention, matrix multiplication, gating, and normalization, optimized for modern GPU and accelerator architectures.</li>\n<li>Design and think through compute primitives to reduce memory bandwidth bottlenecks and improve kernel compute efficiency.</li>\n<li>Collaborate with research teams to align kernel-level optimizations with model architecture and algorithmic goals.</li>\n<li>Develop and maintain a library of reusable kernels and performance benchmarks that serve as the foundation for internal model training.</li>\n<li>Contribute to infrastructure stability and scalability, ensuring reproducibility, consistency across precision formats, and high utilization of compute resources.</li>\n<li>Document and share insights through internal talks, technical papers, or open-source contributions to strengthen the broader ML systems community.</li>\n</ul>\n<p><strong>Skills and Qualifications</strong></p>\n<p>Minimum qualifications:</p>\n<ul>\n<li>Bachelor’s degree or equivalent experience in computer science, electrical engineering, statistics, machine learning, physics, robotics, or similar.</li>\n<li>Strong engineering skills, ability to contribute performant, maintainable code and debug in complex codebases</li>\n<li>Understanding of deep learning frameworks (e.g., PyTorch, JAX) and their underlying system architectures.</li>\n<li>Thrive in a highly collaborative environment involving many, different cross-functional partners and subject matter experts.</li>\n<li>A bias for action with a mindset to take initiative to work across different stacks and different teams where you spot the opportunity to make sure something ships.</li>\n<li>Proficiency in CUDA, CuTe, Triton, or other GPU programming frameworks.</li>\n<li>Demonstrated ability to analyze, profile, and optimize compute-intensive workloads.</li>\n</ul>\n<p>Preferred qualifications:</p>\n<ul>\n<li>Experience training or supporting large-scale language models with tens of billions of parameters or more.</li>\n<li>Track record of improving research productivity through infrastructure design or process improvements.</li>\n<li>Experience developing or tuning kernels for deep learning frameworks such as PyTorch, JAX, or custom accelerators.</li>\n<li>Familiarity with tensor parallelism, pipeline parallelism, or distributed data processing frameworks.</li>\n<li>Experience implementing low-precision formats (FP8, INT8, block floating point) or contributing to related compiler stacks (e.g., XLA, TVM).</li>\n<li>Contributions to open-source GPU, ML systems, or compiler optimization projects.</li>\n<li>Prior research or engineering experience in numerical optimization, communication-efficient training, or scalable AI infrastructure.</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_cba88898-896","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Thinking Machines Lab","sameAs":"https://thinkingmachines.ai/","logo":"https://logos.yubhub.co/thinkingmachines.ai.png"},"x-apply-url":"https://job-boards.greenhouse.io/thinkingmachines/jobs/5013934008","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$350,000 - $475,000 USD","x-skills-required":["CUDA","CuTe","Triton","GPU programming frameworks","Deep learning frameworks (e.g., PyTorch, JAX)","Computer science","Electrical engineering","Statistics","Machine learning","Physics","Robotics"],"x-skills-preferred":["Experience training or supporting large-scale language models with tens of billions of parameters or more","Track record of improving research productivity through infrastructure design or process improvements","Experience developing or tuning kernels for deep learning frameworks such as PyTorch, JAX, or custom accelerators","Familiarity with tensor parallelism, pipeline parallelism, or distributed data processing frameworks","Experience implementing low-precision formats (FP8, INT8, block floating point) or contributing to related compiler stacks (e.g., XLA, TVM)","Contributions to open-source GPU, ML systems, or compiler optimization projects","Prior research or engineering experience in numerical optimization, communication-efficient training, or scalable AI infrastructure"],"datePosted":"2026-04-18T15:54:38.498Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"CUDA, CuTe, Triton, GPU programming frameworks, Deep learning frameworks (e.g., PyTorch, JAX), Computer science, Electrical engineering, Statistics, Machine learning, Physics, Robotics, Experience training or supporting large-scale language models with tens of billions of parameters or more, Track record of improving research productivity through infrastructure design or process improvements, Experience developing or tuning kernels for deep learning frameworks such as PyTorch, JAX, or custom accelerators, Familiarity with tensor parallelism, pipeline parallelism, or distributed data processing frameworks, Experience implementing low-precision formats (FP8, INT8, block floating point) or contributing to related compiler stacks (e.g., XLA, TVM), Contributions to open-source GPU, ML systems, or compiler optimization projects, Prior research or engineering experience in numerical optimization, communication-efficient training, or scalable AI infrastructure","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":350000,"maxValue":475000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_fd9dc769-68b"},"title":"Senior Network Engineer, Data Center","description":"<p>We are looking for an experienced Senior Network Engineer to join our rapidly growing Data Center Network Engineering Team. In this role, you&#39;ll play a critical part in designing, deploying, and managing the data center network that powers CoreWeave&#39;s AI, Hyperscale GPU cloud.</p>\n<p><strong>Responsibilities</strong></p>\n<ul>\n<li>Quickly adapt to a rapidly changing environment and learn new emerging networking technologies</li>\n<li>Design and deploy consistent large scale networks rapidly, leveraging automation</li>\n<li>Participate in peer reviews, design discussions, and architectural decisions</li>\n<li>Troubleshoot complex problems and support internal and external customers</li>\n<li>Share your knowledge and guide junior team members, fostering a culture of continuous learning and improvement</li>\n<li>Effectively collaborate across teams and areas of expertise</li>\n</ul>\n<p><strong>Requirements</strong></p>\n<ul>\n<li>7+ years of experience as a Network Engineer</li>\n<li>Experience managing Hyperscale Clos fabrics</li>\n<li>Networking OS Experience - Cumulus Linux, Sonic, Arista EOS, Junos, Cisco NX-OS</li>\n<li>Expertise with major routing protocols (BGP, ISIS, OSPF)</li>\n<li>Experience with encapsulation protocols (EVPN/VXLAN or Geneve)</li>\n<li>Experience automating configuration management (Ansible, SaltStack, home grown)</li>\n<li>Familiarity with scripting/programming (Shell, Python)</li>\n<li>Operational experience with Git</li>\n<li>Large scale network design, maintenance, and operations</li>\n<li>Experience leading complex projects as part of a team</li>\n<li>Deep understanding of TCP/IP</li>\n<li>Strong Linux skills</li>\n</ul>\n<p><strong>Nice-to-Haves</strong></p>\n<ul>\n<li>College education in Computer Science, Electrical Engineering</li>\n<li>Certifications like CCNA, CCNP, JNCIA</li>\n<li>Experience with Kubernetes and CNIs (Calico, Cilium)</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_fd9dc769-68b","directApply":true,"hiringOrganization":{"@type":"Organization","name":"CoreWeave","sameAs":"https://www.coreweave.com","logo":"https://logos.yubhub.co/coreweave.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/coreweave/jobs/4562279006","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$139,000 to $242,000","x-skills-required":["Network Engineer","Hyperscale Clos fabrics","Cumulus Linux","Sonic","Arista EOS","Junos","Cisco NX-OS","BGP","ISIS","OSPF","EVPN","VXLAN","Geneve","Ansible","SaltStack","Shell","Python","Git","TCP/IP","Linux"],"x-skills-preferred":["Kubernetes","CNIs","Calico","Cilium"],"datePosted":"2026-04-18T15:53:15.470Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Livingston, NJ / New York, NY / Sunnyvale, CA / Bellevue, WA / Richmond, VA"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Network Engineer, Hyperscale Clos fabrics, Cumulus Linux, Sonic, Arista EOS, Junos, Cisco NX-OS, BGP, ISIS, OSPF, EVPN, VXLAN, Geneve, Ansible, SaltStack, Shell, Python, Git, TCP/IP, Linux, Kubernetes, CNIs, Calico, Cilium","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":139000,"maxValue":242000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_dc17980d-461"},"title":"Research Engineer, Interpretability","description":"<p>JOB TITLE: Research Engineer, Interpretability \\n LOCATION: San Francisco, CA \\n DEPARTMENT: AI Research &amp; Engineering \\n \\n JOB DESCRIPTION: \\n \\n When you see what modern language models are capable of, do you wonder, &quot;How do these things work? How can we trust them?&quot; \\n \\n The Interpretability team at Anthropic is working to reverse-engineer how trained models work because we believe that a mechanistic understanding is the most robust way to make advanced systems safe. \\n \\n Think of us as doing &quot;neuroscience&quot; of neural networks using &quot;microscopes&quot; we build - or reverse-engineering neural networks like binary programs. \\n \\n More resources to learn about our work: \\n - Our research blog - covering advances including Monosemantic Features and Circuits \\n - An Introduction to Interpretability from our research lead, Chris Olah \\n - The Urgency of Interpretability from CEO Dario Amodei \\n - Engineering Challenges Scaling Interpretability - directly relevant to this role \\n - 60 Minutes segment - Around 8:07, see a demo of tooling our team built \\n - New Yorker article - what it&#39;s like to work on one of AI&#39;s hardest open problems \\n \\n Even if you haven&#39;t worked on interpretability before, the infrastructure expertise is similar to what&#39;s needed across the lifecycle of a production language model: \\n - Pretraining: Training dictionary learning models looks a lot like model pretraining - creating stable, performant training jobs for massively parameterized models across thousands of chips \\n - Inference: Interp runs a customized inference stack. Day-to-day analysis requires services that allow editing a model&#39;s internal activations mid-forward-pass - for example, adding a &quot;steering vector&quot; \\n - Performance: Like all LLM work, we push up against the limits of hardware and software. Rather than squeezing the last 0.1%, we are focused on finding bottlenecks, fixing them and moving ahead given rapidly evolving research and safety mission \\n \\n The science keeps scaling - and it&#39;s now applied directly in safety audits on frontier models, with real deadlines. As our research has matured, engineering and infrastructure have become a bottleneck. Your work will have a direct impact on one of the most important open problems in AI. \\n \\n RESPONSIBILITIES: \\n - Build and maintain the specialized inference and training infrastructure that powers interpretability research - including instrumented forward/backward passes, activation extraction, and steering vector application \\n - Resolve scaling and efficiency bottlenecks through profiling, optimization, and close collaboration with peer infrastructure teams \\n - Design tools, abstractions, and platforms that enable researchers to rapidly experiment without hitting engineering barriers \\n - Help bring interpretability research into production safety audits - with real deadlines and high reliability expectations \\n - Work across the stack - from model internals and accelerator-level optimization to user-facing research tooling \\n \\n YOU MAY BE A GOOD FIT IF YOU: \\n - Have 5-10+ years of experience building software \\n - Are highly proficient in at least one programming language (e.g., Python, Rust, Go, Java) and productive with Python \\n - Are extremely curious about unfamiliar domains; can quickly learn and put that knowledge to work, e.g. diving into new layers of the stack to find bottlenecks \\n - Have a strong ability to prioritize the most impactful work and are comfortable operating with ambiguity and questioning assumptions \\n - Prefer fast-moving collaborative projects to extensive solo efforts \\n - Are curious about interpretability research and its role in AI safety (though no research experience is required!) \\n - Care about the societal impacts and ethics of your work \\n - Are comfortable working closely with researchers, translating research needs into engineering solutions. \\n \\n STRONG CANDIDATES MAY ALSO HAVE EXPERIENCE WITH: \\n - Optimizing the performance of large-scale distributed systems \\n - Language modeling fundamentals with transformers \\n - High Performance LLM optimization: memory management, compute efficiency, parallelism strategies, inference throughput optimization \\n - Working hands-on in a mainstream ML stack - PyTorch/CUDA on GPUs or JAX/XLA on TPUs \\n - Collaborating closely with researchers and building tooling to support research teams; or directly performed research with complex engineering challenges \\n \\n REPRESENTATIVE PROJECTS: \\n - Building Garcon, a tool that allows researchers to easily instrument LLMs to extract internal activations \\n - Designing and optimizing a pipeline to efficiently collect petabytes of transformer activations and shuffle them \\n - Profiling and optimizing ML training jobs, including multi-GPU parallelism and memory optimization \\n - Building a steered inference system that applies targeted interventions to model internals at scale (conceptually similar to Golden Gate Claude but for safety research) \\n \\n ROLE SPECIFIC LOCATION POLICY: \\n - This role is based in the San Francisco office; however, we are open to considering exceptional candidates for remote work on a case-by-case basis. \\n \\n The annual compensation range for this role is listed below. \\n For sales roles, the range provided is the role&#39;s On Target Earnings (\\&quot;OTE\\&quot;) range, meaning that the range includes both the sales commissions/sales bonuses target and annual base salary for the role. \\n Annual Salary:\\\\$315,000-\\\\$560,000 USD</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_dc17980d-461","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Anthropic","sameAs":"https://www.anthropic.com/","logo":"https://logos.yubhub.co/anthropic.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/anthropic/jobs/4980430008","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$315,000-$560,000 USD","x-skills-required":["Python","Rust","Go","Java","PyTorch","CUDA","JAX","XLA","High Performance LLM optimization","memory management","compute efficiency","parallelism strategies","inference throughput optimization"],"x-skills-preferred":["large-scale distributed systems","language modeling fundamentals","transformers","collaborating closely with researchers","building tooling to support research teams"],"datePosted":"2026-04-18T15:53:01.682Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco, CA"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Python, Rust, Go, Java, PyTorch, CUDA, JAX, XLA, High Performance LLM optimization, memory management, compute efficiency, parallelism strategies, inference throughput optimization, large-scale distributed systems, language modeling fundamentals, transformers, collaborating closely with researchers, building tooling to support research teams","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":315000,"maxValue":560000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_ac14f361-5b8"},"title":"Network Engineer, Capacity and Efficiency","description":"<p>We&#39;re looking for a network engineer who thinks in metrics first. You will use deep networking knowledge and rigorous measurement to figure out where and how bandwidth, latency, and dollars are being used, find optimization opportunities and land them.</p>\n<p>You will instrument spine-leaf fabrics, BGP, SDN overlays, and cloud interconnect products well enough to build them. You&#39;ll own the observability and efficiency surface for Anthropic&#39;s network: from per-flow telemetry on backbone routers, to QoS policy on cross-region links carrying inference traffic, to cost attribution that tells a research team exactly what their checkpoint sync is costing.</p>\n<p>This is a hands-on IC role. You&#39;ll write code (Python, Go), build dashboards, model capacity, and ship config changes to production routers. You&#39;ll also influence architecture: when the data says a traffic pattern is pathological, you&#39;ll be in the room root causing it and fixing it.</p>\n<p>You will be working across three areas: network telemetry and observability, traffic engineering, and cost modeling and attribution. We expect you to be strong in at least two and willing to grow into the third.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_ac14f361-5b8","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Anthropic","sameAs":"https://anthropic.com","logo":"https://logos.yubhub.co/anthropic.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/anthropic/jobs/5177143008","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["BGP","ECMP","VXLAN/EVPN","QoS","L1/optical basics","CSP networking model","network telemetry","flow export","eBPF-based host-side instrumentation","Python","Go"],"x-skills-preferred":["SRE experience for large-scale network infrastructure","cloud provider's networking team or a cloud networking product team","AI/ML infrastructure traffic patterns","HPC fabrics","traffic engineering for large backbones"],"datePosted":"2026-04-18T15:52:49.160Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco, CA | New York City, NY"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"BGP, ECMP, VXLAN/EVPN, QoS, L1/optical basics, CSP networking model, network telemetry, flow export, eBPF-based host-side instrumentation, Python, Go, SRE experience for large-scale network infrastructure, cloud provider's networking team or a cloud networking product team, AI/ML infrastructure traffic patterns, HPC fabrics, traffic engineering for large backbones"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_a1ab2590-2b4"},"title":"Staff Security Engineer, Network Security","description":"<p>We are seeking a Staff Network Security Engineer to architect the defense of our global backbone, edge, and massive-scale GPU clusters. You will move beyond configuring firewalls to engineering security into the network fabric itself,utilizing telemetry, automation, and deep protocol analysis.</p>\n<p>As a Staff Network Security Engineer, you will:</p>\n<p>Unravel and tackle network security challenges at an exhilarating global scale. Collaborate with exceptional network architects and engineers building the backbone infrastructure for the AI revolution. Enjoy the freedom and support to experiment, innovate, and significantly shape our approach to securing the underlay and overlay of our cloud.</p>\n<p>In this role, you will: Conducting architecture reviews, protocol analysis, and design assessments to proactively identify and fix vulnerabilities in our backbone and data center fabrics. Developing robust, repeatable frameworks for network security automation (CoPP, ACL generation, Route Filtering) that make it easy for teams to build securely from day one. Collaborating closely with Network Engineering teams to integrate security checks and validation seamlessly into their CI/CD and config-push pipelines. Crafting clear, practical security guidance and documentation that empowers engineers to deploy secure routing policies and topologies. Actively participating in architectural discussions regarding peering, transit, and traffic engineering, providing insightful security recommendations. Occasionally, &#39;drawing the owl&#39; - figuring out innovative solutions for securing massive throughput environments while navigating ambiguous situations.</p>\n<p>You will be working with a talented team of network engineers, security experts, and AI researchers to build and deploy a highly scalable and secure cloud infrastructure.</p>\n<p>If you are passionate about network security, cloud computing, and AI, and enjoy working in a fast-paced, dynamic environment, we encourage you to apply for this exciting opportunity.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_a1ab2590-2b4","directApply":true,"hiringOrganization":{"@type":"Organization","name":"CoreWeave","sameAs":"https://www.coreweave.com","logo":"https://logos.yubhub.co/coreweave.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/coreweave/jobs/4620164006","x-work-arrangement":"hybrid","x-experience-level":"staff","x-job-type":"full-time","x-salary-range":"$188,000 to $275,000","x-skills-required":["core network protocols (BGP, OSPF/IS-IS, TCP/IP)","deep knowledge of how they function at the packet level","network automation or security tooling in Go, Python, or similar modern languages","collaborating with network architects to implement secure designs in multi-vendor environments","Linux networking internals, control plane protection, and managing infrastructure as code"],"x-skills-preferred":["hyperscale network architectures (CLOS fabrics, MPLS/EVPN, VXLAN)","hardware-level networking security (SmartNICs/DPUs, connectX)","flow-based telemetry analysis","internet routing security standards (RPKI, MANRS)","advanced DDoS mitigation strategies at the network layer","Infiniband and RoCE"],"datePosted":"2026-04-18T15:52:43.431Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Livingston, NJ / New York, NY / Sunnyvale, CA / Bellevue, WA"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"core network protocols (BGP, OSPF/IS-IS, TCP/IP), deep knowledge of how they function at the packet level, network automation or security tooling in Go, Python, or similar modern languages, collaborating with network architects to implement secure designs in multi-vendor environments, Linux networking internals, control plane protection, and managing infrastructure as code, hyperscale network architectures (CLOS fabrics, MPLS/EVPN, VXLAN), hardware-level networking security (SmartNICs/DPUs, connectX), flow-based telemetry analysis, internet routing security standards (RPKI, MANRS), advanced DDoS mitigation strategies at the network layer, Infiniband and RoCE","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":188000,"maxValue":275000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_97212bdf-dd1"},"title":"Research Engineer, Interpretability","description":"<p>Job Title: Research Engineer, Interpretability</p>\n<p>About the Role:</p>\n<p>When you see what modern language models are capable of, do you wonder, &quot;How do these things work? How can we trust them?&quot; The Interpretability team at Anthropic is working to reverse-engineer how trained models work because we believe that a mechanistic understanding is the most robust way to make advanced systems safe.</p>\n<p>Think of us as doing &quot;neuroscience&quot; of neural networks using &quot;microscopes&quot; we build - or reverse-engineering neural networks like binary programs.</p>\n<p>More resources to learn about our work:</p>\n<ul>\n<li>Our research blog - covering advances including Monosemantic Features and Circuits</li>\n</ul>\n<ul>\n<li>An Introduction to Interpretability from our research lead, Chris Olah</li>\n</ul>\n<ul>\n<li>The Urgency of Interpretability from CEO Dario Amodei</li>\n</ul>\n<ul>\n<li>Engineering Challenges Scaling Interpretability - directly relevant to this role</li>\n</ul>\n<ul>\n<li>60 Minutes segment - Around 8:07, see a demo of tooling our team built</li>\n</ul>\n<ul>\n<li>New Yorker article - what it&#39;s like to work on one of AI&#39;s hardest open problems</li>\n</ul>\n<p>Even if you haven&#39;t worked on interpretability before, the infrastructure expertise is similar to what&#39;s needed across the lifecycle of a production language model:</p>\n<ul>\n<li>Pretraining: Training dictionary learning models looks a lot like model pretraining - creating stable, performant training jobs for massively parameterized models across thousands of chips</li>\n</ul>\n<ul>\n<li>Inference: Interp runs a customized inference stack. Day-to-day analysis requires services that allow editing a model&#39;s internal activations mid-forward-pass - for example, adding a &quot;steering vector&quot;</li>\n</ul>\n<ul>\n<li>Performance: Like all LLM work, we push up against the limits of hardware and software. Rather than squeezing the last 0.1%, we are focused on finding bottlenecks, fixing them and moving ahead given rapidly evolving research and safety mission</li>\n</ul>\n<p>The science keeps scaling - and it&#39;s now applied directly in safety audits on frontier models, with real deadlines. As our research has matured, engineering and infrastructure have become a bottleneck. Your work will have a direct impact on one of the most important open problems in AI.</p>\n<p>Responsibilities:</p>\n<ul>\n<li>Build and maintain the specialized inference and training infrastructure that powers interpretability research - including instrumented forward/backward passes, activation extraction, and steering vector application</li>\n</ul>\n<ul>\n<li>Resolve scaling and efficiency bottlenecks through profiling, optimization, and close collaboration with peer infrastructure teams</li>\n</ul>\n<ul>\n<li>Design tools, abstractions, and platforms that enable researchers to rapidly experiment without hitting engineering barriers</li>\n</ul>\n<ul>\n<li>Help bring interpretability research into production safety audits - with real deadlines and high reliability expectations</li>\n</ul>\n<ul>\n<li>Work across the stack - from model internals and accelerator-level optimization to user-facing research tooling</li>\n</ul>\n<p>You may be a good fit if you:</p>\n<ul>\n<li>Have 5-10+ years of experience building software</li>\n</ul>\n<ul>\n<li>Are highly proficient in at least one programming language (e.g., Python, Rust, Go, Java) and productive with Python</li>\n</ul>\n<ul>\n<li>Are extremely curious about unfamiliar domains; can quickly learn and put that knowledge to work, e.g. diving into new layers of the stack to find bottlenecks</li>\n</ul>\n<ul>\n<li>Have a strong ability to prioritize the most impactful work and are comfortable operating with ambiguity and questioning assumptions</li>\n</ul>\n<ul>\n<li>Prefer fast-moving collaborative projects to extensive solo efforts</li>\n</ul>\n<ul>\n<li>Are curious about interpretability research and its role in AI safety (though no research experience is required!)</li>\n</ul>\n<ul>\n<li>Care about the societal impacts and ethics of your work</li>\n</ul>\n<ul>\n<li>Are comfortable working closely with researchers, translating research needs into engineering solutions.</li>\n</ul>\n<p>Strong candidates may also have experience with:</p>\n<ul>\n<li>Optimizing the performance of large-scale distributed systems</li>\n</ul>\n<ul>\n<li>Language modeling fundamentals with transformers</li>\n</ul>\n<ul>\n<li>High Performance LLM optimization: memory management, compute efficiency, parallelism strategies, inference throughput optimization</li>\n</ul>\n<ul>\n<li>Working hands-on in a mainstream ML stack - PyTorch/CUDA on GPUs or JAX/XLA on TPUs</li>\n</ul>\n<ul>\n<li>Collaborating closely with researchers and building tooling to support research teams; or directly performed research with complex engineering challenges</li>\n</ul>\n<p>Representative Projects:</p>\n<ul>\n<li>Building Garcon, a tool that allows researchers to easily instrument LLMs to extract internal activations</li>\n</ul>\n<ul>\n<li>Designing and optimizing a pipeline to efficiently collect petabytes of transformer activations and shuffle them</li>\n</ul>\n<ul>\n<li>Profiling and optimizing ML training jobs, including multi-GPU parallelism and memory optimization</li>\n</ul>\n<ul>\n<li>Building a steered inference system that applies targeted interventions to model internals at scale (conceptually similar to Golden Gate Claude but for safety research)</li>\n</ul>\n<p>Role Specific Location Policy:</p>\n<ul>\n<li>This role is based in the San Francisco office; however, we are open to considering exceptional candidates for remote work on a case-by-case basis.</li>\n</ul>\n<p>The annual compensation range for this role is listed below.</p>\n<p>For sales roles, the range provided is the role&#39;s On Target Earnings (&quot;OTE&quot;) range, meaning that the range includes both the sales commissions/sales bonuses target and annual base salary for the role.</p>\n<p>Annual Salary: $315,000-$560,000 USD</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_97212bdf-dd1","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Anthropic","sameAs":"https://www.anthropic.com/","logo":"https://logos.yubhub.co/anthropic.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/anthropic/jobs/4980430008","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$315,000-$560,000 USD","x-skills-required":["Python","Rust","Go","Java","PyTorch","CUDA","JAX","XLA","Transformers","High Performance LLM optimization","Memory management","Compute efficiency","Parallelism strategies","Inference throughput optimization"],"x-skills-preferred":["Optimizing the performance of large-scale distributed systems","Language modeling fundamentals","Collaborating closely with researchers and building tooling to support research teams"],"datePosted":"2026-04-18T15:46:01.999Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco, CA"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Python, Rust, Go, Java, PyTorch, CUDA, JAX, XLA, Transformers, High Performance LLM optimization, Memory management, Compute efficiency, Parallelism strategies, Inference throughput optimization, Optimizing the performance of large-scale distributed systems, Language modeling fundamentals, Collaborating closely with researchers and building tooling to support research teams","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":315000,"maxValue":560000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_9cd0420a-99d"},"title":"Network Engineer, Capacity and Efficiency","description":"<p><strong>About the Role</strong></p>\n<p>We&#39;re looking for a network engineer who thinks in metrics first. You will use deep networking knowledge and rigorous measurement to figure out where and how bandwidth, latency, and dollars are being used, find optimization opportunities and land them.</p>\n<p><strong>Responsibilities</strong></p>\n<ul>\n<li>Build the network observability stack. Design and deploy telemetry pipelines , sFlow/IPFIX, gNMI streaming, eBPF host probes , that turn packet counters into per-flow, per-tenant, per-workload cost and utilization data. Own the SLIs for backbone and DCN fabric health.</li>\n<li>Hunt for efficiency. Analyze inter-region traffic patterns, identify hot links and stranded capacity, and quantify the dollar impact. Build the models that tell us whether we should buy more capacity, or move the workload.</li>\n<li>Own QoS and traffic engineering. Design and operate traffic classification, marking, and shaping across the backbone. Make sure bulk checkpoint transfers don’t starve latency-sensitive inference, and that we’re not paying premium cross-region rates for traffic that could take the cheap path.</li>\n<li>Drive cost attribution. Tie network spend , egress, interconnect ports, transit, optical leases , back to the teams and workloads that generate it. Make network cost a first-class input to capacity planning and workload placement decisions.</li>\n<li>Influence decisions you don&#39;t own. A large fraction of this role is convincing other teams to act on what your data shows: making the case to research that a traffic pattern needs to change, to finance that an interconnect tranche is worth buying, to Systems Networking that a QoS policy needs rewriting.</li>\n</ul>\n<p><strong>Requirements</strong></p>\n<ul>\n<li>Have 5+ years operating large-scale production networks , data center fabrics (spine-leaf, Clos), backbone/WAN, or hyperscaler-adjacent environments.</li>\n<li>Are genuinely fluent across the stack: BGP (including policy and communities), ECMP, VXLAN/EVPN or equivalent overlays, QoS (DSCP, queuing, shaping), and L1/optical basics (DWDM, coherent, LAGs).</li>\n<li>Know at least one major CSP’s networking model deeply , AWS (VPC, TGW, Direct Connect, Gateway Load Balancer) or GCP (Shared VPC, Interconnect, Cloud Router, Network Connectivity Center) , and understand how their overlays interact with physical underlays.</li>\n<li>Have built or operated network telemetry at scale: streaming telemetry (gNMI/OpenConfig), flow export (sFlow, IPFIX, NetFlow), or eBPF-based host-side instrumentation. You can reason about sampling, cardinality, and storage tradeoffs.</li>\n<li>Comfortable writing Python or Go to build tooling, telemetry pipelines, infrastructure-as-code, config management for network devices and automation, that you’ll ship to production.</li>\n<li>Think quantitatively by default. You reach for a notebook or a Grafana query before you reach for an opinion, and you can turn messy counter data into a defensible cost model.</li>\n<li>Communicate crisply. You can explain to a finance partner why a 10% egress reduction matters, and to a network engineer why a specific ECMP imbalance is costing real money.</li>\n</ul>\n<p><strong>Nice to Have</strong></p>\n<ul>\n<li>SRE experience for large-scale network infrastructure , designing for reliability, defining SLOs/SLIs for network services, capacity planning with error budgets, and incident response for network-impacting outages at scale.</li>\n<li>Background on a cloud provider&#39;s networking team or a cloud networking product team , building or operating the interconnect, backbone, or SDN control plane from the provider side, not just consuming it as a customer.</li>\n<li>Familiarity with AI/ML infrastructure traffic patterns like collective communication (all-reduce, all-gather), checkpoint/weight transfer, inference serving, and how these stress networks differ than traditional workloads in terms of burst behavior, flow synchronization, and bandwidth symmetry.</li>\n<li>Experience with HPC fabrics like InfiniBand, RoCE v2, lossless Ethernet, or custom high-radix topologies and an understanding of how job placement, congestion management, and adaptive routing interact at scale.</li>\n<li>Background in traffic engineering for large backbones and the operational judgment to know when TE is worth the complexity.</li>\n<li>Hands-on time with multi-cloud connectivity: cross-cloud peering, private interconnect products, and the billing models that come with them.</li>\n<li>Experience building cost/chargeback systems for shared infrastructure, or FinOps exposure in a large cloud environment.</li>\n</ul>\n<p><strong>Representative Projects</strong></p>\n<ul>\n<li>Build a per-flow cost attribution pipeline that traces every byte of cross-region egress back to the team and workload that generated it</li>\n<li>Design QoS policy for the private backbone that prevents bulk checkpoint transfers from starving inference traffic</li>\n<li>Model whether it&#39;s cheaper to buy an additional 1.6Tb interconnect tranche or to re-route traffic through existing capacity</li>\n<li>Instrument DCN fabric utilization with streaming telemetry and build the Grafana dashboards that become the team&#39;s source of truth for network observability</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_9cd0420a-99d","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Anthropic","sameAs":"https://anthropic.com","logo":"https://logos.yubhub.co/anthropic.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/anthropic/jobs/5177143008","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["network engineering","network observability","telemetry pipelines","sFlow/IPFIX","gNMI streaming","eBPF host probes","BGP","ECMP","VXLAN/EVPN","QoS","DSCP","queuing","shaping","L1/optical basics","DWDM","coherent","LAGs","AWS","GCP","cloud networking","infrastructure-as-code","config management","automation","Python","Go","quantitative analysis","cost modeling","communication"],"x-skills-preferred":["SRE","cloud provider's networking team","cloud networking product team","AI/ML infrastructure traffic patterns","HPC fabrics","traffic engineering","multi-cloud connectivity","cost/chargeback systems","FinOps"],"datePosted":"2026-04-18T15:42:29.482Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco, CA | New York City, NY"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"network engineering, network observability, telemetry pipelines, sFlow/IPFIX, gNMI streaming, eBPF host probes, BGP, ECMP, VXLAN/EVPN, QoS, DSCP, queuing, shaping, L1/optical basics, DWDM, coherent, LAGs, AWS, GCP, cloud networking, infrastructure-as-code, config management, automation, Python, Go, quantitative analysis, cost modeling, communication, SRE, cloud provider's networking team, cloud networking product team, AI/ML infrastructure traffic patterns, HPC fabrics, traffic engineering, multi-cloud connectivity, cost/chargeback systems, FinOps"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_28107212-128"},"title":"Performance Engineer, GPU","description":"<p>As a GPU Performance Engineer at Anthropic, you will be responsible for architecting and implementing the foundational systems that power Claude and push the frontiers of what&#39;s possible with large language models. You will maximize GPU utilization and performance at unprecedented scale, develop cutting-edge optimizations that directly enable new model capabilities, and dramatically improve inference efficiency.</p>\n<p>Working at the intersection of hardware and software, you will implement state-of-the-art techniques from custom kernel development to distributed system architectures. Your work will span the entire stack,from low-level tensor core optimizations to orchestrating thousands of GPUs in perfect synchronization.</p>\n<p>Strong candidates will have a track record of delivering transformative GPU performance improvements in production ML systems and will be excited to shape the future of AI infrastructure alongside world-class researchers and engineers.</p>\n<p>Responsibilities:</p>\n<ul>\n<li>Architect and implement foundational systems that power Claude</li>\n<li>Maximize GPU utilization and performance at unprecedented scale</li>\n<li>Develop cutting-edge optimizations that directly enable new model capabilities</li>\n<li>Dramatically improve inference efficiency</li>\n<li>Implement state-of-the-art techniques from custom kernel development to distributed system architectures</li>\n<li>Work at the intersection of hardware and software</li>\n<li>Span the entire stack,from low-level tensor core optimizations to orchestrating thousands of GPUs in perfect synchronization</li>\n</ul>\n<p>Requirements:</p>\n<ul>\n<li>Deep experience with GPU programming and optimization at scale</li>\n<li>Impact-driven, passionate about delivering measurable performance breakthroughs</li>\n<li>Ability to navigate complex systems from hardware interfaces to high-level ML frameworks</li>\n<li>Enjoy collaborative problem-solving and pair programming</li>\n<li>Want to work on state-of-the-art language models with real-world impact</li>\n<li>Care about the societal impacts of your work</li>\n<li>Thrive in ambiguous environments where you define the path forward</li>\n</ul>\n<p>Nice to have:</p>\n<ul>\n<li>Experience with GPU Kernel Development: CUDA, Triton, CUTLASS, Flash Attention, tensor core optimization</li>\n<li>ML Compilers &amp; Frameworks: PyTorch/JAX internals, torch.compile, XLA, custom operators</li>\n<li>Performance Engineering: Kernel fusion, memory bandwidth optimization, profiling with Nsight</li>\n<li>Distributed Systems: NCCL, NVLink, collective communication, model parallelism</li>\n<li>Low-Precision: INT8/FP8 quantization, mixed-precision techniques</li>\n<li>Production Systems: Large-scale training infrastructure, fault tolerance, cluster orchestration</li>\n</ul>\n<p>Representative projects:</p>\n<ul>\n<li>Co-design attention mechanisms and algorithms for next-generation hardware architectures</li>\n<li>Develop custom kernels for emerging quantization formats and mixed-precision techniques</li>\n<li>Design distributed communication strategies for multi-node GPU clusters</li>\n<li>Optimize end-to-end training and inference pipelines for frontier language models</li>\n<li>Build performance modeling frameworks to predict and optimize GPU utilization</li>\n<li>Implement kernel fusion strategies to minimize memory bandwidth bottlenecks</li>\n<li>Create resilient systems for planet-scale distributed training infrastructure</li>\n<li>Profile and eliminate performance bottlenecks in production serving infrastructure</li>\n<li>Partner with hardware vendors to influence future accelerator capabilities and software stacks</li>\n</ul>\n<p>Note: The salary range for this position is $280,000-$850,000 USD per year.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_28107212-128","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Anthropic","sameAs":"https://www.anthropic.com/","logo":"https://logos.yubhub.co/anthropic.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/anthropic/jobs/4926227008","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$280,000-$850,000 USD per year","x-skills-required":["GPU programming","optimization at scale","CUDA","Triton","CUTLASS","Flash Attention","tensor core optimization","PyTorch/JAX internals","torch.compile","XLA","custom operators","kernel fusion","memory bandwidth optimization","profiling with Nsight","NCCL","NVLink","collective communication","model parallelism","INT8/FP8 quantization","mixed-precision techniques","large-scale training infrastructure","fault tolerance","cluster orchestration"],"x-skills-preferred":[],"datePosted":"2026-04-18T15:40:11.758Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco, CA | New York City, NY | Seattle, WA"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"GPU programming, optimization at scale, CUDA, Triton, CUTLASS, Flash Attention, tensor core optimization, PyTorch/JAX internals, torch.compile, XLA, custom operators, kernel fusion, memory bandwidth optimization, profiling with Nsight, NCCL, NVLink, collective communication, model parallelism, INT8/FP8 quantization, mixed-precision techniques, large-scale training infrastructure, fault tolerance, cluster orchestration","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":280000,"maxValue":850000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_28b01ce3-8a3"},"title":"Member of Technical Staff - Imagine Model","description":"<p>As a Member of Technical Staff on the Imagine Model Team, you will develop cutting-edge AI experiences beyond text, with a strong focus on enabling high-fidelity understanding and generation across image and video modalities, while also incorporating audio where it enhances visual content.</p>\n<p>Responsibilities:</p>\n<ul>\n<li>Create and drive engineering agendas to advance multimodal capabilities, with emphasis on image and video generation, editing, understanding, controllable/long-horizon synthesis, agentic planning, RL training, and world simulation (including audio integration for richer video experiences).</li>\n<li>Improve data quality through annotation, filtering, augmentation, synthetic generation, captioning, and in-depth data studies, particularly for visual and audio data.</li>\n<li>Design evaluation frameworks, metrics, benchmarks, evals, and reward models tailored to image/video/audio quality and coherence.</li>\n<li>Implement efficient algorithms for state-of-the-art model performance, including real-time inference, distillation, and scalable serving for visual content.</li>\n<li>Develop scalable data collection and processing pipelines for multimodal (primarily image/video-focused) datasets.</li>\n<li>Collaborate cross-functionally to integrate AI solutions into production and rapidly iterate based on user feedback.</li>\n</ul>\n<p>Basic Qualifications:</p>\n<ul>\n<li>Track record in leading studies that significantly improve neural network capabilities and performance through better data or modeling.</li>\n<li>Experience in data-driven experiment designs, systematic analysis, and iterative model debugging.</li>\n<li>Experience developing or working with large-scale distributed machine learning systems.</li>\n<li>Ability to deliver optimal end-to-end user experiences.</li>\n<li>Hands-on contributor with initiative, excellence, strong work ethic, prioritization skills, and excellent communication.</li>\n</ul>\n<p>Preferred Skills and Experience:</p>\n<ul>\n<li>Experience in SFT, RL, evals, human/synthetic data collection, or agentic systems.</li>\n<li>Proficiency in Python, JAX/XLA, PyTorch, Rust/C++, Spark, Ray, and related large-scale frameworks.</li>\n<li>Domain expertise in multimodal applications such as graphics engines, rendering techniques, image/video understanding and generation, world models, real-time simulation, or controllable/long-horizon visual content creation (audio/speech processing or music/audio generation experience is a plus where it supports video).</li>\n<li>Experience with agentic RL training, controllable/long-horizon generation, or multimodal agents that reason and act across modalities (especially in visual domains).</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_28b01ce3-8a3","directApply":true,"hiringOrganization":{"@type":"Organization","name":"xAI","sameAs":"https://www.xai.com/","logo":"https://logos.yubhub.co/xai.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/xai/jobs/5051985007","x-work-arrangement":"hybrid","x-experience-level":"staff","x-job-type":"full-time","x-salary-range":"$180,000 - $440,000 USD","x-skills-required":["Python","JAX/XLA","PyTorch","Rust/C++","Spark","Ray","multimodal applications","agentic systems","RL training","controllable/long-horizon generation"],"x-skills-preferred":["SFT","evals","human/synthetic data collection","graphics engines","rendering techniques","image/video understanding and generation","world models","real-time simulation"],"datePosted":"2026-04-18T15:24:12.847Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Palo Alto, CA; Seattle, WA"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Python, JAX/XLA, PyTorch, Rust/C++, Spark, Ray, multimodal applications, agentic systems, RL training, controllable/long-horizon generation, SFT, evals, human/synthetic data collection, graphics engines, rendering techniques, image/video understanding and generation, world models, real-time simulation","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":180000,"maxValue":440000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_540ce49c-271"},"title":"Member of Technical Staff - Multimodal Understanding","description":"<p><strong>About the Role</strong></p>\n<p>You will join the multimodal team to push toward superhuman multimodal intelligence. Advance understanding and generation across modalities,image, video, audio, and text,spanning the full stack: data curation/acquisition, tokenizer training, large-scale pre-training, post-training/alignment, infrastructure/scaling, evaluation, tooling/demos, and end-to-end product experiences.</p>\n<p>Collaborate cross-functionally with pre-training, post-training, reasoning, data, applied, and product teams to deliver frontier capabilities in multimodal reasoning, world modeling, tool use, agentic behaviors, and interactive human-AI collaboration. Contribute to building models that can see, hear, reason about, and interact with the world in real time at unprecedented levels.</p>\n<p><strong>Responsibilities</strong></p>\n<ul>\n<li>Design, build, and optimize large-scale distributed systems for multimodal pre-training, post-training, inference, data processing, and tokenization at web/petabyte scale.</li>\n<li>Develop high-throughput pipelines for data acquisition, preprocessing, filtering, generation, decoding, loading, crawling, visualization, and management (images, videos, audio + text).</li>\n<li>Advance multimodal capabilities including spatial-temporal compression, cross-modal alignment, world modeling, reasoning, emergent abilities, audio/image/video understanding &amp; generation, real-time video processing, and noisy data handling.</li>\n<li>Drive data quality and studies: curation (human/synthetic), filtering techniques, analysis, and scalable pipelines to support trillion-parameter models.</li>\n<li>Create evaluation frameworks, internal benchmarks, reward models, and metrics that capture real-world usage, failure modes, interactive dynamics, and human-AI synergy.</li>\n<li>Innovate on algorithms, modeling approaches, hardware/software/algorithm co-design, and scaling paradigms for state-of-the-art performance.</li>\n<li>Build research tooling, user-friendly interfaces, prototypes/demos, full-stack applications, and enable rapid iteration based on feedback.</li>\n<li>Work across the stack (pre-training → SFT/RL/post-training) to enable reasoning, tool calling, agentic behaviors, orchestration, and seamless real-time interactions.</li>\n</ul>\n<p><strong>Basic Qualifications</strong></p>\n<ul>\n<li>Hands-on experience with multimodal pre-training, post-training, or fine-tuning (vision, audio, video, or cross-modal).</li>\n<li>Expert-level proficiency in Python (core language), with strong experience in at least one of: JAX / PyTorch / XLA.</li>\n<li>Proven track record building or optimizing large-scale distributed ML systems (training/inference optimization, GPU utilization, multi-GPU/TPU setups, hardware co-design).</li>\n<li>Deep experience designing and running data pipelines at scale: curation, filtering, generation, quality studies, especially for noisy/real-world multimodal data.</li>\n<li>Strong fundamentals in evaluation design, benchmarks, reward modeling, or RL techniques (particularly for interactive/agentic behaviors).</li>\n<li>Proactive self-starter who thrives in high-intensity environments and is passionate about pushing multimodal AI frontiers.</li>\n<li>Willingness to own end-to-end initiatives and do whatever it takes to deliver breakthrough user experiences.</li>\n</ul>\n<p><strong>Preferred Skills and Experience</strong></p>\n<ul>\n<li>Experience leading major improvements in model capabilities through better data, modeling, algorithms, or scaling.</li>\n<li>Familiarity with state-of-the-art in multimodal LLMs, scaling laws, tokenizers, compression techniques, reasoning, or agentic systems.</li>\n<li>Proficiency in Rust and/or C++ for performance-critical components.</li>\n<li>Hands-on work with large-scale orchestration tools such as Spark, Ray, or Kubernetes.</li>\n<li>Background building full-stack tooling: performant interfaces, real-time research demos/apps, or end-to-end product ownership.</li>\n<li>Passion for end-to-end user experience in interactive, real-time multimodal AI systems.</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_540ce49c-271","directApply":true,"hiringOrganization":{"@type":"Organization","name":"xAI","sameAs":"https://www.xai.com","logo":"https://logos.yubhub.co/xai.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/xai/jobs/5111374007","x-work-arrangement":"onsite","x-experience-level":"staff","x-job-type":"full-time","x-salary-range":"$180,000 - $440,000 USD","x-skills-required":["Multimodal pre-training","Post-training","Fine-tuning","Python","JAX","PyTorch","XLA","Large-scale distributed ML systems","Data pipelines","Evaluation design","Benchmarks","Reward modeling","RL techniques"],"x-skills-preferred":["State-of-the-art in multimodal LLMs","Scaling laws","Tokenizers","Compression techniques","Reasoning","Agentic systems","Rust","C++","Spark","Ray","Kubernetes","Full-stack tooling"],"datePosted":"2026-04-18T15:23:05.119Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Palo Alto, CA"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Multimodal pre-training, Post-training, Fine-tuning, Python, JAX, PyTorch, XLA, Large-scale distributed ML systems, Data pipelines, Evaluation design, Benchmarks, Reward modeling, RL techniques, State-of-the-art in multimodal LLMs, Scaling laws, Tokenizers, Compression techniques, Reasoning, Agentic systems, Rust, C++, Spark, Ray, Kubernetes, Full-stack tooling","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":180000,"maxValue":440000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_8980bea0-e13"},"title":"Senior Software Engineer, Java - Network Team","description":"<p>We are revolutionizing the way large networks are managed. Our Forward Enterprise platform delivers a vendor-agnostic &#39;digital twin&#39; of the network, based on a mathematical model. The platform scales to support hundreds of thousands of network devices, whether cloud, hybrid cloud, or on-prem. It serves as a single source of truth for the network, enabling network operators to instantly verify security posture, accelerate troubleshooting, avoid outages, and modernize network management.</p>\n<p>Our team is currently seeking experienced Java developers to work as part of our Network team. As a senior software engineer, you will help bring the best ideas from the software development world into the networking industry.</p>\n<p>Responsibilities:</p>\n<ul>\n<li>Contribute to our code base, systems and software architecture as a member of our engineering team.</li>\n<li>Help create and optimize network device models for different device vendors and protocols.</li>\n<li>Help create infrastructure needed to configure, collect and test network devices.</li>\n<li>Work with peers who are experts in Networking, Distributed Systems, Big Data and Search.</li>\n</ul>\n<p>Requirements:</p>\n<ul>\n<li>5+ years of work experience in software development</li>\n<li>3+ years of work experience with Java</li>\n<li>BS in Computer Science or related degree</li>\n<li>Solid software engineering experience with large code bases</li>\n<li>Basic understanding of networking and TCP/IP.</li>\n<li>Strong verbal and written communication skills.</li>\n</ul>\n<p>Nice to have:</p>\n<ul>\n<li>Working knowledge of how switches, routers, firewalls or load balancers work.</li>\n<li>Experience working with networking protocols such as BGP/OSPF/IS-IS, IPv4/IPv6, MPLS, VLAN, VXLAN, etc.</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_8980bea0-e13","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Forward Networks","sameAs":"https://www.forward.net/","logo":"https://logos.yubhub.co/forward.net.png"},"x-apply-url":"https://job-boards.greenhouse.io/forwardnetworks/jobs/5967053003","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["Java","networking","TCP/IP","software engineering","large code bases"],"x-skills-preferred":["switches","routers","firewalls","load balancers","BGP/OSPF/IS-IS","IPv4/IPv6","MPLS","VLAN","VXLAN"],"datePosted":"2026-04-17T12:36:15.038Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Bengaluru, India"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Java, networking, TCP/IP, software engineering, large code bases, switches, routers, firewalls, load balancers, BGP/OSPF/IS-IS, IPv4/IPv6, MPLS, VLAN, VXLAN"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_ec71d906-a19"},"title":"IT Security Network Engineer - Sr Staff","description":"<p>Synopsys is seeking a motivated and passionate Sr. Staff Network Engineer to join our dynamic global network engineering team. As a Sr. Staff Network Engineer, you will be responsible for leading the design, architecture, and implementation of complex network solutions to meet evolving business requirements and objectives. You will maintain network standards, policies, and best practices to ensure consistency, reliability, and security across global operations.</p>\n<p>Your responsibilities will include designing, configuring, deploying, monitoring, and troubleshooting production network infrastructure and associated services, including LAN, WAN, Data Center, remote access, wireless, and firewall security. You will develop and maintain comprehensive documentation for network configurations, processes, security policies, and procedures.</p>\n<p>You will define and lead networking strategy aligned with business growth, automation goals, and scalable AI infrastructure. You will implement automation tools and AI-driven solutions to optimize network operations and reduce manual intervention. You will utilize automation tools to standardize deployment configurations and environments for consistency and efficiency.</p>\n<p>You will identify and resolve issues related to network and security infrastructure performance, efficiency, and availability. You will communicate effectively with stakeholders at all levels, translating complex technical topics into accessible insights.</p>\n<p>As a Sr. Staff Network Engineer, you will enable secure, scalable, and highly available network infrastructure supporting Synopsys&#39; global business operations. You will drive innovation through the adoption of automation and AI, enhancing network efficiency and reducing manual overhead. You will champion best practices and standards that elevate network reliability, security, and performance.</p>\n<p>You will mentor and empower team members, fostering a culture of learning, collaboration, and technical excellence. You will contribute to strategic initiatives that align networking capabilities with company growth and emerging technologies. You will enhance stakeholder engagement through clear communication and the delivery of impactful solutions.</p>\n<p>You will influence the direction of Synopsys&#39; network architecture, ensuring it remains at the forefront of industry advancements.</p>\n<p>To be successful in this role, you will need:</p>\n<ul>\n<li>Bachelor&#39;s degree in Computer Science, Information Technology, Engineering, or related field</li>\n<li>8+ years of experience in network engineering, with several years in senior staff or architecture-oriented roles</li>\n<li>Expertise in designing and supporting large-scale enterprise networks</li>\n<li>Deep understanding of network security systems and protocols (IPSec, IKE, GRE, TACACS, RADIUS, 802.1x, NAC, EAP-TLS)</li>\n<li>Expert-level knowledge of networking fundamentals: TCP/IP, switching/routing, BGP, OSPF, DMVPN, EVPN/VXLAN, SD-WAN, MPLS</li>\n<li>Proficiency in wireless standards and technologies: 802.11a/b/g/n/ac/ax, MIMO, beamforming, channel planning</li>\n<li>Experience with network configuration management and automation tools (Python, Ansible, OpenStack, Terraform, REST API)</li>\n<li>Extensive hands-on experience with Cisco, Aruba, Zscaler, Palo Alto Networks equipment and platforms</li>\n<li>Ability to analyze raw packet data to uncover network performance issues (latency, packet loss, application errors)</li>\n<li>Ability to work after hours for project and maintenance needs</li>\n<li>Program management skills to align cross-functional teams and drive results</li>\n<li>Understanding of AI, machine learning, LLMs, MCP technologies</li>\n<li>Relevant certifications (PCNSE, ZIA/ZPA, CCNP, CCDP, CCIE, CISSP, CCDE, CEH, Security+ or equivalent experience) are a plus</li>\n</ul>\n<p>As a Sr. Staff Network Engineer, you will be a forward-thinking and innovative individual, always seeking to improve and streamline processes. You will be a collaborative leader and mentor, passionate about empowering others and sharing expertise. You will be an excellent communicator, able to bridge technical and non-technical audiences. You will be adaptable and resilient, thriving in dynamic environments. You will be strategic and detail-oriented, balancing big-picture vision with hands-on execution. You will be committed to integrity, excellence, leadership, and passion,core Synopsys values.</p>\n<p>You will join a dynamic global network engineering team responsible for designing and supporting all network services,including LAN, WAN, Data Center, remote access, wireless, and firewall security. This collaborative group is focused on delivering secure, scalable, and resilient network solutions, embracing automation and AI to drive continuous improvement. As a mentor and leader, you will help shape a culture of innovation and learning within the team.</p>\n<p>We offer a comprehensive range of health, wellness, and financial benefits to cater to your needs. Our total rewards include both monetary and non-monetary offerings. Your recruiter will provide more details about the salary range and benefits during the hiring process.</p>\n<p>At Synopsys, we want talented people of every background to feel valued and supported to do their best work. Synopsys considers all applicants for employment without regard to race, color, religion, national origin, gender, sexual orientation, age, military veteran status, or disability.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_ec71d906-a19","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Synopsys","sameAs":"https://careers.synopsys.com","logo":"https://logos.yubhub.co/careers.synopsys.com.png"},"x-apply-url":"https://careers.synopsys.com/job/sunnyvale/it-security-network-engineer-sr-staff/44408/92616532928","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$158,000-$236,000","x-skills-required":["network engineering","large-scale enterprise networks","network security systems","TCP/IP","switching/routing","BGP","OSPF","DMVPN","EVPN/VXLAN","SD-WAN","MPLS","wireless standards","802.11a/b/g/n/ac/ax","MIMO","beamforming","channel planning","network configuration management","automation tools","Python","Ansible","OpenStack","Terraform","REST API","Cisco","Aruba","Zscaler","Palo Alto Networks","raw packet data analysis","program management","AI","machine learning","LLMs","MCP technologies"],"x-skills-preferred":[],"datePosted":"2026-04-05T13:22:18.109Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Sunnyvale"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"network engineering, large-scale enterprise networks, network security systems, TCP/IP, switching/routing, BGP, OSPF, DMVPN, EVPN/VXLAN, SD-WAN, MPLS, wireless standards, 802.11a/b/g/n/ac/ax, MIMO, beamforming, channel planning, network configuration management, automation tools, Python, Ansible, OpenStack, Terraform, REST API, Cisco, Aruba, Zscaler, Palo Alto Networks, raw packet data analysis, program management, AI, machine learning, LLMs, MCP technologies","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":158000,"maxValue":236000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_1338e7d1-ad8"},"title":"Cloud Machine Learning Engineer","description":"<p>At Hugging Face, we&#39;re on a journey to democratize good AI. We are building the fastest growing platform for AI builders. We are looking for a Cloud Machine Learning engineer responsible to help build machine learning solutions used by millions leveraging cloud technologies.</p>\n<p>You will work on integrating Hugging Face&#39;s open-source libraries like Transformers and Diffusers, with major cloud platforms or managed SaaS solutions. This role involves bridging and integrating models with different cloud providers, ensuring the models meet expected performance, designing and developing easy-to-use, secure, and robust developer experiences and APIs for our users, writing technical documentation, examples and notebooks to demonstrate new features, and sharing and advocating your work and the results with the community.</p>\n<p>The ideal candidate will have deep experience building with Hugging Face Technologies, including Transformers, Diffusers, Accelerate, PEFT, Datasets, expertise in Deep Learning Framework, preferably PyTorch, optionally XLA understanding, strong knowledge of cloud platforms like AWS and services like Amazon SageMaker, EC2, S3, CloudWatch and/or Azure and GCP equivalents, experience in building MLOps pipelines for containerizing models and solutions with Docker, familiarity with Typescript, Rust, and MongoDB, Kubernetes are helpful, ability to write clear documentation, examples and definition and work across the full product development lifecycle, and bonus experience with Svelte &amp; TailwindCSS.</p>\n<p>We are actively working to build a culture that values diversity, equity, and inclusivity. We are intentionally building a workplace where people feel respected and supported—regardless of who you are or where you come from. We believe this is foundational to building a great company and community.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_1338e7d1-ad8","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Hugging Face","sameAs":"https://huggingface.co/"},"x-apply-url":"https://apply.workable.com/j/A3879724CD","x-work-arrangement":"remote","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["Deep experience building with Hugging Face Technologies, including Transformers, Diffusers, Accelerate, PEFT, Datasets","Expertise in Deep Learning Framework, preferably PyTorch, optionally XLA understanding","Strong knowledge of cloud platforms like AWS and services like Amazon SageMaker, EC2, S3, CloudWatch and/or Azure and GCP equivalents","Experience in building MLOps pipelines for containerizing models and solutions with Docker","Familiarity with Typescript, Rust, and MongoDB, Kubernetes are helpful"],"x-skills-preferred":["Bonus experience with Svelte & TailwindCSS"],"datePosted":"2026-03-10T11:32:29.200Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"United States"}},"jobLocationType":"TELECOMMUTE","employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Deep experience building with Hugging Face Technologies, including Transformers, Diffusers, Accelerate, PEFT, Datasets, Expertise in Deep Learning Framework, preferably PyTorch, optionally XLA understanding, Strong knowledge of cloud platforms like AWS and services like Amazon SageMaker, EC2, S3, CloudWatch and/or Azure and GCP equivalents, Experience in building MLOps pipelines for containerizing models and solutions with Docker, Familiarity with Typescript, Rust, and MongoDB, Kubernetes are helpful, Bonus experience with Svelte & TailwindCSS"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_af4253f8-57e"},"title":"Cloud Machine Learning Engineer - EMEA remote","description":"<p>At Hugging Face, we&#39;re on a journey to democratize good AI. We are building the fastest growing platform for AI builders with over 11 million users who collectively shared over 2M models, 700k datasets &amp; 600k apps. Our open-source libraries have more than 600k+ stars on Github. Hugging Face has become the most popular, community-driven project for training, sharing, and deploying the most advanced machine learning models.</p>\n<p>We are looking for a Cloud Machine Learning engineer responsible to help build machine learning solutions used by millions leveraging cloud technologies. You will work on integrating Hugging Face&#39;s open-source libraries like Transformers and Diffusers, with major cloud platforms or managed SaaS solutions.</p>\n<p>Responsibilities:</p>\n<ul>\n<li>Bridging and integrating 🤗 transformers/diffusers models with a different Cloud provider.</li>\n<li>Ensuring the above models meet the expected performance</li>\n<li>Designing &amp; Developing easy-to-use, secure, and robust Developer Experiences &amp; APIs for our users.</li>\n<li>Write technical documentation, examples and notebooks to demonstrate new features</li>\n<li>Sharing &amp; Advocating your work and the results with the community.</li>\n</ul>\n<p>About You\nYou&#39;ll enjoy working on this team if you have experience with and interest in deploying machine learning systems to production and build great developer experiences. The ideal candidate will have skills including:</p>\n<ul>\n<li>Deep experience building with Hugging Face Technologies, including Transformers, Diffusers, Accelerate, PEFT, Datasets</li>\n<li>Expertise in Deep Learning Framework, preferably PyTorch, optionally XLA understanding</li>\n<li>Strong knowledge of cloud platforms like AWS and services like Amazon SageMaker, EC2, S3, CloudWatch and/or Azure and GCP equivalents.</li>\n<li>Experience in building MLOps pipelines for containerizing models and solutions with Docker</li>\n<li>Familiarity with Typescript, Rust, and MongoDB, Kubernetes are helpful</li>\n<li>Ability to write clear documentation, examples and definition and work across the full product development lifecycle</li>\n<li>Bonus: Experience with Svelte &amp; TailwindCSS</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_af4253f8-57e","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Hugging Face","sameAs":"https://huggingface.co/"},"x-apply-url":"https://apply.workable.com/j/0CE9E806CC","x-work-arrangement":"remote","x-experience-level":"mid","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["Deep experience building with Hugging Face Technologies, including Transformers, Diffusers, Accelerate, PEFT, Datasets","Expertise in Deep Learning Framework, preferably PyTorch, optionally XLA understanding","Strong knowledge of cloud platforms like AWS and services like Amazon SageMaker, EC2, S3, CloudWatch and/or Azure and GCP equivalents.","Experience in building MLOps pipelines for containerizing models and solutions with Docker","Familiarity with Typescript, Rust, and MongoDB, Kubernetes are helpful"],"x-skills-preferred":["Svelte & TailwindCSS"],"datePosted":"2026-03-10T11:32:17.703Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Paris"}},"jobLocationType":"TELECOMMUTE","employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Deep experience building with Hugging Face Technologies, including Transformers, Diffusers, Accelerate, PEFT, Datasets, Expertise in Deep Learning Framework, preferably PyTorch, optionally XLA understanding, Strong knowledge of cloud platforms like AWS and services like Amazon SageMaker, EC2, S3, CloudWatch and/or Azure and GCP equivalents., Experience in building MLOps pipelines for containerizing models and solutions with Docker, Familiarity with Typescript, Rust, and MongoDB, Kubernetes are helpful, Svelte & TailwindCSS"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_c1ce6197-2e2"},"title":"Data Center Networking Specialist - Sr Staff","description":"<p><strong>Engineer the Future with Us</strong></p>\n<p>We are seeking a motivated and passionate Network Engineer to join our team. As a Data Center Networking Specialist - Sr Staff, you will be responsible for leading the design and implementation of scalable, high-performance network architectures that integrate data center and cloud environments.</p>\n<p><strong>What You&#39;ll Be Doing:</strong></p>\n<ul>\n<li>Leading the design and implementation of scalable, high-performance network architectures that integrate data center and cloud environments.</li>\n<li>Developing strategy and architecture for DC engineering, influencing standards and technology direction across data centers and clouds.</li>\n<li>Program managing complex cross-functional projects, aligning stakeholders and driving collaboration to achieve strategic goals.</li>\n<li>Owning the end-to-end data center design lifecycle, from blueprinting through day2 operations, and creating repeatable templates for broader team support.</li>\n<li>Implementing automation tools and AI-driven solutions to streamline network operations and improve efficiency.</li>\n<li>Establishing and tracking key performance indicators (KPIs) to measure network efficiency, effectiveness, and driving continuous improvements.</li>\n<li>Mentoring and guiding junior engineers, fostering a culture of knowledge sharing and continuous learning.</li>\n<li>Staying current with industry trends and emerging technologies in data center and cloud networking, evaluating their impact on operations.</li>\n<li>Developing and maintaining comprehensive documentation for network configurations, processes, and procedures.</li>\n<li>Communicating effectively with stakeholders at all levels, conveying complex concepts to both technical and non-technical audiences.</li>\n</ul>\n<p><strong>The Impact You Will Have:</strong></p>\n<ul>\n<li>Driving innovation in data center and cloud network architectures, ensuring Synopsys remains at the forefront of technology.</li>\n<li>Enhancing operational efficiency and scalability through automation and AI-driven solutions.</li>\n<li>Shaping the standards and technology direction for global data center initiatives.</li>\n<li>Improving network reliability, security, and performance for mission-critical business applications.</li>\n<li>Fostering a high-performing, collaborative engineering culture through mentorship and leadership.</li>\n<li>Enabling seamless integration of emerging technologies and adapting strategies to evolving business needs.</li>\n<li>Reducing manual intervention and operational risks, supporting robust and resilient infrastructure.</li>\n<li>Ensuring documentation and processes are streamlined, accessible, and actionable for all stakeholders.</li>\n</ul>\n<p><strong>What You&#39;ll Need:</strong></p>\n<ul>\n<li>BS in Engineering or related field; MS preferred.</li>\n<li>10+ years of experience in network engineering/data center infrastructure with significant production ownership of large-scale networks.</li>\n<li>Expert-level proficiency in DC and service provider grade protocols: BGP, ISIS, MPLS, Segment Routing, EVPN, VXLAN, QoS, traffic engineering.</li>\n<li>High proficiency with Cisco ACI (Application Centric Infrastructure) solutions, including Multi-Pod/Multi-Site architecture and ACI automation (APIC REST/SDK/Ansible collections).</li>\n<li>Proficiency in Python and Ansible for automation.</li>\n<li>Experience integrating telemetry (gNMI/streaming) and flow analytics (NetFlow/IPFIX/sFlow) with platforms like Elastic/Grafana.</li>\n<li>Demonstrated leadership and mentoring abilities, with successful program management experience.</li>\n<li>Strong organizational skills, capable of managing multiple projects and priorities.</li>\n<li>Good to have: Relevant certifications (e.g., CCNP, CCIE – Data Center, AWS certified solution architect), DC operations experience, strong security mindset, ITIL change/incident familiarity.</li>\n</ul>\n<p><strong>Who You Are:</strong></p>\n<ul>\n<li>Self-driven and proactive, consistently finding ways to contribute beyond your charter.</li>\n<li>Excellent communicator, able to convey complex concepts to both technical and non-technical audiences.</li>\n<li>Collaborative team player, inspiring and mentoring others.</li>\n<li>Adaptable and agile, thriving in a fast-paced, evolving environment.</li>\n<li>Innovative thinker, always seeking to improve and future-proof network solutions.</li>\n<li>Organized, detail-oriented, and results-focused.</li>\n<li>Strong leader with a passion for continuous learning and development.</li>\n</ul>\n<p><strong>The Team You’ll Be A Part Of:</strong></p>\n<p>The Synopsys Network and Datacenter team is responsible for the design, vision, and driving major initiatives for Data Center and Cloud. You’ll be joining a diverse, high-impact group of experts who collaborate across global sites to deliver scalable, secure, and innovative network solutions. The team values knowledge sharing, continuous improvement, and a culture of excellence, empowering each member to make a meaningful impact on Synopsys’ technological landscape.</p>\n<p><strong>Rewards and Benefits:</strong></p>\n<p>We offer a comprehensive range of health, wellness, and financial benefits to cater to your needs. Our total rewards include both monetary and non-monetary offerings. Your recruiter will provide more details about the salary range and benefits during the hiring process.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_c1ce6197-2e2","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Synopsys","sameAs":"https://careers.synopsys.com","logo":"https://logos.yubhub.co/careers.synopsys.com.png"},"x-apply-url":"https://careers.synopsys.com/job/bengaluru/data-center-networking-specialist-sr-staff/44408/91926832256","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["BGP","ISIS","MPLS","Segment Routing","EVPN","VXLAN","QoS","traffic engineering","Cisco ACI","Python","Ansible","telemetry","flow analytics","Elastic","Grafana"],"x-skills-preferred":[],"datePosted":"2026-03-09T11:10:35.299Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Bengaluru, Karnataka, India"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"BGP, ISIS, MPLS, Segment Routing, EVPN, VXLAN, QoS, traffic engineering, Cisco ACI, Python, Ansible, telemetry, flow analytics, Elastic, Grafana"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_11a60d5a-f54"},"title":"Performance Engineer, GPU","description":"<p><strong>About the role:</strong></p>\n<p>Pioneering the next generation of AI requires breakthrough innovations in GPU performance and systems engineering. As a GPU Performance Engineer, you&#39;ll architect and implement the foundational systems that power Claude and push the frontiers of what&#39;s possible with large language models. You&#39;ll be responsible for maximizing GPU utilization and performance at unprecedented scale, developing cutting-edge optimizations that directly enable new model capabilities and dramatically improve inference efficiency.</p>\n<p>Working at the intersection of hardware and software, you&#39;ll implement state-of-the-art techniques from custom kernel development to distributed system architectures. Your work will span the entire stack—from low-level tensor core optimizations to orchestrating thousands of GPUs in perfect synchronization.</p>\n<p>Strong candidates will have a track record of delivering transformative GPU performance improvements in production ML systems and will be excited to shape the future of AI infrastructure alongside world-class researchers and engineers.</p>\n<p><strong>You might be a good fit if you:</strong></p>\n<ul>\n<li>Have deep experience with GPU programming and optimization at scale</li>\n<li>Are impact-driven, passionate about delivering measurable performance breakthroughs</li>\n<li>Can navigate complex systems from hardware interfaces to high-level ML frameworks</li>\n<li>Enjoy collaborative problem-solving and pair programming</li>\n<li>Want to work on state-of-the-art language models with real-world impact</li>\n<li>Care about the societal impacts of your work</li>\n<li>Thrive in ambiguous environments where you define the path forward</li>\n</ul>\n<p><strong>Strong candidates may also have experience with:</strong></p>\n<ul>\n<li>GPU Kernel Development: CUDA, Triton, CUTLASS, Flash Attention, tensor core optimization</li>\n<li>ML Compilers &amp; Frameworks: PyTorch/JAX internals, torch.compile, XLA, custom operators</li>\n<li>Performance Engineering: Kernel fusion, memory bandwidth optimization, profiling with Nsight</li>\n<li>Distributed Systems: NCCL, NVLink, collective communication, model parallelism</li>\n<li>Low-Precision: INT8/FP8 quantization, mixed-precision techniques</li>\n<li>Production Systems: Large-scale training infrastructure, fault tolerance, cluster orchestration</li>\n</ul>\n<p><strong>Representative projects:</strong></p>\n<ul>\n<li>Co-design attention mechanisms and algorithms for next-generation hardware architectures</li>\n<li>Develop custom kernels for emerging quantization formats and mixed-precision techniques</li>\n<li>Design distributed communication strategies for multi-node GPU clusters</li>\n<li>Optimize end-to-end training and inference pipelines for frontier language models</li>\n<li>Build performance modeling frameworks to predict and optimize GPU utilization</li>\n<li>Implement kernel fusion strategies to minimize memory bandwidth bottlenecks</li>\n<li>Create resilient systems for planet-scale distributed training infrastructure</li>\n<li>Profile and eliminate performance bottlenecks in production serving infrastructure</li>\n<li>Partner with hardware vendors to influence future accelerator capabilities and software stacks</li>\n</ul>\n<p><strong>Deadline to apply:</strong> None. Applications will be reviewed on a rolling basis.</p>\n<p>The expected salary range for this position is:</p>\n<p>Annual Salary: $280,000 - $850,000USD</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_11a60d5a-f54","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Anthropic","sameAs":"https://job-boards.greenhouse.io","logo":"https://logos.yubhub.co/anthropic.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/anthropic/jobs/4926227008","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$280,000 - $850,000USD","x-skills-required":["GPU programming","optimization at scale","custom kernel development","distributed system architectures","low-level tensor core optimizations","orchestrating thousands of GPUs","GPU kernel development","CUDA","Triton","CUTLASS","Flash Attention","tensor core optimization","ML compilers & frameworks","PyTorch/JAX internals","torch.compile","XLA","custom operators","performance engineering","kernel fusion","memory bandwidth optimization","profiling with Nsight","distributed systems","NCCL","NVLink","collective communication","model parallelism","low-precision","INT8/FP8 quantization","mixed-precision techniques","production systems","large-scale training infrastructure","fault tolerance","cluster orchestration"],"x-skills-preferred":["GPU programming","optimization at scale","custom kernel development","distributed system architectures","low-level tensor core optimizations","orchestrating thousands of GPUs","GPU kernel development","CUDA","Triton","CUTLASS","Flash Attention","tensor core optimization","ML compilers & frameworks","PyTorch/JAX internals","torch.compile","XLA","custom operators","performance engineering","kernel fusion","memory bandwidth optimization","profiling with Nsight","distributed systems","NCCL","NVLink","collective communication","model parallelism","low-precision","INT8/FP8 quantization","mixed-precision techniques","production systems","large-scale training infrastructure","fault tolerance","cluster orchestration"],"datePosted":"2026-03-08T13:45:05.412Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco, CA | New York City, NY | Seattle, WA"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"GPU programming, optimization at scale, custom kernel development, distributed system architectures, low-level tensor core optimizations, orchestrating thousands of GPUs, GPU kernel development, CUDA, Triton, CUTLASS, Flash Attention, tensor core optimization, ML compilers & frameworks, PyTorch/JAX internals, torch.compile, XLA, custom operators, performance engineering, kernel fusion, memory bandwidth optimization, profiling with Nsight, distributed systems, NCCL, NVLink, collective communication, model parallelism, low-precision, INT8/FP8 quantization, mixed-precision techniques, production systems, large-scale training infrastructure, fault tolerance, cluster orchestration, GPU programming, optimization at scale, custom kernel development, distributed system architectures, low-level tensor core optimizations, orchestrating thousands of GPUs, GPU kernel development, CUDA, Triton, CUTLASS, Flash Attention, tensor core optimization, ML compilers & frameworks, PyTorch/JAX internals, torch.compile, XLA, custom operators, performance engineering, kernel fusion, memory bandwidth optimization, profiling with Nsight, distributed systems, NCCL, NVLink, collective communication, model parallelism, low-precision, INT8/FP8 quantization, mixed-precision techniques, production systems, large-scale training infrastructure, fault tolerance, cluster orchestration","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":280000,"maxValue":850000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_ef6cad27-130"},"title":"Datacenter Networking Technician, AI Compute Deployment - Stargate","description":"<p><strong>Job Posting</strong></p>\n<p><strong>Datacenter Networking Technician, AI Compute Deployment - Stargate</strong></p>\n<p><strong>Location</strong></p>\n<p>Remote - US</p>\n<p><strong>Employment Type</strong></p>\n<p>Full time</p>\n<p><strong>Location Type</strong></p>\n<p>Remote</p>\n<p><strong>Department</strong></p>\n<p>Scaling</p>\n<p><strong>Compensation</strong></p>\n<ul>\n<li>$86.4K – $228K • Offers Equity</li>\n</ul>\n<p>The base pay offered may vary depending on multiple individualized factors, including market location, job-related knowledge, skills, and experience. If the role is non-exempt, overtime pay will be provided consistent with applicable laws. In addition to the salary range listed above, total compensation also includes generous equity, performance-related bonus(es) for eligible employees, and the following benefits.</p>\n<ul>\n<li>Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts</li>\n</ul>\n<ul>\n<li>Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)</li>\n</ul>\n<ul>\n<li>401(k) retirement plan with employer match</li>\n</ul>\n<ul>\n<li>Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)</li>\n</ul>\n<ul>\n<li>Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees</li>\n</ul>\n<ul>\n<li>13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)</li>\n</ul>\n<ul>\n<li>Mental health and wellness support</li>\n</ul>\n<ul>\n<li>Employer-paid basic life and disability coverage</li>\n</ul>\n<ul>\n<li>Annual learning and development stipend to fuel your professional growth</li>\n</ul>\n<ul>\n<li>Daily meals in our offices, and meal delivery credits as eligible</li>\n</ul>\n<ul>\n<li>Relocation support for eligible employees</li>\n</ul>\n<ul>\n<li>Additional taxable fringe benefits, such as charitable donation matching and wellness stipends, may also be provided.</li>\n</ul>\n<p>More details about our benefits are available to candidates during the hiring process.</p>\n<p>This role is at-will and OpenAI reserves the right to modify base pay and other compensation components at any time based on individual performance, team or company results, or market conditions.</p>\n<p><strong><strong>About the Team:</strong></strong></p>\n<p>OpenAI, in close collaboration with our capital partners, is embarking on a journey to build the world’s most advanced AI infrastructure ecosystem. Our Stargate program develops and deploys massive, state-of-the-art data center campuses in partnership with industry leaders today—and through future OpenAI infrastructure projects tomorrow. We design for scale, speed, and reliability, and we need experienced technicians who can translate network blueprints into physical reality.</p>\n<p><strong><strong>About the Role:</strong></strong></p>\n<p>We are seeking a Senior Data Center Networking Technician who thrives in fast-moving build environments and is eager to roll up their sleeves during active datacenter deployments.</p>\n<p>Your first assignment will focus on the physical bring-up of network infrastructure at a large partner-operated campus, collaborating with partner teams and their delivery vendors to achieve agreed performance and reliability targets. As that campus reaches steady state, you will transition to lead network deployment for future OpenAI data center projects, defining standards and guiding implementation across multiple locations.</p>\n<p>_Candidates must be able to sit onsite in Abilene, Texas 5 days per week_</p>\n<p><strong><strong>In this role you will:</strong></strong></p>\n<ul>\n<li>Serve as OpenAI’s technical lead technician during the current campus build, partnering with internal engineers and external contractors on design reviews, installation plans, and acceptance criteria.</li>\n</ul>\n<ul>\n<li>Spend significant time on the data-center floor performing inspections, assisting with cable routing/termination when needed, conducting fiber testing (OTDR, power levels, continuity), and resolving installation challenges in real time.</li>\n</ul>\n<ul>\n<li>Troubleshoot and optimize cabling routes, patching, and equipment turn-up to ensure clean, reliable handoff to network operations.</li>\n</ul>\n<ul>\n<li>Contribute to design discussions and peer reviews for structured cabling and physical network layouts, providing practical field feedback to engineering teams.</li>\n</ul>\n<ul>\n<li>Develop repeatable engineering standards, as-built documentation, and deployment playbooks to accelerate future OpenAI campuses</li>\n</ul>\n<ul>\n<li>Transition to hands-on design and deployment leadership for upcoming OpenAI data center expansions, owning network physical-layer deployment from design through commissioning.</li>\n</ul>\n<p><strong><strong>You might thrive in this role if you:</strong></strong></p>\n<ul>\n<li>Have 7+ years of experience in large-scale datacenter network deployment, structured cabling, or physical-layer installation.</li>\n</ul>\n<ul>\n<li>Possess deep knowledge of fiber/copper plant design, cabling standards, and hyperscale network best practices.</li>\n</ul>\n<ul>\n<li>Excel in field work while applying seasoned technical judgment to solve complex installation and testing issues.</li>\n</ul>\n<ul>\n<li>Adapt quickly to changing build conditions and enjoy learning emerging networking technologies.</li>\n</ul>\n<ul>\n<li>Communicate clearly with construction, engineering, and operations teams to drive projects to completion.</li>\n</ul>\n<ul>\n<li>Are willing to be based at a partner campus during the initial build phase and to travel to future OpenAI data center projects.</li>\n</ul>\n<p><strong><strong>Preferred Skills:</strong></strong></p>\n<ul>\n<li>Exposure to automation tools (Ansible, SaltStack) or scripting (Python, Shell) for configuration or documentation.</li>\n</ul>\n<ul>\n<li>Experience with hyperscale Clos fabrics or large-scale network design.</li>\n</ul>\n<ul>\n<li>Familiarity with routing and switching concepts (BGP, OSPF, EVPN/VXLAN) and basic Linux/network troubleshooting.</li>\n</ul>\n<ul>\n<li>Industry certifications such as BICSI, CCNA/CCNP, or equivalent.</li>\n</ul>\n<p><strong>About OpenAI</strong></p>\n<p>OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we are building a team of talented individuals who share our values and are passionate about making a positive impact on the world.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_ef6cad27-130","directApply":true,"hiringOrganization":{"@type":"Organization","name":"OpenAI","sameAs":"https://jobs.ashbyhq.com","logo":"https://logos.yubhub.co/openai.com.png"},"x-apply-url":"https://jobs.ashbyhq.com/openai/3ca637f8-c0d6-4253-8568-1f31b02adf76","x-work-arrangement":"remote","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$86.4K – $228K • Offers Equity","x-skills-required":["large-scale datacenter network deployment","structured cabling","physical-layer installation","fiber/copper plant design","cabling standards","hyperscale network best practices","routing and switching concepts","BGP","OSPF","EVPN/VXLAN","basic Linux/network troubleshooting"],"x-skills-preferred":["automation tools","Ansible","SaltStack","scripting","Python","Shell","hyperscale Clos fabrics","large-scale network design","industry certifications","BICSI","CCNA/CCNP"],"datePosted":"2026-03-06T18:29:44.685Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Remote - US"}},"jobLocationType":"TELECOMMUTE","employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"large-scale datacenter network deployment, structured cabling, physical-layer installation, fiber/copper plant design, cabling standards, hyperscale network best practices, routing and switching concepts, BGP, OSPF, EVPN/VXLAN, basic Linux/network troubleshooting, automation tools, Ansible, SaltStack, scripting, Python, Shell, hyperscale Clos fabrics, large-scale network design, industry certifications, BICSI, CCNA/CCNP","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":86400,"maxValue":228000,"unitText":"YEAR"}}}]}