{"version":"0.1","company":{"name":"YubHub","url":"https://yubhub.co","jobsUrl":"https://yubhub.co/jobs/skill/gpu-interconnect-topologies"},"x-facet":{"type":"skill","slug":"gpu-interconnect-topologies","display":"Gpu Interconnect Topologies","count":2},"x-feed-size-limit":100,"x-feed-sort":"enriched_at desc","x-feed-notice":"This feed contains at most 100 jobs (the most recently enriched). For the full corpus, use the paginated /stats/by-facet endpoint or /search.","x-generator":"yubhub-xml-generator","x-rights":"Free to redistribute with attribution: \"Data by YubHub (https://yubhub.co)\"","x-schema":"Each entry in `jobs` follows https://schema.org/JobPosting. YubHub-native raw fields carry `x-` prefix.","jobs":[{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_ec7cc743-ef4"},"title":"Senior Software Engineer II, Inference","description":"<p>We&#39;re seeking a senior software engineer to join our team and lead the design and development of our Kubernetes-native inference platform. As a senior engineer, you will be responsible for leading design reviews, driving architecture, and ensuring the reliability and scalability of our platform.</p>\n<p>Key responsibilities include:</p>\n<ul>\n<li>Leading design reviews and driving architecture within the team</li>\n<li>Defining and owning SLIs/SLOs and ensuring post-incident actions land and reliability improves release-over-release</li>\n<li>Implementing advanced optimizations such as micro-batch schedulers, speculative decoding, and KV-cache reuse</li>\n<li>Strengthening incident posture through capacity planning, autoscaling policy, and rollback/traffic-shift strategies</li>\n<li>Mentoring IC1/IC2 engineers and reviewing cross-team designs to elevate coding/testing standards</li>\n</ul>\n<p>We&#39;re looking for someone with strong coding skills in Python or Go, deep familiarity with networked systems and performance, and hands-on experience with Kubernetes at production scale. If you have experience with inference internals, batching, caching, mixed precision, and streaming token delivery, that&#39;s a plus.</p>\n<p>In addition to a competitive salary, we offer a range of benefits including medical, dental, and vision insurance, company-paid life insurance, and flexible PTO. We&#39;re committed to creating a work environment that&#39;s inclusive, diverse, and supportive of our employees&#39; well-being.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_ec7cc743-ef4","directApply":true,"hiringOrganization":{"@type":"Organization","name":"CoreWeave","sameAs":"https://www.coreweave.com","logo":"https://logos.yubhub.co/coreweave.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/coreweave/jobs/4604832006","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$165,000 to $242,000","x-skills-required":["Python","Go","Kubernetes","Networked systems","Performance","Inference internals","Batching","Caching","Mixed precision","Streaming token delivery"],"x-skills-preferred":["CUDA kernels","NCCL/SHARP","RDMA/NUMA","GPU interconnect topologies","Contributions to inference frameworks","Experience with multi-team initiatives"],"datePosted":"2026-04-18T15:50:27.738Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Sunnyvale, CA / Bellevue, WA"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Python, Go, Kubernetes, Networked systems, Performance, Inference internals, Batching, Caching, Mixed precision, Streaming token delivery, CUDA kernels, NCCL/SHARP, RDMA/NUMA, GPU interconnect topologies, Contributions to inference frameworks, Experience with multi-team initiatives","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":165000,"maxValue":242000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_9701c504-1a6"},"title":"Senior Software Engineer I, Inference","description":"<p>We&#39;re looking for a Senior Software Engineer I to join our team. As a senior engineer, you&#39;ll lead designs, raise engineering standards, and deliver measurable improvements to latency, throughput, and reliability across multiple services. You&#39;ll partner with product, orchestration, and hardware teams to evolve our Kubernetes-native inference platform and meet strict P99 SLAs at scale.</p>\n<p>Key responsibilities include:</p>\n<ul>\n<li>Lead design reviews and drive architecture within the team; decompose multi-service work into clear milestones.</li>\n<li>Define and own SLIs/SLOs; ensure post-incident actions land and reliability improves release-over-release.</li>\n<li>Implement advanced optimizations (e.g., micro-batch schedulers, speculative decoding, KV-cache reuse) and quantify impact.</li>\n<li>Strengthen incident posture: capacity planning, autoscaling policy, graceful degradation, rollback/traffic-shift strategies.</li>\n<li>Mentor IC1/IC2 engineers; review cross-team designs and elevate coding/testing standards.</li>\n</ul>\n<p>Requirements include:</p>\n<ul>\n<li>3-5 years of industry experience building distributed systems or cloud services.</li>\n<li>Strong coding in Python or Go (C++ a plus) and deep familiarity with networked systems and performance.</li>\n<li>Hands-on experience with Kubernetes at production scale, CI/CD, and observability stacks (Prometheus, Grafana, OpenTelemetry).</li>\n<li>Practical knowledge of inference internals: batching, caching, mixed precision (BF16/FP8), streaming token delivery.</li>\n<li>Proven track record improving tail latency (P95/P99) and service reliability through metrics-driven work.</li>\n</ul>\n<p>Preferred qualifications include contributions to inference frameworks, experience with CUDA kernels, NCCL/SHARP, RDMA/NUMA, or GPU interconnect topologies, and leading multi-team initiatives or partnering with customers on mission-critical launches.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_9701c504-1a6","directApply":true,"hiringOrganization":{"@type":"Organization","name":"CoreWeave","sameAs":"https://www.coreweave.com","logo":"https://logos.yubhub.co/coreweave.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/coreweave/jobs/4647603006","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$139,000 to $204,000","x-skills-required":["Python","Go","Kubernetes","CI/CD","Observability stacks","Inference internals","Batching","Caching","Mixed precision","Streaming token delivery"],"x-skills-preferred":["Contributions to inference frameworks","CUDA kernels","NCCL/SHARP","RDMA/NUMA","GPU interconnect topologies"],"datePosted":"2026-04-18T15:48:09.297Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Sunnyvale, CA / Bellevue, WA"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Python, Go, Kubernetes, CI/CD, Observability stacks, Inference internals, Batching, Caching, Mixed precision, Streaming token delivery, Contributions to inference frameworks, CUDA kernels, NCCL/SHARP, RDMA/NUMA, GPU interconnect topologies","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":139000,"maxValue":204000,"unitText":"YEAR"}}}]}