{"version":"0.1","company":{"name":"YubHub","url":"https://yubhub.co","jobsUrl":"https://yubhub.co/jobs/skill/hpc"},"x-facet":{"type":"skill","slug":"hpc","display":"Hpc","count":50},"x-feed-size-limit":100,"x-feed-sort":"enriched_at desc","x-feed-notice":"This feed contains at most 100 jobs (the most recently enriched). For the full corpus, use the paginated /stats/by-facet endpoint or /search.","x-generator":"yubhub-xml-generator","x-rights":"Free to redistribute with attribution: \"Data by YubHub (https://yubhub.co)\"","x-schema":"Each entry in `jobs` follows https://schema.org/JobPosting. YubHub-native raw fields carry `x-` prefix.","jobs":[{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_326f90c8-11f"},"title":"Senior High Frequency C++ Engineer","description":"<p>The Systematic Platform Execution &amp; Exchange Data (SPEED) Team is at the core of our organisation, powering our lowest-latency solutions for systematic and high-frequency trading. We deliver the live trading and market-data platforms used by portfolio managers and risk systems, including Latency Critical Trading (LCT), DMA OMS (Client Direct), DMA market data feeds, packet capture (PCAPs), enterprise market data, and intraday data services across latency tiers from sub-100 nanoseconds to millisecond-sensitive workflows.</p>\n<p>As a Senior HFT Developer on SPEED, you will design and build core low-latency components for order entry, market data, exchange simulation, feature extraction, and strategy containers, initially focused on delivering the full set of capabilities required for trading and research infrastructure. You will collaborate closely with system architects and quantitative researchers, operate and optimise these systems in production, and have clear opportunities to grow into technical and team leadership as the effort scales.</p>\n<p>Principal Responsibilities:</p>\n<ul>\n<li>Build low-latency infrastructure for order entry, market data, exchange simulators, feature extraction, strategy container, and other systems.</li>\n<li>Build convenience layer tools and services to facilitate trading teams onboarding at MLP.</li>\n<li>Provide level 2 support for the systems in production.</li>\n<li>Work closely with the SPEED architect, quantitative researchers, and the business to provide high ROI solutions that are aligned with both the business and the platform strategy.</li>\n<li>Opportunities for growth in terms of leadership as effort expands.</li>\n<li>Will liaise with many other MLP teams depending on project focus.</li>\n</ul>\n<p>Qualifications/Skills Required:</p>\n<ul>\n<li>5+ years with a well-regarded HFT group, delivering production-grade, low-latency systems.</li>\n<li>Demonstrated expertise in C++ and Python for production, low-latency systems.</li>\n<li>Deep familiarity with low-level Systems: OS tuning, networking stack, user-space drivers, and kernel-bypass patterns.</li>\n<li>Strong understanding of the HFT quantitative research pipeline.</li>\n<li>Experience with HPC grids (scheduling, storage, job management) for research and production workloads.</li>\n<li>Cloud experience (AWS, GCP) is a plus.</li>\n<li>Proven ability to navigate large organisations, create cross-team synergies, and influence outcomes.</li>\n<li>High accountability and ownership; able to self-manage time, set priorities, and meet deadlines.</li>\n<li>Potential to provide technical leadership and manage a small team.</li>\n</ul>\n<p>The estimated base salary range for this position is $175,000 to $250,000, which is specific to New York and may change in the future. We pay a total compensation package which includes a base salary, discretionary performance bonus, and a comprehensive benefits package.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_326f90c8-11f","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Unknown","sameAs":"https://mlp.eightfold.ai","logo":"https://logos.yubhub.co/mlp.eightfold.ai.png"},"x-apply-url":"https://mlp.eightfold.ai/careers/job/755954694645","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$175,000 to $250,000","x-skills-required":["C++","Python","low-level Systems","OS tuning","networking stack","user-space drivers","kernel-bypass patterns","HFT quantitative research pipeline","HPC grids","scheduling","storage","job management"],"x-skills-preferred":[],"datePosted":"2026-04-18T22:13:18.115Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"New York, New York, United States of America"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"C++, Python, low-level Systems, OS tuning, networking stack, user-space drivers, kernel-bypass patterns, HFT quantitative research pipeline, HPC grids, scheduling, storage, job management","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":175000,"maxValue":250000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_588dfb0e-611"},"title":"Solutions Architect - Kubernetes","description":"<p>As a Solutions Architect at CoreWeave, you will play a vital role in helping customers succeed with our cloud infrastructure offerings, focusing on Kubernetes solutions within high-performance compute (HPC) environments.</p>\n<p>Your responsibilities will include serving as the primary technical point of contact for customers, establishing strong technical relationships and ensuring their success with CoreWeave&#39;s cloud infrastructure offerings.</p>\n<p>You will collaborate closely with customers to understand their unique business needs and create, prototype, and deploy tailored solutions that align with their requirements.</p>\n<p>You will lead proof of concept initiatives to showcase the value and viability of CoreWeave&#39;s solutions within specific environments.</p>\n<p>You will drive technical leadership and direction during customer meetings, presentations, and workshops, addressing any technical queries or concerns that arise.</p>\n<p>You will act as a virtual member of CoreWeave&#39;s Kubernetes product and engineering teams, identifying opportunities for product enhancement and collaborating with engineers to implement your suggestions.</p>\n<p>You will offer valuable insights on product features, functionality, and performance, contributing regularly to discussions about product strategy and architecture.</p>\n<p>You will conduct periodic technical reviews and assessments of customer workloads, pinpointing opportunities for workload optimization and suggesting suitable solutions.</p>\n<p>You will stay informed of the latest developments and trends in Kubernetes, cloud computing and infrastructure, sharing your thought leadership with customers and internal stakeholders.</p>\n<p>You will lead the prototyping and initiation of research and development efforts for emerging products and solutions, delivering prototypes and key insights for internal consumption.</p>\n<p>You will represent CoreWeave at conferences and industry events, with occasional travel as required.</p>\n<p>To be successful in this role, you will need to have a B.S. in Computer Science or a related technical discipline, or equivalent experience.</p>\n<p>You will also need to have 7+ years of proven experience as a Solutions Architect, engineer, researcher, or technical account manager in cloud infrastructure, focusing on building distributed systems or HPC/cloud services, with an expertise focused on scalable Kubernetes solutions.</p>\n<p>You will need to be fluent in cloud computing concepts, architecture, and technologies with hands-on experience in designing and implementing cloud solutions.</p>\n<p>You will need to have a proven track record with building customer relationships, communicating clearly and the ability to break down complex technical concepts to both technical and non-technical audiences.</p>\n<p>You will need to be familiar with NVIDIA GPUs typically used in AI/ML applications and associated technologies such as Infiniband and NVIDIA Collective Communications Library (NCCL).</p>\n<p>You will need to have experience with running large-scale Artificial Intelligence/Machine Learning (AI/ML) training and inference workloads on technologies such as Slurm and Kubernetes.</p>\n<p>Preferred qualifications include code contributions to open-source inference frameworks, experience with scripting and automation related to Kubernetes clusters and workloads, experience with building solutions across multi-cloud environments, and client or customer-facing publications/talks on latency, optimization, or advanced model-server architectures.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_588dfb0e-611","directApply":true,"hiringOrganization":{"@type":"Organization","name":"CoreWeave","sameAs":"https://www.coreweave.com","logo":"https://logos.yubhub.co/coreweave.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/coreweave/jobs/4557835006","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$165,000 to $220,000","x-skills-required":["Kubernetes","Cloud Computing","High-Performance Compute (HPC)","Distributed Systems","Cloud Infrastructure","Scalable Solutions","NVIDIA GPUs","Infiniband","NVIDIA Collective Communications Library (NCCL)","Slurm","Kubernetes Clusters"],"x-skills-preferred":["Code Contributions to Open-Source Inference Frameworks","Scripting and Automation Related to Kubernetes Clusters and Workloads","Building Solutions Across Multi-Cloud Environments","Client or Customer-Facing Publications/Talks on Latency, Optimization, or Advanced Model-Server Architectures"],"datePosted":"2026-04-18T15:57:29.779Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Livingston, NJ / New York, NY / Sunnyvale, CA / Bellevue, WA"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Kubernetes, Cloud Computing, High-Performance Compute (HPC), Distributed Systems, Cloud Infrastructure, Scalable Solutions, NVIDIA GPUs, Infiniband, NVIDIA Collective Communications Library (NCCL), Slurm, Kubernetes Clusters, Code Contributions to Open-Source Inference Frameworks, Scripting and Automation Related to Kubernetes Clusters and Workloads, Building Solutions Across Multi-Cloud Environments, Client or Customer-Facing Publications/Talks on Latency, Optimization, or Advanced Model-Server Architectures","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":165000,"maxValue":220000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_ce2e6ab7-617"},"title":"Data Center Design Execution Lead","description":"<p>We are seeking a Data Center Design Execution Lead to join our Infrastructure team. As a key member of our team, you will be responsible for owning the bridge between our technical requirements and third-party partners who bring our data centers to life. This is a critical role that requires a unique blend of technical expertise, project management skills, and leadership abilities.</p>\n<p>Responsibilities:</p>\n<ul>\n<li>Drive execution of our technical requirements for third-party data center delivery partners, ensuring design intent is consistently translated from BOD through construction documents.</li>\n<li>Define the design execution framework,deliverable requirements, review gates, and quality standards,for each partner engagement.</li>\n<li>Drive accountability for partner deliverable quality across milestones, evaluating deliverables for cross-discipline consistency and alignment to our requirements.</li>\n<li>Partner with external design teams and stakeholders to execute design document development and issuance across all project phases.</li>\n</ul>\n<p>Change Management &amp; Technical Continuity:</p>\n<ul>\n<li>Own the design change management process across projects, ensuring technical decisions are documented, resolved, and implemented through construction.</li>\n<li>Review and develop responses to contractor RFIs, maintaining design intent while accommodating field conditions.</li>\n<li>Own technical continuity across design, construction, commissioning, and turnover, driving alignment between internal leads on phase transition standards and acceptance criteria.</li>\n<li>Support project closeout and as-built documentation processes.</li>\n</ul>\n<p>Constructability &amp; Risk Management:</p>\n<ul>\n<li>Facilitate constructability review processes between design and construction teams, identifying integration risks and maintaining alignment between design intent and field execution as conditions evolve.</li>\n<li>Identify and mitigate design risks; ensure robust QA/QC practices across the design and construction lifecycle.</li>\n<li>Perform project site reviews and deliver technical reports on design compliance and construction progress.</li>\n</ul>\n<p>Cross-Functional Coordination:</p>\n<ul>\n<li>Interface with internal stakeholders,construction execution, supply chain, facilities operations, and security,to align design solutions with technical and business objectives.</li>\n<li>Partner with internal teams to define infrastructure requirements and translate them into actionable design criteria for external partners.</li>\n<li>Identify and implement opportunities for process improvements, design optimization, and schedule/performance/cost tradeoffs.</li>\n<li>Review commissioning scripts and final reports to validate performance and functionality against our standards.</li>\n</ul>\n<p>Qualifications:</p>\n<ul>\n<li>10+ years in data center or mission-critical infrastructure delivery, spanning design and delivery phases.</li>\n<li>Direct experience in owner&#39;s engineer or technical oversight roles,not purely design production or purely project management.</li>\n<li>Working knowledge of mechanical, electrical, and cooling systems in data center environments.</li>\n<li>Experience managing design change processes, RFIs, and construction documentation workflows.</li>\n<li>Track record of coordinating across multiple disciplines and organizations on complex infrastructure projects.</li>\n<li>Deep knowledge of industry standards, building codes, and safety standards applicable to mission-critical facilities.</li>\n<li>Comfortable operating with authority in ambiguous environments where processes need to be built, not just followed.</li>\n<li>BS in Mechanical Engineering, Electrical Engineering, Architecture, or related field.</li>\n</ul>\n<p>Preferred:</p>\n<ul>\n<li>Professional Engineer (PE) or Registered Architect (RA) license.</li>\n<li>Experience stamping construction drawing packages.</li>\n<li>Proficiency with Revit/BIM, Autodesk, or similar design software applications.</li>\n<li>Experience integrating sustainability, energy efficiency, and resiliency goals into data center system designs.</li>\n<li>Direct experience with large-scale AI/HPC infrastructure, including high-density cooling systems and power distribution at 50+ MW scale.</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_ce2e6ab7-617","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Anthropic","sameAs":"https://www.anthropic.com/","logo":"https://logos.yubhub.co/anthropic.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/anthropic/jobs/5157023008","x-work-arrangement":"remote-hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$320,000-$405,000 USD","x-skills-required":["data center design","project management","leadership","technical expertise","industry standards","building codes","safety standards","mechanical engineering","electrical engineering","architecture","Revit/BIM","Autodesk","design software","sustainability","energy efficiency","resiliency","large-scale AI/HPC infrastructure"],"x-skills-preferred":["professional engineer","registered architect","construction drawing packages","high-density cooling systems","power distribution"],"datePosted":"2026-04-18T15:55:57.925Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Remote-Friendly, United States"}},"jobLocationType":"TELECOMMUTE","employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"data center design, project management, leadership, technical expertise, industry standards, building codes, safety standards, mechanical engineering, electrical engineering, architecture, Revit/BIM, Autodesk, design software, sustainability, energy efficiency, resiliency, large-scale AI/HPC infrastructure, professional engineer, registered architect, construction drawing packages, high-density cooling systems, power distribution","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":320000,"maxValue":405000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_6d46741a-b4c"},"title":"Senior Systems Engineer, OS Automation","description":"<p>CoreWeave is looking for a Senior Systems Engineer who is ready to evolve beyond traditional DevOps. You will start by stabilizing and scaling our Linux OS and Kernel build pipelines. Once the foundation is set, you will lead the transition to AI-native infrastructure, building &#39;smart&#39; workflows that don&#39;t just report errors, but understand and fix them.</p>\n<p>You are a Systems Engineer at heart, but you are ready to apply LLMs, RAG, and predictive modeling to solve infrastructure challenges at scale.</p>\n<p>Our Team&#39;s Stack:</p>\n<ul>\n<li>Languages: Python, Go, bash/sh</li>\n<li>Observability: Prometheus, Victoria Metrics, Grafana</li>\n<li>OS &amp; Kernel: Linux Kernel (custom build), Ubuntu</li>\n<li>Hardware: Intel/AMD/ARM CPUs, Nvidia GPUs, DPUs, Infiniband and Ethernet NICs</li>\n<li>Containerization: Docker, Kubernetes (k8s), KubeVirt, containerd, kubelet</li>\n</ul>\n<p>Responsibilities:</p>\n<ul>\n<li>Pipeline Architecture: Design, maintain, and automate reproducible OS image build pipelines for our massive fleet of GPU-accelerated servers.</li>\n<li>Kernel Distribution: Collaborate with kernel engineers to package, validate, and distribute custom Linux builds across Intel, AMD, and ARM architectures.</li>\n<li>Dependency Management: Build tooling to manage dependencies, versioning, and release workflows, ensuring hermetic builds.</li>\n<li>Telemetry &amp; Metrics: Standardize the collection of build metrics to create a baseline for future AI modeling.</li>\n<li>&#39;Smart&#39; CI/CD &amp; Auto-Remediation: Architect AI agents that ingest and analyze build logs in real-time. Develop systems that auto-triage errors, categorize failure patterns, and generate context-aware fix suggestions for engineering teams.</li>\n<li>Predictive Regression Modeling: Design ML workflows that utilize historical performance data to detect kernel and OS regressions (latency, throughput, stability) in staging environments before they impact production.</li>\n<li>Dynamic Kernel Tuning: Implement closed-loop feedback systems that analyze real-time system metrics and automatically suggest or apply sysctl parameter optimizations for specific customer workloads.</li>\n<li>Next-Gen ChatOps: Engineer LLM-driven interfaces for Slack/internal tools, enabling stakeholders to query build statuses, request log summaries, or provision resources using natural language commands.</li>\n</ul>\n<p>Requirements:</p>\n<ul>\n<li>4+ years of professional experience in Linux Systems Engineering, Release Engineering, or DevOps.</li>\n<li>Deep knowledge of Linux internals (boot process, kernel modules, networking stack).</li>\n<li>Experience with package management (Debian/Ubuntu) and build systems.</li>\n<li>Strong proficiency in Python (essential for the AI integration aspects of this role).</li>\n<li>Demonstrable experience integrating API-based AI models (OpenAI, Anthropic, or local open-source models) into software workflows.</li>\n<li>Understanding of RAG (Retrieval-Augmented Generation) architectures for querying technical documentation or logs.</li>\n<li>Experience building event-driven automation (e.g., using webhooks to trigger analysis agents).</li>\n<li>Familiarity with data structures required for vector search or time-series analysis.</li>\n</ul>\n<p>Nice-to-haves:</p>\n<ul>\n<li>Experience with Kubeflow or MLFlow.</li>\n<li>Background in High-Performance Computing (HPC).</li>\n<li>Experience fine-tuning small language models (SLMs) for code or log analysis tasks.</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_6d46741a-b4c","directApply":true,"hiringOrganization":{"@type":"Organization","name":"CoreWeave","sameAs":"https://www.coreweave.com","logo":"https://logos.yubhub.co/coreweave.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/coreweave/jobs/4396057006","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$153,000 to $242,000","x-skills-required":["Linux Systems Engineering","Release Engineering","DevOps","Python","API-based AI models","RAG (Retrieval-Augmented Generation)","Event-driven automation","Vector search","Time-series analysis"],"x-skills-preferred":["Kubeflow","MLFlow","High-Performance Computing (HPC)","Small language models (SLMs)"],"datePosted":"2026-04-18T15:55:17.014Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Livingston, NJ / New York City, NY/ Sunnyvale, CA/ Bellevue, WA"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Linux Systems Engineering, Release Engineering, DevOps, Python, API-based AI models, RAG (Retrieval-Augmented Generation), Event-driven automation, Vector search, Time-series analysis, Kubeflow, MLFlow, High-Performance Computing (HPC), Small language models (SLMs)","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":153000,"maxValue":242000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_7c2b1fd1-6ca"},"title":"Staff Software Engineer- AI Workload Orchestration","description":"<p>As a Staff Software Engineer on the AI Workload Orchestration Platform team, you will act as a technical leader for CoreWeave&#39;s Kubernetes-native orchestration strategy for AI workloads.</p>\n<p>You will define and evolve the architecture for how AI workloads are admitted, scheduled, and governed across large GPU clusters using frameworks such as Kueue, Volcano, and Ray. This platform serves as a strategic complement to SUNK (Slurm on Kubernetes) and underpins both training and inference workloads across the CoreWeave cloud.</p>\n<p>This role requires strong systems thinking, cross-team influence, and a long-term view of platform scalability, reliability, and developer experience.</p>\n<p>You will own the technical vision and architecture for major portions of the AI Workload Orchestration Platform, design scalable, reliable orchestration primitives for AI workloads across multiple schedulers and runtimes, lead cross-team architecture reviews and drive alignment across infrastructure, CKS, and managed inference teams, define platform standards for reliability, observability, capacity management, and operational excellence, identify and resolve systemic performance, scalability, and fairness issues across large GPU clusters, mentor senior engineers and grow technical leadership within the organization, and represent the platform in technical reviews and influence broader CoreWeave platform strategy.</p>\n<p>You will be responsible for leading technical initiatives across teams without direct authority, owning mission-critical systems at scale, and having a strong operational mindset. You will also have the opportunity to mentor senior engineers and grow technical leadership within the organization.</p>\n<p>If you&#39;re a strong systems thinker with a passion for AI and cloud computing, this could be the perfect opportunity for you to join a team of innovators and help shape the future of AI workload orchestration.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_7c2b1fd1-6ca","directApply":true,"hiringOrganization":{"@type":"Organization","name":"CoreWeave","sameAs":"https://www.coreweave.com","logo":"https://logos.yubhub.co/coreweave.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/coreweave/jobs/4647586006","x-work-arrangement":"hybrid","x-experience-level":"staff","x-job-type":"full-time","x-salary-range":"$188,000 to $275,000","x-skills-required":["Go","Kubernetes","Distributed systems","Cloud platforms","Kueue","Volcano","Ray"],"x-skills-preferred":["AI infrastructure","ML platforms","HPC","Large-scale batch and streaming systems","Scheduling concepts","Fairness","Pre-emption","Quota management","Multi-tenant isolation"],"datePosted":"2026-04-18T15:54:24.822Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Sunnyvale, CA / Bellevue, WA"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Go, Kubernetes, Distributed systems, Cloud platforms, Kueue, Volcano, Ray, AI infrastructure, ML platforms, HPC, Large-scale batch and streaming systems, Scheduling concepts, Fairness, Pre-emption, Quota management, Multi-tenant isolation","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":188000,"maxValue":275000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_dd290e64-a85"},"title":"Quantum Software Engineer","description":"<p>We are seeking a talented and innovative Quantum Software Engineer to join our forward-looking team at Anduril Labs. In this role, you will be instrumental in building and delivering impactful quantum solutions for both Anduril-internal use cases and external customer applications.</p>\n<p>You will work closely with delivery leads, application developers, and other solutions architects, as well as internal and external partners to design, implement, and deliver bleeding edge quantum solutions on state-of-the-art quantum-inspired, quantum annealing, and quantum gate platforms for real-world defense and national security challenges.</p>\n<p>The ideal candidate will combine a strong foundation in quantum computing principles with hands-on classical and quantum software development expertise. You will leverage your skills to translate complex problems into (hybrid) quantum algorithms, applications, and services.</p>\n<p>This includes developing robust software implementations, and integrate quantum-enhanced solutions into existing and new defense systems.</p>\n<p>If you are passionate about applying theoretical quantum concepts to deliver tangible, high-impact results, and thrive in an environment that values innovation, collaboration, and rapid prototyping, we encourage you to apply.</p>\n<p><strong>Key Responsibilities:</strong> Be a key contributor to the development of next-generation quantum-enhanced Anduril offerings and lead the design, development, and deployment of novel quantum-enhanced applications and services in the defense and national security domain. Develop impactful hybrid quantum algorithms and applications that promise significant decision advantages and focus on practical scalability and real-world applicability. Contribute knowledge of classical and quantum optimization algorithms and tools, evaluating, and communicating their pros and cons, current state-of-the-art, scaling behaviors, trade-offs, and cross-over points. Participate in the full (hybrid) quantum software development lifecycle, from concept and design to testing, deployment, and ongoing maintenance.</p>\n<p><strong>Requirements:</strong> Bachelor&#39;s degree in Computer Science, Quantum Information Science, Physics, Mathematics, or a closely related technical field. 3+ years of hands-on, professional software development experience with C, C++, Python, or another general-purpose compiled programming language. Practical experience in quantum computing, including programming quantum applications, or quantum circuit compilation. Proficiency with one or more leading quantum programming languages, SDKs, or APIs such as Qiskit, CUDA-Q, Q#, Cirq, PennyLane, or similar. Expertise in key mathematical techniques foundational to quantum computing, including linear algebra, matrix decompositions, probability theory, group theory, symmetry, and computational complexity. Proficient with database systems and SQL, with hands-on experience working with relational databases (e.g., PostgreSQL, Oracle, MySQL). Experience with Git version control, build tools, and CI/CD pipelines. Demonstrated understanding and application of software testing principles and practices, including unit testing, integration testing, and end-to-end testing. Strong problem-solving skills, meticulous attention to detail, and the ability to work effectively in a collaborative team environment. Excellent communication and interpersonal skills, with the ability to effectively articulate complex technical concepts to diverse audiences. Eligible to obtain and maintain an active U.S. Top Secret SCI security clearance. Demonstrable hands-on experience using GenAI tools (e.g., OpenAI Codex, Claude Code, Gemini Code Assist, GitHub Copilot, Amazon CodeWhisperer, or similar) for software development, code generation, debugging, and algorithmic exploration.</p>\n<p><strong>Preferred Qualifications:</strong> Master&#39;s or Ph.D. in Quantum Information Science, Physics, Computer Science, or a related quantitative field. Familiarity with leading classical optimization tools and solvers (e.g., CPLEX, Gurobi, OR-Tools) and knowledge of mathematical modeling and classical optimization solution techniques. Experience building and deploying applications to solve complex business or defense problems for customers. Proven record of successful on-time delivery of complex software projects with a high degree of predictability and quality. Experience with deployment of code in distributed environments, cloud application development (e.g., AWS, Azure, GCP), and RESTful API-driven architectures. Experience with high-performance computing (HPC) environments or parallel programming. Familiarity with quantum hardware platforms and their unique characteristics. Prior experience in defense, aerospace, or related industries applying advanced technologies. Willingness to travel up to approximately 10%.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_dd290e64-a85","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Anduril Industries","sameAs":"https://www.anduril.com/","logo":"https://logos.yubhub.co/anduril.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/andurilindustries/jobs/5089054007","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$132,000-$198,000 USD","x-skills-required":["C","C++","Python","Qiskit","CUDA-Q","Q#","Cirq","PennyLane","Linear Algebra","Matrix Decompositions","Probability Theory","Group Theory","Symmetry","Computational Complexity","Database Systems","SQL","Git","Build Tools","CI/CD Pipelines","Software Testing Principles","Unit Testing","Integration Testing","End-to-End Testing","GenAI Tools"],"x-skills-preferred":["Master's or Ph.D. in Quantum Information Science, Physics, Computer Science, or a related quantitative field","Familiarity with leading classical optimization tools and solvers","Experience building and deploying applications to solve complex business or defense problems for customers","Proven record of successful on-time delivery of complex software projects with a high degree of predictability and quality","Experience with deployment of code in distributed environments, cloud application development, and RESTful API-driven architectures","Experience with high-performance computing (HPC) environments or parallel programming","Familiarity with quantum hardware platforms and their unique characteristics","Prior experience in defense, aerospace, or related industries applying advanced technologies"],"datePosted":"2026-04-18T15:54:19.846Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Washington, District of Columbia, United States"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"C, C++, Python, Qiskit, CUDA-Q, Q#, Cirq, PennyLane, Linear Algebra, Matrix Decompositions, Probability Theory, Group Theory, Symmetry, Computational Complexity, Database Systems, SQL, Git, Build Tools, CI/CD Pipelines, Software Testing Principles, Unit Testing, Integration Testing, End-to-End Testing, GenAI Tools, Master's or Ph.D. in Quantum Information Science, Physics, Computer Science, or a related quantitative field, Familiarity with leading classical optimization tools and solvers, Experience building and deploying applications to solve complex business or defense problems for customers, Proven record of successful on-time delivery of complex software projects with a high degree of predictability and quality, Experience with deployment of code in distributed environments, cloud application development, and RESTful API-driven architectures, Experience with high-performance computing (HPC) environments or parallel programming, Familiarity with quantum hardware platforms and their unique characteristics, Prior experience in defense, aerospace, or related industries applying advanced technologies","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":132000,"maxValue":198000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_96d05ee1-799"},"title":"Staff Software Engineer, Cluster Orchestration","description":"<p><strong>Job Description</strong></p>\n<p>CoreWeave is The Essential Cloud for AI. Built for pioneers by pioneers, CoreWeave delivers a platform of technology, tools, and teams that enables innovators to build and scale AI with confidence.</p>\n<p>Trusted by leading AI labs, startups, and global enterprises, CoreWeave combines superior infrastructure performance with deep technical expertise to accelerate breakthroughs and turn compute into capability.</p>\n<p>Founded in 2017, CoreWeave became a publicly traded company (Nasdaq: CRWV) in March 2025.</p>\n<p><strong>About the Role</strong></p>\n<p>As part of the Cluster Orchestration team, you will play a key role in advancing CoreWeave&#39;s orchestration platform including SUNK (Slurm on Kubernetes) and beyond, our Kubernetes-native foundation that powers AI training and inference at scale.</p>\n<p>This is an opportunity to help shape one of the most critical layers of the AI cloud: ensuring workloads run seamlessly, reliably, and efficiently across massive GPU clusters.</p>\n<p>By building the systems that eliminate infrastructure bottlenecks and create new orchestration capabilities, you will directly empower customers to innovate faster and push the boundaries of what&#39;s possible with AI.</p>\n<p><strong>What You&#39;ll Do</strong></p>\n<p>As a Staff Engineer, you will be a technical leader shaping the long-term strategy for CoreWeave&#39;s orchestration platform.</p>\n<p>You&#39;ll define architectural direction, own critical parts of the orchestration platform and other managed services, and drive cross-org initiatives in scheduling, quota enforcement, and scaling at hyperscale.</p>\n<p>You&#39;ll mentor senior engineers, establish org-wide best practices in reliability and observability, and ensure CoreWeave&#39;s orchestration layer evolves to meet the demands of next-generation AI workloads.</p>\n<p><strong>Who You Are</strong></p>\n<ul>\n<li>8+ years of software engineering experience.</li>\n</ul>\n<ul>\n<li>Proven track record designing and operating large-scale distributed systems in production.</li>\n</ul>\n<ul>\n<li>Deep expertise in Slurm/Kubernetes internals and cloud-native development.</li>\n</ul>\n<ul>\n<li>Advanced proficiency in Go and distributed systems design and cloud-native development.</li>\n</ul>\n<ul>\n<li>Experience setting technical direction and influencing cross-team architecture.</li>\n</ul>\n<ul>\n<li>Bachelor&#39;s or Master&#39;s degree in CS, EE, or related field.</li>\n</ul>\n<p><strong>Preferred</strong></p>\n<ul>\n<li>Familiarity with orchestration and workflow technologies such as Ray, Kubeflow, Kueue, Istio, Knative, or Argo Workflows</li>\n</ul>\n<ul>\n<li>Deep expertise in Slurm/Kubernetes internals.</li>\n</ul>\n<ul>\n<li>Experience with distributed workloads, GPU-based applications, or ML pipelines.</li>\n</ul>\n<ul>\n<li>Knowledge of scheduling concepts like quota enforcement, pre-emption, and scaling strategies.</li>\n</ul>\n<ul>\n<li>Exposure to reliability practices including SLOs, alarms, and post-incident reviews.</li>\n</ul>\n<ul>\n<li>Experience with AI infrastructure and workloads (ML training, inference, or HPC).</li>\n</ul>\n<ul>\n<li>Ability to mentor senior engineers and elevate organizational standards.</li>\n</ul>\n<p><strong>Why CoreWeave?</strong></p>\n<p>At CoreWeave, we work hard, have fun, and move fast! We&#39;re in an exciting stage of hyper-growth that you will not want to miss out on.</p>\n<p>We&#39;re not afraid of a little chaos, and we&#39;re constantly learning.</p>\n<p>Our team cares deeply about how we build our product and how we work together, which is represented through our core values:</p>\n<ul>\n<li>Be Curious at Your Core</li>\n</ul>\n<ul>\n<li>Act Like an Owner</li>\n</ul>\n<ul>\n<li>Empower Employees</li>\n</ul>\n<ul>\n<li>Deliver Best-in-Class Client Experiences</li>\n</ul>\n<ul>\n<li>Achieve More Together</li>\n</ul>\n<p>We support and encourage an entrepreneurial outlook and independent thinking.</p>\n<p>We foster an environment that encourages collaboration and provides the opportunity to develop innovative solutions to complex problems.</p>\n<p>As we get set for take off, the growth opportunities within the organization are constantly expanding.</p>\n<p>You will be surrounded by some of the best talent in the industry, who will want to learn from you, too.</p>\n<p>Come join us!</p>\n<p><strong>Salary and Benefits</strong></p>\n<p>The base salary range for this role is $185,000 to $275,000.</p>\n<p>The starting salary will be determined based on job-related knowledge, skills, experience, and market location.</p>\n<p>We strive for both market alignment and internal equity when determining compensation.</p>\n<p>In addition to base salary, our total rewards package includes a discretionary bonus, equity awards, and a comprehensive benefits program (all based on eligibility).</p>\n<p><strong>What We Offer</strong></p>\n<p>The range we&#39;ve posted represents the typical compensation range for this role.</p>\n<p>To determine actual compensation, we review the market rate for each candidate which can include a variety of factors.</p>\n<p>These include qualifications, experience, interview performance, and location.</p>\n<p>In addition to a competitive salary, we offer a variety of benefits to support your needs, including:</p>\n<ul>\n<li>Medical, dental, and vision insurance - 100% paid for by CoreWeave</li>\n</ul>\n<ul>\n<li>Company-paid Life Insurance</li>\n</ul>\n<ul>\n<li>Voluntary supplemental life insurance</li>\n</ul>\n<ul>\n<li>Short and long-term disability insurance</li>\n</ul>\n<ul>\n<li>Flexible Spending Account</li>\n</ul>\n<ul>\n<li>Health Savings Account</li>\n</ul>\n<ul>\n<li>Tuition Reimbursement</li>\n</ul>\n<ul>\n<li>Ability to Participate in Employee Stock Purchase Program (ESPP)</li>\n</ul>\n<ul>\n<li>Mental Wellness Benefits through Spring Health</li>\n</ul>\n<ul>\n<li>Family-Forming support provided by Carrot</li>\n</ul>\n<ul>\n<li>Paid Parental Leave</li>\n</ul>\n<ul>\n<li>Flexible, full-service childcare support with Kinside</li>\n</ul>\n<ul>\n<li>401(k) with a generous employer match</li>\n</ul>\n<ul>\n<li>Flexible PTO</li>\n</ul>\n<ul>\n<li>Catered lunch each day in our office and data center locations</li>\n</ul>\n<ul>\n<li>A casual work environment</li>\n</ul>\n<ul>\n<li>A work culture focused on innovative disruption</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_96d05ee1-799","directApply":true,"hiringOrganization":{"@type":"Organization","name":"CoreWeave","sameAs":"https://www.coreweave.com","logo":"https://logos.yubhub.co/coreweave.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/coreweave/jobs/4658801006","x-work-arrangement":"hybrid","x-experience-level":"staff","x-job-type":"full-time","x-salary-range":"$185,000 to $275,000","x-skills-required":["software engineering","distributed systems","Slurm","Kubernetes","cloud-native development","Go","scheduling","quota enforcement","scaling strategies","reliability practices","SLOs","alarms","post-incident reviews","AI infrastructure","workloads","ML training","inference","HPC"],"x-skills-preferred":["orchestration and workflow technologies","Ray","Kubeflow","Kueue","Istio","Knative","Argo Workflows"],"datePosted":"2026-04-18T15:53:28.322Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Bellevue, WA / Sunnyvale, CA"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"software engineering, distributed systems, Slurm, Kubernetes, cloud-native development, Go, scheduling, quota enforcement, scaling strategies, reliability practices, SLOs, alarms, post-incident reviews, AI infrastructure, workloads, ML training, inference, HPC, orchestration and workflow technologies, Ray, Kubeflow, Kueue, Istio, Knative, Argo Workflows","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":185000,"maxValue":275000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_5f6e6eac-370"},"title":"Sr GPU Infrastructure Software Engineer","description":"<p>CoreWeave is seeking a Senior GPU Infrastructure Software Engineer to join our team. As a senior engineer, you will be responsible for leading designs, raising engineering standards, and delivering measurable improvements to latency, throughput, and reliability across multiple services. You will partner with fleet, product, and hardware teams to evolve our GPU performance testing platform to ensure we deliver a reliable and performant experience to our customers.</p>\n<p>Key responsibilities include:</p>\n<ul>\n<li>Design and implement solutions to problems of scale for testing and validation of CoreWeave&#39;s global infrastructure.</li>\n<li>Design and develop Kubernetes-native controllers and operators to automate infrastructure workflows.</li>\n<li>Build and maintain scalable backend services and APIs (gRPC/REST) in Go or Python.</li>\n<li>Develop performance tests and automation workflows to expand hardware validation across the CoreWeave fleet.</li>\n<li>Write and maintain Kubernetes custom controllers and operators to automate infrastructure testing.</li>\n<li>Adapt and extend open source tooling to enhance visibility into system metrics, performance, and health.</li>\n</ul>\n<p>To be successful in this role, you should have:</p>\n<ul>\n<li>~5 to 8 years experience.</li>\n<li>Proficiency in Go and/or Python software development.</li>\n<li>Hands-on experience with writing Kubernetes operators/controllers.</li>\n</ul>\n<p>Nice to have:</p>\n<ul>\n<li>Experience testing hardware at scale.</li>\n<li>HPC Experience.</li>\n<li>Experience with AI/ML infrastructure and training / inference.</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_5f6e6eac-370","directApply":true,"hiringOrganization":{"@type":"Organization","name":"CoreWeave","sameAs":"https://www.coreweave.com","logo":"https://logos.yubhub.co/coreweave.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/coreweave/jobs/4627277006","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$165,000 to $242,000","x-skills-required":["Go","Python","Kubernetes","GPU performance testing","infrastructure validation"],"x-skills-preferred":["HPC Experience","AI/ML infrastructure","training / inference"],"datePosted":"2026-04-18T15:53:07.770Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Sunnyvale, CA / Bellevue, WA"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Go, Python, Kubernetes, GPU performance testing, infrastructure validation, HPC Experience, AI/ML infrastructure, training / inference","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":165000,"maxValue":242000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_ac14f361-5b8"},"title":"Network Engineer, Capacity and Efficiency","description":"<p>We&#39;re looking for a network engineer who thinks in metrics first. You will use deep networking knowledge and rigorous measurement to figure out where and how bandwidth, latency, and dollars are being used, find optimization opportunities and land them.</p>\n<p>You will instrument spine-leaf fabrics, BGP, SDN overlays, and cloud interconnect products well enough to build them. You&#39;ll own the observability and efficiency surface for Anthropic&#39;s network: from per-flow telemetry on backbone routers, to QoS policy on cross-region links carrying inference traffic, to cost attribution that tells a research team exactly what their checkpoint sync is costing.</p>\n<p>This is a hands-on IC role. You&#39;ll write code (Python, Go), build dashboards, model capacity, and ship config changes to production routers. You&#39;ll also influence architecture: when the data says a traffic pattern is pathological, you&#39;ll be in the room root causing it and fixing it.</p>\n<p>You will be working across three areas: network telemetry and observability, traffic engineering, and cost modeling and attribution. We expect you to be strong in at least two and willing to grow into the third.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_ac14f361-5b8","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Anthropic","sameAs":"https://anthropic.com","logo":"https://logos.yubhub.co/anthropic.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/anthropic/jobs/5177143008","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["BGP","ECMP","VXLAN/EVPN","QoS","L1/optical basics","CSP networking model","network telemetry","flow export","eBPF-based host-side instrumentation","Python","Go"],"x-skills-preferred":["SRE experience for large-scale network infrastructure","cloud provider's networking team or a cloud networking product team","AI/ML infrastructure traffic patterns","HPC fabrics","traffic engineering for large backbones"],"datePosted":"2026-04-18T15:52:49.160Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco, CA | New York City, NY"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"BGP, ECMP, VXLAN/EVPN, QoS, L1/optical basics, CSP networking model, network telemetry, flow export, eBPF-based host-side instrumentation, Python, Go, SRE experience for large-scale network infrastructure, cloud provider's networking team or a cloud networking product team, AI/ML infrastructure traffic patterns, HPC fabrics, traffic engineering for large backbones"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_1c7dc0cb-87c"},"title":"Solutions Architect - Storage","description":"<p>As a Solutions Architect at CoreWeave, you will play a vital and dynamic role in helping customers succeed with our cloud infrastructure offerings. You will serve as the primary technical point of contact for customers, establishing strong technical relationships and ensuring their success with CoreWeave&#39;s cloud infrastructure offerings, focusing on storage technologies within high-performance compute (HPC) environments.</p>\n<p>Collaborate closely with customers to understand their unique business needs and create, prototype, and deploy tailored solutions that align with their requirements. Lead proof of concept initiatives to showcase the value and viability of CoreWeave&#39;s solutions within specific environments.</p>\n<p>Drive technical leadership and direction during customer meetings, presentations, and workshops, addressing any technical queries or concerns that arise. Act as a virtual member of CoreWeave&#39;s Storage product and engineering teams, identifying opportunities for product enhancement and collaborating with engineers to implement your suggestions.</p>\n<p>Offer valuable insights on product features, functionality, and performance, contributing regularly to discussions about product strategy and architecture. Conduct periodic technical reviews and assessments of customer workloads, pinpointing opportunities for workload optimization and suggesting suitable solutions.</p>\n<p>Stay informed of the latest developments and trends in Kubernetes, cloud computing and infrastructure, sharing your thought leadership with customers and internal stakeholders. Lead the prototyping and initiation of research and development efforts for emerging products and solutions, delivering prototypes and key insights for internal consumption.</p>\n<p>Represent CoreWeave at conferences and industry events, with occasional travel as required.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_1c7dc0cb-87c","directApply":true,"hiringOrganization":{"@type":"Organization","name":"CoreWeave","sameAs":"https://www.coreweave.com","logo":"https://logos.yubhub.co/coreweave.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/coreweave/jobs/4568531006","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$165,000 to $220,000","x-skills-required":["cloud computing concepts","architecture","technologies","storage solutions","Kubernetes","cloud infrastructure","high-performance compute (HPC)","storage technologies","file system protocols","infrastructure systems"],"x-skills-preferred":["code contributions to open-source inference frameworks","scripting and automation related to storage technologies","building solutions across multi-cloud environments","client or customer-facing publications/talks on latency, optimization, or advanced model-server architectures"],"datePosted":"2026-04-18T15:52:39.508Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Livingston, NJ / New York, NY / Sunnyvale, CA / Bellevue, WA"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"cloud computing concepts, architecture, technologies, storage solutions, Kubernetes, cloud infrastructure, high-performance compute (HPC), storage technologies, file system protocols, infrastructure systems, code contributions to open-source inference frameworks, scripting and automation related to storage technologies, building solutions across multi-cloud environments, client or customer-facing publications/talks on latency, optimization, or advanced model-server architectures","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":165000,"maxValue":220000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_6a7d182d-c49"},"title":"Solutions Architect - Kubernetes","description":"<p>As a Solutions Architect at CoreWeave, you will play a vital role in helping customers succeed with our cloud infrastructure offerings, focusing on Kubernetes solutions within high-performance compute (HPC) environments.</p>\n<p>Your primary responsibility will be to serve as the primary technical point of contact for customers, establishing strong technical relationships and ensuring their success with CoreWeave&#39;s cloud infrastructure offerings.</p>\n<p>You will collaborate closely with customers to understand their unique business needs and create, prototype, and deploy tailored solutions that align with their requirements.</p>\n<p>You will lead proof of concept initiatives to showcase the value and viability of CoreWeave&#39;s solutions within specific environments.</p>\n<p>You will drive technical leadership and direction during customer meetings, presentations, and workshops, addressing any technical queries or concerns that arise.</p>\n<p>You will act as a virtual member of CoreWeave&#39;s Kubernetes product and engineering teams, identifying opportunities for product enhancement and collaborating with engineers to implement your suggestions.</p>\n<p>You will offer valuable insights on product features, functionality, and performance, contributing regularly to discussions about product strategy and architecture.</p>\n<p>You will conduct periodic technical reviews and assessments of customer workloads, pinpointing opportunities for workload optimization and suggesting suitable solutions.</p>\n<p>You will stay informed of the latest developments and trends in Kubernetes, cloud computing and infrastructure, sharing your thought leadership with customers and internal stakeholders.</p>\n<p>You will lead the prototyping and initiation of research and development efforts for emerging products and solutions, delivering prototypes and key insights for internal consumption.</p>\n<p>You will represent CoreWeave at conferences and industry events, with occasional travel as required.</p>\n<p>To be successful in this role, you will need to have a proven track record of working as a Solutions Architect, engineer, researcher, or technical account manager in cloud infrastructure, focusing on building distributed systems or HPC/cloud services, with an expertise focused on scalable Kubernetes solutions.</p>\n<p>You will also need to have fluency in cloud computing concepts, architecture, and technologies with hands-on experience in designing and implementing cloud solutions.</p>\n<p>In addition, you will need to have a proven track record with building customer relationships, communicating clearly and the ability to break down complex technical concepts to both technical and non-technical audiences.</p>\n<p>Preferred qualifications include code contributions to open-source inference frameworks, experience with scripting and automation related to Kubernetes clusters and workloads, experience with building solutions across multi-cloud environments, and client or customer-facing publications/talks on latency, optimization, or advanced model-server architectures.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_6a7d182d-c49","directApply":true,"hiringOrganization":{"@type":"Organization","name":"CoreWeave","sameAs":"https://www.coreweave.com","logo":"https://logos.yubhub.co/coreweave.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/coreweave/jobs/4649036006","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$165,000 to $225,000 SGD","x-skills-required":["Cloud computing concepts","Kubernetes solutions","High-performance compute (HPC) environments","Distributed systems","Cloud infrastructure"],"x-skills-preferred":["Code contributions to open-source inference frameworks","Scripting and automation related to Kubernetes clusters and workloads","Building solutions across multi-cloud environments","Client or customer-facing publications/talks on latency, optimization, or advanced model-server architectures"],"datePosted":"2026-04-18T15:52:11.835Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Singapore"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Cloud computing concepts, Kubernetes solutions, High-performance compute (HPC) environments, Distributed systems, Cloud infrastructure, Code contributions to open-source inference frameworks, Scripting and automation related to Kubernetes clusters and workloads, Building solutions across multi-cloud environments, Client or customer-facing publications/talks on latency, optimization, or advanced model-server architectures","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":165000,"maxValue":225000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_16dd7ebd-23f"},"title":"Staff Product Manager, Networking","description":"<p>This role sits within CoreWeave&#39;s Product organization, focused on building and scaling advanced networking capabilities that power AI, machine learning, and high-performance computing workloads.</p>\n<p>As a Staff Product Manager, Networking, you will own the strategy and roadmap for CoreWeave&#39;s advanced networking product portfolio. On a day-to-day basis, you will translate market insights, customer needs, and technical constraints into clear product requirements and execution plans. You will work closely with cross-functional partners to launch new products and evolve existing offerings, ensuring they meet CoreWeave&#39;s high standards for performance, scalability, and reliability.</p>\n<p>This is a highly visible role with significant influence over the future of CoreWeave&#39;s networking platform.</p>\n<p>CoreWeave is a rapidly growing company that prioritizes innovation and disruption. We believe in investing in our people and value candidates who can bring their own diversified experiences to our teams.</p>\n<p>If you love defining product strategy in technically complex, fast-evolving domains, are curious about emerging networking technologies, and are an expert at turning market insights and customer needs into scalable, high-impact products, then we&#39;d love to talk.</p>\n<p>At CoreWeave, we work hard, have fun, and move fast. We&#39;re in an exciting stage of hyper-growth that you will not want to miss. We&#39;re not afraid of a little chaos, and we&#39;re constantly learning. Our team cares deeply about how we build our product and how we work together, which is reflected in our core values:</p>\n<ul>\n<li>Be Curious at Your Core</li>\n<li>Act Like an Owner</li>\n<li>Empower Employees</li>\n<li>Deliver Best-in-Class Client Experiences</li>\n<li>Achieve More Together</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_16dd7ebd-23f","directApply":true,"hiringOrganization":{"@type":"Organization","name":"CoreWeave","sameAs":"https://www.coreweave.com","logo":"https://logos.yubhub.co/coreweave.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/coreweave/jobs/4642612006","x-work-arrangement":"hybrid","x-experience-level":"staff","x-job-type":"full-time","x-salary-range":"$188,000 to $275,000","x-skills-required":["product management","networking","infrastructure","distributed systems","VPCs","load balancers","HPC networking","Direct Connect–style solutions"],"x-skills-preferred":["building or scaling networking products for cloud, hyperscale, or high-performance computing environments","background working closely with infrastructure or platform engineering teams","advanced degree or specialized coursework in networking or distributed systems"],"datePosted":"2026-04-18T15:51:56.472Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Bellevue, WA/  Livingston, NJ /  New York, NY /  San Francisco, CA/   Sunnyvale, CA"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"product management, networking, infrastructure, distributed systems, VPCs, load balancers, HPC networking, Direct Connect–style solutions, building or scaling networking products for cloud, hyperscale, or high-performance computing environments, background working closely with infrastructure or platform engineering teams, advanced degree or specialized coursework in networking or distributed systems","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":188000,"maxValue":275000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_2ab9c635-07a"},"title":"Operations Engineer, Fleet Reliability","description":"<p>The Fleet Reliability Operations team is responsible for the day-to-day provisioning, management, and uptime of CoreWeave&#39;s ever-expanding fleet of server nodes. This team plays a central role in CoreWeave&#39;s growth strategy, configuring, updating, and remotely troubleshooting our highest-tier supercomputing clusters and their networking, delivery platforms, and tools dependencies.</p>\n<p>We are seeking curious, creative, and persistent problem solvers to join our Fleet Reliability Operations team to help drive batches of server nodes through our provisioning and validation processes while efficiently and effectively troubleshooting node or cluster problems as they arise.</p>\n<p>Key responsibilities include:</p>\n<ul>\n<li>Configuring and maintaining large-scale high-performance supercomputing clusters running state-of-the-art GPUs</li>\n<li>Troubleshooting hardware and software issues; escalating and coordinating as needed with data center, network, hardware, and platform teams to drive resolution</li>\n<li>Monitoring and analyzing system performance and taking appropriate remediation actions for cloud health</li>\n<li>Approaching work with flexibility and optimism, anticipating shifting business and technical priorities</li>\n<li>Creating and maintaining documentation of team processes, knowledge, and best practices for system management</li>\n<li>Thinking critically about day-to-day work and working collaboratively to improve team processes and efficiency</li>\n</ul>\n<p>As a member of our team, you will be part of a dynamic and fast-paced environment where you will have the opportunity to grow and develop your skills. We offer a competitive salary range of $83,000 to $110,000, as well as a comprehensive benefits package, including medical, dental, and vision insurance, company-paid life insurance, and flexible PTO.</p>\n<p>If you are a motivated and detail-oriented individual who is passionate about working with cutting-edge technology, we encourage you to apply for this exciting opportunity.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_2ab9c635-07a","directApply":true,"hiringOrganization":{"@type":"Organization","name":"CoreWeave","sameAs":"https://www.coreweave.com","logo":"https://logos.yubhub.co/coreweave.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/coreweave/jobs/4617382006","x-work-arrangement":"hybrid","x-experience-level":"mid","x-job-type":"full-time","x-salary-range":"$83,000 to $110,000","x-skills-required":["Linux system administration","Troubleshooting hardware and software issues","System maintenance tasks","Scripting languages (bash, python, powershell, etc)","Grafana, Prometheus, promsql queries or similar observability platforms"],"x-skills-preferred":["Kubernetes administration","HPC - administering GPU-related workloads","Data center environments including server racks, HVAC systems, fiber trays"],"datePosted":"2026-04-18T15:51:55.238Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"New York, NY /Plano, TX /  Bellevue, WA / Sunnyvale, CA"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Linux system administration, Troubleshooting hardware and software issues, System maintenance tasks, Scripting languages (bash, python, powershell, etc), Grafana, Prometheus, promsql queries or similar observability platforms, Kubernetes administration, HPC - administering GPU-related workloads, Data center environments including server racks, HVAC systems, fiber trays","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":83000,"maxValue":110000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_18013f3c-904"},"title":"Cluster Deployment Engineer","description":"<p>As a Cluster Deployment Engineer at Anthropic, you will own how large-scale AI compute clusters physically come together inside our datacenter fleet.</p>\n<p>You will set the deployment-engineering strategy for cluster build-out , how racks are organized into pods, halls, and sites; how compute, network, power, and cooling systems interface at the rack boundary; and how deployment scope flows cleanly from hardware specification to facility delivery to a running cluster.</p>\n<p>This role is focused on deployment engineering, not on datacenter network or systems design , your scope is making sure clusters land cleanly and predictably, not designing the fabrics or facilities themselves.</p>\n<p>You will work across hardware, networking, facilities, supply chain, and construction to ensure that every generation of accelerator we deploy lands in a datacenter that is ready for it , on schedule, at full density, and with every piece of required infrastructure accounted for.</p>\n<p>You will be the person who sees around corners: anticipating how next-generation rack designs will stress our facilities, where our deployment model will break at scale, and what needs to change now so that the next cluster turn-up is faster and more predictable than the last.</p>\n<p>You will operate at the intersection of engineering strategy and execution discipline, partnering with internal research and systems teams, external developers, engineering firms, and OEM partners to deliver cluster capacity at the speed the frontier demands.</p>\n<p>Responsibilities:</p>\n<ul>\n<li>Own cluster-level deployment strategy , define how AI compute clusters are organized across the floor, how racks interconnect, and how cluster topology requirements translate into facility and deployment scope across a portfolio of sites.</li>\n</ul>\n<ul>\n<li>Set rack interface standards spanning power, network, mechanical, thermal, and spatial domains, and ensure that every deployment includes the complete set of infrastructure required to bring a cluster online.</li>\n</ul>\n<ul>\n<li>Drive multi-threaded cluster bring-up programs across hardware, networking, power, and cooling , owning plans, dependencies, and critical paths from hardware specification through energization and turn-up.</li>\n</ul>\n<ul>\n<li>Partner with internal engineering teams , research, systems, networking, and hardware , to translate cluster requirements into deployable facility scope, and to derisk onboarding of new hardware platforms well ahead of delivery.</li>\n</ul>\n<ul>\n<li>Lead external partner execution with developers, engineering firms, OEMs, and construction teams, driving technical reviews, deviation management, and handoffs that keep deployments on schedule and within specification.</li>\n</ul>\n<ul>\n<li>Improve cluster turn-up reliability and repeatability , identify systemic gaps in deployment scope, tooling, and partner interfaces, and drive durable fixes that reduce time-to-serve for new capacity.</li>\n</ul>\n<ul>\n<li>Define and track deployment KPIs , cluster readiness, schedule adherence, scope completeness, time-to-first-packet , and use historical trends to forecast risk and inform capacity planning.</li>\n</ul>\n<ul>\n<li>Coordinate cross-functional readiness across supply chain, security, operations, and construction to ship production-ready compute capacity.</li>\n</ul>\n<ul>\n<li>Provide crisp executive visibility on deployment progress, tradeoffs, and risks across a portfolio of concurrent cluster programs.</li>\n</ul>\n<ul>\n<li>Design cluster interfaces for durability , define rack and cluster-level interfaces that remain robust across hardware generations, so that facility scope and deployment models do not need to be reinvented every time the underlying hardware changes.</li>\n</ul>\n<ul>\n<li>Build cluster layout and BOM tooling , create and maintain the tools, templates, and data models that turn cluster topology and rack specifications into accurate floor layouts, deployment sequences, and complete bills of materials, replacing one-off spreadsheets with repeatable, auditable workflows.</li>\n</ul>\n<p>You may be a good fit if you:</p>\n<ul>\n<li>Have 10+ years of experience in hyperscale datacenter environments, with senior-level responsibility for cluster deployment, large-scale IT integration, or equivalent infrastructure programs.</li>\n</ul>\n<ul>\n<li>Have delivered AI, HPC, or high-density compute clusters at scale and developed a strong intuition for the constraints that govern cluster deployment , interconnect reach, adjacency, power density, and thermal limits.</li>\n</ul>\n<ul>\n<li>Can operate fluently across the boundary between IT hardware and facility infrastructure, and have set interface standards that held up across multiple hardware generations and sites.</li>\n</ul>\n<ul>\n<li>Have led cross-functional programs with both internal engineering teams and external developers, engineering firms, and OEM partners, and are effective at driving alignment across organizational levels.</li>\n</ul>\n<ul>\n<li>Combine strong systems thinking with execution discipline , comfortable zooming from cluster topology and portfolio strategy down to the specific interface detail that will otherwise become a field issue.</li>\n</ul>\n<ul>\n<li>Communicate clearly with technical and executive audiences, and can distill complex, multi-disciplinary programs into decisions and tradeoffs leadership can act on.</li>\n</ul>\n<ul>\n<li>Thrive in ambiguous, fast-moving environments where the hardware, the scale, and the requirements are all changing simultaneously.</li>\n</ul>\n<ul>\n<li>Hold a Bachelor&#39;s degree in Electrical Engineering, Mechanical Engineering, Computer Engineering, or equivalent practical experience.</li>\n</ul>\n<p>Strong candidates may also:</p>\n<ul>\n<li>Have direct experience deploying leading-edge AI accelerator clusters at hyperscale.</li>\n</ul>\n<ul>\n<li>Have shaped reference designs, deployment standards, or cluster-level playbooks that were adopted across a fleet.</li>\n</ul>\n<ul>\n<li>Have experience working across multiple geographies and understand how regional codes, climate, utility constraints, and supply chains shape cluster-level decisions.</li>\n</ul>\n<ul>\n<li>Have partnered closely with hardware and system providers on long-term platform onboarding and bring-up.</li>\n</ul>\n<ul>\n<li>Have experience building the program mechanisms , roadmaps, milestones, risk registers, runbooks , that make delivery predictable at massive scale.</li>\n</ul>\n<p>The annual compensation range for this role is listed below. For sales roles, the range provided is the role’s On Target Earnings (“OTE”) range, meaning that the range includes both the sales commissions/sales bonuses target and annual base salary for the role.</p>\n<p>Annual Salary: $320,000-$405,000 USD</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_18013f3c-904","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Anthropic","sameAs":"https://www.anthropic.com/","logo":"https://logos.yubhub.co/anthropic.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/anthropic/jobs/5191638008","x-work-arrangement":"remote-hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$320,000-$405,000 USD","x-skills-required":["Hyperscale datacenter environments","Cluster deployment","Large-scale IT integration","Infrastructure programs","AI","HPC","High-density compute clusters","Interconnect reach","Adjacency","Power density","Thermal limits","IT hardware","Facility infrastructure","Interface standards","Cluster topology","Portfolio strategy","Execution discipline","Systems thinking","Communication","Technical audiences","Executive audiences","Complex programs","Decisions","Tradeoffs","Leadership","Bachelor's degree","Electrical Engineering","Mechanical Engineering","Computer Engineering","Practical experience"],"x-skills-preferred":[],"datePosted":"2026-04-18T15:51:42.505Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Remote-Friendly, United States"}},"jobLocationType":"TELECOMMUTE","employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Hyperscale datacenter environments, Cluster deployment, Large-scale IT integration, Infrastructure programs, AI, HPC, High-density compute clusters, Interconnect reach, Adjacency, Power density, Thermal limits, IT hardware, Facility infrastructure, Interface standards, Cluster topology, Portfolio strategy, Execution discipline, Systems thinking, Communication, Technical audiences, Executive audiences, Complex programs, Decisions, Tradeoffs, Leadership, Bachelor's degree, Electrical Engineering, Mechanical Engineering, Computer Engineering, Practical experience","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":320000,"maxValue":405000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_09c520cf-f62"},"title":"Systems Engineer, Kernel","description":"<p>CoreWeave is seeking a highly skilled and motivated Systems Kernel Engineer to join our HAVOCK Team, reporting into the Manager of Systems Engineering. In this role, you will be a key contributor to the stability, performance, and evolution of CoreWeave&#39;s Linux-based infrastructure.</p>\n<p>As a kernel generalist, you will be responsible for debugging kernel-level issues, analysing and fixing crashes, panics, dumps, and upstreaming fixes and features that improve the performance and reliability of our stack.</p>\n<p>This position is ideal for someone who thrives in low-level systems engineering, and understands how modern workloads stress kernels, and is excited to work across a diverse hardware/software ecosystem including CPUs, GPUs, DPUs, networking, and storage.</p>\n<p>Kernel Hardware - Acceleration - Virtualization - Operating Systems - Containerization - Kubelet</p>\n<p>Our Team&#39;s Stack:</p>\n<ul>\n<li>Python, Go, bash/sh, C</li>\n</ul>\n<ul>\n<li>Prometheus, Victoria Metrics, Grafana</li>\n</ul>\n<ul>\n<li>Linux Kernel (custom build), Ubuntu</li>\n</ul>\n<ul>\n<li>Intel/AMD/ARM CPUs, Nvidia GPUs, DPUs, Infiniband and Ethernet NICs</li>\n</ul>\n<ul>\n<li>Docker, kubernetes (k8s), KubeVirt, containerd, kubelet</li>\n</ul>\n<p>Focus Areas:</p>\n<ul>\n<li>Kernel Debugging – Analyse kernel crashes, oopses, panics, and dumps to identify root causes and propose fixes.</li>\n</ul>\n<ul>\n<li>Upstream Contributions – Develop patches for the Linux kernel and upstream them where applicable (networking, storage, virtualization, GPU/DPU enablement).</li>\n</ul>\n<ul>\n<li>Stack-Wide Support – Ensure kernel support and stability across:</li>\n</ul>\n<ul>\n<li>Virtualization (KubeVirt, QEMU, vFIO)</li>\n</ul>\n<ul>\n<li>Container runtimes (containerd, nydus, kubelet)</li>\n</ul>\n<ul>\n<li>HPC/AI workloads (CUDA, GPUDirect, RoCE/InfiniBand)</li>\n</ul>\n<ul>\n<li>Kernel-Hardware Enablement – Support new hardware bring-up across Intel, AMD, ARM CPUs, NVIDIA GPUs, DPUs, and NICs.</li>\n</ul>\n<ul>\n<li>Performance &amp; Stability – Tune kernel subsystems for latency, throughput, and scalability in distributed HPC/AI clusters.</li>\n</ul>\n<p>About the role:</p>\n<ul>\n<li>Triage and fix kernel crashes and performance regressions.</li>\n</ul>\n<ul>\n<li>Develop, test, and upstream kernel patches relevant to CoreWeave’s hardware/software environment.</li>\n</ul>\n<ul>\n<li>Collaborate with hardware vendors and the Linux community on feature enablement.</li>\n</ul>\n<ul>\n<li>Implement diagnostics and tooling for kernel-level observability.</li>\n</ul>\n<ul>\n<li>Work closely with HPC and Fleet teams to ensure kernel readiness for production workloads.</li>\n</ul>\n<ul>\n<li>Provide kernel-level expertise during incident response and root-cause investigations.</li>\n</ul>\n<p>Who You Are:</p>\n<ul>\n<li>5+ years of professional experience in Linux kernel engineering or systems-level development.</li>\n</ul>\n<ul>\n<li>Deep understanding of kernel internals (memory management, scheduling, networking, storage, drivers).</li>\n</ul>\n<ul>\n<li>Experience debugging kernel crashes, dumps, and panics using tools like crash, gdb, kdump.</li>\n</ul>\n<ul>\n<li>Strong C programming skills with the ability to write maintainable and upstream-quality code.</li>\n</ul>\n<ul>\n<li>Experience working with kernel modules, drivers, and subsystems.</li>\n</ul>\n<ul>\n<li>Strong problem-solving abilities with a “full-stack” systems perspective.</li>\n</ul>\n<p>Preferred:</p>\n<ul>\n<li>Contributions to the Linux kernel or related open-source projects.</li>\n</ul>\n<ul>\n<li>Familiarity with virtualization (KVM, QEMU, VFIO) and container runtimes.</li>\n</ul>\n<ul>\n<li>Networking stack expertise (InfiniBand, RoCE, TCP/IP performance tuning).</li>\n</ul>\n<ul>\n<li>GPU/DPU bring-up and driver experience.</li>\n</ul>\n<ul>\n<li>Experience in HPC or large-scale distributed systems.</li>\n</ul>\n<ul>\n<li>Familiarity with QA/QE best practices</li>\n</ul>\n<ul>\n<li>Experience working in Cloud environments</li>\n</ul>\n<ul>\n<li>Experience as a software engineer writing large-scale applications</li>\n</ul>\n<ul>\n<li>Experience with machine learning is a huge bonus</li>\n</ul>\n<p>The base salary range for this role is $165,000 to $242,000. The starting salary will be determined based on job-related knowledge, skills, experience, and market location. We strive for both market alignment and internal equity when determining compensation. In addition to base salary, our total rewards package includes a discretionary bonus, equity awards, and a comprehensive benefits program (all based on eligibility).</p>\n<p>What We Offer</p>\n<p>The range we’ve posted represents the typical compensation range for this role. To determine actual compensation, we review the market rate for each candidate which can include a variety of factors. These include qualifications, experience, interview performance, and location.</p>\n<p>In addition to a competitive salary, we offer a variety of benefits to support your needs, including:</p>\n<ul>\n<li>Medical, dental, and vision insurance - 100% paid for by CoreWeave</li>\n</ul>\n<ul>\n<li>Company-paid Life Insurance</li>\n</ul>\n<ul>\n<li>Voluntary supplemental life insurance</li>\n</ul>\n<ul>\n<li>Short and long-term disability insurance</li>\n</ul>\n<ul>\n<li>Flexible Spending Account</li>\n</ul>\n<ul>\n<li>Health Savings Account</li>\n</ul>\n<ul>\n<li>Tuition Reimbursement</li>\n</ul>\n<ul>\n<li>Ability to Participate in Employee Stock Purchase Program (ESPP)</li>\n</ul>\n<ul>\n<li>Mental Wellness Benefits through Spring Health</li>\n</ul>\n<ul>\n<li>Family-Forming support provided by Carrot</li>\n</ul>\n<ul>\n<li>Paid Parental Leave</li>\n</ul>\n<ul>\n<li>Flexible, full-service childcare support with Kinside</li>\n</ul>\n<ul>\n<li>401(k) with a generous employer match</li>\n</ul>\n<ul>\n<li>Flexible PTO</li>\n</ul>\n<ul>\n<li>Catered lunch each day in our office and data center locations</li>\n</ul>\n<ul>\n<li>A casual work environment</li>\n</ul>\n<ul>\n<li>A work culture focused on innovative disruption</li>\n</ul>\n<p>Our Workplace</p>\n<p>While we prioritize a hybrid work environment, remote work may be considered for candidates located more than 30 miles from an office, based on role requirements for specialized skill sets. New hires will be invited to attend onboarding at one of our hubs within their first month. Teams also gather quarterly to support collaboration.</p>\n<p>California Consumer Privacy Act - California applicants only</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_09c520cf-f62","directApply":true,"hiringOrganization":{"@type":"Organization","name":"CoreWeave","sameAs":"https://www.coreweave.com","logo":"https://logos.yubhub.co/coreweave.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/coreweave/jobs/4599319006","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$165,000 to $242,000","x-skills-required":["Linux kernel engineering","Systems-level development","C programming","Kernel modules","Drivers","Subsystems","Kernel debugging","Upstream contributions","Stack-wide support","Virtualization","Container runtimes","HPC/AI workloads","Kernel-hardware enablement","Performance & stability"],"x-skills-preferred":["Contributions to the Linux kernel","Networking stack expertise","GPU/DPU bring-up and driver experience","Experience in HPC or large-scale distributed systems","QA/QE best practices","Cloud environments","Software engineer writing large-scale applications","Machine learning"],"datePosted":"2026-04-18T15:51:21.252Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Livingston, NJ / New York, NY / Sunnyvale, CA / Bellevue, WA"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Linux kernel engineering, Systems-level development, C programming, Kernel modules, Drivers, Subsystems, Kernel debugging, Upstream contributions, Stack-wide support, Virtualization, Container runtimes, HPC/AI workloads, Kernel-hardware enablement, Performance & stability, Contributions to the Linux kernel, Networking stack expertise, GPU/DPU bring-up and driver experience, Experience in HPC or large-scale distributed systems, QA/QE best practices, Cloud environments, Software engineer writing large-scale applications, Machine learning","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":165000,"maxValue":242000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_ff4d3a91-b20"},"title":"Principal Engineer - Perf and Benchmarking","description":"<p>We&#39;re looking for a Principal Engineer to be the technical lead of CoreWeave&#39;s Benchmarking &amp; Performance team. You will be responsible for our planet-scale performance data warehouse: Ingesting, storing, transforming and analyzing performance events in all the data centers across our global infrastructure.</p>\n<p>You will also be an integral part of achieving industry-leading end-to-end performance benchmarking publications: If MLPerf (Training &amp; Inference), Working closely with NVIDIA (Megatron-LM, TensorRT-LLM &amp; DGX cloud) and the open-source community (llm-d, vLLM and all popular ML frameworks) speak to you, come help us demonstrate CoreWeave&#39;s performance reliability leadership in the field.</p>\n<p><strong>Responsibilities</strong></p>\n<ul>\n<li>Strategy &amp; Leadership - Define the multi-year benchmarking strategy and roadmap; prioritize models/workloads (LLMs, diffusion, vision, speech) and hardware tiers. Build, lead, and mentor a high-performing team of performance engineers and data analysts. Establish governance for claims: documented methodologies, versioning, reproducibility, and audit trails.</li>\n</ul>\n<ul>\n<li>Perf Ownership - Lead end-to-end MLPerf Inference and Training submissions: workload selection, cluster planning, runbooks, audits, and result publication. Coordinate optimization tracks with NVIDIA (CUDA, cuDNN, TensorRT/TensorRT-LLM, Triton, NCCL) to hit competitive results; drive upstream fixes where needed.</li>\n</ul>\n<ul>\n<li>Internal Latency &amp; Throughput Benchmarks - Design a Kubernetes-native, repeatable benchmarking service that exercises CoreWeave stacks across SUNK (Slurm on Kubernetes), Kueue, and Kubeflow pipelines. Measure and report p50/p95/p99 latency, jitter, tokens/s, time-to-first-token, cold-start/warm-start, and cost-per-token/request across models, precisions (BF16/FP8/FP4), batch sizes, and GPU types. Maintain a corpus of representative scenarios (streaming, batch, multi-tenant) and data sets; automate comparisons across software releases and hardware generations.</li>\n</ul>\n<ul>\n<li>Tooling &amp; Automation - Build CI/CD pipelines and K8s controllers/operators to schedule benchmarks at scale; integrate with observability stacks (Prometheus, Grafana, OpenTelemetry) and results warehouses. Implement supply-chain integrity for benchmark artifacts (SBOMs, Cosign signatures).</li>\n</ul>\n<ul>\n<li>Cross-functional &amp; Community - Partner with NVIDIA, key ISVs, and OSS projects (vLLM, Triton, KServe, PyTorch/DeepSpeed, ONNX Runtime) to co-develop optimizations and upstream improvements. Support Sales/SEs with authoritative numbers for RFPs and competitive evaluations; brief analysts and press with rigorous, defensible data.</li>\n</ul>\n<p><strong>Requirements</strong></p>\n<ul>\n<li>10+ years building distributed systems or HPC/cloud services, with deep expertise on large-scale ML training or similar high-performance workloads.</li>\n</ul>\n<ul>\n<li>Proven track record of architecting or building planet-scale data systems (e.g., telemetry platforms, observability stacks, cloud data warehouses, large-scale OLAP engines).</li>\n</ul>\n<ul>\n<li>Deep understanding of GPU performance (CUDA, NCCL, RDMA, NVLink/PCIe, memory bandwidth), model-server stacks (Triton, vLLM, TensorRT-LLM, TorchServe), and distributed training frameworks (PyTorch FSDP/DeepSpeed/Megatron-LM).</li>\n</ul>\n<ul>\n<li>Proficient with Kubernetes and ML control planes; familiarity with SUNK, Kueue, and Kubeflow in production environments.</li>\n</ul>\n<ul>\n<li>Excellent communicator able to interface with executives, customers, auditors, and OSS communities.</li>\n</ul>\n<p><strong>Nice to have</strong></p>\n<ul>\n<li>Experience with time-series databases, log-structured merge trees (LSM), or custom storage engine development.</li>\n</ul>\n<ul>\n<li>Experience running MLPerf submissions (Inference and/or Training) or equivalent audited benchmarks at scale.</li>\n</ul>\n<ul>\n<li>Contributions to MLPerf, Triton, vLLM, PyTorch, KServe, or similar OSS projects.</li>\n</ul>\n<ul>\n<li>Experience benchmarking multi-region fleets and large clusters (thousands of GPUs).</li>\n</ul>\n<ul>\n<li>Publications/talks on ML performance, latency engineering, or large-scale benchmarking methodology.</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_ff4d3a91-b20","directApply":true,"hiringOrganization":{"@type":"Organization","name":"CoreWeave","sameAs":"https://www.coreweave.com","logo":"https://logos.yubhub.co/coreweave.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/coreweave/jobs/4627302006","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$206,000 to $333,000","x-skills-required":["Distributed systems","HPC/cloud services","Large-scale ML training","GPU performance","Model-server stacks","Distributed training frameworks","Kubernetes","ML control planes","Time-series databases","Log-structured merge trees","Custom storage engine development"],"x-skills-preferred":["MLPerf submissions","Audited benchmarks","Contributions to OSS projects","Benchmarking multi-region fleets","Large clusters","Publications/talks on ML performance"],"datePosted":"2026-04-18T15:51:17.448Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Sunnyvale, CA / Bellevue, WA"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Distributed systems, HPC/cloud services, Large-scale ML training, GPU performance, Model-server stacks, Distributed training frameworks, Kubernetes, ML control planes, Time-series databases, Log-structured merge trees, Custom storage engine development, MLPerf submissions, Audited benchmarks, Contributions to OSS projects, Benchmarking multi-region fleets, Large clusters, Publications/talks on ML performance","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":206000,"maxValue":333000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_7d9bfb5a-511"},"title":"Senior Firmware Engineer, OpenBMC","description":"<p>To accelerate datacenter deployment and management, CoreWeave is expanding its firmware engineering team to focus on developing and maintaining OpenBMC-based firmware for our next-generation Baseboard Management Controllers (BMCs).</p>\n<p>As a Senior Firmware Engineer, you will design, implement, and maintain embedded firmware features that enable secure, scalable, and reliable control across CoreWeave&#39;s high-performance compute infrastructure. You will work independently on complex components, collaborate closely with cross-functional teams, and help set best practices for firmware quality and performance.</p>\n<p>Key Responsibilities:</p>\n<ul>\n<li>Design &amp; Implement: Develop and enhance OpenBMC firmware in C++ for CoreWeave&#39;s custom server platforms, contributing to key subsystems such as sensor management, power and thermal control, networking, and system monitoring.</li>\n</ul>\n<ul>\n<li>Integrate &amp; Debug: Collaborate with hardware design, platform software, and reliability teams to integrate firmware with new hardware and validate performance across diverse environments.</li>\n</ul>\n<ul>\n<li>Optimize: BMC Performance and Harden Security</li>\n</ul>\n<ul>\n<li>Root Cause Analysis: Perform deep system-level debugging using tools such as GDB, JTAG, or logic analyzers to resolve cross-layer issues between hardware, firmware, and OS.</li>\n</ul>\n<ul>\n<li>Automate &amp; Validate: Contribute to continuous integration and automated testing frameworks for OpenBMC build and validation.</li>\n</ul>\n<ul>\n<li>Document &amp; Share: Maintain clear technical documentation and participate in design reviews to ensure consistency and maintainability across the firmware codebase.</li>\n</ul>\n<ul>\n<li>Collaborate Broadly: Partner with other ICs and technical leads across CoreWeave&#39;s infrastructure engineering, hardware design, and operations teams to align firmware capabilities with platform and datacenter goals.</li>\n</ul>\n<p>Minimum Qualifications:</p>\n<ul>\n<li>Experience: 4+ of professional experience in firmware or embedded systems development, including direct work with Linux-based OpenBMC firmware.</li>\n</ul>\n<ul>\n<li>Education: Bachelor&#39;s degree in Computer Engineering, Electrical Engineering, Computer Science, or a related field.</li>\n</ul>\n<p>Technical Skills:</p>\n<ul>\n<li>Proficiency in C/C++ for embedded systems.</li>\n</ul>\n<ul>\n<li>Hands-on experience with OpenBMC, Yocto Project, and embedded Linux environments.</li>\n</ul>\n<ul>\n<li>Familiarity with hardware interfaces and protocols (I2C, SPI, UART, GPIO, IPMI, DMTF Redfish)</li>\n</ul>\n<ul>\n<li>Experience with hardware bring-up, board-level debugging, and sensor integration.</li>\n</ul>\n<ul>\n<li>Comfort with Linux kernel configuration, device trees, and BSP-level integration.</li>\n</ul>\n<ul>\n<li>Working knowledge of source code control system such as Git</li>\n</ul>\n<ul>\n<li>Comfort with debugging tools such as GDB JTAG and debugging over serial or remote consoles.</li>\n</ul>\n<ul>\n<li>Basic scripting skills in Python or Bash for build automation and validation.</li>\n</ul>\n<ul>\n<li>Strong problem-solving and analytical thinking; able to break down complex system-level issues.</li>\n</ul>\n<ul>\n<li>Communicates effectively with peers across hardware, firmware, and operations teams.</li>\n</ul>\n<ul>\n<li>Self-driven with a focus on delivering high-quality, maintainable code.</li>\n</ul>\n<ul>\n<li>Thrives in a fast-paced environment and balances multiple priorities effectively.</li>\n</ul>\n<p>Preferred Qualification:</p>\n<ul>\n<li>Experience developing CI/CD pipeline for firmware builds and regression testing</li>\n</ul>\n<ul>\n<li>Exposure to large-scale datacenter or HPC environments</li>\n</ul>\n<ul>\n<li>Contributions to open-source firmware projects or upstream Linux development</li>\n</ul>\n<p>The base salary range for this role is $153,000 to $242,000. The starting salary will be determined based on job-related knowledge, skills, experience, and market location. We strive for both market alignment and internal equity when determining compensation. In addition to base salary, our total rewards package includes a discretionary bonus, equity awards, and a comprehensive benefits program (all based on eligibility).</p>\n<p>What We Offer</p>\n<p>The range we&#39;ve posted represents the typical compensation range for this role. To determine actual compensation, we review the market rate for each candidate which can include a variety of factors. These include qualifications, experience, interview performance, and location.</p>\n<p>In addition to a competitive salary, we offer a variety of benefits to support your needs, including:</p>\n<ul>\n<li>Medical, dental, and vision insurance - 100% paid for by CoreWeave</li>\n</ul>\n<ul>\n<li>Company-paid Life Insurance</li>\n</ul>\n<ul>\n<li>Voluntary supplemental life insurance</li>\n</ul>\n<ul>\n<li>Short and long-term disability insurance</li>\n</ul>\n<ul>\n<li>Flexible Spending Account</li>\n</ul>\n<ul>\n<li>Health Savings Account</li>\n</ul>\n<ul>\n<li>Tuition Reimbursement</li>\n</ul>\n<ul>\n<li>Ability to Participate in Employee Stock Purchase Program (ESPP)</li>\n</ul>\n<ul>\n<li>Mental Wellness Benefits through Spring Health</li>\n</ul>\n<ul>\n<li>Family-Forming support provided by Carrot</li>\n</ul>\n<ul>\n<li>Paid Parental Leave</li>\n</ul>\n<ul>\n<li>Flexible, full-service childcare support with Kinside</li>\n</ul>\n<ul>\n<li>401(k) with a generous employer match</li>\n</ul>\n<ul>\n<li>Flexible PTO</li>\n</ul>\n<ul>\n<li>Catered lunch each day in our office and data center locations</li>\n</ul>\n<ul>\n<li>A casual work environment</li>\n</ul>\n<ul>\n<li>A work culture focused on innovative disruption</li>\n</ul>\n<p>Our Workplace</p>\n<p>While we prioritize a hybrid work environment, remote work may be considered for candidates located more than 30 miles from an office, based on role requirements for specialized skill sets. New hires will be invited to attend onboarding at one of our hubs within their first month. Teams also gather quarterly to support collaboration.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_7d9bfb5a-511","directApply":true,"hiringOrganization":{"@type":"Organization","name":"CoreWeave","sameAs":"https://www.coreweave.com","logo":"https://logos.yubhub.co/coreweave.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/coreweave/jobs/4452431006","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$153,000 to $242,000","x-skills-required":["C/C++","OpenBMC","Yocto Project","embedded Linux","hardware interfaces","protocols","Linux kernel configuration","device trees","BSP-level integration","source code control system","debugging tools","scripting skills","problem-solving","analytical thinking"],"x-skills-preferred":["CI/CD pipeline","firmware builds","regression testing","large-scale datacenter","HPC environments","open-source firmware projects","upstream Linux development"],"datePosted":"2026-04-18T15:50:51.520Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Livingston, NJ / New York, NY / Sunnyvale, CA / Bellevue, WA"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"C/C++, OpenBMC, Yocto Project, embedded Linux, hardware interfaces, protocols, Linux kernel configuration, device trees, BSP-level integration, source code control system, debugging tools, scripting skills, problem-solving, analytical thinking, CI/CD pipeline, firmware builds, regression testing, large-scale datacenter, HPC environments, open-source firmware projects, upstream Linux development","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":153000,"maxValue":242000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_594b20c4-c28"},"title":"Infrastructure Engineer, Security","description":"<p>We&#39;re looking for an infrastructure engineer to own and evolve the security infrastructure that underpins our foundation models. In this role, you&#39;ll work across compute, storage, networking, and data platforms, making sure our systems are secure, reliable, and built to scale.</p>\n<p>You&#39;ll shape controls, architecture, and tooling so that security is part of how the platform works by default. You&#39;ll partner closely with research and product teams, enabling them to move quickly while keeping our models, data, and environments protected.</p>\n<p>Key responsibilities include:</p>\n<p>Architecting security patterns for platforms and services, including network segmentation, service-to-service authentication, RBAC, and policy enforcement in Kubernetes and cloud environments.</p>\n<p>Managing identity, access, and secrets for humans and services: workload and cross-cloud identity, least-privilege IAM, and secrets management.</p>\n<p>Building secure platforms for data ingestion, processing, and curation: classification, encryption, access controls, and safe sharing patterns across teams.</p>\n<p>Writing threat models and reviewing designs with researchers and engineers to help them ship features and experiments in a safe, scalable way.</p>\n<p>Automating security checks and building guardrails: policy-as-code, secure infrastructure baselines, validation in CI/CD, and tools that make the secure path the easiest one.</p>\n<p>Requirements include:</p>\n<p>Bachelor&#39;s degree or equivalent experience in engineering, or similar.</p>\n<p>Strong background with containers and orchestration (e.g., Kubernetes) and how to secure them (namespaces, network policies, pod security, admission controls, etc.).</p>\n<p>Practical experience with Infrastructure as Code (Terraform or similar), including secure patterns for provisioning networks, IAM, and shared services.</p>\n<p>Solid understanding of cloud networking and security: VPCs, load balancers, service discovery, mTLS, firewalls, and zero-trust-style architectures.</p>\n<p>Proficiency with a systems language such as Rust and scripting in Python for building platform components and internal tools.</p>\n<p>Evidence of owning complex, production-critical systems, including debugging issues that span infra, security, and application layers.</p>\n<p>Preferred qualifications include experience with ML infrastructure, GPU clusters, or large-scale training environments, as well as background in AI labs, HPC environments, or ML-heavy organizations.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_594b20c4-c28","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Thinking Machines Lab","sameAs":"https://thinkingmachineslab.com/","logo":"https://logos.yubhub.co/thinkingmachineslab.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/thinkingmachines/jobs/5015964008","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$200,000 - $475,000 USD","x-skills-required":["Kubernetes","Infrastructure as Code","Cloud Networking and Security","Systems Language (Rust)","Scripting (Python)"],"x-skills-preferred":["ML Infrastructure","GPU Clusters","Large-Scale Training Environments","AI Labs","HPC Environments"],"datePosted":"2026-04-18T15:50:20.174Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Kubernetes, Infrastructure as Code, Cloud Networking and Security, Systems Language (Rust), Scripting (Python), ML Infrastructure, GPU Clusters, Large-Scale Training Environments, AI Labs, HPC Environments","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":200000,"maxValue":475000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_372999e8-579"},"title":"Senior Software Engineer II, AI Workload Orchestration","description":"<p>As a Senior Software Engineer II on the AI Workload Orchestration team, you will help build and operate CoreWeave&#39;s Kubernetes-native platform for admitting, scheduling, and operating AI workloads at scale.</p>\n<p>This platform integrates multiple orchestration and scheduling frameworks such as Kueue, Volcano, and Ray to support modern AI training and inference workflows. It complements SUNK (Slurm on Kubernetes) by providing a Kubernetes-first, cloud-native orchestration layer with deep platform integration.</p>\n<p>You will own meaningful components of the platform, drive reliability and performance improvements, and help scale the system as customer demand and workload complexity continue to grow.</p>\n<p>Responsibilities:</p>\n<ul>\n<li>Design, build, and operate Kubernetes-native services for AI workload orchestration and scheduling</li>\n<li>Own one or more platform components end-to-end, including design, implementation, testing, and on-call support</li>\n<li>Improve scheduling latency, cluster utilization, and workload reliability through metrics-driven engineering</li>\n<li>Contribute to architectural discussions across services and influence design decisions within the platform</li>\n<li>Work closely with adjacent teams (CKS, infrastructure, managed inference) to ensure clean interfaces and integrations</li>\n<li>Mentor junior engineers and raise the quality bar for code, design, and operations</li>\n</ul>\n<p>About the role:</p>\n<ul>\n<li>5–8 years of professional software engineering experience in distributed systems, cloud infrastructure, or platform engineering</li>\n<li>Strong experience building production systems in Go (Python or C++ a plus)</li>\n<li>Solid understanding of Kubernetes fundamentals, APIs, controllers, and operating services in production</li>\n<li>Experience working with scheduling, resource management, or quota-based systems</li>\n<li>Proven ability to improve system reliability and performance using data and operational metrics</li>\n<li>Comfortable owning services in production and participating in on-call rotations</li>\n</ul>\n<p>Preferred:</p>\n<ul>\n<li>Experience with Kubernetes-native orchestration frameworks such as Kueue, Volcano, Ray, Kubeflow, or Argo Workflows</li>\n<li>Familiarity with GPU-based workloads, ML training, or inference pipelines</li>\n<li>Knowledge of scheduling concepts such as quota enforcement, pre-emption, and backfilling</li>\n<li>Experience with reliability practices including SLOs, alerting, and incident response</li>\n<li>Exposure to AI infrastructure, HPC, or large-scale distributed compute environments</li>\n</ul>\n<p>Why CoreWeave?</p>\n<p>At CoreWeave, we work hard, have fun, and move fast! We’re in an exciting stage of hyper-growth that you will not want to miss out on. We’re not afraid of a little chaos, and we’re constantly learning. Our team cares deeply about how we build our product and how we work together, which is represented through our core values:</p>\n<ul>\n<li>Be Curious at Your Core</li>\n<li>Act Like an Owner</li>\n<li>Empower Employees</li>\n<li>Deliver Best-in-Class Client Experiences</li>\n<li>Achieve More Together</li>\n</ul>\n<p>The base salary range for this role is $165,000 to $242,000. The starting salary will be determined based on job-related knowledge, skills, experience, and market location. We strive for both market alignment and internal equity when determining compensation. In addition to base salary, our total rewards package includes a discretionary bonus, equity awards, and a comprehensive benefits program (all based on eligibility).</p>\n<p>What We Offer</p>\n<p>The range we’ve posted represents the typical compensation range for this role. To determine actual compensation, we review the market rate for each candidate which can include a variety of factors. These include qualifications, experience, interview performance, and location.</p>\n<p>In addition to a competitive salary, we offer a variety of benefits to support your needs, including:</p>\n<ul>\n<li>Medical, dental, and vision insurance - 100% paid for by CoreWeave</li>\n<li>Company-paid Life Insurance</li>\n<li>Voluntary supplemental life insurance</li>\n<li>Short and long-term disability insurance</li>\n<li>Flexible Spending Account</li>\n<li>Health Savings Account</li>\n<li>Tuition Reimbursement</li>\n<li>Ability to Participate in Employee Stock Purchase Program (ESPP)</li>\n<li>Mental Wellness Benefits through Spring Health</li>\n<li>Family-Forming support provided by Carrot</li>\n<li>Paid Parental Leave</li>\n<li>Flexible, full-service childcare support with Kinside</li>\n<li>401(k) with a generous employer match</li>\n<li>Flexible PTO</li>\n<li>Catered lunch each day in our office and data center locations</li>\n<li>A casual work environment</li>\n<li>A work culture focused on innovative disruption</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_372999e8-579","directApply":true,"hiringOrganization":{"@type":"Organization","name":"CoreWeave","sameAs":"https://www.coreweave.com","logo":"https://logos.yubhub.co/coreweave.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/coreweave/jobs/4647595006","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$165,000 to $242,000","x-skills-required":["Kubernetes","Go","Distributed systems","Cloud infrastructure","Platform engineering","Scheduling","Resource management","Quota-based systems"],"x-skills-preferred":["Kueue","Volcano","Ray","Kubeflow","Argo Workflows","GPU-based workloads","ML training","Inference pipelines","SLOs","Alerting","Incident response","AI infrastructure","HPC","Large-scale distributed compute environments"],"datePosted":"2026-04-18T15:50:19.636Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Sunnyvale, CA / Bellevue, WA"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Kubernetes, Go, Distributed systems, Cloud infrastructure, Platform engineering, Scheduling, Resource management, Quota-based systems, Kueue, Volcano, Ray, Kubeflow, Argo Workflows, GPU-based workloads, ML training, Inference pipelines, SLOs, Alerting, Incident response, AI infrastructure, HPC, Large-scale distributed compute environments","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":165000,"maxValue":242000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_db7b0f51-7df"},"title":"Senior Cloud Support Engineer","description":"<p>As a Senior Cloud Support Engineer at CoreWeave, you&#39;ll be on the front lines of a technological revolution, empowering our customers to harness the full potential of our advanced Kubernetes-powered HPC cloud infrastructure.</p>\n<p>You&#39;ll be hands-on, collaborating with engineers and researchers to resolve issues that impact high-profile, mission-critical applications and cutting-edge AI training workloads. Your contributions will be pivotal in ensuring seamless performance, reliability, and success for our customers, positioning you at the very core of transformative technologies reshaping industries worldwide at a company that is truly one of a kind.</p>\n<p>In this role, you will:</p>\n<ul>\n<li>Guide and mentor team members in developing their technical skills and troubleshooting capabilities across all disciplines supported by CoreWeave.</li>\n<li>Provide real-time feedback and coaching, reviewing tickets to identify opportunities for improvement and ensure quality assurance (QA).</li>\n<li>Develop and deliver training sessions to improve the team&#39;s proficiency and efficiency in resolving customer issues.</li>\n<li>Use technical expertise to investigate, debug, and resolve customer-impacting issues with the curiosity required to uncover and understand root causes.</li>\n<li>Maintain high customer satisfaction through swift, accurate, and empathetic high-touch support communications, as well as established best practices.</li>\n<li>Help design and implement troubleshooting best practices to ensure fast, accurate client resolutions.</li>\n<li>Contribute to refining processes, workflows, and playbooks for handling complex customer challenges.</li>\n<li>Serve as a technical escalation point for high-priority escalations or complex cases, modeling effective problem-solving approaches.</li>\n<li>Lead the creation of knowledge-sharing resources, including documentation, tutorials, and how-to guides.</li>\n<li>Enhance the support team&#39;s knowledge of CoreWeave&#39;s products and services through continuous learning initiatives.</li>\n</ul>\n<p>Who You Are:</p>\n<ul>\n<li>Have a Bachelor&#39;s degree in Information Science / Information Technology, Data Science, Computer Science, Engineering, Mathematics, Physics, or a related field, OR equivalent experience in a technical position</li>\n<li>At least 5+ years of experience in cloud support, systems administration, or related technical support-focused roles</li>\n<li>Proven hands-on work experience with Kubernetes</li>\n<li>Experience with networking, load balancing, storage volumes, observability, node management, High-Performance Computing (HPC), and Linux system administration</li>\n<li>Proven ability to mentor team members, foster technical growth, and improve team-wide capabilities through guidance and feedback</li>\n<li>Experience with observability tools such as Grafana</li>\n<li>Strong troubleshooting skills, with experience resolving complex customer issues and driving quality assurance through ticket reviews or similar processes</li>\n<li>Demonstrated success collaborating with cross-functional teams to refine workflows, implement best practices, and advocate for necessary tools or process changes</li>\n<li>Excellent written and verbal communication skills, with a track record of simplifying complex concepts for diverse audiences</li>\n<li>Strong technical presentation skills, with experience delivering precise, engaging, and informative presentations to technical and non-technical audiences, effectively showcasing complex concepts and solutions</li>\n</ul>\n<p>Preferred:</p>\n<ul>\n<li>CKA Certified</li>\n<li>Demonstrated experience with training, coaching, and creating onboarding materials.</li>\n<li>Operates in a fast-paced, global, 24/7 support team environment</li>\n<li>Ability to collaborate across different time zones</li>\n<li>On-site office environment, hybrid, or remote options depending on location</li>\n<li>Flexible to travel up to 10% (~25 days/year)</li>\n</ul>\n<p>Why CoreWeave?</p>\n<p>At CoreWeave, we work hard, have fun, and move fast! We&#39;re in an exciting stage of hyper-growth that you will not want to miss out on. We&#39;re not afraid of a little chaos, and we&#39;re constantly learning. Our team cares deeply about how we build our product and how we work together, which is represented through our core values:</p>\n<ul>\n<li>Be Curious at Your Core</li>\n<li>Act Like an Owner</li>\n<li>Empower Employees</li>\n<li>Deliver Best-in-Class Client Experiences</li>\n<li>Achieve More Together</li>\n</ul>\n<p>We support and encourage an entrepreneurial outlook and independent thinking. We foster an environment that encourages collaboration and provides the opportunity to develop innovative solutions to complex problems. As we get set for take off, the growth opportunities within the organization are constantly expanding. You will be surrounded by some of the best talent in the industry, who will want to learn from you, too.</p>\n<p>Come join us!</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_db7b0f51-7df","directApply":true,"hiringOrganization":{"@type":"Organization","name":"CoreWeave","sameAs":"https://www.coreweave.com","logo":"https://logos.yubhub.co/coreweave.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/coreweave/jobs/4568136006","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$122,000 to $163,000","x-skills-required":["cloud support","systems administration","Kubernetes","networking","load balancing","storage volumes","observability","node management","High-Performance Computing (HPC)","Linux system administration"],"x-skills-preferred":["CKA Certified","training","coaching","onboarding materials","fast-paced global support team environment","collaboration across different time zones"],"datePosted":"2026-04-18T15:49:50.841Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Livingston, NJ / New York, NY / Sunnyvale, CA / Bellevue, WA"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"cloud support, systems administration, Kubernetes, networking, load balancing, storage volumes, observability, node management, High-Performance Computing (HPC), Linux system administration, CKA Certified, training, coaching, onboarding materials, fast-paced global support team environment, collaboration across different time zones","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":122000,"maxValue":163000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_0f249232-d14"},"title":"Principal Engineer, Cluster Orchestration","description":"<p>As a Principal Engineer in AI Infrastructure, you will lead the design and evolution of the cluster orchestration systems that make this possible. This includes Slurm, Kubernetes, SUNK, and the control planes that support AI training, inference, and model onboarding at scale.</p>\n<p>You will define long-term architecture, solve hard scaling problems, and set technical direction across teams. Your work will directly affect how quickly customers can run models, how efficiently we use GPUs, and how reliably the platform behaves at scale.</p>\n<p>Key responsibilities include:</p>\n<ul>\n<li>Defining the long-term architecture for CoreWeave&#39;s orchestration platforms across Kubernetes, Slurm, SUNK, Kueue, and related systems.</li>\n<li>Acting as a technical authority on scheduling, quota enforcement, fairness, pre-emption, and multi-tenant GPU isolation.</li>\n<li>Making design decisions that balance performance, reliability, cost, and operational complexity.</li>\n</ul>\n<p>In addition to these responsibilities, you will also lead the evolution of Kubernetes-native control planes, including SUNK and custom operators, and design systems that support workload admission, validation, and rollout, including model onboarding flows.</p>\n<p>You will work closely with cross-functional teams to ensure that the systems you design and implement meet the needs of our customers and are scalable, reliable, and efficient.</p>\n<p>If you have a passion for building large-scale distributed systems and are looking for a challenging and rewarding role, we encourage you to apply.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_0f249232-d14","directApply":true,"hiringOrganization":{"@type":"Organization","name":"CoreWeave","sameAs":"https://www.coreweave.com","logo":"https://logos.yubhub.co/coreweave.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/coreweave/jobs/4658799006","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$206,000 to $303,000","x-skills-required":["Kubernetes","Slurm","SUNK","Go","Cloud-native systems development","GPU-heavy platforms for AI training, inference, or HPC workloads"],"x-skills-preferred":["Kueue","Kubeflow","Argo Workflows","Ray","Istio","Knative"],"datePosted":"2026-04-18T15:48:07.140Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Bellevue, WA / Sunnyvale, CA"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Kubernetes, Slurm, SUNK, Go, Cloud-native systems development, GPU-heavy platforms for AI training, inference, or HPC workloads, Kueue, Kubeflow, Argo Workflows, Ray, Istio, Knative","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":206000,"maxValue":303000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_a14533c3-732"},"title":"Senior Engineer, Cilium CNI & Cloud Networking","description":"<p>Network Services Team</p>\n<p>The Network Services team builds and operates the foundational networking that powers CoreWeave&#39;s Kubernetes platforms at cloud scale. The team is responsible for container networking, connectivity, and network services that support large-scale, GPU-driven workloads across regions and environments. They focus on scalability, reliability, security, and performance while delivering intuitive platforms for internal teams and customers.</p>\n<p>About the Role</p>\n<p>As a Senior Engineer focused on our Cilium-based CNI, you will design, build, and operate the container networking layer that underpins CoreWeave&#39;s Kubernetes platforms. Day to day, you will work on evolving our CNI stack to support large, high-density GPU clusters with demanding throughput and latency requirements. You will partner closely with Kubernetes, Infrastructure, and Network Services engineers to ensure the platform is highly available, observable, and secure. This role spans architecture, implementation, and operations, with ownership from prototype through production. You will also help shape how our networking platform scales for future growth.</p>\n<p>Who You Are</p>\n<ul>\n<li>5+ years of experience as a Software Engineer or Systems Engineer working on cloud infrastructure or large-scale distributed systems.</li>\n<li>Hands-on production experience with Cilium CNI (or equivalent advanced CNIs), including cluster configuration and lifecycle management.</li>\n<li>Strong understanding of Cilium&#39;s eBPF datapath, policy model, and load-balancing mechanisms.</li>\n<li>Deep knowledge of cloud networking concepts, including VPCs, subnets, routing, security groups/ACLs, NAT, and ingress/egress architectures.</li>\n<li>Experience designing multi-tenant network architectures with strong isolation and security.</li>\n<li>Solid grounding in TCP/IP, dynamic routing (e.g., BGP), ECMP, MTU/fragmentation, and overlay/underlay networking (VXLAN, Geneve, encapsulation).</li>\n<li>Experience with network observability and troubleshooting across L3–L7.</li>\n<li>Proficiency in at least one systems language such as Golang or C/C++.</li>\n<li>Experience working in modern CI/CD environments.</li>\n<li>Experience operating Kubernetes at scale, including cluster lifecycle management and debugging networking issues across pods, nodes, and external services.</li>\n<li>Demonstrated ownership of complex systems end-to-end.</li>\n</ul>\n<p>Preferred</p>\n<ul>\n<li>Experience operating cloud-scale network services across tens of thousands of nodes and multiple regions.</li>\n<li>Contributions to Cilium, Kubernetes, or related open-source networking projects.</li>\n<li>Experience with eBPF development and performance tuning.</li>\n<li>Experience building Kubernetes operators or controllers.</li>\n<li>Familiarity with service meshes, multi-cluster networking, or cluster mesh solutions.</li>\n<li>Experience in GPU-heavy, HPC, or other performance-sensitive environments.</li>\n</ul>\n<p>Wondering if you’re a good fit?</p>\n<p>We believe in investing in our people and value candidates who bring diverse experiences , even if you’re not a 100% match on paper. If some of this sounds like you, we’d love to talk.</p>\n<ul>\n<li>You love solving complex distributed systems and networking challenges at scale.</li>\n<li>You’re curious about cloud-native networking, eBPF, and Kubernetes internals.</li>\n<li>You’re an expert in building reliable, scalable infrastructure that runs in production.</li>\n</ul>\n<p>Why CoreWeave?</p>\n<p>At CoreWeave, we work hard, have fun, and move fast! We’re in an exciting stage of hyper-growth that you will not want to miss out on. We’re not afraid of a little chaos, and we’re constantly learning. Our team cares deeply about how we build our product and how we work together, which is represented through our core values:</p>\n<ul>\n<li>Be Curious at Your Core</li>\n<li>Act Like an Owner</li>\n<li>Empower Employees</li>\n<li>Deliver Best-in-Class Client Experiences</li>\n<li>Achieve More Together</li>\n</ul>\n<p>The base salary range for this role is $165,000 to $242,000. The starting salary will be determined based on job-related knowledge, skills, experience, and market location. We strive for both market alignment and internal equity when determining compensation. In addition to base salary, our total rewards package includes a discretionary bonus, equity awards, and a comprehensive benefits program (all based on eligibility).</p>\n<p>What We Offer</p>\n<p>The range we’ve posted represents the typical compensation range for this role. To determine actual compensation, we review the market rate for each candidate which can include a variety of factors. These include qualifications, experience, interview performance, and location. In addition to a competitive salary, we offer a variety of benefits to support your needs, including:</p>\n<ul>\n<li>Medical, dental, and vision insurance</li>\n<li>100% paid for by CoreWeave</li>\n<li>Company-paid Life Insurance</li>\n<li>Voluntary supplemental life insurance</li>\n<li>Short and long-term disability insurance</li>\n<li>Flexible Spending Account</li>\n<li>Health Savings Account</li>\n<li>Tuition Reimbursement</li>\n<li>Ability to Participate in Employee Stock Purchase Program (ESPP)</li>\n<li>Mental Wellness Benefits through Spring Health</li>\n<li>Family-Forming support provided by Carrot</li>\n<li>Paid Parental Leave</li>\n<li>Flexible, full-service childcare support with Kinside</li>\n<li>401(k) with a generous employer match</li>\n<li>Flexible PTO</li>\n<li>Catered lunch each day in our office and data center locations</li>\n<li>A casual work environment</li>\n<li>A work culture focused on innovative disruption</li>\n</ul>\n<p>Our Workplace</p>\n<p>While we prioritize a hybrid work environment, remote work may be considered for candidates located more than 30 miles from an office, based on role requirements for specialized skill sets. New hires will be invited to attend onboarding at one of our hubs within their first month. Teams also gather quarterly to support collaboration.</p>\n<p>California Consumer Privacy Act - California applicants only</p>\n<p>CoreWeave is an equal opportunity employer, committed to fostering an inclusive and supportive workplace. All qualified applicants and candidates will receive consideration for employment without regard to race, color, religion, sex, disability, age, sexual orientation, gender identity, national origin, veteran status, or genetic information. As part of this commitment and consistent with the Americans with Disabilities Act (ADA), CoreWeave will ensure that qualified applicants and candidates with disabilities are provided reasonable accommodations for the hiring process, unless such accommodation would cause an undue hardship. If reasonable accommodation is needed, please contact: careers@coreweave.com.</p>\n<p>Export Control Compliance</p>\n<p>This position requires access to export controlled information. To conform to U.S. Government export regulations applicable to that information, applicant must either be (A) a U.S. person, defined as a (i) U.S. citizen or national, (ii) U.S. lawful permanent resident (green card holder), (iii) refugee under 8 U.S.C. § 1157, or (iv) asylee under 8 U.S.C. § 1158, (B) eligible to access the export controlled information without a required export authorization, or (C) eligible and reasonably likely to obtain the required export authorization from the applicable U.S. government agency. CoreWeave may, for legitimate business reasons, decline to pursue any export licensing process.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_a14533c3-732","directApply":true,"hiringOrganization":{"@type":"Organization","name":"CoreWeave","sameAs":"https://www.coreweave.com","logo":"https://logos.yubhub.co/coreweave.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/coreweave/jobs/4653971006","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$165,000 to $242,000","x-skills-required":["Cilium CNI","cloud infrastructure","large-scale distributed systems","container networking","connectivity","network services","Kubernetes","eBPF datapath","policy model","load-balancing mechanisms","cloud networking concepts","VPCs","subnets","routing","security groups/ACLs","NAT","ingress/egress architectures","TCP/IP","dynamic routing","ECMP","MTU/fragmentation","overlay/underlay networking","Golang","C/C++","CI/CD environments","Kubernetes at scale","cluster lifecycle management","debugging networking issues"],"x-skills-preferred":["cloud-scale network services","Cilium","eBPF development","performance tuning","Kubernetes operators","controllers","service meshes","multi-cluster networking","cluster mesh solutions","GPU-heavy","HPC","performance-sensitive environments"],"datePosted":"2026-04-18T15:47:58.336Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Livingston, NJ / New York, NY / Sunnyvale, CA / Bellevue, WA"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Cilium CNI, cloud infrastructure, large-scale distributed systems, container networking, connectivity, network services, Kubernetes, eBPF datapath, policy model, load-balancing mechanisms, cloud networking concepts, VPCs, subnets, routing, security groups/ACLs, NAT, ingress/egress architectures, TCP/IP, dynamic routing, ECMP, MTU/fragmentation, overlay/underlay networking, Golang, C/C++, CI/CD environments, Kubernetes at scale, cluster lifecycle management, debugging networking issues, cloud-scale network services, Cilium, eBPF development, performance tuning, Kubernetes operators, controllers, service meshes, multi-cluster networking, cluster mesh solutions, GPU-heavy, HPC, performance-sensitive environments","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":165000,"maxValue":242000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_cbaf9906-291"},"title":"Platform Hardware Security","description":"<p>We&#39;re seeking a Platform Hardware Security Engineer to design and implement security architectures for bare-metal infrastructure. You&#39;ll work with teams across Anthropic to build firmware, bootloaders, operating systems, and attestation systems to ensure the integrity of our infrastructure from the ground up.</p>\n<p>This role requires expertise in low-level systems security and the ability to architect solutions that balance security requirements with the performance demands of training AI models across our massive fleet.</p>\n<p>Responsibilities:</p>\n<ul>\n<li>Design and implement secure boot chains from firmware through OS initialization for diverse hardware platforms (CPUs, BMCs, switches, peripherals, and embedded microcontrollers)</li>\n</ul>\n<ul>\n<li>Architect attestation systems that provide cryptographic proof of system state from hardware root of trust through application layer</li>\n</ul>\n<ul>\n<li>Develop measured boot implementations and runtime integrity monitoring</li>\n</ul>\n<ul>\n<li>Create reference architectures and security requirements for bare-metal deployments</li>\n</ul>\n<ul>\n<li>Integrate security controls with infrastructure teams without impacting training performance</li>\n</ul>\n<ul>\n<li>Prototype and validate security mechanisms before production deployment</li>\n</ul>\n<ul>\n<li>Conduct firmware vulnerability assessments and penetration testing</li>\n</ul>\n<ul>\n<li>Build firmware analysis pipelines for continuous security monitoring</li>\n</ul>\n<ul>\n<li>Document security architectures and maintain threat models</li>\n</ul>\n<ul>\n<li>Collaborate with software and hardware vendors to ensure security capabilities meet our requirements</li>\n</ul>\n<p>Who you are:</p>\n<ul>\n<li>8+ years of experience in systems security, with at least 5 years focused on firmware and hardware security (firmware, bootloaders, and OS-level security)</li>\n</ul>\n<ul>\n<li>Hands-on experience with secure boot, measured boot, and attestation technologies (TPM, Intel TXT, AMD SEV, ARM TrustZone)</li>\n</ul>\n<ul>\n<li>Strong understanding of cryptographic protocols and hardware security modules</li>\n</ul>\n<ul>\n<li>Experience with UEFI/BIOS or embedded firmware security, bootloader hardening, and chain of trust implementation</li>\n</ul>\n<ul>\n<li>Proficiency in low-level programming (C, Rust, Assembly) and systems programming</li>\n</ul>\n<ul>\n<li>Knowledge of firmware vulnerability assessment and threat modeling</li>\n</ul>\n<ul>\n<li>Track record of designing security architectures for complex, distributed systems</li>\n</ul>\n<ul>\n<li>Experience with supply chain security</li>\n</ul>\n<ul>\n<li>Ability to work effectively across hardware and software boundaries</li>\n</ul>\n<ul>\n<li>Knowledge of NIST firmware security guidelines and hardware security frameworks</li>\n</ul>\n<p>Strong candidates may also have:</p>\n<ul>\n<li>Experience with confidential computing technologies and hardware-based TEEs</li>\n</ul>\n<ul>\n<li>Knowledge of SLSA framework and software supply chain security standards</li>\n</ul>\n<ul>\n<li>Experience securing large-scale HPC or cloud infrastructure</li>\n</ul>\n<ul>\n<li>Contributions to open-source security projects (coreboot, CHIPSEC, etc.)</li>\n</ul>\n<ul>\n<li>Background in formal verification or security proof techniques</li>\n</ul>\n<ul>\n<li>Experience with silicon root of trust implementations</li>\n</ul>\n<ul>\n<li>Experience working with building foundational technical designs, operational leadership, and vendor collaboration</li>\n</ul>\n<ul>\n<li>Previous work with AI/ML infrastructure security</li>\n</ul>\n<p>Annual Salary: $405,000-$485,000 USD</p>\n<p>Logistics:</p>\n<ul>\n<li>Minimum education: Bachelor’s degree or an equivalent combination of education, training, and/or experience</li>\n</ul>\n<ul>\n<li>Required field of study: A field relevant to the role as demonstrated through coursework, training, or professional experience</li>\n</ul>\n<ul>\n<li>Minimum years of experience: Years of experience required will correlate with the internal job level requirements for the position</li>\n</ul>\n<ul>\n<li>Location-based hybrid policy: Currently, we expect all staff to be in one of our offices at least 25% of the time. However, some roles may require more time in our offices.</li>\n</ul>\n<ul>\n<li>Visa sponsorship: We do sponsor visas! However, we aren&#39;t able to successfully sponsor visas for every role and every candidate. But if we make you an offer, we will make every reasonable effort to get you a visa, and we retain an immigration lawyer to help with this.</li>\n</ul>\n<p>Why work with us?</p>\n<ul>\n<li>Competitive compensation and benefits</li>\n</ul>\n<ul>\n<li>Optional equity donation matching</li>\n</ul>\n<ul>\n<li>Generous vacation and parental leave</li>\n</ul>\n<ul>\n<li>Flexible working hours</li>\n</ul>\n<ul>\n<li>Lovely office space in which to collaborate with colleagues</li>\n</ul>\n<p>Guidance on Candidates&#39; AI Usage: Learn about our policy for using AI in our application process</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_cbaf9906-291","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Anthropic","sameAs":"https://www.anthropic.com/","logo":"https://logos.yubhub.co/anthropic.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/anthropic/jobs/4929689008","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$405,000-$485,000 USD","x-skills-required":["Secure boot","Measured boot","Attestation technologies","Cryptographic protocols","Hardware security modules","UEFI/BIOS or embedded firmware security","Bootloader hardening","Chain of trust implementation","Low-level programming","Systems programming","Firmware vulnerability assessment","Threat modeling","Supply chain security","NIST firmware security guidelines","Hardware security frameworks"],"x-skills-preferred":["Confidential computing technologies","Hardware-based TEEs","SLSA framework","Software supply chain security standards","Large-scale HPC or cloud infrastructure","Open-source security projects","Formal verification","Security proof techniques","Silicon root of trust implementations","Vendor collaboration","AI/ML infrastructure security"],"datePosted":"2026-04-18T15:43:00.394Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"New York City, NY | Seattle, WA; San Francisco, CA | New York City, NY | Seattle, WA; Washington, DC"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Secure boot, Measured boot, Attestation technologies, Cryptographic protocols, Hardware security modules, UEFI/BIOS or embedded firmware security, Bootloader hardening, Chain of trust implementation, Low-level programming, Systems programming, Firmware vulnerability assessment, Threat modeling, Supply chain security, NIST firmware security guidelines, Hardware security frameworks, Confidential computing technologies, Hardware-based TEEs, SLSA framework, Software supply chain security standards, Large-scale HPC or cloud infrastructure, Open-source security projects, Formal verification, Security proof techniques, Silicon root of trust implementations, Vendor collaboration, AI/ML infrastructure security","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":405000,"maxValue":485000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_9cd0420a-99d"},"title":"Network Engineer, Capacity and Efficiency","description":"<p><strong>About the Role</strong></p>\n<p>We&#39;re looking for a network engineer who thinks in metrics first. You will use deep networking knowledge and rigorous measurement to figure out where and how bandwidth, latency, and dollars are being used, find optimization opportunities and land them.</p>\n<p><strong>Responsibilities</strong></p>\n<ul>\n<li>Build the network observability stack. Design and deploy telemetry pipelines , sFlow/IPFIX, gNMI streaming, eBPF host probes , that turn packet counters into per-flow, per-tenant, per-workload cost and utilization data. Own the SLIs for backbone and DCN fabric health.</li>\n<li>Hunt for efficiency. Analyze inter-region traffic patterns, identify hot links and stranded capacity, and quantify the dollar impact. Build the models that tell us whether we should buy more capacity, or move the workload.</li>\n<li>Own QoS and traffic engineering. Design and operate traffic classification, marking, and shaping across the backbone. Make sure bulk checkpoint transfers don’t starve latency-sensitive inference, and that we’re not paying premium cross-region rates for traffic that could take the cheap path.</li>\n<li>Drive cost attribution. Tie network spend , egress, interconnect ports, transit, optical leases , back to the teams and workloads that generate it. Make network cost a first-class input to capacity planning and workload placement decisions.</li>\n<li>Influence decisions you don&#39;t own. A large fraction of this role is convincing other teams to act on what your data shows: making the case to research that a traffic pattern needs to change, to finance that an interconnect tranche is worth buying, to Systems Networking that a QoS policy needs rewriting.</li>\n</ul>\n<p><strong>Requirements</strong></p>\n<ul>\n<li>Have 5+ years operating large-scale production networks , data center fabrics (spine-leaf, Clos), backbone/WAN, or hyperscaler-adjacent environments.</li>\n<li>Are genuinely fluent across the stack: BGP (including policy and communities), ECMP, VXLAN/EVPN or equivalent overlays, QoS (DSCP, queuing, shaping), and L1/optical basics (DWDM, coherent, LAGs).</li>\n<li>Know at least one major CSP’s networking model deeply , AWS (VPC, TGW, Direct Connect, Gateway Load Balancer) or GCP (Shared VPC, Interconnect, Cloud Router, Network Connectivity Center) , and understand how their overlays interact with physical underlays.</li>\n<li>Have built or operated network telemetry at scale: streaming telemetry (gNMI/OpenConfig), flow export (sFlow, IPFIX, NetFlow), or eBPF-based host-side instrumentation. You can reason about sampling, cardinality, and storage tradeoffs.</li>\n<li>Comfortable writing Python or Go to build tooling, telemetry pipelines, infrastructure-as-code, config management for network devices and automation, that you’ll ship to production.</li>\n<li>Think quantitatively by default. You reach for a notebook or a Grafana query before you reach for an opinion, and you can turn messy counter data into a defensible cost model.</li>\n<li>Communicate crisply. You can explain to a finance partner why a 10% egress reduction matters, and to a network engineer why a specific ECMP imbalance is costing real money.</li>\n</ul>\n<p><strong>Nice to Have</strong></p>\n<ul>\n<li>SRE experience for large-scale network infrastructure , designing for reliability, defining SLOs/SLIs for network services, capacity planning with error budgets, and incident response for network-impacting outages at scale.</li>\n<li>Background on a cloud provider&#39;s networking team or a cloud networking product team , building or operating the interconnect, backbone, or SDN control plane from the provider side, not just consuming it as a customer.</li>\n<li>Familiarity with AI/ML infrastructure traffic patterns like collective communication (all-reduce, all-gather), checkpoint/weight transfer, inference serving, and how these stress networks differ than traditional workloads in terms of burst behavior, flow synchronization, and bandwidth symmetry.</li>\n<li>Experience with HPC fabrics like InfiniBand, RoCE v2, lossless Ethernet, or custom high-radix topologies and an understanding of how job placement, congestion management, and adaptive routing interact at scale.</li>\n<li>Background in traffic engineering for large backbones and the operational judgment to know when TE is worth the complexity.</li>\n<li>Hands-on time with multi-cloud connectivity: cross-cloud peering, private interconnect products, and the billing models that come with them.</li>\n<li>Experience building cost/chargeback systems for shared infrastructure, or FinOps exposure in a large cloud environment.</li>\n</ul>\n<p><strong>Representative Projects</strong></p>\n<ul>\n<li>Build a per-flow cost attribution pipeline that traces every byte of cross-region egress back to the team and workload that generated it</li>\n<li>Design QoS policy for the private backbone that prevents bulk checkpoint transfers from starving inference traffic</li>\n<li>Model whether it&#39;s cheaper to buy an additional 1.6Tb interconnect tranche or to re-route traffic through existing capacity</li>\n<li>Instrument DCN fabric utilization with streaming telemetry and build the Grafana dashboards that become the team&#39;s source of truth for network observability</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_9cd0420a-99d","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Anthropic","sameAs":"https://anthropic.com","logo":"https://logos.yubhub.co/anthropic.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/anthropic/jobs/5177143008","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["network engineering","network observability","telemetry pipelines","sFlow/IPFIX","gNMI streaming","eBPF host probes","BGP","ECMP","VXLAN/EVPN","QoS","DSCP","queuing","shaping","L1/optical basics","DWDM","coherent","LAGs","AWS","GCP","cloud networking","infrastructure-as-code","config management","automation","Python","Go","quantitative analysis","cost modeling","communication"],"x-skills-preferred":["SRE","cloud provider's networking team","cloud networking product team","AI/ML infrastructure traffic patterns","HPC fabrics","traffic engineering","multi-cloud connectivity","cost/chargeback systems","FinOps"],"datePosted":"2026-04-18T15:42:29.482Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco, CA | New York City, NY"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"network engineering, network observability, telemetry pipelines, sFlow/IPFIX, gNMI streaming, eBPF host probes, BGP, ECMP, VXLAN/EVPN, QoS, DSCP, queuing, shaping, L1/optical basics, DWDM, coherent, LAGs, AWS, GCP, cloud networking, infrastructure-as-code, config management, automation, Python, Go, quantitative analysis, cost modeling, communication, SRE, cloud provider's networking team, cloud networking product team, AI/ML infrastructure traffic patterns, HPC fabrics, traffic engineering, multi-cloud connectivity, cost/chargeback systems, FinOps"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_a0f9f4e9-96d"},"title":"Data Center Design Execution Lead","description":"<p>We are seeking a Data Center Design Execution Lead to join our Infrastructure team. As a key member of our team, you will be responsible for driving the execution of our technical requirements for third-party data center delivery partners. You will define the design execution framework, drive accountability for partner deliverable quality, and partner with external design teams and stakeholders to execute design document development and issuance.</p>\n<p>Responsibilities:</p>\n<ul>\n<li>Drive execution of Anthropic&#39;s technical requirements for third-party data center delivery partners, ensuring design intent is consistently translated from BOD through construction documents.</li>\n<li>Define the design execution framework,deliverable requirements, review gates, and quality standards,for each partner engagement.</li>\n<li>Drive accountability for partner deliverable quality across milestones, evaluating deliverables for cross-discipline consistency and alignment to Anthropic&#39;s requirements.</li>\n<li>Partner with external design teams and stakeholders to execute design document development and issuance across all project phases.</li>\n</ul>\n<p>Change Management &amp; Technical Continuity:</p>\n<ul>\n<li>Own the design change management process across projects, ensuring technical decisions are documented, resolved, and implemented through construction.</li>\n<li>Review and develop responses to contractor RFIs, maintaining design intent while accommodating field conditions.</li>\n<li>Own technical continuity across design, construction, commissioning, and turnover, driving alignment between internal leads on phase transition standards and acceptance criteria.</li>\n<li>Support project closeout and as-built documentation processes.</li>\n</ul>\n<p>Constructability &amp; Risk Management:</p>\n<ul>\n<li>Facilitate constructability review processes between design and construction teams, identifying integration risks and maintaining alignment between design intent and field execution as conditions evolve.</li>\n<li>Identify and mitigate design risks; ensure robust QA/QC practices across the design and construction lifecycle.</li>\n<li>Perform project site reviews and deliver technical reports on design compliance and construction progress.</li>\n</ul>\n<p>Cross-Functional Coordination:</p>\n<ul>\n<li>Interface with internal stakeholders,construction execution, supply chain, facilities operations, and security,to align design solutions with technical and business objectives.</li>\n<li>Partner with internal teams to define infrastructure requirements and translate them into actionable design criteria for external partners.</li>\n<li>Identify and implement opportunities for process improvements, design optimization, and schedule/performance/cost tradeoffs.</li>\n<li>Review commissioning scripts and final reports to validate performance and functionality against Anthropic&#39;s standards.</li>\n</ul>\n<p>Qualifications:</p>\n<ul>\n<li>10+ years in data center or mission-critical infrastructure delivery, spanning design and delivery phases.</li>\n<li>Direct experience in owner&#39;s engineer or technical oversight roles,not purely design production or purely project management.</li>\n<li>Working knowledge of mechanical, electrical, and cooling systems in data center environments.</li>\n<li>Experience managing design change processes, RFIs, and construction documentation workflows.</li>\n<li>Track record of coordinating across multiple disciplines and organizations on complex infrastructure projects.</li>\n<li>Deep knowledge of industry standards, building codes, and safety standards applicable to mission-critical facilities.</li>\n<li>Comfortable operating with authority in ambiguous environments where processes need to be built, not just followed.</li>\n<li>BS in Mechanical Engineering, Electrical Engineering, Architecture, or related field.</li>\n</ul>\n<p>Preferred:</p>\n<ul>\n<li>Professional Engineer (PE) or Registered Architect (RA) license.</li>\n<li>Experience stamping construction drawing packages.</li>\n<li>Proficiency with Revit/BIM, Autodesk, or similar design software applications.</li>\n<li>Experience integrating sustainability, energy efficiency, and resiliency goals into data center system designs.</li>\n<li>Direct experience with large-scale AI/HPC infrastructure, including high-density cooling systems and power distribution at 50+ MW scale.</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_a0f9f4e9-96d","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Anthropic","sameAs":"https://www.anthropic.com/","logo":"https://logos.yubhub.co/anthropic.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/anthropic/jobs/5157023008","x-work-arrangement":"remote","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$320,000-$405,000 USD","x-skills-required":["data center design","infrastructure delivery","technical oversight","mechanical engineering","electrical engineering","cooling systems","design change management","constructability review","risk management","cross-functional coordination"],"x-skills-preferred":["Revit/BIM","Autodesk","sustainability","energy efficiency","resiliency","large-scale AI/HPC infrastructure"],"datePosted":"2026-04-18T15:41:28.603Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Remote-Friendly, United States"}},"jobLocationType":"TELECOMMUTE","employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"data center design, infrastructure delivery, technical oversight, mechanical engineering, electrical engineering, cooling systems, design change management, constructability review, risk management, cross-functional coordination, Revit/BIM, Autodesk, sustainability, energy efficiency, resiliency, large-scale AI/HPC infrastructure","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":320000,"maxValue":405000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_60082588-bf0"},"title":"Cluster Deployment Engineer","description":"<p>As a Cluster Deployment Engineer at Anthropic, you will own how large-scale AI compute clusters physically come together inside our datacenter fleet.</p>\n<p>You will set the deployment-engineering strategy for cluster build-out , how racks are organized into pods, halls, and sites; how compute, network, power, and cooling systems interface at the rack boundary; and how deployment scope flows cleanly from hardware specification to facility delivery to a running cluster.</p>\n<p>This role is focused on deployment engineering, not on datacenter network or systems design , your scope is making sure clusters land cleanly and predictably, not designing the fabrics or facilities themselves.</p>\n<p>You will work across hardware, networking, facilities, supply chain, and construction to ensure that every generation of accelerator we deploy lands in a datacenter that is ready for it , on schedule, at full density, and with every piece of required infrastructure accounted for.</p>\n<p>You will be the person who sees around corners: anticipating how next-generation rack designs will stress our facilities, where our deployment model will break at scale, and what needs to change now so that the next cluster turn-up is faster and more predictable than the last.</p>\n<p>You will operate at the intersection of engineering strategy and execution discipline, partnering with internal research and systems teams, external developers, engineering firms, and OEM partners to deliver cluster capacity at the speed the frontier demands.</p>\n<p>Responsibilities:</p>\n<ul>\n<li>Own cluster-level deployment strategy , define how AI compute clusters are organized across the floor, how racks interconnect, and how cluster topology requirements translate into facility and deployment scope across a portfolio of sites.</li>\n</ul>\n<ul>\n<li>Set rack interface standards spanning power, network, mechanical, thermal, and spatial domains, and ensure that every deployment includes the complete set of infrastructure required to bring a cluster online.</li>\n</ul>\n<ul>\n<li>Drive multi-threaded cluster bring-up programs across hardware, networking, power, and cooling , owning plans, dependencies, and critical paths from hardware specification through energization and turn-up.</li>\n</ul>\n<ul>\n<li>Partner with internal engineering teams , research, systems, networking, and hardware , to translate cluster requirements into deployable facility scope, and to derisk onboarding of new hardware platforms well ahead of delivery.</li>\n</ul>\n<ul>\n<li>Lead external partner execution with developers, engineering firms, OEMs, and construction teams, driving technical reviews, deviation management, and handoffs that keep deployments on schedule and within specification.</li>\n</ul>\n<ul>\n<li>Improve cluster turn-up reliability and repeatability , identify systemic gaps in deployment scope, tooling, and partner interfaces, and drive durable fixes that reduce time-to-serve for new capacity.</li>\n</ul>\n<ul>\n<li>Define and track deployment KPIs , cluster readiness, schedule adherence, scope completeness, time-to-first-packet , and use historical trends to forecast risk and inform capacity planning.</li>\n</ul>\n<ul>\n<li>Coordinate cross-functional readiness across supply chain, security, operations, and construction to ship production-ready compute capacity.</li>\n</ul>\n<ul>\n<li>Provide crisp executive visibility on deployment progress, tradeoffs, and risks across a portfolio of concurrent cluster programs.</li>\n</ul>\n<ul>\n<li>Design cluster interfaces for durability , define rack and cluster-level interfaces that remain robust across hardware generations, so that facility scope and deployment models do not need to be reinvented every time the underlying hardware changes.</li>\n</ul>\n<ul>\n<li>Build cluster layout and BOM tooling , create and maintain the tools, templates, and data models that turn cluster topology and rack specifications into accurate floor layouts, deployment sequences, and complete bills of materials, replacing one-off spreadsheets with repeatable, auditable workflows.</li>\n</ul>\n<p>You may be a good fit if you:</p>\n<ul>\n<li>Have 10+ years of experience in hyperscale datacenter environments, with senior-level responsibility for cluster deployment, large-scale IT integration, or equivalent infrastructure programs.</li>\n</ul>\n<ul>\n<li>Have delivered AI, HPC, or high-density compute clusters at scale and developed a strong intuition for the constraints that govern cluster deployment , interconnect reach, adjacency, power density, and thermal limits.</li>\n</ul>\n<ul>\n<li>Can operate fluently across the boundary between IT hardware and facility infrastructure, and have set interface standards that held up across multiple hardware generations and sites.</li>\n</ul>\n<ul>\n<li>Have led cross-functional programs with both internal engineering teams and external developers, engineering firms, and OEM partners, and are effective at driving alignment across organizational levels.</li>\n</ul>\n<ul>\n<li>Combine strong systems thinking with execution discipline , comfortable zooming from cluster topology and portfolio strategy down to the specific interface detail that will otherwise become a field issue.</li>\n</ul>\n<ul>\n<li>Communicate clearly with technical and executive audiences, and can distill complex, multi-disciplinary programs into decisions and tradeoffs leadership can act on.</li>\n</ul>\n<ul>\n<li>Thrive in ambiguous, fast-moving environments where the hardware, the scale, and the requirements are all changing simultaneously.</li>\n</ul>\n<ul>\n<li>Hold a Bachelor&#39;s degree in Electrical Engineering, Mechanical Engineering, Computer Engineering, or equivalent practical experience.</li>\n</ul>\n<p>Strong candidates may also:</p>\n<ul>\n<li>Have direct experience deploying leading-edge AI accelerator clusters at hyperscale.</li>\n</ul>\n<ul>\n<li>Have shaped reference designs, deployment standards, or cluster-level playbooks that were adopted across a fleet.</li>\n</ul>\n<ul>\n<li>Have experience working across multiple geographies and understand how regional codes, climate, utility constraints, and supply chains shape cluster-level decisions.</li>\n</ul>\n<ul>\n<li>Have partnered closely with hardware and system providers on long-term platform onboarding and bring-up.</li>\n</ul>\n<ul>\n<li>Have experience building the program mechanisms , roadmaps, milestones, risk registers, runbooks , that make delivery predictable at massive scale.</li>\n</ul>\n<p>The annual compensation range for this role is listed below. For sales roles, the range provided is the role’s On Target Earnings (“OTE”) range, meaning that the range includes both the sales commissions/sales bonuses target and annual base salary for the role.</p>\n<p>Annual Salary: $320,000-$405,000 USD</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_60082588-bf0","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Anthropic","sameAs":"https://www.anthropic.com/","logo":"https://logos.yubhub.co/anthropic.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/anthropic/jobs/5191638008","x-work-arrangement":"remote-hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$320,000-$405,000 USD","x-skills-required":["Hyperscale datacenter environments","Cluster deployment","Large-scale IT integration","Infrastructure programs","AI","HPC","High-density compute clusters","Interconnect reach","Adjacency","Power density","Thermal limits","IT hardware","Facility infrastructure","Interface standards","Cluster topology","Portfolio strategy","Execution discipline","Systems thinking","Communication","Technical audiences","Executive audiences","Decision-making","Trade-offs","Leadership","Bachelor's degree","Electrical Engineering","Mechanical Engineering","Computer Engineering","Practical experience"],"x-skills-preferred":[],"datePosted":"2026-04-18T15:36:06.517Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Remote-Friendly, United States"}},"jobLocationType":"TELECOMMUTE","employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Hyperscale datacenter environments, Cluster deployment, Large-scale IT integration, Infrastructure programs, AI, HPC, High-density compute clusters, Interconnect reach, Adjacency, Power density, Thermal limits, IT hardware, Facility infrastructure, Interface standards, Cluster topology, Portfolio strategy, Execution discipline, Systems thinking, Communication, Technical audiences, Executive audiences, Decision-making, Trade-offs, Leadership, Bachelor's degree, Electrical Engineering, Mechanical Engineering, Computer Engineering, Practical experience","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":320000,"maxValue":405000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_a2e88648-d1d"},"title":"Mistral Cloud - Site Reliability Engineer","description":"<p>We are seeking highly experienced Site Reliability Engineers (SRE) to shape the reliability, scalability and performance of our Cloud platform and customer facing applications.</p>\n<p>You will work closely with our software engineers and product teams to ensure our systems meet and exceed our internal and external customers&#39; expectations.</p>\n<p>Key responsibilities include:</p>\n<ul>\n<li>Design, build, and maintain scalable, highly available and fault-tolerant infrastructures</li>\n<li>Operate systems and troubleshoot issues in production environments</li>\n<li>Implement and improve monitoring, alerting, and incident response systems</li>\n<li>Implement and maintain workflows and tools for both our customer-facing APIs and large training runs</li>\n</ul>\n<p>Development responsibilities include:</p>\n<ul>\n<li>Drive continuous improvement in infrastructure automation, deployment, and orchestration</li>\n<li>Collaborate with software engineers to develop and implement solutions that enable safe and reproducible model-training experiments</li>\n<li>Help build a cloud platform offering an abstraction layer between science, engineering and infrastructure</li>\n<li>Design and develop new workflows and tooling to improve the reliability, availability and performance of our systems</li>\n</ul>\n<p>Additional responsibilities include:</p>\n<ul>\n<li>Collaborate with the security team to ensure infrastructure adheres to best security practices and compliance requirements</li>\n<li>Document processes and procedures to ensure consistency and knowledge sharing across the team</li>\n<li>Contribute to open-source projects, research publications, blog articles and conferences</li>\n</ul>\n<p>About you:</p>\n<ul>\n<li>Master’s degree in Computer Science, Engineering or a related field</li>\n<li>5+ years of experience in a DevOps/SRE role</li>\n<li>Strong experience with bare metal infrastructure and highly available distributed systems</li>\n<li>Exposure to site reliability issues in critical environments</li>\n<li>Experience working against reliability KPIs</li>\n<li>Hands-on experience with CI/CD, containerization and orchestration tools</li>\n<li>Knowledge of monitoring, logging, alerting and observability tools</li>\n<li>Familiarity with infrastructure-as-code tools</li>\n<li>Proficiency in scripting languages and knowledge of software development best practices</li>\n<li>Strong understanding of networking, security, and system administration concepts</li>\n<li>Excellent problem-solving and communication skills</li>\n</ul>\n<p>Your application will be all the more interesting if you also have:</p>\n<ul>\n<li>Experience in an AI/ML environment</li>\n<li>Experience of high-performance computing (HPC) systems and workload managers</li>\n<li>Worked with modern AI-oriented solutions</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_a2e88648-d1d","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Mistral AI","sameAs":"https://mistral.ai","logo":"https://logos.yubhub.co/mistral.ai.png"},"x-apply-url":"https://jobs.lever.co/mistral/f76907fd-428a-4824-a1cf-8013974fde29","x-work-arrangement":"remote","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["bare metal infrastructure","highly available distributed systems","CI/CD","containerization","orchestration tools","monitoring","logging","alerting","observability tools","infrastructure-as-code tools","scripting languages","software development best practices","networking","security","system administration"],"x-skills-preferred":["AI/ML environment","high-performance computing (HPC) systems","workload managers","modern AI-oriented solutions"],"datePosted":"2026-04-17T12:47:48.920Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Paris"}},"jobLocationType":"TELECOMMUTE","employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"bare metal infrastructure, highly available distributed systems, CI/CD, containerization, orchestration tools, monitoring, logging, alerting, observability tools, infrastructure-as-code tools, scripting languages, software development best practices, networking, security, system administration, AI/ML environment, high-performance computing (HPC) systems, workload managers, modern AI-oriented solutions"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_058f6a10-283"},"title":"Field Hardware Engineer, HPC","description":"<p>Our compute footprint is growing fast to support our science and engineering teams. We&#39;re hiring a Field HW Engineer to understand end-to-end systems, execute complex/vendor-level interventions, and guide L1 engineers on site,without direct line management.</p>\n<p>You&#39;ll work hands-on across compute, storage, interconnect and cooling to keep one of France&#39;s largest GPU/CPU clusters healthy and scalable.</p>\n<p>Location: Bruyères-le-Châtel , on-site, field role (multi-site mobility: Paris area and nearby)</p>\n<p>Reporting line: Hardware Ops</p>\n<p>Impact:</p>\n<p>• Compute is a key lever for Mistral&#39;s success and our largest spend item.</p>\n<p>• Direct impact on scale: you&#39;ll restore service on complex incidents and raise the bar on reliability as we grow.</p>\n<p>• Enable breakthrough AI: your work unlocks science &amp; engineering teams to deliver state-of-the-art AI.</p>\n<p>What you will do:</p>\n<p>• Lead complex interventions: plan and execute vendor-level or multi-node operations (e.g., full rack work, intricate recabling, post-restart diagnosis), own risk assessment/rollback, and coordinate with vendors (RMA/escalations).</p>\n<p>• Advanced diagnostics: correlate symptoms across compute, storage, interconnect, cooling; read system indicators (LED/POST/beep), BMC/IPMI consoles, and logs to identify root causes.</p>\n<p>• Guide and uplift L1s: coach on safe practices (ESD/LOTO), first-line triage, rack craftsmanship, documentation quality; pair on tricky procedures.</p>\n<p>• Process &amp; automation: improve SOPs/checklists; propose/build small automation (Python/Bash) for photo/serial capture, inventory sync, dashboards/alerts; shorten MTTR.</p>\n<p>• Safety &amp; compliance: enforce lockout/tagout, ESD, PPE; ensure audit-ready tickets, evidence and change traces.</p>\n<p>• Parts &amp; logistics (advanced): plan spares strategy, track failure trends, and drive proactive vendor actions.</p>\n<p>About you:</p>\n<p>• 5+ years in data center/server hardware or L2/L3 hardware support, with proven complex hands-on work in production (HPC/AI/Cloud at scale).</p>\n<p>• End-to-end hardware expertise: comfortable across CPU/memory/PCIe cards (incl. accelerators), NICs, PSUs, drives, network, power and cooling (including DLC); strong judgment on when/how to escalate.</p>\n<p>• Diagnostics depth: confident in analyzing BMC/IPMI logs, linux software logs and crashes simple CLI checks; methodical root cause analysis.</p>\n<p>• Safety &amp; discipline: impeccable ESD/LOTO/PPE habits; zero rough handling; clean, labeled, auditable work.</p>\n<p>• Communication &amp; mentoring: crisp status/handovers; able to coach L1s during live operations.</p>\n<p>Provide technical documentations to L1s or other team</p>\n<p>Mobility: willing to travel between sites (Paris area or nearby regions, occasionally in Europe or US))</p>\n<p>Nice to have:</p>\n<p>• Vendor tools (iDRAC/iLO/IPMI), RAID/storage basics (NVMe/SAS/SATA), high-speed interconnect (Ethernet/InfiniBand).</p>\n<p>• Coding/automation (Python/Bash) for small ops tools and reporting.</p>\n<p>• Experience with ticketing (Jira/ServiceNow), inventory/RMA flows, vendor coordination.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_058f6a10-283","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Mistral AI","sameAs":"https://mistral.ai","logo":"https://logos.yubhub.co/mistral.ai.png"},"x-apply-url":"https://jobs.lever.co/mistral/ea94b55b-58e1-437b-bf3d-07ed150308e3","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["data center/server hardware","L2/L3 hardware support","HPC/AI/Cloud at scale","end-to-end hardware expertise","diagnostics depth","safety & discipline","communication & mentoring"],"x-skills-preferred":["vendor tools","RAID/storage basics","high-speed interconnect","coding/automation","ticketing","inventory/RMA flows","vendor coordination"],"datePosted":"2026-04-17T12:47:46.512Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Paris"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"data center/server hardware, L2/L3 hardware support, HPC/AI/Cloud at scale, end-to-end hardware expertise, diagnostics depth, safety & discipline, communication & mentoring, vendor tools, RAID/storage basics, high-speed interconnect, coding/automation, ticketing, inventory/RMA flows, vendor coordination"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_b3746239-557"},"title":"HPC Network Engineer","description":"<p>As an HPC Network Engineer at Mistral AI, you will design, deploy, and optimize high-performance network infrastructures for our HPC clusters and AI workloads. You will collaborate with cross-functional teams to ensure seamless integration of networking solutions with our compute, storage, and cloud platforms.</p>\n<p>Key Responsibilities:</p>\n<ul>\n<li><p>Design, implement, and optimize high-performance, low-latency network architectures for HPC environments, including InfiniBand, RoCE, and high-speed Ethernet.</p>\n</li>\n<li><p>Collaborate with HPC, DevOps, and AI research teams to integrate networking solutions with compute clusters, storage systems, and cloud platforms.</p>\n</li>\n<li><p>Troubleshoot and resolve complex network issues to minimize downtime and maximize performance.</p>\n</li>\n<li><p>Follow escalation procedures and ensure solutions are provided in a timely manner. Ensure escalation is progressing accordingly with the given severity.</p>\n</li>\n<li><p>Monitor network performance, capacity, and security, implementing improvements as needed.</p>\n</li>\n<li><p>Stay updated with emerging HPC networking technologies and best practices, and drive their adoption within Mistral.</p>\n</li>\n<li><p>Develop and maintain documentation for network architectures, configurations, and operational procedures.</p>\n</li>\n</ul>\n<p>Qualifications &amp; Experience:</p>\n<p>Technical Skills:</p>\n<ul>\n<li><p>Proficiency in HPC networking protocols (InfiniBand, RoCE, TCP/IP, MPLS).</p>\n</li>\n<li><p>Hands-on experience with network hardware (switches, routers, NICs) from vendors like Mellanox, Cisco, or Arista.</p>\n</li>\n<li><p>Knowledge of network automation tools (Ansible, Python scripting).</p>\n</li>\n<li><p>Familiarity with HPC environments, parallel computing, and distributed systems.</p>\n</li>\n<li><p>Experience with network security best practices.</p>\n</li>\n</ul>\n<p>Soft Skills:</p>\n<ul>\n<li><p>Strong problem-solving and analytical skills.</p>\n</li>\n<li><p>Ability to thrive in a fast-paced, collaborative environment.</p>\n</li>\n<li><p>Excellent communication skills (English required; French is a plus).</p>\n</li>\n<li><p>Teaching and documentation skills to ensure knowledge is archived and distributed to team members.</p>\n</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_b3746239-557","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Mistral AI","sameAs":"https://mistral.ai","logo":"https://logos.yubhub.co/mistral.ai.png"},"x-apply-url":"https://jobs.lever.co/mistral/6857fa38-ce30-4513-9930-acf7d78d42ed","x-work-arrangement":"hybrid","x-experience-level":"mid","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["HPC networking protocols","InfiniBand","RoCE","TCP/IP","MPLS","network hardware","switches","routers","NICs","Mellanox","Cisco","Arista","network automation tools","Ansible","Python scripting","HPC environments","parallel computing","distributed systems","network security best practices"],"x-skills-preferred":[],"datePosted":"2026-04-17T12:47:44.875Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"France, USA, UK, Germany, Singapore"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"HPC networking protocols, InfiniBand, RoCE, TCP/IP, MPLS, network hardware, switches, routers, NICs, Mellanox, Cisco, Arista, network automation tools, Ansible, Python scripting, HPC environments, parallel computing, distributed systems, network security best practices"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_a632e52b-c63"},"title":"Site Reliability Engineer","description":"<p>About Mistral AI</p>\n<p>At Mistral AI, we believe in the power of AI to simplify tasks, save time, and enhance learning and creativity. Our technology is designed to integrate seamlessly into daily working life.</p>\n<p>We are a dynamic team passionate about AI and its potential to transform society. Our diverse workforce thrives in competitive environments and is committed to driving innovation.</p>\n<p>Role Summary</p>\n<p>We are seeking highly experienced Site Reliability Engineers (SRE) to shape the reliability, scalability and performance of our platform and customer facing applications. You will work closely with our software engineers and research teams to ensure our systems meet and exceed our internal and external customers&#39; expectations.</p>\n<p>Responsibilities</p>\n<p>As a Site Reliability Engineer, you balance the day-to-day operations on production systems with long-term software engineering improvements to reduce operational toil and foster the reliability, availability, and performance of these systems.</p>\n<p>Operations</p>\n<p>• Design, build, and maintain scalable, highly available and fault-tolerant infrastructures to support our web services and ML workloads</p>\n<p>• Make sure our platform, inference and model training environments are always highly available and enable seamless replication of work environments across several HPC clusters</p>\n<p>• Operate systems and troubleshoot issues in production environments (interrupts, on-call responses, users admin, data extraction, infrastructure scaling, etc.)</p>\n<p>• Implement and improve monitoring, alerting, and incident response systems to ensure optimal system performance and minimize downtime</p>\n<p>• Implement and maintain workflows and tools (CI/CD, containerization, orchestration, monitoring, logging and alerting systems) for both our client-facing APIs and large training runs</p>\n<p>• Participate occasionally in on-call rotations to respond to incidents and perform root cause analysis to prevent future occurrences</p>\n<p>Development</p>\n<p>• Drive continuous improvement in infrastructure automation, deployment, and orchestration using tools like Kubernetes, Flux, Terraform</p>\n<p>• Collaborate with AI/ML researchers to develop and implement solutions that enable safe and reproducible model-training experiments</p>\n<p>• Build a cloud-agnostic platform offering an abstraction layer between science and infrastructure</p>\n<p>• Design and develop new workflows and tooling to improve to the reliability, availability and performance of our systems (automation scripts, refactoring, new API-based features, web apps, dashboards, etc.)</p>\n<p>• Collaborate with the security team to ensure infrastructure adheres to best security practices and compliance requirements</p>\n<p>• Document processes and procedures to ensure consistency and knowledge sharing across the team</p>\n<p>• Contribute to open-source projects, research publications, blog articles and conferences</p>\n<p>About You</p>\n<p>• Master’s degree in Computer Science, Engineering or a related field</p>\n<p>• 7+ years of experience in a DevOps/SRE role</p>\n<p>• Strong experience with cloud computing and highly available distributed systems</p>\n<p>• Exposure to site reliability issues in critical environments (issue root cause analysis, in-production troubleshooting, on-call rotations...)</p>\n<p>• Experience working against reliability KPIs (observability, alerting, SLAs)</p>\n<p>• Hands-on experience with CI/CD, containerization and orchestration tools (Docker, Kubernetes...)</p>\n<p>• Knowledge of monitoring, logging, alerting and observability tools (Prometheus, Grafana, ELK Stack, Datadog...)</p>\n<p>• Familiarity with infrastructure-as-code tools like Terraform or CloudFormation</p>\n<p>• Proficiency in scripting languages (Python, Go, Bash...) and knowledge of software development best practices</p>\n<p>• Strong understanding of networking, security, and system administration concepts</p>\n<p>• Excellent problem-solving and communication skills</p>\n<p>• Self-motivated and able to work well in a fast-paced startup environment</p>\n<p>Your Application Will Be All The More Interesting If You Also Have:</p>\n<p>• Experience in an AI/ML environment</p>\n<p>• Experience of high-performance computing (HPC) systems and workload managers (Slurm)</p>\n<p>• Worked with modern AI-oriented solutions (Fluidstack, Coreweave, Vast...)</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_a632e52b-c63","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Mistral AI","sameAs":"https://mistral.ai","logo":"https://logos.yubhub.co/mistral.ai.png"},"x-apply-url":"https://jobs.lever.co/mistral/6e16e4fa-a60b-4270-a815-06b0450fb597","x-work-arrangement":"remote","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["cloud computing","highly available distributed systems","DevOps","SRE","Kubernetes","Flux","Terraform","CI/CD","containerization","orchestration","monitoring","logging","alerting","observability","infrastructure-as-code","scripting languages","software development best practices","networking","security","system administration"],"x-skills-preferred":["AI/ML environment","high-performance computing (HPC) systems","workload managers","modern AI-oriented solutions"],"datePosted":"2026-04-17T12:47:37.519Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Paris"}},"jobLocationType":"TELECOMMUTE","employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"cloud computing, highly available distributed systems, DevOps, SRE, Kubernetes, Flux, Terraform, CI/CD, containerization, orchestration, monitoring, logging, alerting, observability, infrastructure-as-code, scripting languages, software development best practices, networking, security, system administration, AI/ML environment, high-performance computing (HPC) systems, workload managers, modern AI-oriented solutions"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_b7bde4cf-9c8"},"title":"Datacenter Hardware Engineer, HPC","description":"<p>About Mistral</p>\n<p>At Mistral AI, we believe in the power of AI to simplify tasks, save time, and enhance learning and creativity. Our technology is designed to integrate seamlessly into daily working life.</p>\n<p>Our compute footprint is growing fast to support our science and engineering teams. We’re hiring a Datacenter HW Engineer to maintain, troubleshoot, and scale our GPU/CPU clusters safely and reliably.</p>\n<p>You’ll execute hands-on hardware work in our Paris-area datacenter and partner with hardware owners, DC operations, and vendors to keep one of France’s largest GPU clusters healthy.</p>\n<p>Location: Bruyères-le-Châtel , on-site, field role</p>\n<p>Reporting line: Hardware Ops</p>\n<p>Impact</p>\n<p>• Compute is a key lever for Mistral’s success and our largest spend item.\n• Direct impact on scale: your work keeps one of France’s largest AI clusters healthy as we grow to unprecedented scale.\n• Enable breakthrough AI: you unlock our science &amp; engineering teams to deliver groundbreaking AI solutions.</p>\n<p>Responsibilities</p>\n<p>• Diagnose &amp; operate core server/cluster components - Investigate and handle compute/storage hardware issues (CPU, memory, drives, NICs, GPUs, PSUs) and interconnect problems (switches, cables, transceivers; Ethernet/InfiniBand).\n• Safety &amp; procedures - Apply lockout/tagout (LOTO) and ESD discipline; follow pre/post-work checklists; maintain tidy, safe work areas.\n• First-line diagnostics - Triage using LEDs, POST, beep codes and basic tests; capture evidence (photos, serials, results); open/update/close tickets with clear notes.\n• Preventive maintenance - Provide feedback and ideas to improve proactive activities, monitoring, and targeted follow-ups on recurring or specific anomalies; help turn ad-hoc checks into SOPs, alerts, and dashboards.\n• Parts &amp; logistics - Receive and track parts, keep labeled inventory accurate, manage simple RMAs, and coordinate with vendors.\n• Collaboration &amp; escalation - Partner with senior hardware/firmware owners on complex or multi-node issues; communicate status and next steps crisply.\n• Documentation &amp; quality - Keep SOPs/checklists current; ensure zero undocumented changes and consistent, audit-ready records.</p>\n<p>About you</p>\n<p>• Hands-on mindset in datacenters/server hardware: you can install/re-seat/swap GPU/PCIe cards, NICs, PSUs, drives, and work cleanly in racks (rails, cabling, labeling).\n• Disciplined and meticulous: follows checklists, ESD/LOTO; no rough handling; careful with all high-value server components.\n• Practical electrical basics: power-off, PPE, short-circuit risk awareness.\n• Comfortable in racks: cooling, network, storage, PDU, cable management; can lift/mount safely (within HSE limits).\n• Clear communicator: short factual updates; reliable teammate; punctual and process-minded.\n• Hardware-passionate, professionally grounded: strong curiosity and craft mindset.</p>\n<p>Nice to have</p>\n<p>• HPC/AI/Cloud at scale experience (production environments), large-fleet/server install &amp; maintenance in datacenters.\n• Basic networking (Ethernet/InfiniBand) and basic Linux (boot/check; no coding needed).\n• Coding/automation skills (Python/Bash): small tools/scripts to improve checklists, photo/serial capture, inventory sync, or simple monitoring/reporting.\n• Experience with inventory/RMA tools and vendor coordination.\n• Exposure to HPC/research/industrial environments.</p>\n<p>What we offer</p>\n<p>💰 Competitive salary and equity package</p>\n<p>🧑‍⚕️ Health insurance</p>\n<p>🚴 Transportation allowance</p>\n<p>🥎 Sport allowance</p>\n<p>🥕 Meal vouchers</p>\n<p>💰 Private pension plan</p>\n<p>🍼 Generous parental leave policy</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_b7bde4cf-9c8","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Mistral AI","sameAs":"https://mistral.ai/careers","logo":"https://logos.yubhub.co/mistral.ai.png"},"x-apply-url":"https://jobs.lever.co/mistral/ddf7bcbb-e223-4768-a553-6e95df472cf7","x-work-arrangement":"onsite","x-experience-level":"mid","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["GPU/CPU clusters","server hardware","Linux fundamentals","scripting","electrical basics","networking","inventory management"],"x-skills-preferred":["HPC/AI/Cloud at scale experience","basic Linux","coding/automation skills","inventory/RMA tools","vendor coordination"],"datePosted":"2026-04-17T12:47:08.660Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Paris"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"GPU/CPU clusters, server hardware, Linux fundamentals, scripting, electrical basics, networking, inventory management, HPC/AI/Cloud at scale experience, basic Linux, coding/automation skills, inventory/RMA tools, vendor coordination"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_d499c732-ee6"},"title":"Senior Staff Software Engineer","description":"<p>Synopsys software engineers are key enablers in the world of Electronic Design Automation (EDA), developing and maintaining software used in chip design, verification and manufacturing. They work on assignments like designing, developing, and troubleshooting software, leveraging the state-of-the-art technologies like AI/ML, GenAI and Cloud. Their critical contributions enable world-wide EDA designers to extend the frontiers of semiconductors and chip development.</p>\n<p>The Common Engineering Components (CEC) team, part of Synopsys Central Engineering, develops and maintains productivity-enhancing platforms, tools, and services enabling Synopsys R&amp;D engineers, Field Application Engineers (FAEs), and customers to efficiently identify, diagnose, and resolve issues in Synopsys&#39; Electronic Design Automation (EDA) tools. Our work is critical to the performance and scalability of EDA solutions used in silicon chip design.</p>\n<p>We are looking for a Senior Staff Software Engineer to join our team. As a Senior Staff Software Engineer, you will be responsible for delivering enterprise-scale software solutions to internal and external stakeholders, ensuring reliability and performance at every stage. You will collaborate with multiple global engineering teams to achieve common goals and successful project delivery. You will define technical requirements and drive implementation of next-generation productivity platforms. You will mentor engineers, fostering a culture of technical excellence, collaboration, and innovation within the team.</p>\n<p>You will utilize C/C++, Python, Unix/Linux system-level knowledge, and advanced debugging and performance profiling tools to deliver robust, scalable solutions and resolve critical issues efficiently. You will accelerate Synopsys&#39; capability to deliver cutting-edge solutions for customers and partners worldwide. You will enable seamless integration of advanced computing platforms into chip design and verification workflows. You will drive operational efficiency and scalability for mission-critical applications and services. You will influence technology strategy and architecture decisions, shaping the future of Synopsys&#39; product offerings. You will empower engineering teams through mentorship, fostering innovation and technical growth. You will enhance customer satisfaction by delivering reliable, high-performance software solutions tailored to evolving industry needs.</p>\n<p>You will have 8–14 years of professional software development experience, ideally in High Performance Computing (HPC) or large-scale systems. You will have strong Unix/Linux systems programming background, including multithreading, synchronization, sockets, and inter-process communication (IPC). You will have proven experience designing and working with distributed systems, including networking, databases, and containerized environments. You will have proficiency in C/C++ and Python, with hands-on experience using debugging and performance profiling tools. You will have excellent written and verbal communication skills, with the ability to explain complex technical concepts to diverse audiences.</p>\n<p>You will be an inclusive leader and collaborator, committed to fostering diverse perspectives and teamwork. You will be analytical and detail-oriented, with a passion for solving challenging technical problems. You will be adaptable and proactive, able to thrive in dynamic, rapidly evolving environments. You will be curious and innovative, always seeking new ways to leverage technology for impactful results. You will be a mentor and coach, dedicated to supporting the growth and development of others.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_d499c732-ee6","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Synopsys","sameAs":"https://careers.synopsys.com","logo":"https://logos.yubhub.co/careers.synopsys.com.png"},"x-apply-url":"https://careers.synopsys.com/job/sunnyvale/senior-staff-software-engineer/44408/93286401648","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$204000-$306000","x-skills-required":["C/C++","Python","Unix/Linux","High Performance Computing (HPC)","Distributed Systems","Networking","Databases","Containerized Environments","Debugging and Performance Profiling Tools"],"x-skills-preferred":[],"datePosted":"2026-04-05T13:20:54.239Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Sunnyvale"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"C/C++, Python, Unix/Linux, High Performance Computing (HPC), Distributed Systems, Networking, Databases, Containerized Environments, Debugging and Performance Profiling Tools","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":204000,"maxValue":306000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_019ba3f3-88c"},"title":"Staff Engineer – AI/ML & Digital Twin","description":"<p><strong>Job Description</strong></p>\n<p>We are seeking a highly motivated Staff Engineer to join our team, focusing on AI/ML and Digital Twin technologies. As a Staff Engineer, you will lead and execute technical engagements across the customer lifecycle, including discovery, solution development, demonstrations, evaluations, and deployment.</p>\n<p><strong>Responsibilities</strong></p>\n<ul>\n<li>Lead and execute technical engagements across the customer lifecycle, including discovery, solution development, demonstrations, evaluations, and deployment.</li>\n<li>Engage directly with customers to understand engineering workflows, data availability, and decision-making processes, translating them into AI-enabled simulation and digital engineering solutions.</li>\n<li>Develop and implement differentiated solutions using technologies such as automation, reduced order modeling, optimization, simulation democratization, system-level modeling, and digital twins.</li>\n<li>Integrate machine learning models within simulation and digital twin pipelines to improve prediction accuracy, reduce computational cost, and enable near real-time insights.</li>\n<li>Define and deliver automated and scalable workflows that reduce reliance on expert-driven simulation and enable broader adoption across engineering teams.</li>\n<li>Lead or contribute to first-of-a-kind or ambiguous use cases, including AI-assisted design exploration, surrogate modeling, and digital twin deployment.</li>\n<li>Collaborate closely with product development teams to influence roadmap, validate new capabilities, and improve usability of AI-enabled features.</li>\n<li>Deliver professional services, training, and technical guidance to ensure successful adoption of advanced workflows.</li>\n<li>Support pre-sales and technical marketing activities through demonstrations, evaluations, and industry engagement.</li>\n</ul>\n<p><strong>Benefits</strong></p>\n<ul>\n<li>Enable customers to transition from traditional simulation to AI-augmented and automated engineering workflows.</li>\n<li>Reduce time-to-insight through surrogate modeling, optimization, and intelligent automation.</li>\n<li>Expand access to simulation by supporting democratization across engineering and non-expert users.</li>\n<li>Drive adoption of digital twin technologies for predictive and operational decision-making.</li>\n<li>Influence product direction by connecting real-world use cases with next-generation AI-enabled capabilities.</li>\n<li>Contribute to business growth through high-impact technical engagements and solution delivery.</li>\n</ul>\n<p><strong>Requirements</strong></p>\n<ul>\n<li>MS (or PhD) in Engineering, Computer Science, Applied Mathematics, or related field.</li>\n<li>5+ years of experience in engineering systems, simulation, or data-driven modeling.</li>\n<li>Strong programming skills (Python preferred).</li>\n<li>Experience working with modeling, simulation, optimization, or data-driven engineering workflows.</li>\n<li>Strong analytical, problem-solving, and communication skills.</li>\n<li>Ability to operate effectively in a customer-facing, consultative engineering role.</li>\n<li>Proven experience in automation of engineering workflows or pipelines using tools such as optiSLang, modeFrontier, HEEDS or equivalent.</li>\n<li>Demonstrated expertise applying machine learning techniques in engineering contexts, including surrogate modeling, regression methods, or neural networks (CNNs, RNNs, autoencoders).</li>\n<li>Understanding of projection-based ROMs, dimensionality reduction, and feature engineering.</li>\n<li>Knowledge of multi-fidelity system modeling using Twin Builder, Simulink, AMESim or equivalent.</li>\n<li>Familiarity with deployment and operationalization of AI models, including integration into engineering workflows and use of frameworks such as PyTorch, TensorFlow, scikit-learn, Kubernetes, AWS/Azure equivalent.</li>\n<li>Exposure to cloud or HPC-based environments for large-scale simulation or data processing.</li>\n</ul>\n<p><strong>Who We Are Looking For</strong></p>\n<ul>\n<li>Customer-focused and able to build trusted relationships.</li>\n<li>Comfortable working in ambiguous, fast-evolving technical environments.</li>\n<li>A strong communicator who can translate complex concepts into actionable insights.</li>\n<li>Self-driven, organized, and capable of managing multiple priorities.</li>\n<li>A collaborative team player who contributes to a culture of learning and innovation.</li>\n</ul>\n<p><strong>The Team You’ll Be A Part Of</strong></p>\n<p>You will be part of a multidisciplinary engineering team focused on advancing industry adoption of simulation through AI, automation, digital twin, and MBSE technologies. The team collaborates closely with customers, product development, and go-to-market functions to deliver innovative, high-impact solutions.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_019ba3f3-88c","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Ansys, Part of Synopsys","sameAs":"https://careers.synopsys.com","logo":"https://logos.yubhub.co/careers.synopsys.com.png"},"x-apply-url":"https://careers.synopsys.com/job/canonsburg/staff-engineer-ai-ml-and-digital-twin/44408/93512568768","x-work-arrangement":"Remote Eligible","x-experience-level":"Staff","x-job-type":"Employee","x-salary-range":"$112000-$168000","x-skills-required":["Python","Automation","Reduced Order Modeling","Optimization","Simulation Democratization","System-Level Modeling","Digital Twins","Machine Learning","Surrogate Modeling","Regression Methods","Neural Networks","Projection-Based ROMs","Dimensionality Reduction","Feature Engineering","Multi-Fidelity System Modeling","Cloud or HPC-Based Environments"],"x-skills-preferred":[],"datePosted":"2026-04-05T13:16:27.451Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"United States"}},"jobLocationType":"TELECOMMUTE","occupationalCategory":"Engineering","industry":"Technology","skills":"Python, Automation, Reduced Order Modeling, Optimization, Simulation Democratization, System-Level Modeling, Digital Twins, Machine Learning, Surrogate Modeling, Regression Methods, Neural Networks, Projection-Based ROMs, Dimensionality Reduction, Feature Engineering, Multi-Fidelity System Modeling, Cloud or HPC-Based Environments","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":112000,"maxValue":168000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_37049070-1d7"},"title":"Software Engineer, Compute Infrastructure","description":"<p>About Mistral AI\nAt Mistral AI, we believe in the power of AI to simplify tasks, save time, and enhance learning and creativity.</p>\n<p>Our technology is designed to integrate seamlessly into daily working life. We democratize AI through high-performance, optimized, open-source and cutting-edge models, products and solutions. Our comprehensive AI platform is designed to meet enterprise needs, whether on-premises or in cloud environments.</p>\n<p>We are a team passionate about AI and its potential to transform society. Our diverse workforce thrives in competitive environments and is committed to driving innovation. Our teams are distributed between France, USA, UK, Germany and Singapore.</p>\n<p>Role Summary\nWe are building one of Europe&#39;s largest AI infrastructure offerings that will provide our customers a private and integrated stack in every form factor they may need — from bare-metal servers to fully-managed PaaS.</p>\n<p>You will join a fast-growing team to help build, scale and automate our computing management stack. You will be responsible for building fault-tolerant and reliable infrastructure to support both our internal processes and customer platform.</p>\n<p>Location: France and UK as primary locations. Remote in Europe can be considered under conditions.</p>\n<p>Key Responsibilities:\n• Design, build, and operate a scalable Kubernetes-based platform to host large-scale AI and HPC workloads, ensuring high performance, reliability, and security.\n• Own the full lifecycle of cluster management, from bootstrapping and provisioning to global operations, by integrating and developing the necessary software components—including automation, monitoring, and orchestration tools.\n• Drive infrastructure innovation by designing workflows, tooling (scripts, APIs, dashboards), and CI/CD pipelines to optimize system reliability, availability, and observability.\n• Champion a zero-trust security model, strengthening IAM, networking (VPC), and access controls to safeguard the platform.\n• Develop user-centric features that simplify operations for both sysadmins and end customers, reducing friction in daily workflows.\n• Lead incident resolution with rigorous root-cause analysis to prevent recurrence and improve system resilience.</p>\n<p>About you\n• Strong proficiency in software development (preferably Golang) and knowledge of software development best practices\n• Successful experience in an Infrastructure Engineering role (SWE, Platform, DevOps, Cloud...)\n• Deep understanding of Kubernetes internals and hands-on experience with containerization and orchestration tools (Docker, Kubernetes, Openstack...)\n• Familiarity with infrastructure-as-code tools like Terraform or CloudFormation\n• Knowledge of monitoring, logging, alerting and observability tools (Prometheus, Grafana, ELK, Datadog...)\n• Exposure to highly available distributed systems and site reliability issues in critical environments (issue root cause analysis, in-production troubleshooting, on-call rotations...)\n• Experience working against reliability KPIs (observability, alerting, SLAs)\n• Excellent problem-solving and communication skills\n• Self-motivation and ability to thrive in a fast-paced startup environment</p>\n<p>Now, it would be ideal if you also had:\n• Experience with HPC workload managers (Slurm) and distributed storage systems (Lustre, Ceph)\n• Demonstrated history of contributing to open-source projects (e.g., code, documentation, bug fixes, feature development, or community support).</p>\n<p>Additional Information\nLocation &amp; Remote\nThis role is primarily based in one of our European offices — Paris, France and London, UK. We will prioritize candidates who either reside there or are open to relocating. We strongly believe in the value of in-person collaboration to foster strong relationships and seamless communication within our team.</p>\n<p>In certain specific situations, we will also consider remote candidates based in one of the countries listed in this job posting — currently France, UK, Germany, Belgium, Netherlands, Spain and Italy.</p>\n<p>In any case, we ask all new hires to visit our Paris HQ office:\n• for the first week of their onboarding (accommodation and travelling covered)\n• then at least 2 days per month</p>\n<p>What we offer\nCompetitive salary and equity\nHealth insurance\nTransportation allowance\nSport allowance\nMeal vouchers\nPrivate pension plan\nGenerous parental leave policy\nVisa sponsorship</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_37049070-1d7","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Mistral AI","sameAs":"https://mistral.ai"},"x-apply-url":"https://jobs.lever.co/mistral/d60f6c60-ad5e-4753-af8a-56365b7db8b8","x-work-arrangement":"remote","x-experience-level":"mid","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["software development","Golang","Kubernetes","containerization","orchestration","infrastructure-as-code","Terraform","CloudFormation","monitoring","logging","alerting","observability","Prometheus","Grafana","ELK","Datadog"],"x-skills-preferred":["HPC workload managers","distributed storage systems","open-source projects"],"datePosted":"2026-03-10T11:35:56.693Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Paris"}},"jobLocationType":"TELECOMMUTE","employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"software development, Golang, Kubernetes, containerization, orchestration, infrastructure-as-code, Terraform, CloudFormation, monitoring, logging, alerting, observability, Prometheus, Grafana, ELK, Datadog, HPC workload managers, distributed storage systems, open-source projects"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_24be48df-238"},"title":"Field Hardware Engineer, HPC","description":"<p>We&#39;re hiring a Field HW Engineer to work on-site at our data centre in Bruyères-le-Châtel. As a Field HW Engineer, you will be responsible for understanding end-to-end systems, executing complex/vendor-level interventions, and guiding L1 engineers on site.</p>\n<p>Your work will involve hands-on troubleshooting and repair of compute, storage, interconnect and cooling systems to keep our large GPU/CPU cluster healthy and scalable. You will also be responsible for leading complex interventions, advanced diagnostics, guiding and uplifting L1s, process and automation, safety and compliance, and parts and logistics.</p>\n<p>To be successful in this role, you will need 5+ years of experience in data center/server hardware or L2/L3 hardware support, with proven complex hands-on work in production (HPC/AI/Cloud at scale). You should have end-to-end hardware expertise, including comfort with CPU/memory/PCIe cards, NICs, PSUs, drives, network, power and cooling. You should also be confident in analyzing BMC/IPMI logs, linux software logs and crashes simple CLI checks, and have methodical root cause analysis skills.</p>\n<p>The ideal candidate will be willing to travel between sites (Paris area or nearby regions, occasionally in Europe or US) and have a strong understanding of safety and discipline, including impeccable ESD/LOTO/PPE habits, zero rough handling, and clean, labeled, auditable work.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_24be48df-238","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Mistral AI","sameAs":"https://mistral.ai"},"x-apply-url":"https://jobs.lever.co/mistral/ea94b55b-58e1-437b-bf3d-07ed150308e3","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["data center/server hardware","L2/L3 hardware support","complex hands-on work in production (HPC/AI/Cloud at scale)","end-to-end hardware expertise","CPU/memory/PCIe cards","NICs","PSUs","drives","network","power and cooling","BMC/IPMI logs","linux software logs","crashes simple CLI checks","root cause analysis"],"x-skills-preferred":["vendor tools (iDRAC/iLO/IPMI)","RAID/storage basics (NVMe/SAS/SATA)","high-speed interconnect (Ethernet/InfiniBand)","coding/automation (Python/Bash)"],"datePosted":"2026-03-10T11:27:14.542Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Bruyères-le-Châtel"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"data center/server hardware, L2/L3 hardware support, complex hands-on work in production (HPC/AI/Cloud at scale), end-to-end hardware expertise, CPU/memory/PCIe cards, NICs, PSUs, drives, network, power and cooling, BMC/IPMI logs, linux software logs, crashes simple CLI checks, root cause analysis, vendor tools (iDRAC/iLO/IPMI), RAID/storage basics (NVMe/SAS/SATA), high-speed interconnect (Ethernet/InfiniBand), coding/automation (Python/Bash)"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_34dcb379-23a"},"title":"Applied AI, Forward Deployed Machine Learning Engineer - (Internship)","description":"<p>About Mistral AI</p>\n<p>At Mistral AI, we believe in the power of AI to simplify tasks, save time, and enhance learning and creativity. Our technology is designed to integrate seamlessly into daily working life.</p>\n<p>We are a global company with teams distributed between France, USA, UK, Germany, and Singapore. Our comprehensive AI platform meets enterprise needs, whether on-premises or in cloud environments. Our offerings include le Chat, the AI assistant for life and work.</p>\n<p>Role Summary</p>\n<p>As an Applied Engineering Intern, you will work closely with our Applied AI Engineering team to facilitate the adoption of Mistral AI products among customers and collaborate with them to address complex technical challenges. This role is based in Paris, with an internship duration of 3 to 6 months. We are open to CIFRE programs as a continuation after the internship.</p>\n<p>Responsibilities</p>\n<p>• Contribute to the deployment of state-of-the-art GenAI applications, driving technological transformation with our customers.</p>\n<p>• Collaborate with researchers, AI engineers, and product engineers on complex customer projects.</p>\n<p>• Work with the product and science team to continuously improve our product and model capabilities based on customer feedback.</p>\n<p>How We Work in Applied AI</p>\n<p>• We care about people and outputs.</p>\n<p>• What matters is what you ship, not the time you spend on it.</p>\n<p>• Bureaucracy is where urgency goes to vanish. You talk to whoever you need to talk to. The best idea wins, whether it comes from a principal engineer or someone in their first week.</p>\n<p>• Always ask why. The best solutions come from deep understanding, not from copying what worked before.</p>\n<p>• We say what we mean. Feedback is direct, timely, and given because we care.</p>\n<p>• No politics. Low ego, high standards.</p>\n<p>• We embrace an unstructured environment and find joy in it.</p>\n<p>About You</p>\n<p>• You are currently pursuing a degree in AI, data science, or a related field from a tier 1 engineering school or university.</p>\n<p>• You have strong programming skills in Python.</p>\n<p>• You are familiar with machine learning algorithms and natural language processing techniques.</p>\n<p>• You hold basic understanding of MLOps and deploying machine learning use cases.</p>\n<p>• You have good communication skills with the ability to explain technical concepts to both technical and non-technical audiences.</p>\n<p>Ideally You Have:</p>\n<p>• Experience with deep learning frameworks such as PyTorch.</p>\n<p>• Familiarity with version control systems (e.g., Git) and Linux shell environment.</p>\n<p>• Experience working in HPC Environments.</p>\n<p>• Publication record in AI or a related field.</p>\n<p>Benefits</p>\n<p>• Competitive salary</p>\n<p>• Food: Daily lunch vouchers</p>\n<p>• Sport: Monthly contribution to a Gympass subscription</p>\n<p>• Transportation: Monthly contribution to a mobility pass</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_34dcb379-23a","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Mistral AI","sameAs":"https://mistral.ai"},"x-apply-url":"https://jobs.lever.co/mistral/881941e1-2741-48e2-8767-12866965fac5","x-work-arrangement":"onsite","x-experience-level":"entry","x-job-type":"internship","x-salary-range":null,"x-skills-required":["Python","Machine learning algorithms","Natural language processing techniques","MLOps","Deep learning frameworks (PyTorch)"],"x-skills-preferred":["Version control systems (Git)","Linux shell environment","HPC Environments"],"datePosted":"2026-03-10T11:26:07.163Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Paris"}},"employmentType":"INTERN","occupationalCategory":"Engineering","industry":"Technology","skills":"Python, Machine learning algorithms, Natural language processing techniques, MLOps, Deep learning frameworks (PyTorch), Version control systems (Git), Linux shell environment, HPC Environments"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_c8c20fa9-7f3"},"title":"Datacenter Hardware Engineer, HPC","description":"<p>About Mistral AI</p>\n<p>At Mistral AI, we believe in the power of AI to simplify tasks, save time, and enhance learning and creativity. Our technology is designed to integrate seamlessly into daily working life.</p>\n<p>We are a company that democratizes AI through high-performance, optimized, open-source and cutting-edge models, products and solutions. Our comprehensive AI platform is designed to meet enterprise needs, whether on-premises or in cloud environments.</p>\n<p>Our offerings include le Chat, the AI assistant for life and work. We are a team passionate about AI and its potential to transform society.</p>\n<p>Role Summary</p>\n<p>Our compute footprint is growing fast to support our science and engineering teams. We’re hiring a Datacenter HW Engineer to maintain, troubleshoot, and scale our GPU/CPU clusters safely and reliably.</p>\n<p>What you will do</p>\n<ul>\n<li>Diagnose &amp; operate core server/cluster components - Investigate and handle compute/storage hardware issues (CPU, memory, drives, NICs, GPUs, PSUs) and interconnect problems (switches, cables, transceivers; Ethernet/InfiniBand).</li>\n<li>Safety &amp; procedures - Apply lockout/tagout (LOTO) and ESD discipline; follow pre/post-work checklists; maintain tidy, safe work areas.</li>\n<li>First-line diagnostics - Triage using LEDs, POST, beep codes and basic tests; capture evidence (photos, serials, results); open/update/close tickets with clear notes.</li>\n<li>Preventive maintenance - Provide feedback and ideas to improve proactive activities, monitoring, and targeted follow-ups on recurring or specific anomalies; help turn ad-hoc checks into SOPs, alerts, and dashboards.</li>\n<li>Parts &amp; logistics - Receive and track parts, keep labeled inventory accurate, manage simple RMAs, and coordinate with vendors.</li>\n<li>Collaboration &amp; escalation - Partner with senior hardware/firmware owners on complex or multi-node issues; communicate status and next steps crisply.</li>\n<li>Documentation &amp; quality - Keep SOPs/checklists current; ensure zero undocumented changes and consistent, audit-ready records.</li>\n</ul>\n<p>About you</p>\n<ul>\n<li>Hands-on mindset in datacenters/server hardware: you can install/re-seat/swap GPU/PCIe cards, NICs, PSUs, drives, and work cleanly in racks (rails, cabling, labeling).</li>\n<li>Disciplined and meticulous: follows checklists, ESD/LOTO; no rough handling; careful with all high-value server components.</li>\n<li>Practical electrical basics: power-off, PPE, short-circuit risk awareness.</li>\n<li>Comfortable in racks: cooling, network, storage, PDU, cable management; can lift/mount safely (within HSE limits).</li>\n<li>Clear communicator: short factual updates; reliable teammate; punctual and process-minded.</li>\n<li>Hardware-passionate, professionally grounded: strong curiosity and craft mindset.</li>\n</ul>\n<p>Nice to have</p>\n<ul>\n<li>HPC/AI/Cloud at scale experience (production environments), large-fleet/server install &amp; maintenance in datacenters.</li>\n<li>Basic networking (Ethernet/InfiniBand) and basic Linux (boot/check; no coding needed).</li>\n<li>Coding/automation skills (Python/Bash): small tools/scripts to improve checklists, photo/serial capture, inventory sync, or simple monitoring/reporting.</li>\n<li>Experience with inventory/RMA tools and vendor coordination.</li>\n<li>Exposure to HPC/research/industrial environments.</li>\n</ul>\n<p>What we offer</p>\n<ul>\n<li>Competitive salary and equity package</li>\n<li>Health insurance</li>\n<li>Transportation allowance</li>\n<li>Sport allowance</li>\n<li>Meal vouchers</li>\n<li>Private pension plan</li>\n<li>Generous parental leave policy</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_c8c20fa9-7f3","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Mistral AI","sameAs":"https://mistral.ai"},"x-apply-url":"https://jobs.lever.co/mistral/ddf7bcbb-e223-4768-a553-6e95df472cf7","x-work-arrangement":"onsite","x-experience-level":"mid","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["Datacenter hardware","Server hardware","GPU/CPU clusters","Networking","Linux","Scripting (Python/Bash)","Inventory/RMA tools","Vendor coordination"],"x-skills-preferred":["HPC/AI/Cloud at scale experience","Basic networking (Ethernet/InfiniBand)","Basic Linux (boot/check; no coding needed)","Coding/automation skills (Python/Bash)"],"datePosted":"2026-03-10T11:25:48.956Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Paris"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Datacenter hardware, Server hardware, GPU/CPU clusters, Networking, Linux, Scripting (Python/Bash), Inventory/RMA tools, Vendor coordination, HPC/AI/Cloud at scale experience, Basic networking (Ethernet/InfiniBand), Basic Linux (boot/check; no coding needed), Coding/automation skills (Python/Bash)"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_13998cbe-159"},"title":"Data Center Operations Manager","description":"<p>About the Role\nMistral AI is seeking a Data Center Operations Manager to lead the build and run operations of our new data center in Borlänge, Sweden. As the first hire for this site, you will be responsible for establishing operational excellence, managing local teams, and ensuring the reliability, security, and efficiency of our AI infrastructure.</p>\n<p>Key Responsibilities\n• Lead the operational management of Mistral’s data center in Borlänge, overseeing build-out, day-to-day operations, and scalability to support our AI infrastructure.\n• Hire and manage a local team of hardware engineers to support operations, maintenance, and troubleshooting.\n• Oversee hardware deployment, upgrades, and decommissioning, ensuring alignment with Mistral’s infrastructure goals.\n• Monitor and enforce Service Level Agreements (SLAs) with data center providers and subcontractors.\n• Manage incidents and request tickets, ensuring timely resolution and clear communication with stakeholders.\n• Ensure adherence to security protocols, contractual obligations, and regulatory requirements at the data center.\n• Provide regular updates to internal teams and external partners on operational status, risks, and improvements.\n• Establish processes and best practices for data center operations, ensuring high availability and performance.\n• Manage local contracts with DC providers and OEM.</p>\n<p>Qualifications &amp; Experience\n• Degree in Computer Science, Electrical/Mechanical Engineering, or related field, or equivalent experience, with a strong understanding of data center technical requirements and operations.\n• Proven track record in data center operations, hardware management, or infrastructure support, preferably in HPC or cloud environments.\n• Proven experience in recruiting, mentoring, and scaling a technical operations team from a greenfield deployment.\n• Experience managing large-scale infrastructure projects, including build-outs, migrations, or upgrades.\n• Strong ability to coordinate and lead vendors, contractors, and internal teams, including review, escalation, and contractual engagement.\n• Comfortable with contract negotiation and management.\n• Hands-on troubleshooting skills and ability to work and lead in critical situations and aggressive timelines.\n• Knowledge of data center security standards (physical and digital) and compliance requirements.\n• Language Skills: Fluency in English and Swedish is a plus.</p>\n<p>Why Join Mistral?\n• Impact: Play a pivotal role in scaling Mistral’s cutting-edge AI infrastructure.\n• Growth: Opportunity to shape data center operations from the ground up in a high-growth startup environment.\n• Collaboration: Work with a talented, cross-functional team passionate about AI and technology.\n• Flexibility: Competitive compensation, benefits, and the chance to contribute to revolutionary projects.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_13998cbe-159","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Mistral AI","sameAs":"https://mistral.ai"},"x-apply-url":"https://jobs.lever.co/mistral/fa170722-b93a-49f5-a649-3fc731c57a71","x-work-arrangement":"hybrid","x-experience-level":"mid","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["data center operations","hardware management","infrastructure support","HPC or cloud environments","recruiting","mentoring","scaling a technical operations team","large-scale infrastructure projects","contract negotiation","hands-on troubleshooting","data center security standards"],"x-skills-preferred":[],"datePosted":"2026-03-10T11:25:02.423Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Borlänge"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"data center operations, hardware management, infrastructure support, HPC or cloud environments, recruiting, mentoring, scaling a technical operations team, large-scale infrastructure projects, contract negotiation, hands-on troubleshooting, data center security standards"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_cf4fd05b-818"},"title":"Senior Software Engineer, NCCL","description":"<p>We are looking for a highly motivated senior software engineer to join our communication libraries and network software team. The position will be part of a fast-paced crew that develops and maintains software for complex heterogeneous computing systems that power disruptive products in High Performance Computing and Deep Learning.</p>\n<p><strong>Responsibilities:</strong></p>\n<ul>\n<li>Design, implement and maintain highly-optimized communication runtimes for Deep Learning frameworks (e.g. NCCL for TensorFlow/Pytorch) and HPC programming interfaces (e.g. UCX for MPI/OpenSHMEM) on GPU clusters.</li>\n<li>Participate in and contribute to parallel programming interface specifications like MPI/OpenSHMEM.</li>\n<li>Design, implement and maintain system software that enables interactions among GPUs and interactions between GPUs and other system components.</li>\n<li>Create proof-of-concepts to evaluate and motivate extensions in programming models, new designs in runtimes and new features in hardware.</li>\n</ul>\n<p><strong>Requirements:</strong></p>\n<ul>\n<li>M.S./Ph.D. degree in CS/CE or equivalent experience.</li>\n<li>5+ years of relevant experience.</li>\n<li>Excellent C/C++ programming and debugging skills.</li>\n<li>Strong experience with Linux.</li>\n<li>Expert understanding of computer system architecture and operating systems.</li>\n<li>Experience with parallel programming interfaces and communication runtimes.</li>\n<li>Ability and flexibility to work and communicate effectively in a multi-national, multi-time-zone corporate environment.</li>\n</ul>\n<p><strong>Nice to Have:</strong></p>\n<ul>\n<li>Deep understanding of technology and passionate about what you do.</li>\n<li>Experience with CUDA programming and NVIDIA GPUs.</li>\n<li>Knowledge of high-performance networks like InfiniBand, iWARP etc.</li>\n<li>Experience with HPC applications.</li>\n<li>Experience with Deep Learning Frameworks such PyTorch, TensorFlow, etc.</li>\n<li>Strong collaborative and interpersonal skills, specifically a proven ability to effectively guide and influence within a dynamic matrix environment.</li>\n</ul>\n<p><strong>Benefits:</strong></p>\n<ul>\n<li>Highly competitive salaries.</li>\n<li>Comprehensive benefits package.</li>\n<li>Eligibility for equity.</li>\n<li>Opportunity to work with a world-class engineering team.</li>\n<li>Ability to work in a dynamic matrix environment.</li>\n<li>Opportunity to contribute to cutting-edge technology.</li>\n<li>Flexible work arrangements.</li>\n<li>Professional development opportunities.</li>\n</ul>\n<p><strong>How to Apply:</strong></p>\n<p>Applications for this job will be accepted at least until March 13, 2026. NVIDIA uses AI tools in its recruiting processes.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_cf4fd05b-818","directApply":true,"hiringOrganization":{"@type":"Organization","name":"NVIDIA","sameAs":"https://nvidia.wd5.myworkdayjobs.com","logo":"https://logos.yubhub.co/nvidia.com.png"},"x-apply-url":"https://nvidia.wd5.myworkdayjobs.com/en-US/NVIDIAExternalCareerSite/job/US-CA-Santa-Clara/Senior-Software-Engineer--GPU-Communications-and-Networking_JR1997186","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["C/C++","Linux","Computer system architecture","Operating systems","Parallel programming interfaces","Communication runtimes"],"x-skills-preferred":["CUDA programming","NVIDIA GPUs","High-performance networks","HPC applications","Deep Learning Frameworks"],"datePosted":"2026-03-09T20:44:17.925Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Santa Clara"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"C/C++, Linux, Computer system architecture, Operating systems, Parallel programming interfaces, Communication runtimes, CUDA programming, NVIDIA GPUs, High-performance networks, HPC applications, Deep Learning Frameworks"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_a51375e8-30e"},"title":"Member of Technical Staff, Software Co-Design AI HPC Systems","description":"<p>Our team&#39;s mission is to architect, co-design, and productionize next-generation AI systems at datacenter scale. We operate at the intersection of models, systems software, networking, storage, and AI hardware, optimizing end-to-end performance, efficiency, reliability, and cost. Our work spans today&#39;s frontier AI workloads and directly shapes the next generation of accelerators, system architectures, and large-scale AI platforms. We pursue this mission through deep hardware–software co-design, combining rigorous systems thinking with hands-on engineering. The team invests heavily in understanding real production workloads large-scale training, inference, and emerging multimodal models and translating those insights into concrete improvements across the stack: from kernels, runtimes, and distributed systems, all the way down to silicon-level trade-offs and datacenter-scale architectures. This role sits at the boundary between exploration and production. You will work closely with internal infrastructure, hardware, compiler, and product teams, as well as external partners across the hardware and systems ecosystem. Our operating model emphasizes rapid ideation and prototyping, followed by disciplined execution to drive high-leverage ideas into production systems that operate at massive scale. In addition to delivering real-world impact on large-scale AI platforms, the team actively contributes to the broader research and engineering community. Our work aligns closely with leading communities in ML systems, distributed systems, computer architecture, and high-performance computing, and we regularly publish, prototype, and open-source impactful technologies where appropriate.</p>\n<p>About the Team</p>\n<p>We build foundational AI infrastructure that enables large-scale training and inference across diverse workloads and rapidly evolving hardware generations. Our work directly shapes how AI systems are designed, deployed, and scaled today and into the future. Engineers on this team operate with end-to-end ownership, deep technical rigor, and a strong bias toward real-world impact.</p>\n<p>Microsoft Superintelligence Team</p>\n<p>Microsoft Superintelligence team’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.</p>\n<p>This role is part of Microsoft AI’s Superintelligence Team. The MAIST is a startup-like team inside Microsoft AI, created to push the boundaries of AI toward Humanist Superintelligence—ultra-capable systems that remain controllable, safety-aligned, and anchored to human values. Our mission is to create AI that amplifies human potential while ensuring humanity remains firmly in control. We aim to deliver breakthroughs that benefit society—advancing science, education, and global well-being. We’re also fortunate to partner with incredible product teams giving our models the chance to reach billions of users and create immense positive impact. If you’re a brilliant, highly-ambitious and low ego individual, you’ll fit right in—come and join us as we work on our next generation of models!</p>\n<p>Responsibilities</p>\n<p>Lead the co-design of AI systems across hardware and software boundaries, spanning accelerators, interconnects, memory systems, storage, runtimes, and distributed training/inference frameworks. Drive architectural decisions by analyzing real workloads, identifying bottlenecks across compute, communication, and data movement, and translating findings into actionable system and hardware requirements. Co-design and optimize parallelism strategies, execution models, and distributed algorithms to improve scalability, utilization, reliability, and cost efficiency of large-scale AI systems. Develop and evaluate what-if performance models to project system behavior under future workloads, model architectures, and hardware generations, providing early guidance to hardware and platform roadmaps. Partner with compiler, kernel, and runtime teams to unlock the full performance of current and next-generation accelerators, including custom kernels, scheduling strategies, and memory optimizations. Influence and guide AI hardware design at system and silicon levels, including accelerator microarchitecture, interconnect topology, memory hierarchy, and system integration trade-offs. Lead cross-functional efforts to prototype, validate, and productionize high-impact co-design ideas, working across infrastructure, hardware, and product teams. Mentor senior engineers and researchers, set technical direction, and raise the overall bar for systems rigor, performance engineering, and co-design thinking across the organization.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_a51375e8-30e","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Microsoft AI","sameAs":"https://microsoft.ai","logo":"https://logos.yubhub.co/microsoft.ai.png"},"x-apply-url":"https://microsoft.ai/job/member-of-technical-staff-software-co-design-ai-hpc-systems-mai-superintelligence-team-3/","x-work-arrangement":"hybrid","x-experience-level":"staff","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["AI accelerator or GPU architectures","Distributed systems and large-scale AI training/inference","High-performance computing (HPC) and collective communications","ML systems, runtimes, or compilers","Performance modeling, benchmarking, and systems analysis","Hardware–software co-design for AI workloads","Proficiency in systems-level programming (e.g., C/C++, CUDA, Python) and performance-critical software development"],"x-skills-preferred":["Experience designing or operating large-scale AI clusters for training or inference","Deep familiarity with LLMs, multimodal models, or recommendation systems, and their systems-level implications","Experience with accelerator interconnects and communication stacks (e.g., NCCL, MPI, RDMA, high-speed Ethernet or InfiniBand)","Background in performance modeling and capacity planning for future hardware generations","Prior experience contributing to or leading hardware roadmaps, silicon bring-up, or platform architecture reviews","Publications, patents, or open-source contributions in systems, architecture, or ML systems"],"datePosted":"2026-03-08T22:18:41.443Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"London"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"AI accelerator or GPU architectures, Distributed systems and large-scale AI training/inference, High-performance computing (HPC) and collective communications, ML systems, runtimes, or compilers, Performance modeling, benchmarking, and systems analysis, Hardware–software co-design for AI workloads, Proficiency in systems-level programming (e.g., C/C++, CUDA, Python) and performance-critical software development, Experience designing or operating large-scale AI clusters for training or inference, Deep familiarity with LLMs, multimodal models, or recommendation systems, and their systems-level implications, Experience with accelerator interconnects and communication stacks (e.g., NCCL, MPI, RDMA, high-speed Ethernet or InfiniBand), Background in performance modeling and capacity planning for future hardware generations, Prior experience contributing to or leading hardware roadmaps, silicon bring-up, or platform architecture reviews, Publications, patents, or open-source contributions in systems, architecture, or ML systems"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_b151fcc2-2fb"},"title":"Member of Technical Staff, High Performance Computing Engineer","description":"<p>We are looking for experienced Member of Technical Staff, High Performance Computing Engineers to help build and scale the infrastructure that trains our frontier models and powers the next evolution of our personal AI, Copilot.</p>\n<p>This role offers the unique opportunity to work on some of the largest scale supercomputers in the world – a rare chance to operate at such a significant scale.</p>\n<p><strong>Responsibilities</strong></p>\n<p>Design, operate, and maintain large-scale HPC environments, drawing on hands-on engineering experience in production settings.</p>\n<p>Own the deployment, configuration, and day-to-day operation of HPC schedulers (e.g., SLURM, Kubernetes), ensuring reliable and efficient job scheduling at scale.</p>\n<p>Serve as a technical owner for at least one core HPC domain (GPU compute, high-performance storage, networking, or similar), including ongoing maintenance, performance tuning, and troubleshooting of massive clusters.</p>\n<p>Develop and maintain automation and tooling using Bash and/or Python to improve cluster reliability, observability, and operational efficiency.</p>\n<p>Partner closely with researchers and engineers to support their workloads, troubleshoot cluster usage issues, and triage failed or underperforming jobs to resolution.</p>\n<p>Drive work forward independently by navigating ambiguity and technical roadblocks, delivering incremental improvements that get capabilities into users’ hands quickly.</p>\n<p><strong>Qualifications</strong></p>\n<p>Do you have a Bachelor’s degree in computer science, or related technical field AND 4+ years technical engineering experience with deploying or operating on-premise or cloud high-performance clusters, AND 4+ years experience working with high-scale training clusters (ex. working with frameworks/tools such as nvidia InfiniBand clusters, SLURM, Kubernetes, Ray, etc.), AND 4+ years experience building scalable services on top of public cloud infrastructure like Azure, AWS, or GCP, OR equivalent experience?</p>\n<p><strong>Preferred Qualifications</strong></p>\n<p>Master’s Degree in Computer Science or related technical field AND 6+ years technical engineering experience with deploying or operating on-premise or cloud high-performance clusters, AND 6+ years experience working with high-scale training clusters (ex. working with frameworks/tools such as nvidia InfiniBand clusters, SLURM, Kubernetes, Ray, etc.), AND 6+ years experience building scalable services on top of public cloud infrastructure like Azure, AWS, or GCP, OR equivalent experience.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_b151fcc2-2fb","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Microsoft AI","sameAs":"https://microsoft.ai","logo":"https://logos.yubhub.co/microsoft.ai.png"},"x-apply-url":"https://microsoft.ai/job/member-of-technical-staff-high-performance-computing-engineer-mai-superintelligence-team-3/","x-work-arrangement":"onsite","x-experience-level":"staff","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["HPC","SLURM","Kubernetes","GPU compute","high-performance storage","networking","Bash","Python","nvidia InfiniBand clusters","Ray"],"x-skills-preferred":["LLM training clusters","AI platforms","Machine Learning frameworks","large-scale HPC or GPU systems"],"datePosted":"2026-03-08T22:15:08.170Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Zürich"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"HPC, SLURM, Kubernetes, GPU compute, high-performance storage, networking, Bash, Python, nvidia InfiniBand clusters, Ray, LLM training clusters, AI platforms, Machine Learning frameworks, large-scale HPC or GPU systems"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_cd1a0d16-311"},"title":"Member of Technical Staff, Software Co-Design AI HPC Systems","description":"<p>Our team&#39;s mission is to architect, co-design, and productionize next-generation AI systems at datacenter scale. We operate at the intersection of models, systems software, networking, storage, and AI hardware, optimizing end-to-end performance, efficiency, reliability, and cost.</p>\n<p>We pursue this mission through deep hardware–software co-design, combining rigorous systems thinking with hands-on engineering. The team invests heavily in understanding real production workloads large-scale training, inference, and emerging multimodal models and translating those insights into concrete improvements across the stack: from kernels, runtimes, and distributed systems, all the way down to silicon-level trade-offs and datacenter-scale architectures.</p>\n<p>This role sits at the boundary between exploration and production. You will work closely with internal infrastructure, hardware, compiler, and product teams, as well as external partners across the hardware and systems ecosystem. Our operating model emphasizes rapid ideation and prototyping, followed by disciplined execution to drive high-leverage ideas into production systems that operate at massive scale.</p>\n<p>In addition to delivering real-world impact on large-scale AI platforms, the team actively contributes to the broader research and engineering community. Our work aligns closely with leading communities in ML systems, distributed systems, computer architecture, and high-performance computing, and we regularly publish, prototype, and open-source impactful technologies where appropriate.</p>\n<p>Microsoft Superintelligence Team\nMicrosoft Superintelligence team’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.</p>\n<p>This role is part of Microsoft AI’s Superintelligence Team. The MAIST is a startup-like team inside Microsoft AI, created to push the boundaries of AI toward Humanist Superintelligence—ultra-capable systems that remain controllable, safety-aligned, and anchored to human values. Our mission is to create AI that amplifies human potential while ensuring humanity remains firmly in control. We aim to deliver breakthroughs that benefit society—advancing science, education, and global well-being. We’re also fortunate to partner with incredible product teams giving our models the chance to reach billions of users and create immense positive impact.</p>\n<p>Responsibilities\nLead the co-design of AI systems across hardware and software boundaries, spanning accelerators, interconnects, memory systems, storage, runtimes, and distributed training/inference frameworks.</p>\n<p>Drive architectural decisions by analyzing real workloads, identifying bottlenecks across compute, communication, and data movement, and translating findings into actionable system and hardware requirements.</p>\n<p>Co-design and optimize parallelism strategies, execution models, and distributed algorithms to improve scalability, utilization, reliability, and cost efficiency of large-scale AI systems.</p>\n<p>Develop and evaluate what-if performance models to project system behavior under future workloads, model architectures, and hardware generations, providing early guidance to hardware and platform roadmaps.</p>\n<p>Partner with compiler, kernel, and runtime teams to unlock the full performance of current and next-generation accelerators, including custom kernels, scheduling strategies, and memory optimizations.</p>\n<p>Influence and guide AI hardware design at system and silicon levels, including accelerator microarchitecture, interconnect topology, memory hierarchy, and system integration trade-offs.</p>\n<p>Lead cross-functional efforts to prototype, validate, and productionize high-impact co-design ideas, working across infrastructure, hardware, and product teams.</p>\n<p>Mentor senior engineers and researchers, set technical direction, and raise the overall bar for systems rigor, performance engineering, and co-design thinking across the organization.</p>\n<p>Qualifications\nBachelor’s Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience.</p>\n<p>Additional or Preferred Qualifications\nMaster’s Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR Bachelor’s Degree in Computer Science or related technical field AND 12+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience.</p>\n<p>Strong background in one or more of the following areas: AI accelerator or GPU architectures Distributed systems and large-scale AI training/inference High-performance computing (HPC) and collective communications ML systems, runtimes, or compilers Performance modeling, benchmarking, and systems analysis Hardware–software co-design for AI workloads Proficiency in systems-level programming (e.g., C/C++, CUDA, Python) and performance-critical software development.</p>\n<p>Proven ability to work across organizational boundaries and influence technical decisions involving multiple stakeholders. Experience designing or operating large-scale AI clusters for training or inference. Deep familiarity with LLMs, multimodal models, or recommendation systems, and their systems-level implications. Experience with accelerator interconnects and communication stacks (e.g., NCCL, MPI, RDMA, high-speed Ethernet or InfiniBand). Background in performance modeling and capacity planning for future hardware generations. Prior experience contributing to or leading hardware roadmaps, silicon bring-up, or platform architecture reviews. Publications, patents, or open-source contributions in systems, architecture, or ML systems are a plus.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_cd1a0d16-311","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Microsoft AI","sameAs":"https://microsoft.ai","logo":"https://logos.yubhub.co/microsoft.ai.png"},"x-apply-url":"https://microsoft.ai/job/member-of-technical-staff-software-co-design-ai-hpc-systems-mai-superintelligence-team-2/","x-work-arrangement":"hybrid","x-experience-level":"staff","x-job-type":"full-time","x-salary-range":"$139,900 – $274,800 per year","x-skills-required":["C","C++","C#","Java","JavaScript","Python","AI accelerator or GPU architectures","Distributed systems and large-scale AI training/inference","High-performance computing (HPC) and collective communications","ML systems, runtimes, or compilers","Performance modeling, benchmarking, and systems analysis","Hardware–software co-design for AI workloads","Proficiency in systems-level programming (e.g., C/C++, CUDA, Python) and performance-critical software development"],"x-skills-preferred":["LLMs, multimodal models, or recommendation systems, and their systems-level implications","Accelerator interconnects and communication stacks (e.g., NCCL, MPI, RDMA, high-speed Ethernet or InfiniBand)","Performance modeling and capacity planning for future hardware generations","Contributing to or leading hardware roadmaps, silicon bring-up, or platform architecture reviews","Publications, patents, or open-source contributions in systems, architecture, or ML systems"],"datePosted":"2026-03-08T22:13:30.666Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Redmond"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"C, C++, C#, Java, JavaScript, Python, AI accelerator or GPU architectures, Distributed systems and large-scale AI training/inference, High-performance computing (HPC) and collective communications, ML systems, runtimes, or compilers, Performance modeling, benchmarking, and systems analysis, Hardware–software co-design for AI workloads, Proficiency in systems-level programming (e.g., C/C++, CUDA, Python) and performance-critical software development, LLMs, multimodal models, or recommendation systems, and their systems-level implications, Accelerator interconnects and communication stacks (e.g., NCCL, MPI, RDMA, high-speed Ethernet or InfiniBand), Performance modeling and capacity planning for future hardware generations, Contributing to or leading hardware roadmaps, silicon bring-up, or platform architecture reviews, Publications, patents, or open-source contributions in systems, architecture, or ML systems","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":139900,"maxValue":274800,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_fb4fa003-a73"},"title":"Platform Hardware Security Engineer","description":"<p><strong>About the Role</strong></p>\n<p>We&#39;re seeking a Platform Hardware Security Engineer to design and implement security architectures for bare-metal infrastructure. You&#39;ll work with teams across Anthropic to build firmware, bootloaders, operating systems, and attestation systems to ensure the integrity of our infrastructure from the ground up.</p>\n<p>This role requires expertise in low-level systems security and the ability to architect solutions that balance security requirements with the performance demands of training AI models across our massive fleet.</p>\n<p><strong>What you&#39;ll do:</strong></p>\n<ul>\n<li>Design and implement secure boot chains from firmware through OS initialization for diverse hardware platforms (CPUs, BMCs, switches, peripherals, and embedded microcontrollers)</li>\n<li>Architect attestation systems that provide cryptographic proof of system state from hardware root of trust through application layer</li>\n<li>Develop measured boot implementations and runtime integrity monitoring</li>\n<li>Create reference architectures and security requirements for bare-metal deployments</li>\n<li>Integrate security controls with infrastructure teams without impacting training performance</li>\n<li>Prototype and validate security mechanisms before production deployment</li>\n<li>Conduct firmware vulnerability assessments and penetration testing</li>\n<li>Build firmware analysis pipelines for continuous security monitoring</li>\n<li>Document security architectures and maintain threat models</li>\n<li>Collaborate with software and hardware vendors to ensure security capabilities meet our requirements</li>\n</ul>\n<p><strong>Who you are:</strong></p>\n<ul>\n<li>8+ years of experience in systems security, with at least 5 years focused on firmware and hardware security (firmware, bootloaders, and OS-level security)</li>\n<li>Hands-on experience with secure boot, measured boot, and attestation technologies (TPM, Intel TXT, AMD SEV, ARM TrustZone)</li>\n<li>Strong understanding of cryptographic protocols and hardware security modules</li>\n<li>Experience with UEFI/BIOS or embedded firmware security, bootloader hardening, and chain of trust implementation</li>\n<li>Proficiency in low-level programming (C, Rust, Assembly) and systems programming</li>\n<li>Knowledge of firmware vulnerability assessment and threat modeling</li>\n<li>Track record of designing security architectures for complex, distributed systems</li>\n<li>Experience with supply chain security</li>\n<li>Ability to work effectively across hardware and software boundaries</li>\n<li>Knowledge of NIST firmware security guidelines and hardware security frameworks</li>\n</ul>\n<p><strong>Strong candidates may also have:</strong></p>\n<ul>\n<li>Experience with confidential computing technologies and hardware-based TEEs</li>\n<li>Knowledge of SLSA framework and software supply chain security standards</li>\n<li>Experience securing large-scale HPC or cloud infrastructure</li>\n<li>Contributions to open-source security projects (coreboot, CHIPSEC, etc.)</li>\n<li>Background in formal verification or security proof techniques</li>\n<li>Experience with silicon root of trust implementations</li>\n<li>Experience working with building foundational technical designs, operational leadership, and vendor collaboration</li>\n<li>Previous work with AI/ML infrastructure security</li>\n</ul>\n<p><strong>Logistics</strong></p>\n<ul>\n<li>Education requirements: We require at least a Bachelor&#39;s degree in a related field or equivalent experience.</li>\n<li>Location-based hybrid policy: Currently, we expect all staff to be in one of our offices at least 25% of the time. However, some roles may require more time in our offices.</li>\n<li>Visa sponsorship: We do sponsor visas! However, we aren&#39;t able to successfully sponsor visas for every role and every candidate. But if we make you an offer, we will make every reasonable effort to get you a visa, and we retain an immigration lawyer to help with this.</li>\n</ul>\n<p><strong>We encourage you to apply even if you do not believe you meet every single qualification. Not all strong candidates will meet every single qualification as listed. Research shows that people who identify as being from underrepresented groups are more prone to experiencing imposter syndrome and doubting the strength of their candidacy, so we urge you not to exclude yourself prematurely and to submit an application if you&#39;re interested in this work.</strong></p>\n<p><strong>Your safety matters to us. To protect yourself from potential scams, remember that Anthropic recruiters only contact you from @anthropic.com email addresses. In some cases, we may partner with vetted recruiting agencies who will identify themselves as working on behalf of Anthropic. Be cautious of emails from other domains. Legitimate Anthropic recruiters will never ask for money, fees, or banking information before your first day. If you&#39;re ever unsure about a communication, don&#39;t click any links—visit anthropic.com/careers directly for confirmed position openings.</strong></p>\n<p><strong>How we&#39;re different</strong></p>\n<p>We believe that the highest-impact AI research will be big science. At Anthropic we work as a single cohesive team on just a few large-scale research efforts.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_fb4fa003-a73","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Anthropic","sameAs":"https://job-boards.greenhouse.io","logo":"https://logos.yubhub.co/anthropic.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/anthropic/jobs/4929689008","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$405,000 - $485,000 USD","x-skills-required":["firmware security","hardware security","secure boot","measured boot","attestation technologies","cryptographic protocols","hardware security modules","UEFI/BIOS","embedded firmware security","bootloader hardening","chain of trust implementation","low-level programming","systems programming","firmware vulnerability assessment","threat modeling","supply chain security","NIST firmware security guidelines","hardware security frameworks"],"x-skills-preferred":["confidential computing technologies","hardware-based TEEs","SLSA framework","software supply chain security standards","large-scale HPC or cloud infrastructure","open-source security projects","formal verification","security proof techniques","silicon root of trust implementations","AI/ML infrastructure security"],"datePosted":"2026-03-08T13:47:08.377Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"New York City, NY; Seattle, WA; San Francisco, CA; Washington, DC"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"firmware security, hardware security, secure boot, measured boot, attestation technologies, cryptographic protocols, hardware security modules, UEFI/BIOS, embedded firmware security, bootloader hardening, chain of trust implementation, low-level programming, systems programming, firmware vulnerability assessment, threat modeling, supply chain security, NIST firmware security guidelines, hardware security frameworks, confidential computing technologies, hardware-based TEEs, SLSA framework, software supply chain security standards, large-scale HPC or cloud infrastructure, open-source security projects, formal verification, security proof techniques, silicon root of trust implementations, AI/ML infrastructure security","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":405000,"maxValue":485000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_517e3008-238"},"title":"Physical Design Engineer","description":"<p><strong>Location</strong></p>\n<p>San Francisco</p>\n<p><strong>Employment Type</strong></p>\n<p>Full time</p>\n<p><strong>Department</strong></p>\n<p>Scaling</p>\n<p><strong>Compensation</strong></p>\n<ul>\n<li>$266K – $445K • Offers Equity</li>\n</ul>\n<p>The base pay offered may vary depending on multiple individualized factors, including market location, job-related knowledge, skills, and experience. If the role is non-exempt, overtime pay will be provided consistent with applicable laws. In addition to the salary range listed above, total compensation also includes generous equity, performance-related bonus(es) for eligible employees, and the following benefits.</p>\n<ul>\n<li>Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts</li>\n</ul>\n<ul>\n<li>Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)</li>\n</ul>\n<ul>\n<li>401(k) retirement plan with employer match</li>\n</ul>\n<ul>\n<li>Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)</li>\n</ul>\n<ul>\n<li>Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees</li>\n</ul>\n<ul>\n<li>13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)</li>\n</ul>\n<ul>\n<li>Mental health and wellness support</li>\n</ul>\n<ul>\n<li>Employer-paid basic life and disability coverage</li>\n</ul>\n<ul>\n<li>Annual learning and development stipend to fuel your professional growth</li>\n</ul>\n<ul>\n<li>Daily meals in our offices, and meal delivery credits as eligible</li>\n</ul>\n<ul>\n<li>Relocation support for eligible employees</li>\n</ul>\n<ul>\n<li>Additional taxable fringe benefits, such as charitable donation matching and wellness stipends, may also be provided.</li>\n</ul>\n<p>More details about our benefits are available to candidates during the hiring process.</p>\n<p>This role is at-will and OpenAI reserves the right to modify base pay and other compensation components at any time based on individual performance, team or company results, or market conditions.</p>\n<p><strong>About the Team</strong></p>\n<p>OpenAI’s Hardware team designs the custom silicon that powers the world’s most advanced AI systems. From system-level architecture to custom circuit implementations, we partner closely with model and infrastructure teams to deliver performance, power, and efficiency breakthroughs across all layers of the stack.</p>\n<p><strong>About the Role</strong></p>\n<p>We are seeking a highly skilled Silicon Implementation Engineer with deep expertise in physical design and methodology. This individual contributor role sits within our physical design team and is central to delivering power, performance, and area (PPA) optimized datapath and interconnect solutions for next-generation AI accelerators.</p>\n<p>You’ll work closely with RTL designers to define and execute on physical design strategies. You will develop tools, flows and methodologies to increase team productivity. Your work will directly impact silicon’s performance and cost efficiency, as well as the team’s execution velocity and quality.</p>\n<p><strong>In this role, you will:</strong></p>\n<ul>\n<li>Develop, build and own tools, flows and methodologies for physical implementation</li>\n<li>Own physical implementation of floorplan blocks from floorplanning to final signoff</li>\n<li>Collaborate with RTL designers to drive optimal block implementation solutions</li>\n<li>Analyze and optimize design for timing, power, and area trade-offs, working in collaboration with EDA vendors and ASIC partners</li>\n</ul>\n<p><strong>Qualifications:</strong></p>\n<ul>\n<li>BS w/ 4+ or MS with 2+ years or PhD with 0-1 year(s) of relevant industry experience in physical design and methodology development</li>\n<li>Demonstrated success in taping out complex silicon designs</li>\n<li>Hands-on experience with block physical implementation and PPA convergence</li>\n<li>Strong coding experience with python, bazel, TCL</li>\n<li>Strong experience building physical design tools, flows and methodologies</li>\n<li>Strong understanding of microarchitecture, RTL design, physical design, circuit design, physical verification and timing closure.</li>\n<li>Deep familiarity with industry-standard tools and flows for physical synthesis, PNR, LEC and power estimation</li>\n</ul>\n<p><strong>Bonus:</strong></p>\n<ul>\n<li>Experience with AI or HPC-focused chips</li>\n<li>Experience with optimizing PPA for high performance compute cores</li>\n<li>Hands-on experience with top-level design methodologies</li>\n</ul>\n<p><strong>About OpenAI</strong></p>\n<p>OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_517e3008-238","directApply":true,"hiringOrganization":{"@type":"Organization","name":"OpenAI","sameAs":"https://jobs.ashbyhq.com","logo":"https://logos.yubhub.co/openai.com.png"},"x-apply-url":"https://jobs.ashbyhq.com/openai/5a265d2b-683f-4cea-9b69-8e137e704ab3","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$266K – $445K","x-skills-required":["physical design","methodology development","python","bazel","TCL","EDA vendors","ASIC partners","microarchitecture","RTL design","physical design","circuit design","physical verification","timing closure"],"x-skills-preferred":["AI or HPC-focused chips","optimizing PPA for high performance compute cores","top-level design methodologies"],"datePosted":"2026-03-06T18:41:49.725Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"physical design, methodology development, python, bazel, TCL, EDA vendors, ASIC partners, microarchitecture, RTL design, physical design, circuit design, physical verification, timing closure, AI or HPC-focused chips, optimizing PPA for high performance compute cores, top-level design methodologies","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":266000,"maxValue":445000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_f2722128-3e2"},"title":"Inference Runtime, Engineering Manager","description":"<p><strong>Inference Runtime, Engineering Manager</strong></p>\n<p><strong>Location</strong></p>\n<p>San Francisco</p>\n<p><strong>Employment Type</strong></p>\n<p>Full time</p>\n<p><strong>Department</strong></p>\n<p>Scaling</p>\n<p><strong>Compensation</strong></p>\n<ul>\n<li>$455K – $555K</li>\n</ul>\n<p>The base pay offered may vary depending on multiple individualized factors, including market location, job-related knowledge, skills, and experience. If the role is non-exempt, overtime pay will be provided consistent with applicable laws. In addition to the salary range listed above, total compensation also includes generous equity, performance-related bonus(es) for eligible employees, and the following benefits.</p>\n<ul>\n<li>Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts</li>\n</ul>\n<ul>\n<li>Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)</li>\n</ul>\n<ul>\n<li>401(k) retirement plan with employer match</li>\n</ul>\n<ul>\n<li>Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)</li>\n</ul>\n<ul>\n<li>Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees</li>\n</ul>\n<ul>\n<li>13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)</li>\n</ul>\n<ul>\n<li>Mental health and wellness support</li>\n</ul>\n<ul>\n<li>Employer-paid basic life and disability coverage</li>\n</ul>\n<ul>\n<li>Annual learning and development stipend to fuel your professional growth</li>\n</ul>\n<ul>\n<li>Daily meals in our offices, and meal delivery credits as eligible</li>\n</ul>\n<ul>\n<li>Relocation support for eligible employees</li>\n</ul>\n<ul>\n<li>Additional taxable fringe benefits, such as charitable donation matching and wellness stipends, may also be provided.</li>\n</ul>\n<p>More details about our benefits are available to candidates during the hiring process.</p>\n<p>This role is at-will and OpenAI reserves the right to modify base pay and other compensation components at any time based on individual performance, team or company results, or market conditions.</p>\n<p><strong>About the Team</strong></p>\n<p>Our Inference team brings OpenAI’s most capable research and technology to the world through our products. We empower consumers, enterprise and developers alike to use and access our start-of-the-art AI models, allowing them to do things that they’ve never been able to before. We focus on performant and efficient model inference, as well as accelerating research progression via model inference.</p>\n<p><strong>About the Role</strong></p>\n<p>We are looking for an engineering leader who wants to build and lead the worlds leading AI systems and modeling engineers who take the world&#39;s largest and most capable AI models and optimize them for use in a high-volume, low-latency, and high-availability production and research environment.</p>\n<p>In this role, you will:</p>\n<ul>\n<li>Lead a team of engineers who are experts in working with distributed systems, with a deep understanding of model architecture, system co-design with research and production team,</li>\n</ul>\n<ul>\n<li>Work alongside partners in machine learning researchers, engineers, and product managers to bring our latest technologies into production.</li>\n</ul>\n<ul>\n<li>Work in an outcome-oriented environment where everyone contributes across layers of the stack, from infra plumbing to performance tuning.</li>\n</ul>\n<ul>\n<li>Introduce new techniques, tools, and architecture that improve the performance, latency, throughput, and efficiency of our model inference stack.</li>\n</ul>\n<ul>\n<li>Build tools to give us visibility into our bottlenecks and sources of instability and then design and implement solutions to address the highest priority issues.</li>\n</ul>\n<ul>\n<li>Optimize our code and fleet of GPU’s to utilize every FLOP and every GB of GPU RAM of our hardware.</li>\n</ul>\n<p><strong>You might thrive in this role if you:</strong></p>\n<ul>\n<li>Have an understanding of modern ML architectures and an intuition for how to optimize their performance, particularly for inference.</li>\n</ul>\n<ul>\n<li>Own problems end-to-end, and are willing to pick up whatever knowledge you&#39;re missing to get the job done.</li>\n</ul>\n<ul>\n<li>Have at least 15 years of professional software engineering experience.</li>\n</ul>\n<ul>\n<li>Have or can quickly gain familiarity with PyTorch, NVidia GPUs and the software stacks that optimize them (e.g. NCCL, CUDA), as well as HPC technologies such as InfiniBand, MPI, NVLink, etc.</li>\n</ul>\n<ul>\n<li>Have experience architecting, building, observing, and debugging production distributed systems. Bonus point if worked on performance-critical distributed systems.</li>\n</ul>\n<ul>\n<li>Have needed to rebuild or substantially refactor production systems several times over due to rapidly increasing scale.</li>\n</ul>\n<ul>\n<li>Are self-directed and enjoy figuring out the most important problem to work on.</li>\n</ul>\n<ul>\n<li>Have a humble attitude, an eagerness to help your colleagues, and a desire to do whatever it takes to make the team succeed.</li>\n</ul>\n<p><strong>About OpenAI</strong></p>\n<p>OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_f2722128-3e2","directApply":true,"hiringOrganization":{"@type":"Organization","name":"OpenAI","sameAs":"https://jobs.ashbyhq.com","logo":"https://logos.yubhub.co/openai.com.png"},"x-apply-url":"https://jobs.ashbyhq.com/openai/4f998abb-4510-4bd3-9922-161599625171","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$455K – $555K","x-skills-required":["PyTorch","NVidia GPUs","NCCL","CUDA","InfiniBand","MPI","NVLink","HPC technologies","Distributed systems","Model architecture","System co-design","Machine learning","Research","Production","Software engineering","GPU optimization"],"x-skills-preferred":["HPC technologies","Distributed systems","Model architecture","System co-design","Machine learning","Research","Production","Software engineering","GPU optimization"],"datePosted":"2026-03-06T18:39:15.426Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"PyTorch, NVidia GPUs, NCCL, CUDA, InfiniBand, MPI, NVLink, HPC technologies, Distributed systems, Model architecture, System co-design, Machine learning, Research, Production, Software engineering, GPU optimization, HPC technologies, Distributed systems, Model architecture, System co-design, Machine learning, Research, Production, Software engineering, GPU optimization","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":455000,"maxValue":555000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_d5390946-539"},"title":"Software Engineer, Model Inference","description":"<p><strong>Software Engineer, Model Inference</strong></p>\n<p><strong>Location</strong></p>\n<p>San Francisco</p>\n<p><strong>Employment Type</strong></p>\n<p>Full time</p>\n<p><strong>Department</strong></p>\n<p>Scaling</p>\n<p><strong>Compensation</strong></p>\n<ul>\n<li>$295K – $555K • Offers Equity</li>\n</ul>\n<p>The base pay offered may vary depending on multiple individualized factors, including market location, job-related knowledge, skills, and experience. If the role is non-exempt, overtime pay will be provided consistent with applicable laws. In addition to the salary range listed above, total compensation also includes generous equity, performance-related bonus(es) for eligible employees, and the following benefits.</p>\n<p><strong>Benefits</strong></p>\n<ul>\n<li>Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts</li>\n</ul>\n<ul>\n<li>Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)</li>\n</ul>\n<ul>\n<li>401(k) retirement plan with employer match</li>\n</ul>\n<ul>\n<li>Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)</li>\n</ul>\n<ul>\n<li>Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees</li>\n</ul>\n<ul>\n<li>13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)</li>\n</ul>\n<ul>\n<li>Mental health and wellness support</li>\n</ul>\n<ul>\n<li>Employer-paid basic life and disability coverage</li>\n</ul>\n<ul>\n<li>Annual learning and development stipend to fuel your professional growth</li>\n</ul>\n<ul>\n<li>Daily meals in our offices, and meal delivery credits as eligible</li>\n</ul>\n<ul>\n<li>Relocation support for eligible employees</li>\n</ul>\n<ul>\n<li>Additional taxable fringe benefits, such as charitable donation matching and wellness stipends, may also be provided.</li>\n</ul>\n<p><strong>About the Team</strong></p>\n<p>Our Inference team brings OpenAI’s most capable research and technology to the world through our products. We empower consumers, enterprise and developers alike to use and access our start-of-the-art AI models, allowing them to do things that they’ve never been able to before. We focus on performant and efficient model inference, as well as accelerating research progression via model inference.</p>\n<p><strong>About the Role</strong></p>\n<p>We are looking for an engineer who wants to take the world&#39;s largest and most capable AI models and optimize them for use in a high-volume, low-latency, and high-availability production and research environment.</p>\n<p><strong>In this role, you will:</strong></p>\n<ul>\n<li>Work alongside machine learning researchers, engineers, and product managers to bring our latest technologies into production.</li>\n</ul>\n<ul>\n<li>Work alongside researchers to enable advanced research through awesome engineering.</li>\n</ul>\n<ul>\n<li>Introduce new techniques, tools, and architecture that improve the performance, latency, throughput, and efficiency of our model inference stack.</li>\n</ul>\n<ul>\n<li>Build tools to give us visibility into our bottlenecks and sources of instability and then design and implement solutions to address the highest priority issues.</li>\n</ul>\n<ul>\n<li>Optimize our code and fleet of Azure VMs to utilize every FLOP and every GB of GPU RAM of our hardware.</li>\n</ul>\n<p><strong>You might thrive in this role if you:</strong></p>\n<ul>\n<li>Have an understanding of modern ML architectures and an intuition for how to optimize their performance, particularly for inference.</li>\n</ul>\n<ul>\n<li>Own problems end-to-end, and are willing to pick up whatever knowledge you&#39;re missing to get the job done.</li>\n</ul>\n<ul>\n<li>Have at least 5 years of professional software engineering experience.</li>\n</ul>\n<ul>\n<li>Have or can quickly gain familiarity with PyTorch, NVidia GPUs and the software stacks that optimize them (e.g. NCCL, CUDA), as well as HPC technologies such as InfiniBand, MPI, NVLink, etc.</li>\n</ul>\n<ul>\n<li>Have experience architecting, building, observing, and debugging production distributed systems. Bonus point if worked on performance-critical distributed systems.</li>\n</ul>\n<ul>\n<li>Have needed to rebuild or substantially refactor production systems several times over due to rapidly increasing scale.</li>\n</ul>\n<ul>\n<li>Are self-directed and enjoy figuring out the most important problem to work on.</li>\n</ul>\n<ul>\n<li>Have a humble attitude, an eagerness to help your colleagues, and a desire to do whatever it takes to make the team succeed.</li>\n</ul>\n<p><strong>About OpenAI</strong></p>\n<p>OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_d5390946-539","directApply":true,"hiringOrganization":{"@type":"Organization","name":"OpenAI","sameAs":"https://jobs.ashbyhq.com","logo":"https://logos.yubhub.co/openai.com.png"},"x-apply-url":"https://jobs.ashbyhq.com/openai/83b6755d-7785-4186-9050-5ef3ad127941","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$295K – $555K • Offers Equity","x-skills-required":["PyTorch","NVidia GPUs","NCCL","CUDA","HPC technologies","InfiniBand","MPI","NVLink","Azure VMs","GPU RAM","FLOP"],"x-skills-preferred":["modern ML architectures","intuition for optimizing performance","distributed systems","performance-critical distributed systems"],"datePosted":"2026-03-06T18:31:29.482Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"PyTorch, NVidia GPUs, NCCL, CUDA, HPC technologies, InfiniBand, MPI, NVLink, Azure VMs, GPU RAM, FLOP, modern ML architectures, intuition for optimizing performance, distributed systems, performance-critical distributed systems","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":295000,"maxValue":555000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_771f8f42-36f"},"title":"CFD Methodology Engineer","description":"<p><strong>Join Our Team</strong></p>\n<p>We are the Audi Revolut F1 Team.</p>\n<p>Audi will compete in the FIA Formula 1 World Championship starting in 2026.</p>\n<p>The team has three locations. Audi&#39;s Motorsport Competence Center in Neuburg, Germany, is considered one of the most modern of its kind in Europe.</p>\n<p>The Formula 1 Factory of Audi Motorsport AG in Hinwil (Switzerland) is known for its innovative technologies and passion for racing.</p>\n<p>The Audi Motorsport Technology Centre, UK, is in its launch phase and growing continuously. Our power unit comes from Neuburg, while Hinwil is responsible for chassis development and race operations.</p>\n<p><strong>Your mission</strong></p>\n<ul>\n<li>Contribute to the overall CFD process development in an F1 environment</li>\n<li>Have an impact on how CFD technology is used by the whole aerodynamic department</li>\n<li>Improve the CFD physical modelling to increase the accuracy of the prediction</li>\n<li>Work closely with external software providers on specific development projects</li>\n<li>Help us set future directions and trends in advanced CFD methods</li>\n<li>Work in an agile software development team</li>\n</ul>\n<p><strong>Your profile</strong></p>\n<ul>\n<li>MSc or PhD in Aerospace Engineering, Mechanical Engineering, Mathematics or a related field with a strong focus on computational fluid dynamics</li>\n<li>At least 3 years of professional experience developing CFD Methods in OpenFOAM</li>\n<li>Previous F1 experience will be considered a plus</li>\n<li>Knowledge of high-Reynolds flows physics and turbulence modelling</li>\n<li>Knowledge of C++ and Python programming languages</li>\n<li>Familiarity with software release, update and deployment techniques</li>\n<li>Experience in parallel computing and high-performance computing (HPC)</li>\n<li>Familiarity with developing software in Linux environments</li>\n<li>Hard-working, independent and reliable</li>\n<li>Self-motivating, responsible and meticulous</li>\n<li>Open to share a continuous learning culture within the team</li>\n<li>Promote a collaborative, respectful and trusting work environment within the team</li>\n<li>Proficiency in English language skills. German language skills will be an advantage</li>\n</ul>\n<p><strong>Driven by passion.</strong></p>\n<p><strong>Every lap is a statement, every start a promise. This is what defines us.</strong></p>\n<p><strong>Take a closer look.</strong></p>\n<p>Your browser does not support the video tag.</p>\n<p>Explore the R26</p>\n<p>Meet the team</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_771f8f42-36f","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Audi Revolut F1 Team","sameAs":"https://www.audif1.com","logo":"https://logos.yubhub.co/audif1.com.png"},"x-apply-url":"https://www.audif1.com/career/details/1589976","x-work-arrangement":"onsite","x-experience-level":"mid","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["MSc or PhD in Aerospace Engineering, Mechanical Engineering, Mathematics or a related field","At least 3 years of professional experience developing CFD Methods in OpenFOAM","Knowledge of high-Reynolds flows physics and turbulence modelling","Knowledge of C++ and Python programming languages","Familiarity with software release, update and deployment techniques","Experience in parallel computing and high-performance computing (HPC)","Familiarity with developing software in Linux environments"],"x-skills-preferred":["Previous F1 experience","German language skills"],"datePosted":"2026-03-06T18:18:00.137Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Hinwil, Switzerland"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Motorsport","skills":"MSc or PhD in Aerospace Engineering, Mechanical Engineering, Mathematics or a related field, At least 3 years of professional experience developing CFD Methods in OpenFOAM, Knowledge of high-Reynolds flows physics and turbulence modelling, Knowledge of C++ and Python programming languages, Familiarity with software release, update and deployment techniques, Experience in parallel computing and high-performance computing (HPC), Familiarity with developing software in Linux environments, Previous F1 experience, German language skills"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_8a34364f-8c5"},"title":"Member of Technical Staff, Hardware Health","description":"<p><strong>Summary</strong></p>\n<p>Microsoft AI are looking for a talented Member of Technical Staff, Hardware Health, to ensure these systems deliver sustained reliability, performance, and availability across exascale-class deployments.</p>\n<p><strong>About the Role</strong></p>\n<p>We work closely with research, hardware, datacenter, and platform engineering teams to develop predictive health models, failure detection frameworks, and autonomous remediation systems that keep our AI clusters operating at frontier scale. Our team is responsible for Copilot, Bing, Edge, and generative AI research.</p>\n<p><strong>Accountabilities</strong></p>\n<ul>\n<li>Design and develop next-generation hardware health monitoring and diagnostic frameworks for large GPU clusters (NVL16/NVL72/GB200+ scale).</li>\n<li>Build predictive analytics pipelines leveraging telemetry, power, and thermal data to anticipate hardware degradation and systemic issues.</li>\n<li>Collaborate with silicon, firmware, and datacenter engineers to identify root causes and remediate large-scale hardware anomalies.</li>\n<li>Define system health KPIs (e.g., NIS/RIS, MTBF, failure domain analysis) and integrate them into real-time observability platforms.</li>\n<li>Lead incident triage for high-impact GPU, network, and cooling issues across distributed clusters.</li>\n<li>Drive automation in health management to reduce manual intervention to the top 5% of anomalies.</li>\n<li>Partner with cross-functional teams to influence hardware design for reliability, thermal efficiency, and serviceability.</li>\n</ul>\n<p><strong>The Candidate we&#39;re looking for</strong></p>\n<p><strong>Experience:</strong></p>\n<ul>\n<li>Bachelor&#39;s Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience.</li>\n</ul>\n<p><strong>Technical skills:</strong></p>\n<ul>\n<li>Experience working with large-scale HPC or GPU systems (NVIDIA H100/GB200 or equivalent).</li>\n<li>Deep understanding of GPU architecture, high-speed interconnects (NVLink, InfiniBand, RoCE), and large datacenter topologies.</li>\n<li>Proficiency in hardware telemetry, diagnostics, or failure analysis tools.</li>\n</ul>\n<p><strong>Personal attributes:</strong></p>\n<ul>\n<li>Strong analytical and problem-solving skills.</li>\n<li>Excellent communication and collaboration skills.</li>\n</ul>\n<p><strong>Benefits</strong></p>\n<ul>\n<li>Competitive salary.</li>\n<li>Comprehensive benefits package.</li>\n<li>Opportunities for professional growth and development.</li>\n<li>Collaborative and dynamic work environment.</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_8a34364f-8c5","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Microsoft AI","sameAs":"https://microsoft.ai","logo":"https://logos.yubhub.co/microsoft.ai.png"},"x-apply-url":"https://microsoft.ai/job/member-of-technical-staff-hardware-health-mai-superintelligence-team-5/","x-work-arrangement":"onsite","x-experience-level":"staff","x-job-type":"full-time","x-salary-range":"USD $139,900 – $274,800 per year","x-skills-required":["C","C++","C#","Java","JavaScript","Python","GPU architecture","high-speed interconnects","hardware telemetry","diagnostics","failure analysis tools"],"x-skills-preferred":["experience working with large-scale HPC or GPU systems","deep understanding of GPU architecture","proficiency in hardware telemetry","diagnostics","failure analysis tools"],"datePosted":"2026-03-06T07:33:03.791Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"New York"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"C, C++, C#, Java, JavaScript, Python, GPU architecture, high-speed interconnects, hardware telemetry, diagnostics, failure analysis tools, experience working with large-scale HPC or GPU systems, deep understanding of GPU architecture, proficiency in hardware telemetry, diagnostics, failure analysis tools","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":139900,"maxValue":274800,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_148ddf8d-fe9"},"title":"IT Director - Infrastructure Engineering","description":"<p>We are seeking an experienced IT Director to lead our Infrastructure Engineering team. As a seasoned IT leader, you will be responsible for developing and executing strategic IT infrastructure plans, policies, and procedures that align with Synopsys&#39; global and regional objectives.</p>\n<p><strong>What you&#39;ll do</strong></p>\n<ul>\n<li>Developing and executing strategic IT infrastructure plans, policies, and procedures that align with Synopsys&#39; global and regional objectives.</li>\n<li>Overseeing the design, implementation, and maintenance of HPC Engineering infrastructure, including Compute, Citrix, Storage, Networks, and Data Centers to ensure seamless operations.</li>\n</ul>\n<p><strong>What you need</strong></p>\n<ul>\n<li>Extensive experience (20+ years) in managing large-scale, 24x7 IT infrastructure delivery programs for global organizations.</li>\n<li>Deep technical expertise in HPC Engineering infrastructure—Compute, Citrix, Storage, Networks, and Data Centers.</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_148ddf8d-fe9","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Synopsys","sameAs":"https://careers.synopsys.com","logo":"https://logos.yubhub.co/careers.synopsys.com.png"},"x-apply-url":"https://careers.synopsys.com/job/bengaluru/it-director-infrastructure-engineering/44408/92296852016","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"employee","x-salary-range":null,"x-skills-required":["IT infrastructure management","HPC Engineering infrastructure","IT operations","governance","compliance","service management"],"x-skills-preferred":["job scheduling and queuing systems","LSF","SLURM"],"datePosted":"2026-03-06T07:20:45.459Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Bengaluru"}},"occupationalCategory":"Information Technology","industry":"Technology","skills":"IT infrastructure management, HPC Engineering infrastructure, IT operations, governance, compliance, service management, job scheduling and queuing systems, LSF, SLURM"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_4ca6ccaa-38a"},"title":"Sr Staff, Data Analytics Engineer- 14154","description":"<p>Opening. This role is open to hiring in Mississauga (Preferred) as well as Ottawa. We Are: At Synopsys, we drive the innovations that shape the way we live and connect. Our technology is central to the Era of Pervasive Intelligence, from self-driving cars to learning machines. We lead in chip design, verification, and IP integration, empowering the creation of high-performance silicon chips and software content. Join us to transform the future through continuous technological innovation. You Are: You possess strong analytical and problem-solving skills, with a proven track record of driving innovation and implementing solutions that improve efficiency and performance. You are detail-oriented and can work independently with minimal supervision. Your communication skills are excellent, enabling you to present complex technical information clearly and effectively to diverse audiences, including senior management and cross-functional teams. ## What you&#39;ll do \\<em> Manage and optimize compute and disk resources to support large-scale simulation workloads. \\</em> Monitor and troubleshoot compute infrastructure to ensure high availability and performance. \\<em> Collaborate with IT and infrastructure teams to scale resources as needed for complex simulation tasks. \\</em> Develop and implement strategies for efficient resource utilization, including job scheduling, load balancing, and storage optimization. \\<em> Identify opportunities for automation in simulation workflows and implement solutions to reduce manual effort and improve efficiency. \\</em> Develop custom scripts and tools using Python, Tcl, or other programming languages to automate repetitive tasks and enhance simulation processes. \\<em> Integrate automation solutions into existing workflows, ensuring seamless operation and scalability. \\</em> Stay updated on emerging technologies and methodologies to continuously improve automation capabilities. ## What you need \\<em> In-depth knowledge of compute infrastructure, including high-performance computing (HPC) environments, job schedulers (e.g., LSF), and disk storage systems. \\</em> Proficiency in simulation methodologies, including corner analysis, Monte Carlo simulations, and parasitic extraction. \\<em> Experience with automation scripting using Python, Tcl, or similar languages. \\</em> Exceptional analytical thinking skills with the ability to diagnose and resolve complex simulation and infrastructure issues. \\<em> Bachelor’s or Master’s degree in Electrical Engineering, Computer Engineering, or a related field. \\</em> 3 or more years of experience in a relevant area.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_4ca6ccaa-38a","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Synopsys","sameAs":"https://careers.synopsys.com","logo":"https://logos.yubhub.co/careers.synopsys.com.png"},"x-apply-url":"https://careers.synopsys.com/job/mississauga/sr-staff-data-analytics-engineer-14154/44408/91386421712","x-work-arrangement":"onsite","x-experience-level":"staff","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["In-depth knowledge of compute infrastructure","Proficiency in simulation methodologies","Experience with automation scripting","Exceptional analytical thinking skills"],"x-skills-preferred":["Python","Tcl","High-performance computing (HPC) environments","Job schedulers (e.g., LSF)","Disk storage systems"],"datePosted":"2026-02-11T16:13:25.610Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Mississauga"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"In-depth knowledge of compute infrastructure, Proficiency in simulation methodologies, Experience with automation scripting, Exceptional analytical thinking skills, Python, Tcl, High-performance computing (HPC) environments, Job schedulers (e.g., LSF), Disk storage systems"}]}