{"version":"0.1","company":{"name":"YubHub","url":"https://yubhub.co","jobsUrl":"https://yubhub.co/jobs/skill/linux-system-administration"},"x-facet":{"type":"skill","slug":"linux-system-administration","display":"Linux System Administration","count":9},"x-feed-size-limit":100,"x-feed-sort":"enriched_at desc","x-feed-notice":"This feed contains at most 100 jobs (the most recently enriched). For the full corpus, use the paginated /stats/by-facet endpoint or /search.","x-generator":"yubhub-xml-generator","x-rights":"Free to redistribute with attribution: \"Data by YubHub (https://yubhub.co)\"","x-schema":"Each entry in `jobs` follows https://schema.org/JobPosting. YubHub-native raw fields carry `x-` prefix.","jobs":[{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_a5430d30-778"},"title":"Network Engineer","description":"<p>About Us At Cloudflare, we are on a mission to help build a better Internet. Today the company runs one of the world’s largest networks that powers millions of websites and other Internet properties for customers ranging from individual bloggers to SMBs to Fortune 500 companies.</p>\n<p>Responsibilities: We are looking for a Network Engineer to join our team (5+yrs). Cloudflare is building one of the largest, most resilient networks that spans over 335 cities spread across all regions and we plan to continue our expansion at a rapid pace. You will have the opportunity to (literally) build a faster, safer Internet for our millions of users and the billions of web surfers that visit their sites each month. This position will be responsible for:</p>\n<ul>\n<li>Technical operation and engineering of the Cloudflare network, including the provisioning and management of the network hardware and software,</li>\n<li>Day to day network operations and monitoring, working closely with internal teams such as System Reliability Engineering, Infrastructure Engineering and Customer Support teams,</li>\n<li>Creating and maintaining documentation, SOP’s, knowledge base,</li>\n<li>Interacting with our network peers to assist with their inquiries, and subsequently provide meaningful data on performance degradation.</li>\n</ul>\n<p>Requirements:</p>\n<ul>\n<li>Capable of learning new technologies / systems / features under guidance of mentors,</li>\n<li>Proficient in multiple network vendor operating systems , Associate level network certification(s) (JNCIA , CCNA , etc) or higher,</li>\n<li>Understanding of BGP, Knowledge of the OSI-model and experience isolating network, hardware and software issues,</li>\n<li>Experience writing scripts in Bash, Python, or other scripting language,</li>\n<li>Experience in working as part of a team in a customer-facing role,</li>\n<li>Ability to prioritise when faced with high pressure scenarios.</li>\n</ul>\n<p>Bonus Points but not required:</p>\n<ul>\n<li>Understanding of anycast routing,</li>\n<li>Good working knowledge of Junos, IOS-XR,NX-OS, EOS and SONIC,</li>\n<li>Experience writing network configuration and design documentation,</li>\n<li>Experience solving problems through automation,</li>\n<li>Experience with optical transport technologies such as CWDM/DWDM,</li>\n<li>Linux system administration,</li>\n<li>Multilingual.</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_a5430d30-778","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Cloudflare","sameAs":"https://www.cloudflare.com/","logo":"https://logos.yubhub.co/cloudflare.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/cloudflare/jobs/7628395","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["network vendor operating systems","Associate level network certification(s)","BGP","OSI-model","scripting languages (Bash, Python, etc.)","team collaboration","prioritization"],"x-skills-preferred":["anycast routing","Junos","IOS-XR","NX-OS","EOS","SONIC","network configuration and design documentation","problem-solving through automation","optical transport technologies (CWDM/DWDM)","Linux system administration","multilingual"],"datePosted":"2026-04-18T15:54:59.447Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"In-Office"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"network vendor operating systems, Associate level network certification(s), BGP, OSI-model, scripting languages (Bash, Python, etc.), team collaboration, prioritization, anycast routing, Junos, IOS-XR, NX-OS, EOS, SONIC, network configuration and design documentation, problem-solving through automation, optical transport technologies (CWDM/DWDM), Linux system administration, multilingual"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_2ab9c635-07a"},"title":"Operations Engineer, Fleet Reliability","description":"<p>The Fleet Reliability Operations team is responsible for the day-to-day provisioning, management, and uptime of CoreWeave&#39;s ever-expanding fleet of server nodes. This team plays a central role in CoreWeave&#39;s growth strategy, configuring, updating, and remotely troubleshooting our highest-tier supercomputing clusters and their networking, delivery platforms, and tools dependencies.</p>\n<p>We are seeking curious, creative, and persistent problem solvers to join our Fleet Reliability Operations team to help drive batches of server nodes through our provisioning and validation processes while efficiently and effectively troubleshooting node or cluster problems as they arise.</p>\n<p>Key responsibilities include:</p>\n<ul>\n<li>Configuring and maintaining large-scale high-performance supercomputing clusters running state-of-the-art GPUs</li>\n<li>Troubleshooting hardware and software issues; escalating and coordinating as needed with data center, network, hardware, and platform teams to drive resolution</li>\n<li>Monitoring and analyzing system performance and taking appropriate remediation actions for cloud health</li>\n<li>Approaching work with flexibility and optimism, anticipating shifting business and technical priorities</li>\n<li>Creating and maintaining documentation of team processes, knowledge, and best practices for system management</li>\n<li>Thinking critically about day-to-day work and working collaboratively to improve team processes and efficiency</li>\n</ul>\n<p>As a member of our team, you will be part of a dynamic and fast-paced environment where you will have the opportunity to grow and develop your skills. We offer a competitive salary range of $83,000 to $110,000, as well as a comprehensive benefits package, including medical, dental, and vision insurance, company-paid life insurance, and flexible PTO.</p>\n<p>If you are a motivated and detail-oriented individual who is passionate about working with cutting-edge technology, we encourage you to apply for this exciting opportunity.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_2ab9c635-07a","directApply":true,"hiringOrganization":{"@type":"Organization","name":"CoreWeave","sameAs":"https://www.coreweave.com","logo":"https://logos.yubhub.co/coreweave.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/coreweave/jobs/4617382006","x-work-arrangement":"hybrid","x-experience-level":"mid","x-job-type":"full-time","x-salary-range":"$83,000 to $110,000","x-skills-required":["Linux system administration","Troubleshooting hardware and software issues","System maintenance tasks","Scripting languages (bash, python, powershell, etc)","Grafana, Prometheus, promsql queries or similar observability platforms"],"x-skills-preferred":["Kubernetes administration","HPC - administering GPU-related workloads","Data center environments including server racks, HVAC systems, fiber trays"],"datePosted":"2026-04-18T15:51:55.238Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"New York, NY /Plano, TX /  Bellevue, WA / Sunnyvale, CA"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Linux system administration, Troubleshooting hardware and software issues, System maintenance tasks, Scripting languages (bash, python, powershell, etc), Grafana, Prometheus, promsql queries or similar observability platforms, Kubernetes administration, HPC - administering GPU-related workloads, Data center environments including server racks, HVAC systems, fiber trays","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":83000,"maxValue":110000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_fb9b187c-e32"},"title":"HPC Engineer","description":"<p>We are seeking a skilled and driven NVLink Engineer to support large-scale data center deployments. In this role, you&#39;ll be at the forefront of cutting-edge infrastructure technologies, ensuring the optimal performance and stability of NVLink systems.</p>\n<p>Key Responsibilities:</p>\n<ul>\n<li>Support the deployment of NVLink systems across large data center environments.</li>\n<li>Support the full lifecycle management of NVLink hardware and software components.</li>\n<li>Build and maintain tooling to automate and streamline the deployment, monitoring and troubleshooting workflows.</li>\n<li>Diagnose and resolve performance, connectivity and stability issues in complex environments.</li>\n<li>Collaborate with internal teams and external customers worldwide.</li>\n<li>Participate in a rotating on-call schedule to ensure 24/7 support coverage.</li>\n</ul>\n<p>Required Qualifications:</p>\n<ul>\n<li>Solid understanding of networking fundamentals</li>\n<li>Proven background in troubleshooting network and server hardware at the component level.</li>\n<li>Strong Linux system administration skills.</li>\n<li>Proficiency in at least one language (e.g., Python, Go).</li>\n<li>Proven ability to troubleshoot and debug complex application issues.</li>\n<li>Excellent communication and collaboration skills.</li>\n<li>Experience with Ansible.</li>\n</ul>\n<p>Preferred Qualifications:</p>\n<ul>\n<li>Experience with InfiniBand networking.</li>\n<li>Experience managing large-scale environments (1,000+ switches or nodes).</li>\n<li>Prior experience with NVLink technologies.</li>\n<li>Knowledge of Redfish API for system management.</li>\n<li>Experience with NVUE (NVIDIA User Experience).</li>\n<li>Background with SONiC.</li>\n<li>Experience with Grafana/PromQL</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_fb9b187c-e32","directApply":true,"hiringOrganization":{"@type":"Organization","name":"CoreWeave","sameAs":"https://www.coreweave.com","logo":"https://logos.yubhub.co/coreweave.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/coreweave/jobs/4645664006","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$109,000 to $204,000","x-skills-required":["Networking fundamentals","Linux system administration","Python","Go","Troubleshooting and debugging"],"x-skills-preferred":["InfiniBand networking","Ansible","Redfish API","NVUE","SONiC","Grafana/PromQL"],"datePosted":"2026-04-18T15:50:52.753Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"New York, NY/ Bellevue, WA/ Sunnyvale, CA / Livingston, NJ"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Networking fundamentals, Linux system administration, Python, Go, Troubleshooting and debugging, InfiniBand networking, Ansible, Redfish API, NVUE, SONiC, Grafana/PromQL","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":109000,"maxValue":204000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_1868194d-726"},"title":"Operations Engineer, HPC Networking","description":"<p>In this role, you will support the deployment, monitoring, troubleshooting, and maintenance of large-scale InfiniBand fabrics, ensuring their stability and performance.</p>\n<p>The ideal candidate will have a strong operations mindset, effective collaboration skills, and the ability to solve complex issues in a dynamic environment.</p>\n<p>Key responsibilities include:</p>\n<ul>\n<li>Regularly monitoring the performance and health of InfiniBand fabrics, including switches, host adapters, and nodes.</li>\n<li>Investigating and resolving operational issues within InfiniBand fabrics, such as network connectivity problems and performance bottlenecks.</li>\n<li>Assisting with the installation and operational bring-up of large InfiniBand fabrics in collaboration with onsite personnel and customer teams.</li>\n<li>Performing routine maintenance and upgrades on InfiniBand switches and control plane components.</li>\n<li>Collaborating with HPC cluster operations teams to provide troubleshooting and operational expertise.</li>\n</ul>\n<p>Investing in our people is one of our top priorities, and we value candidates who can bring their diversified experiences to our teams.</p>\n<p>Minimum Qualifications:</p>\n<ul>\n<li>At least 1 year of experience with InfiniBand or similar networking technologies.</li>\n<li>Solid understanding of networking concepts, including architectures, topologies, operational best practices, and troubleshooting.</li>\n<li>Experience with Linux system administration and maintenance.</li>\n<li>Proficiency in at least one scripting language.</li>\n</ul>\n<p>Preferred Qualifications:</p>\n<ul>\n<li>Hands-on experience with Nvidia UFM or similar fabric management tools.</li>\n<li>Familiarity with SLURM job scheduler and its role in HPC environments.</li>\n<li>Experience with monitoring and visualization platforms such as Grafana or Prometheus.</li>\n<li>Experience with operational tooling and automation frameworks like Ansible.</li>\n<li>Knowledge of data center operations, including server racks, and cabling.</li>\n<li>Python or Bash scripting.</li>\n</ul>\n<p>Why CoreWeave? At CoreWeave, we work hard, have fun, and move fast! We’re in an exciting stage of hyper-growth that you will not want to miss out on. We’re not afraid of a little chaos, and we’re constantly learning. Our team cares deeply about how we build our product and how we work together, which is represented through our core values:</p>\n<ul>\n<li>Be Curious at Your Core</li>\n<li>Act Like an Owner</li>\n<li>Empower Employees</li>\n<li>Deliver Best-in-Class Client Experiences</li>\n<li>Achieve More Together</li>\n</ul>\n<p>We support and encourage an entrepreneurial outlook and independent thinking. We foster an environment that encourages collaboration and enables the development of innovative solutions to complex problems. As we get set for takeoff, the organization&#39;s growth opportunities are constantly expanding. You will be surrounded by some of the best talent in the industry, who will want to learn from you, too.</p>\n<p>Come join us!</p>\n<p>The base salary range for this role is $110,000 to $179,000. The starting salary will be determined based on job-related knowledge, skills, experience, and market location. We strive for both market alignment and internal equity when determining compensation.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_1868194d-726","directApply":true,"hiringOrganization":{"@type":"Organization","name":"CoreWeave","sameAs":"https://www.coreweave.com","logo":"https://logos.yubhub.co/coreweave.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/coreweave/jobs/4673462006","x-work-arrangement":"hybrid","x-experience-level":"mid","x-job-type":"full-time","x-salary-range":"$110,000 to $179,000","x-skills-required":["InfiniBand","Linux system administration","Scripting language","Networking concepts","Architectures","Topologies","Operational best practices","Troubleshooting"],"x-skills-preferred":["Nvidia UFM","SLURM job scheduler","Grafana","Prometheus","Ansible","Data center operations","Server racks","Cabling","Python","Bash scripting"],"datePosted":"2026-04-18T15:50:12.336Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Livingston, NJ / New York, NY / Sunnyvale, CA / Bellevue, WA"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"InfiniBand, Linux system administration, Scripting language, Networking concepts, Architectures, Topologies, Operational best practices, Troubleshooting, Nvidia UFM, SLURM job scheduler, Grafana, Prometheus, Ansible, Data center operations, Server racks, Cabling, Python, Bash scripting","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":110000,"maxValue":179000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_db7b0f51-7df"},"title":"Senior Cloud Support Engineer","description":"<p>As a Senior Cloud Support Engineer at CoreWeave, you&#39;ll be on the front lines of a technological revolution, empowering our customers to harness the full potential of our advanced Kubernetes-powered HPC cloud infrastructure.</p>\n<p>You&#39;ll be hands-on, collaborating with engineers and researchers to resolve issues that impact high-profile, mission-critical applications and cutting-edge AI training workloads. Your contributions will be pivotal in ensuring seamless performance, reliability, and success for our customers, positioning you at the very core of transformative technologies reshaping industries worldwide at a company that is truly one of a kind.</p>\n<p>In this role, you will:</p>\n<ul>\n<li>Guide and mentor team members in developing their technical skills and troubleshooting capabilities across all disciplines supported by CoreWeave.</li>\n<li>Provide real-time feedback and coaching, reviewing tickets to identify opportunities for improvement and ensure quality assurance (QA).</li>\n<li>Develop and deliver training sessions to improve the team&#39;s proficiency and efficiency in resolving customer issues.</li>\n<li>Use technical expertise to investigate, debug, and resolve customer-impacting issues with the curiosity required to uncover and understand root causes.</li>\n<li>Maintain high customer satisfaction through swift, accurate, and empathetic high-touch support communications, as well as established best practices.</li>\n<li>Help design and implement troubleshooting best practices to ensure fast, accurate client resolutions.</li>\n<li>Contribute to refining processes, workflows, and playbooks for handling complex customer challenges.</li>\n<li>Serve as a technical escalation point for high-priority escalations or complex cases, modeling effective problem-solving approaches.</li>\n<li>Lead the creation of knowledge-sharing resources, including documentation, tutorials, and how-to guides.</li>\n<li>Enhance the support team&#39;s knowledge of CoreWeave&#39;s products and services through continuous learning initiatives.</li>\n</ul>\n<p>Who You Are:</p>\n<ul>\n<li>Have a Bachelor&#39;s degree in Information Science / Information Technology, Data Science, Computer Science, Engineering, Mathematics, Physics, or a related field, OR equivalent experience in a technical position</li>\n<li>At least 5+ years of experience in cloud support, systems administration, or related technical support-focused roles</li>\n<li>Proven hands-on work experience with Kubernetes</li>\n<li>Experience with networking, load balancing, storage volumes, observability, node management, High-Performance Computing (HPC), and Linux system administration</li>\n<li>Proven ability to mentor team members, foster technical growth, and improve team-wide capabilities through guidance and feedback</li>\n<li>Experience with observability tools such as Grafana</li>\n<li>Strong troubleshooting skills, with experience resolving complex customer issues and driving quality assurance through ticket reviews or similar processes</li>\n<li>Demonstrated success collaborating with cross-functional teams to refine workflows, implement best practices, and advocate for necessary tools or process changes</li>\n<li>Excellent written and verbal communication skills, with a track record of simplifying complex concepts for diverse audiences</li>\n<li>Strong technical presentation skills, with experience delivering precise, engaging, and informative presentations to technical and non-technical audiences, effectively showcasing complex concepts and solutions</li>\n</ul>\n<p>Preferred:</p>\n<ul>\n<li>CKA Certified</li>\n<li>Demonstrated experience with training, coaching, and creating onboarding materials.</li>\n<li>Operates in a fast-paced, global, 24/7 support team environment</li>\n<li>Ability to collaborate across different time zones</li>\n<li>On-site office environment, hybrid, or remote options depending on location</li>\n<li>Flexible to travel up to 10% (~25 days/year)</li>\n</ul>\n<p>Why CoreWeave?</p>\n<p>At CoreWeave, we work hard, have fun, and move fast! We&#39;re in an exciting stage of hyper-growth that you will not want to miss out on. We&#39;re not afraid of a little chaos, and we&#39;re constantly learning. Our team cares deeply about how we build our product and how we work together, which is represented through our core values:</p>\n<ul>\n<li>Be Curious at Your Core</li>\n<li>Act Like an Owner</li>\n<li>Empower Employees</li>\n<li>Deliver Best-in-Class Client Experiences</li>\n<li>Achieve More Together</li>\n</ul>\n<p>We support and encourage an entrepreneurial outlook and independent thinking. We foster an environment that encourages collaboration and provides the opportunity to develop innovative solutions to complex problems. As we get set for take off, the growth opportunities within the organization are constantly expanding. You will be surrounded by some of the best talent in the industry, who will want to learn from you, too.</p>\n<p>Come join us!</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_db7b0f51-7df","directApply":true,"hiringOrganization":{"@type":"Organization","name":"CoreWeave","sameAs":"https://www.coreweave.com","logo":"https://logos.yubhub.co/coreweave.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/coreweave/jobs/4568136006","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$122,000 to $163,000","x-skills-required":["cloud support","systems administration","Kubernetes","networking","load balancing","storage volumes","observability","node management","High-Performance Computing (HPC)","Linux system administration"],"x-skills-preferred":["CKA Certified","training","coaching","onboarding materials","fast-paced global support team environment","collaboration across different time zones"],"datePosted":"2026-04-18T15:49:50.841Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Livingston, NJ / New York, NY / Sunnyvale, CA / Bellevue, WA"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"cloud support, systems administration, Kubernetes, networking, load balancing, storage volumes, observability, node management, High-Performance Computing (HPC), Linux system administration, CKA Certified, training, coaching, onboarding materials, fast-paced global support team environment, collaboration across different time zones","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":122000,"maxValue":163000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_a561c761-1f3"},"title":"Manager, Bare Metal Support Engineering","description":"<p>The Customer Experience (CX) Organisation at CoreWeave is dedicated to ensuring every client running AI workloads at scale has a seamless, reliable, and high-performance experience.</p>\n<p>As a Manager of Bare Metal Support Engineering, you&#39;ll be at the centre of ensuring our dedicated infrastructure remains stable, reliable, and performant. You&#39;ll lead daily support operations, triage incidents, drive escalations, and ensure that hardware is monitored, maintained, and delivered effectively for our clients.</p>\n<p>Key responsibilities include:</p>\n<ul>\n<li>Leading a skilled team responsible for maintaining and optimising physical infrastructure across multiple client environments.</li>\n<li>Building, developing, and leading a dedicated Infrastructure Support team focused on supporting key infrastructure, handling escalations, and ensuring smooth hardware operations.</li>\n<li>Overseeing the resolution of infrastructure-related incidents, escalation management, and collaborating with internal teams to deliver effective solutions.</li>\n<li>Improving support processes to enhance efficiency and reduce downtime, ensuring the infrastructure meets client expectations.</li>\n</ul>\n<p>The ideal candidate will have 5+ years of experience leading teams responsible for infrastructure support, data centre operations, or physical compute environments. They should be hands-on with Linux system administration and command-line tools, familiar with hardware-level diagnostics, troubleshooting, and replacement, and have experience working with high-performance rack-scale hardware.</p>\n<p>In addition to the required skills, preferred skills include experience managing infrastructure support teams in high-growth or rapidly evolving environments, proven ability to develop and implement operational processes that scale with business needs, and strong familiarity with server and GPU hardware lifecycle management.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_a561c761-1f3","directApply":true,"hiringOrganization":{"@type":"Organization","name":"CoreWeave","sameAs":"https://www.coreweave.com","logo":"https://logos.yubhub.co/coreweave.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/coreweave/jobs/4649055006","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$170,000 to $240,000 SGD","x-skills-required":["Linux system administration","Command-line tools","Hardware-level diagnostics","Troubleshooting and replacement","High-performance rack-scale hardware"],"x-skills-preferred":["Managing infrastructure support teams","Developing and implementing operational processes","Server and GPU hardware lifecycle management"],"datePosted":"2026-04-18T15:45:59.370Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Singapore"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Linux system administration, Command-line tools, Hardware-level diagnostics, Troubleshooting and replacement, High-performance rack-scale hardware, Managing infrastructure support teams, Developing and implementing operational processes, Server and GPU hardware lifecycle management","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":170000,"maxValue":240000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_b29decfc-c15"},"title":"Site Reliability Staff Engineer - Administrator","description":"<p>At Synopsys, we drive the innovations that shape the way we live and connect. Our technology is central to the Era of Pervasive Intelligence, from self-driving cars to learning machines. We lead in chip design, verification, and IP integration, empowering the creation of high-performance silicon chips and software content.</p>\n<p>You are a highly motivated Site Reliability, Staff Engineer with a passion for Linux platforms and a commitment to operational excellence. You thrive in dynamic, multi-faceted environments and are energized by the challenge of deploying, maintaining, and optimizing complex systems. Your curiosity drives you to continually learn and adapt, while your technical expertise enables you to solve intricate problems efficiently.</p>\n<p>Administering and managing Linux operating systems, including kernel components, memory management, process scheduling, and system performance optimization.\nPerforming routine and advanced system administration tasks such as monitoring, tuning, and troubleshooting across bare-metal and virtualized nodes.\nDeploying, configuring, and managing Linux-based operating systems using Kickstart and Ansible for automation and environment standardization.\nImplementing and managing MAAS (Metal as a Service) for large-scale bare-metal provisioning and lifecycle operations.\nOperating and maintaining OpenStack environments for On Demand Computing and cloud infrastructure.\nProviding support for virtualization technologies (VMware, KVM, etc.), including troubleshooting and maintenance.\nDelivering basic Linux networking support, resolving connectivity, routing, firewall, NIC bonding, VLAN, and interface configuration issues.\nCollaborating with cross-functional teams to enhance infrastructure reliability, scalability, and security.\nCreating and maintaining detailed documentation, including configurations, SOPs, troubleshooting guides, and operational runbooks.</p>\n<p>Ensuring the reliability and uptime of critical Linux environments that underpin Synopsys&#39; engineering and development operations.\nEnabling rapid deployment and scalability of infrastructure through automation and standardized processes.\nReducing downtime and improving system performance by proactively identifying and resolving technical issues.\nEnhancing security and compliance across platforms through robust configuration and monitoring practices.\nAccelerating innovation by providing stable, high-performance environments for development and testing teams.\nFostering a collaborative culture by sharing expertise, mentoring peers, and contributing to knowledge repositories.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_b29decfc-c15","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Synopsys","sameAs":"https://careers.synopsys.com","logo":"https://logos.yubhub.co/careers.synopsys.com.png"},"x-apply-url":"https://careers.synopsys.com/job/bengaluru/site-reliability-staff-engineer-administrator/44408/93181374944","x-work-arrangement":"onsite","x-experience-level":"staff","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["Linux system administration","Linux internals","Kickstart","Ansible","MAAS","OpenStack","Virtualization technologies","Linux networking","Scripting languages (Bash, Python)","Problem-solving","Communication"],"x-skills-preferred":[],"datePosted":"2026-04-05T13:21:03.331Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Bengaluru"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Linux system administration, Linux internals, Kickstart, Ansible, MAAS, OpenStack, Virtualization technologies, Linux networking, Scripting languages (Bash, Python), Problem-solving, Communication"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_2ed6e0d4-88f"},"title":"Site Reliability Staff - EDA Engineering Compute and IT Infrastructure Automation","description":"<p>At Synopsys, we drive the innovations that shape the way we live and connect. Our technology is central to the Era of Pervasive Intelligence, from self-driving cars to learning machines. We lead in chip design, verification, and IP integration, empowering the creation of high-performance silicon chips and software content.</p>\n<p>You are a forward-thinking and highly motivated IT professional with a passion for reliability, automation, and infrastructure excellence. You thrive in dynamic, fast-paced environments where your expertise in data center operations, engineering compute, and automation can make a tangible impact. With a strong foundation in Linux-based environments, virtualization, and network architecture, you bring both depth and breadth to IT operations, ensuring optimal performance and security.</p>\n<p><strong>Responsibilities</strong></p>\n<ul>\n<li>Maintain and optimize the Taiwan data center at the Synopsys Hsinchu office, ensuring compliance with corporate IT data center and security standards.</li>\n<li>Manage the full server hardware lifecycle, including rack installation, provisioning, maintenance, and decommissioning.</li>\n<li>Support and manage engineering compute IT environments located at key customer data centers, such as TSMC and MediaTek.</li>\n<li>Collaborate with corporate network and InfoSec teams to maintain data center network infrastructure, including core switches, TOR, firewalls, routers, and circuits.</li>\n<li>Oversee the software architecture of EDA compute, focusing on Linux environments, and provide troubleshooting expertise for virtualization platforms, job schedulers, and remote access solutions.</li>\n<li>Design and implement automation processes to streamline regular tasks, reduce manual effort, and adopt ML/AI technologies for efficient EDA compute orchestration and secure chamber domain management.</li>\n<li>Proactively monitor, analyze, and optimize system performance to maximize uptime and reliability.</li>\n</ul>\n<p><strong>Impact</strong></p>\n<ul>\n<li>Ensure the seamless operation and scalability of Synopsys&#39; Taiwan data center, directly supporting R&amp;D and customer-facing teams.</li>\n<li>Enable faster and more reliable silicon design and verification cycles for leading semiconductor companies.</li>\n<li>Drive continuous improvement in data center operations through automation and adoption of cutting-edge technologies.</li>\n<li>Safeguard mission-critical infrastructure by upholding best practices in security and compliance.</li>\n<li>Facilitate collaboration between internal teams and major customers, strengthening Synopsys&#39; reputation as a trusted technology partner.</li>\n<li>Contribute to the global reliability and performance of Synopsys&#39; EDA compute environment, empowering innovation at scale.</li>\n</ul>\n<p><strong>Requirements</strong></p>\n<ul>\n<li>Proven experience in data center operations, including server hardware lifecycle management and resource provisioning.</li>\n<li>Strong knowledge of Linux system administration, virtualization platforms, and EDA compute architectures.</li>\n<li>Solid understanding of data center networking concepts, including core switches, firewalls, routers, and security protocols.</li>\n<li>Hands-on expertise in automation tools and scripting (such as Python, Bash, or Ansible), with a track record of process optimization.</li>\n<li>Experience implementing or supporting ML/AI-based solutions for IT infrastructure orchestration is a plus.</li>\n</ul>\n<p><strong>Team</strong></p>\n<p>You&#39;ll join a dedicated IT engineering compute team responsible for maintaining and optimizing Synopsys&#39; Taiwan data center and engineering compute operations. The team plays a critical role in supporting Synopsys&#39; R&amp;D software development and providing seamless support to key customers across Taiwan, including industry leaders like TSMC and MediaTek. You&#39;ll collaborate with global IT, InfoSec, and engineering teams, driving future EDA compute expansion and innovation.</p>\n<p><strong>Rewards and Benefits</strong></p>\n<p>We offer a comprehensive range of health, wellness, and financial benefits to cater to your needs. Our total rewards include both monetary and non-monetary offerings. Your recruiter will provide more details about the salary range and benefits during the hiring process.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_2ed6e0d4-88f","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Synopsys","sameAs":"https://careers.synopsys.com","logo":"https://logos.yubhub.co/careers.synopsys.com.png"},"x-apply-url":"https://careers.synopsys.com/job/hsinchu/site-reliability-staff-eda-engineering-compute-and-it-infrastructure-automation/44408/91681543232","x-work-arrangement":"onsite","x-experience-level":"staff","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["Linux system administration","Virtualization platforms","EDA compute architectures","Data center networking","Automation tools and scripting","ML/AI-based solutions"],"x-skills-preferred":["Python","Bash","Ansible"],"datePosted":"2026-03-09T11:03:25.511Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Hsinchu"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Linux system administration, Virtualization platforms, EDA compute architectures, Data center networking, Automation tools and scripting, ML/AI-based solutions, Python, Bash, Ansible"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_101df34a-252"},"title":"Site Reliability Manager","description":"<p>You will lead and be part of a Linux Engineering / Site Reliability Engineering organisation responsible for frontline (L1) production support. The team works closely with L2/L3 engineering, platform, network, security, and R&amp;D teams to ensure reliable and scalable infrastructure operations across the business.</p>\n<p><strong>Job Description</strong></p>\n<p>We are a technology organisation operating high performance, large scale Linux production environments that support critical platforms and engineering teams. Our focus is on operational excellence, service reliability, automation, and continuous improvement. We run 24x7 operations and partner closely with platform, network, security, and engineering teams to deliver stable, secure, and scalable infrastructure.</p>\n<p><strong>Responsibilities</strong></p>\n<ul>\n<li>Leading and managing a 24x7 L1 Linux Engineering / SRE team operating in rotational shifts</li>\n<li>Owning hiring, onboarding, performance management, coaching, and career development for L1 engineers</li>\n<li>Owning L1 production support operations for Linux systems in a 24x7 environment</li>\n<li>Acting as the first leadership escalation point during major production incidents</li>\n<li>Ensuring adherence to SLAs, OLAs, and operational KPIs such as availability and MTTR</li>\n<li>Providing technical oversight across Linux OS, bare metal and virtualized platforms, and monitoring/logging systems</li>\n<li>Driving automation adoption using Ansible, Bash, and Python to reduce manual toil</li>\n<li>Defining and maintaining SOPs, runbooks, escalation procedures, and documentation</li>\n<li>Partnering with platform, network, security, and engineering teams to improve system reliability and resilience</li>\n</ul>\n<p><strong>Impact</strong></p>\n<ul>\n<li>Ensuring stable, reliable, and efficient 24x7 L1 Linux/SRE operations</li>\n<li>Reducing incident recurrence and improving incident response and resolution times</li>\n<li>Building a skilled, motivated, and well-governed L1 engineering team</li>\n<li>Improving operational maturity through automation, standardization, and documentation</li>\n<li>Enabling engineering and R&amp;D teams through predictable and resilient platform operations</li>\n</ul>\n<p><strong>Requirements</strong></p>\n<ul>\n<li>10–14+ years of experience in IT Infrastructure, Linux Operations, or SRE</li>\n<li>4–6+ years of people management experience, preferably managing 24x7 support teams</li>\n<li>Strong hands-on background in Linux system administration and production support</li>\n<li>Experience with incident management, on-call models, and rotational shifts</li>\n<li>Advanced knowledge of Linux OS internals</li>\n<li>Experience with virtualization platforms (VMware, KVM, OpenStack, oVirt)</li>\n<li>Knowledge of monitoring and logging tools (e.g., Nagios, ELK)</li>\n<li>Experience with automation and configuration management (Ansible)</li>\n<li>Scripting skills in Bash and/or Python</li>\n</ul>\n<p><strong>Who You Are</strong></p>\n<ul>\n<li>A strong people leader with excellent coaching and decision-making skills</li>\n<li>Calm and effective under high-pressure production scenarios</li>\n<li>Highly structured and data-driven in driving operational excellence</li>\n<li>An effective communicator and stakeholder partner</li>\n<li>Passionate about reliability engineering, automation, and continuous improvement</li>\n</ul>\n<p><strong>Rewards and Benefits</strong></p>\n<ul>\n<li>Opportunity to lead mission-critical, large-scale Linux and SRE operations</li>\n<li>High visibility role with exposure to senior leadership and engineering stakeholders</li>\n<li>Ability to shape operational strategy, automation, and reliability practices</li>\n<li>Strong focus on career growth, learning, and leadership development</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_101df34a-252","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Synopsys","sameAs":"https://careers.synopsys.com","logo":"https://logos.yubhub.co/careers.synopsys.com.png"},"x-apply-url":"https://careers.synopsys.com/job/bengaluru/site-reliability-manager/44408/92446615696","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["Linux system administration","Linux OS internals","Virtualization platforms","Monitoring and logging tools","Automation and configuration management","Scripting skills in Bash and/or Python"],"x-skills-preferred":["Ansible","Bash","Python"],"datePosted":"2026-03-08T22:18:45.406Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Bengaluru"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Linux system administration, Linux OS internals, Virtualization platforms, Monitoring and logging tools, Automation and configuration management, Scripting skills in Bash and/or Python, Ansible, Bash, Python"}]}