{"version":"0.1","company":{"name":"YubHub","url":"https://yubhub.co","jobsUrl":"https://yubhub.co/jobs/skill/mlops-workflows"},"x-facet":{"type":"skill","slug":"mlops-workflows","display":"Mlops Workflows","count":1},"x-feed-size-limit":100,"x-feed-sort":"enriched_at desc","x-feed-notice":"This feed contains at most 100 jobs (the most recently enriched). For the full corpus, use the paginated /stats/by-facet endpoint or /search.","x-generator":"yubhub-xml-generator","x-rights":"Free to redistribute with attribution: \"Data by YubHub (https://yubhub.co)\"","x-schema":"Each entry in `jobs` follows https://schema.org/JobPosting. YubHub-native raw fields carry `x-` prefix.","jobs":[{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_2e8a2997-260"},"title":"Senior Infrastructure Engineer","description":"<p>We are open to hiring at multiple levels for this role, depending on experience, impact, and demonstrated ownership. While this role is level-agnostic, it is best suited for engineers with experience owning and working in highly ambiguous problem spaces.</p>\n<p>About the company:\nThe mining industry has steadily become worse at finding new ore deposits, requiring &gt;10X more capital to make discoveries compared to 30 years ago. KoBold Metals builds AI models for mineral exploration and deploys those models,alongside our novel sensors,to guide decisions on KoBold-owned-and-operated exploration programs.</p>\n<p>About The Role:\nIn this role, you will partner with exploration and engineering teams to build reliable, scalable infrastructure that makes it easier to turn data and models into real-world exploration insights. You will improve observability, streamline MLOps workflows, and maintain shared tools like JupyterHub that enable faster experimentation and collaboration. Your work will help create a solid foundation for scientists and engineers to focus on discovery instead of infrastructure.</p>\n<p>Responsibilities</p>\n<ul>\n<li>Design, build, and operate compute infrastructure that is both scalable and reliable to support critical services.</li>\n<li>Work closely with engineering teams to embed observability, reliability, and security throughout the software development process.</li>\n<li>Create and maintain automation for monitoring, deployments, and incident response to keep operations efficient and predictable.</li>\n<li>Lead or support capacity planning, performance reviews, and system tuning to ensure stable and efficient systems.</li>\n<li>Join the on-call rotation and take part in incident response, troubleshooting, and resolution.</li>\n<li>Develop and refine monitoring and alerting to catch issues early and reduce downtime.</li>\n<li>Establish and maintain disaster recovery and business continuity practices that protect the organization against failures.</li>\n<li>Regularly review and improve our tools and processes to strengthen system visibility and reliability.</li>\n<li>Investigate points of fragility in distributed systems and understand how complex systems behave under stress in order to improve resilience.</li>\n<li>Continually learn about mineral exploration through reading, discussions with exploration team members, periodic rotation on an exploration team and time in the field with geologists</li>\n</ul>\n<p>Qualifications</p>\n<ul>\n<li>5+ years of experience as an Infrastructure Engineer, Site Reliability Engineer or in a similar role</li>\n<li>Strong scripting and programming skills (Python, Go, Java or JavaScript/ Node.js )</li>\n<li>Experience with IaC tools like Terraform and container orchestration tools like Kubernetes and Docker</li>\n<li>Experience with cloud platforms such as AWS</li>\n<li>Experience operating or administering JupyterHub in a multi-user environment</li>\n<li>Understanding of MLOps workflows, including model training, deployment, and related tooling</li>\n<li>Excellent communication &amp; collaboration skills and a continuous improvement mindset</li>\n<li>Proven ability to troubleshoot complex issues and implement effective solutions</li>\n<li>Proven ability to thrive in dynamic and evolving environments, effectively navigating uncertainty and incomplete information.</li>\n<li>Proven ability to grow expertise, influence &amp; educate others</li>\n<li>Comfortable making informed decisions with limited data, adapting quickly to new circumstances, and maintaining focus on strategic objectives while driving clarity for the team.</li>\n<li>Intellectual curiosity and eagerness to learn about all aspects of mineral exploration, particularly in the geology domain. Enjoys constantly learning such that you are driving insights through using our tools in exploration and willing to work directly with geologists in the field.</li>\n<li>Ability to explain technical problems to and collaborate on solutions with domain experts who are not infrastructure engineers. A strong communicator who enjoys working with colleagues across the company.</li>\n<li>Excitement about joining a fast-growing early-stage company, comfort with a dynamic work environment, and eagerness to take on an evolving range of responsibilities.</li>\n<li>Keen not just to build cool technology, but to figure out what technical product to build to best achieve the business objectives of the company.</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_2e8a2997-260","directApply":true,"hiringOrganization":{"@type":"Organization","name":"KoBold Metals","sameAs":"https://koboldmetals.com/","logo":"https://logos.yubhub.co/koboldmetals.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/koboldmetals/jobs/4002126005","x-work-arrangement":"remote","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$170,000 - $230,000","x-skills-required":["scripting","programming","IaC","container orchestration","cloud platforms","MLOps workflows","observability","reliability","security","automation","monitoring","deployments","incident response","capacity planning","performance reviews","system tuning","disaster recovery","business continuity","tools","processes","distributed systems","complex systems","resilience","mineral exploration","geology"],"x-skills-preferred":[],"datePosted":"2026-04-17T12:40:33.164Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Remote"}},"jobLocationType":"TELECOMMUTE","employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"scripting, programming, IaC, container orchestration, cloud platforms, MLOps workflows, observability, reliability, security, automation, monitoring, deployments, incident response, capacity planning, performance reviews, system tuning, disaster recovery, business continuity, tools, processes, distributed systems, complex systems, resilience, mineral exploration, geology","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":170000,"maxValue":230000,"unitText":"YEAR"}}}]}