{"version":"0.1","company":{"name":"YubHub","url":"https://yubhub.co","jobsUrl":"https://yubhub.co/jobs/skill/site-reliability-engineering"},"x-facet":{"type":"skill","slug":"site-reliability-engineering","display":"Site Reliability Engineering","count":25},"x-feed-size-limit":100,"x-feed-sort":"enriched_at desc","x-feed-notice":"This feed contains at most 100 jobs (the most recently enriched). For the full corpus, use the paginated /stats/by-facet endpoint or /search.","x-generator":"yubhub-xml-generator","x-rights":"Free to redistribute with attribution: \"Data by YubHub (https://yubhub.co)\"","x-schema":"Each entry in `jobs` follows https://schema.org/JobPosting. YubHub-native raw fields carry `x-` prefix.","jobs":[{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_d6e7c226-e8c"},"title":"Technical Lead, MFT MDE Analytics Engineering","description":"<p>The SPEED Market Data team at Equity IT is seeking a hands-on Technical Lead to own and drive a critical workstream focused on architecting, implementing, monitoring, and supporting low-latency C++ systems. As a Technical Lead, you will shape the future of the industry by working alongside exceptional engineers and strategists to solve significant engineering problems.</p>\n<p>We are looking for a strong technical leader with financial markets technology experience and real-time market data expertise to design, build, and support our global real-time market data platform. This role emphasizes technical leadership, architectural ownership, and cross-team coordination rather than people management.</p>\n<p>Principal Responsibilities:</p>\n<ul>\n<li>Act as the technical owner for a major market data workstream, setting technical direction, defining architecture, and driving execution across the full lifecycle.</li>\n<li>Collaborate with hardware and software teams across divisions to design and build real-time market data processing and distribution systems.</li>\n<li>Lead and drive new technical initiatives for the team, including evaluating technologies, defining standards, and establishing best practices.</li>\n<li>Design and develop systems, interfaces, and tools for historical market data and trading simulations that increase research productivity.</li>\n<li>Architect and implement components of an enterprise market data platform, including components for caching, aggregation, conflation and value-added data enrichment.</li>\n<li>Optimise platform performance using network and systems programming, and advanced low-latency techniques (CPU, NIC, kernel, and application-level tuning).</li>\n<li>Lead the design and maintenance of automated test and benchmark frameworks, and tools for risk management, performance tracking, and system validation.</li>\n<li>Provide technical leadership for the support and operation of both enterprise real-time market data environments, including coordinating internal, vendor, and exchange-driven changes.</li>\n<li>Design and engineer components to automate support and management of the market data platform, including monitoring, real-time and historical metrics collection/visualisation, and self-service administrative/user tools.</li>\n<li>Serve as a primary technical liaison for users of the market data environment (Portfolio Managers, trading desks, and core technology teams), translating requirements into robust technical solutions.</li>\n<li>Lead the enhancement of processes and workflows for operating the market data platform (release/deployment, incident management and remediation, exchange notification handling, defining and enforcing SLAs).</li>\n<li>Mentor and influence other engineers through code reviews, design reviews, and hands-on guidance, fostering a culture of technical excellence and accountability.</li>\n</ul>\n<p>Qualifications / Skills Required:</p>\n<ul>\n<li>Degree in Computer Science or a related field with a strong background in data structures, algorithms, and object-oriented programming in modern C++.</li>\n<li>Deep understanding of Linux system internals and networking, especially in low-latency and high-throughput environments.</li>\n<li>Strong knowledge of CPU architecture and the ability to leverage CPU capabilities for performance optimisation.</li>\n<li>Demonstrated experience acting as a technical lead or senior engineer owning complex systems or workstreams end-to-end (design, delivery, and operations).</li>\n<li>Able to prioritise and make trade-offs in a fast-moving, high-pressure, constantly changing environment; strong sense of urgency, ownership, and follow-through.</li>\n<li>Strong belief in and practice of extreme ownership, with a track record of taking accountability for systems in production.</li>\n<li>Effective communication and stakeholder management skills: able to work closely with business and technology users, understand their needs, and drive appropriate technical solutions.</li>\n<li>Experience building solutions on cloud environments such as GCP and AWS.</li>\n<li>Knowledge of additional programming languages such as Java, Python, or scripting (Perl, shell).</li>\n<li>Technical background in application development on complex market data systems (e.g., Bloomberg, Thomson Reuters, etc.).</li>\n<li>Experience supporting market data environments within a global organisation, including internally developed DMA feed handlers and distribution infrastructure.</li>\n<li>Strong understanding of market data concepts and functionality, including data models (fields/messages), protocols (e.g., snapshot + delta), order book representations (L1/L2/L3), recovery, and reliability.</li>\n<li>Hands-on Site Reliability Engineering or DevOps experience, including system administration, automation, measurement, and release/deployment management.</li>\n<li>Experience with monitoring, metrics, and command/control tooling for distributed market data platforms, with the ability to evaluate existing solutions and drive enhancements across development and operations.</li>\n<li>Ability to operate with a high level of thoroughness and attention to detail, demonstrating strong ownership of deliverables and production systems.</li>\n</ul>\n<p>Millennium pays a total compensation package which includes a base salary, discretionary performance bonus, and a comprehensive benefits package. The estimated base salary range for this position is $175,000 to $250,000, which is specific to New York and may change in the future. When finalising an offer, we take into consideration an individual&#39;s experience level and the qualifications they bring to the role to formulate a competitive total compensation package.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_d6e7c226-e8c","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Equity IT","sameAs":"https://mlp.eightfold.ai","logo":"https://logos.yubhub.co/mlp.eightfold.ai.png"},"x-apply-url":"https://mlp.eightfold.ai/careers/job/755954905529","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$175,000 to $250,000","x-skills-required":["C++","Linux system internals","Networking","CPU architecture","Object-oriented programming","Cloud environments","Java","Python","Scripting","Market data systems","Site Reliability Engineering","DevOps","Monitoring","Metrics","Command/control tooling"],"x-skills-preferred":[],"datePosted":"2026-04-18T22:13:18.645Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"New York, New York, United States of America"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Finance","skills":"C++, Linux system internals, Networking, CPU architecture, Object-oriented programming, Cloud environments, Java, Python, Scripting, Market data systems, Site Reliability Engineering, DevOps, Monitoring, Metrics, Command/control tooling","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":175000,"maxValue":250000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_262aa1cb-01c"},"title":"Head of Corporate Engineering","description":"<p>As Head of Corporate Engineering, you will be responsible for Enterprise engineering and operations globally. You will be responsible for building and managing a highly technical enterprise engineering team, developing first principled-based strategies, and enabling strong enterprise security.</p>\n<p>Key responsibilities include engineering, securing and optimizing cloud infrastructure, Identity and Access Management, Endpoints, Collaboration tools, and ensuring compliance with SOX, PCI DSS, and FedRAMP compliance. The Head of Corporate Engineering will work closely with R&amp;D on managing engineering tools like Jira, Confluence, and GitHub, driving efficient adoption and integration.</p>\n<p>Strong technical and influencing leadership principles coupled with the ability to manage a complex, scaling, and fast-moving enterprise environment are essential. This role reports directly to the Vice President, Infrastructure and Operations</p>\n<p>Responsibilities:</p>\n<p>In this influential role, you will be responsible for:</p>\n<p>Securing the Enterprise: Working closely with Enterprise Security organization to harden and secure our cloud environments, secret management, collaboration tools, endpoints, SaaS environments, IAM tools, and more. Success measured in continuous improvement of our enterprise security hardening standards</p>\n<p>Building and Scaling our Cloud Infrastructure: Your team will be responsible for establishing and implementing enterprise cloud infrastructure including establishing Infrastructure Provisioning, SRE services, 24/7 on-call support, Infra as Code, observability, and more. In addition, you will be responsible for managing cloud budgets, vendor management, and establishing cost optimization initiatives. Success is measured in increased developer velocity while securing &amp; scaling the cloud infrastructure</p>\n<p>Engineering Tooling: Partner closely with R&amp;D teams to establish policies, configurations, run-books, SLAs, hardening, scalability and availability of engineering tools like Github, Jira, Atlassian, and more</p>\n<p>Endpoint Engineering: Enable extreme automation for endpoint management with zero-touch deployment, observability (synthetic and real-time), provisioning/de-provisioning, and establishing standards / SLAs. Enforce security policies, configure &amp; manage security settings and ensure compliance across all endpoints and mobile devices. Success is measured in terms of end-user satisfaction and % of manual touch</p>\n<p>Collaboration Management: Ensure we provide world class tools to our employees to be extremely productive and collaborative. This would include but not be limited to managing and scaling internal workplace products like Gmail, Slack, Atlassian, Moveworks, Glean, and more. Success is measured by user satisfaction</p>\n<p>Identity &amp; Access Management: Manage the IAM team from IAM implementation, access standards enforcement, SLA management, and compliance to various standards like FedRAMP, IL5, PCI, and more. Included are both internal and external identity providers to be managed. Success is measured by compliance, Identity governance, and availability</p>\n<p>Desired Success Outcomes</p>\n<p>A high-performing enterprise engineering team capable of handling complex technical projects with agility and high quality</p>\n<p>Well defined cloud strategy ensuring the stability, scalability, and security of cloud infrastructure. Overhaul of current processes and workflows to address inefficiencies and increase team velocity</p>\n<p>Robust endpoint security with Implementation of comprehensive security measures for all endpoints, including Mac, Windows, and mobile devices</p>\n<p>Deliver high-quality employee experience with productivity tools (Gmail, Slack, Atlassian tools, Moveworks, GitHub) with a robust forward-looking roadmap</p>\n<p>Efficient operational support for Tier 3 IT services with minimized production incidents. Implementation of robust incident and change management processes with mature operational practice</p>\n<p>Efficient and mature processes for system integrations related to Mergers and Acquisitions (M&amp;As), ensuring timely smooth transitions during M&amp;A integrations</p>\n<p>Development and implementation of automation tools and frameworks, Identification of automation opportunities to reduce manual toil and improve accuracy</p>\n<p>Qualifications:</p>\n<p>10 years of experience managing Cloud infrastructure at large enterprises. Extensive experience managing public cloud implementations in AWS. Experience with GCP and Azure will be a plus</p>\n<p>In-depth understanding of Cloud native technologies to lead and guide the team. Must have hands-on experience in troubleshooting and debugging issues in production environments</p>\n<p>Working experience in managing DevOps/SRE practices OKRs (Objective and Key Results), Agile development, Infra-as-code, SRE (Site Reliability Engineering), DevOps measurement such as DORA KPIs,</p>\n<p>In-depth understanding of each collaboration tool&#39;s features, functionalities, and configurations (e.g., Gmail for email, Slack for messaging). Ability to identify and integrate and optimize the use of various tools for seamless collaboration (e.g., connecting Jira with GitHub for Dev metrics)</p>\n<p>Experience leading a team of senior professionals working asynchronously in a remote, distributed team. Strong communication skills, with clear verbal communication and written communication skills</p>\n<p>Collaborative style: partners well with cross-functional teams to solve hard problems and to complete complex deliverables with quality and business outcomes</p>\n<p>Provide mentorship and guidance to team members to ensure that their skills and knowledge are kept up-to-date</p>\n<p>Pay Range Transparency Databricks is committed to fair and equitable compensation practices. The pay range(s) for this role is listed below and represents the expected salary range for non-commissionable roles or on-target earnings for commissionable roles. Actual compensation packages are based on several factors that are unique to each candidate, including but not limited to job-related skills, depth of experience, relevant certifications and training, and specific work location. Based on the factors above, Databricks anticipates utilizing the full width of the range. The total compensation package for this position may also include eligibility for annual performance bonus, equity, and the benefits listed above. For more information regarding which range your location is in visit our page here.</p>\n<p>Zone 1 Pay Range $265,000-$364,300 USD</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_262aa1cb-01c","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Databricks","sameAs":"https://databricks.com","logo":"https://logos.yubhub.co/databricks.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/databricks/jobs/7293607002","x-work-arrangement":"remote","x-experience-level":"executive","x-job-type":"full-time","x-salary-range":"$265,000-$364,300 USD","x-skills-required":["Cloud infrastructure","Identity and Access Management","Endpoint security","Collaboration tools","DevOps","Site Reliability Engineering","Agile development","Infrastructure as Code","Observability","Automation","Scripting languages","Cloud native technologies","Public cloud implementations","AWS","GCP","Azure"],"x-skills-preferred":["Jira","Confluence","GitHub","Atlassian","Moveworks","Glean","Slack","Gmail","Microsoft Office"],"datePosted":"2026-04-18T15:58:26.589Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco, California"}},"jobLocationType":"TELECOMMUTE","employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Cloud infrastructure, Identity and Access Management, Endpoint security, Collaboration tools, DevOps, Site Reliability Engineering, Agile development, Infrastructure as Code, Observability, Automation, Scripting languages, Cloud native technologies, Public cloud implementations, AWS, GCP, Azure, Jira, Confluence, GitHub, Atlassian, Moveworks, Glean, Slack, Gmail, Microsoft Office","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":265000,"maxValue":364300,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_38a5c86c-54e"},"title":"Senior Compliance Engineer","description":"<p>JOB TITLE: Senior Compliance Engineer LOCATION: Costa Mesa, California, United States DEPARTMENT: Corporate Technology : Information Security : Corporate Assurance</p>\n<p>As a Senior Compliance Engineer at Anduril Industries, you will be responsible for driving automation, compliance, and security engineering principles into the design, integration, and operation of Anduril&#39;s internal systems. This is a technically hands-on role that requires a strong DevSecOps background with deep expertise in cloud infrastructure security, embedded systems security, and federal compliance frameworks.</p>\n<p><strong>Key Responsibilities</strong></p>\n<ul>\n<li>Design, develop, and maintain Infrastructure as Code (IaC) and Policy as Code (PaC) that enforce compliance with NIST SP 800-171 and 800-53, CMMC, and other applicable frameworks, enabling developers to deploy CMMC-certified applications using pre-packaged, compliant infrastructure templates.</li>\n<li>Architect, build, and deploy robust, scalable security controls across Anduril&#39;s corporate, development, and production cloud environments (AWS, Azure, GCP) and on-premise environments.</li>\n<li>Develop and automate IaC pipelines for managing and scaling cloud deployments securely and efficiently, including automated pipelines for deploying infrastructure, applications, and updates.</li>\n<li>Build automation for procedural compliance controls, generating compliance and audit artifacts at scale without manual intervention.</li>\n<li>Develop security models that integrate Continuous Monitoring (ConMon), DISA STIG scanning, and compliance reporting into a unified, automated workflow.</li>\n</ul>\n<p><strong>Compliance Engineering &amp; Framework Implementation</strong></p>\n<ul>\n<li>Analyze, interpret, and operationalize federal and industry cybersecurity regulations, including NIST SP 800-171 and 800-53, CMMC, FedRAMP, and SOC 2, translating regulatory language into actionable engineering guidance and enforceable technical controls.</li>\n<li>Evaluate system architectures and configurations to ensure alignment with required security controls for moderate-impact information systems.</li>\n<li>Interface directly with infrastructure teams to verify and enforce compliance across existing on-premise and cloud stacks, identifying gaps and driving remediation.</li>\n</ul>\n<p><strong>Cross-Functional Collaboration &amp; Enablement</strong></p>\n<ul>\n<li>Partner with engineers, the DevSecOps Team, and the Automation Team to implement and verify security controls in both corporate and product software environments.</li>\n<li>Act as a force multiplier by embedding security best practices into the workflows of infrastructure, application, and product teams, particularly for environments holding mission-critical data.</li>\n</ul>\n<p><strong>Strategic &amp; Advisory</strong></p>\n<ul>\n<li>Develop strategies and implementation plans for compliance-related matters, advising management on risk posture, regulatory changes, and investment priorities.</li>\n<li>Institute best-practice procedures for compliance and risk mitigation across the organization.</li>\n</ul>\n<p><strong>Required Qualifications</strong></p>\n<ul>\n<li>3+ years of professional experience in Cloud Security, DevSecOps, Site Reliability Engineering (SRE), or a related security engineering role.</li>\n<li>Background in one or more of the following disciplines: Systems Security Engineering, Cybersecurity, Systems Engineering, Software Engineering, Computer Engineering, or Computer Science.</li>\n<li>Proven experience building and securing complex cloud environments at scale.</li>\n<li>3+ years of hands-on experience working with compliance frameworks such as CMMC, NIST SP 800-171 and/or 800-53, and FedRAMP.</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_38a5c86c-54e","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Anduril Industries","sameAs":"https://www.anduril.com/","logo":"https://logos.yubhub.co/anduril.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/andurilindustries/jobs/5087188007","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["Cloud Security","DevSecOps","Site Reliability Engineering","Systems Security Engineering","Cybersecurity","Systems Engineering","Software Engineering","Computer Engineering","Computer Science","Compliance Frameworks","NIST SP 800-171","NIST SP 800-53","CMMC","FedRAMP"],"x-skills-preferred":[],"datePosted":"2026-04-18T15:54:24.911Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Costa Mesa, California, United States"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Cloud Security, DevSecOps, Site Reliability Engineering, Systems Security Engineering, Cybersecurity, Systems Engineering, Software Engineering, Computer Engineering, Computer Science, Compliance Frameworks, NIST SP 800-171, NIST SP 800-53, CMMC, FedRAMP"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_40d32156-365"},"title":"Reliability Lead, Common Services","description":"<p>As Reliability Lead, Common Services, you will establish and lead the Reliability Engineering and production operations practice for the Common Services organization. You&#39;ll partner closely with engineering leaders and teams across Common Services to define how we build, release, monitor, and operate critical services,raising the bar on reliability, availability, and operational excellence across the board.</p>\n<p>In this role, you will:</p>\n<ul>\n<li>Establish and lead the SRE / production engineering practice for the Common Services organization, including standards for reliability, incident management, and on-call, in partnership with the central Product Engineering organization.</li>\n<li>Develop an Operational Excellence strategy that focuses on not only improving system performance but also monitoring and reducing operational toil</li>\n<li>Partner with engineering and product teams to define SLOs, SLIs, and error budgets for critical Common Services, and ensure these become part of how teams plan and make tradeoffs.</li>\n<li>Own and improve the incident management lifecycle for Common Services, including on-call rotations, escalation paths, incident tooling, post-incident reviews, and follow-through on corrective actions.</li>\n<li>Drive the observability strategy (metrics, logs, traces, dashboards, alerts) for Common Services, ensuring we have actionable visibility into the health, performance, and capacity of key systems.</li>\n<li>Collaborate with engineering leads to design and review architectures for reliability, scalability, resilience, and operability, including failure modes, redundancy, and graceful degradation.</li>\n<li>Lead efforts to automate and harden operational workflows, including deployments, rollbacks, configuration management, change management, and routine maintenance tasks.</li>\n<li>Build strong, trust-based relationships with partner teams and stakeholders, becoming a go-to leader for production readiness and operational risk within Common Services.</li>\n<li>Hire, mentor, and develop SRE and production engineering talent, fostering a culture of continuous improvement, learning from incidents, and humane on-call.</li>\n<li>Partner with other SRE and production engineering leaders across CoreWeave to align on global practices, tools, and reliability goals, representing the needs and constraints of Common Services.</li>\n</ul>\n<p>You will be responsible for defining the reliability strategy, processes, and standards for the Common Services portfolio and driving consistent, high-quality operational practices across multiple teams.</p>\n<p>The base salary range for this role is $206,000 to $303,000.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_40d32156-365","directApply":true,"hiringOrganization":{"@type":"Organization","name":"CoreWeave","sameAs":"https://www.coreweave.com","logo":"https://logos.yubhub.co/coreweave.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/coreweave/jobs/4650165006","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$206,000 to $303,000","x-skills-required":["Site Reliability Engineering","Production Engineering","Linux-based production environments","Containers","Orchestration technologies","Observability stacks","Alerting systems","SLIs/SLOs","Error budgets","Incident management","On-call rotations","Escalation paths","Post-incident reviews","Corrective actions","Automation tooling","Infrastructure-as-code","CI/CD pipelines"],"x-skills-preferred":["GPU workloads","High-performance computing","Latency/throughput-sensitive systems","Multi-tenant environments","Multi-region environments","Regulated environments","Service ownership models","Mentoring","Managing senior engineers"],"datePosted":"2026-04-18T15:47:45.370Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"New York, NY / Sunnyvale, CA / Bellevue, WA"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Site Reliability Engineering, Production Engineering, Linux-based production environments, Containers, Orchestration technologies, Observability stacks, Alerting systems, SLIs/SLOs, Error budgets, Incident management, On-call rotations, Escalation paths, Post-incident reviews, Corrective actions, Automation tooling, Infrastructure-as-code, CI/CD pipelines, GPU workloads, High-performance computing, Latency/throughput-sensitive systems, Multi-tenant environments, Multi-region environments, Regulated environments, Service ownership models, Mentoring, Managing senior engineers","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":206000,"maxValue":303000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_da7679a6-e4f"},"title":"Senior Technical Operations Lead","description":"<p>Job Title: Senior Technical Operations Lead</p>\n<p>We are seeking an experienced Senior Technical Operations Lead to drive operational excellence across our Infrastructure Engineering organization.</p>\n<p>As a Senior Technical Operations Lead, you will design and implement world-class operational processes, establish SRE best practices, and mentor technical teams to achieve exceptional reliability and efficiency.</p>\n<p>Key Responsibilities:</p>\n<p>SRE Leadership &amp; Transformation</p>\n<ul>\n<li>Lead the design and implementation of SRE practices and tooling across Infrastructure Engineering</li>\n</ul>\n<ul>\n<li>Establish and cultivate an SRE-focused culture at Zoominfo</li>\n</ul>\n<p>Operational Process Design &amp; Governance</p>\n<ul>\n<li>Establish clear governance frameworks and procedural consistency</li>\n</ul>\n<ul>\n<li>Make decisions about process exceptions and/or changes to accommodate different team contexts</li>\n</ul>\n<ul>\n<li>Design and/or implement process automations using scripts and integrations</li>\n</ul>\n<ul>\n<li>Define functional requirements and goals for process automations</li>\n</ul>\n<ul>\n<li>Conduct hands-on and/or automated audits to ensure process adherence and identify improvement opportunities</li>\n</ul>\n<p>Incident Management &amp; Root Cause Analysis</p>\n<ul>\n<li>Design, implement, and continuously improve Incident Management and Change Management procedures that scale across the organization, using tools such as PagerDuty, Slack, Jira, ServiceNow, and custom integrations</li>\n</ul>\n<ul>\n<li>Lead and participate in root cause analysis sessions, driving teams toward systemic improvements rather than blame</li>\n</ul>\n<ul>\n<li>Design and execute incident dry runs and tabletop exercises to build organizational resilience</li>\n</ul>\n<ul>\n<li>Establish metrics and KPIs that measure incident response effectiveness and drive continuous improvement</li>\n</ul>\n<p>Enable Data-Driven Decision Making</p>\n<ul>\n<li>Identify, define, and automate the tracking of operational KPIs and departmental metrics that matter, enabling senior managers to make informed decisions on the basis of data</li>\n</ul>\n<ul>\n<li>Build and maintain metric dashboards and automated reporting systems that provide real-time visibility into operational health</li>\n</ul>\n<ul>\n<li>Analyze trends and surface opportunities for optimization</li>\n</ul>\n<p>Stakeholder Engagement, Training &amp; Mentorship</p>\n<ul>\n<li>Build and maintain strong relationships with Engineering managers, Product Managers, and cross-functional stakeholders across geographies</li>\n</ul>\n<ul>\n<li>Maintain a feedback loop. Meet with stakeholders to understand process pain points.</li>\n</ul>\n<ul>\n<li>Influence others by fostering trust, leading by example, and inspiring them with your expertise and passion for reliability practices.</li>\n</ul>\n<ul>\n<li>Enhance internal knowledge of third-party tools such as Pagerduty, Datadog, and more, by educating Zoominfo employees on these tools.</li>\n</ul>\n<p>Deliver training sessions that make Operational Excellence engaging and motivating for diverse audiences.</p>\n<p>Required Experience &amp; Qualifications:</p>\n<ul>\n<li>Bachelor’s degree in Software Engineering, Operations Management, or related field</li>\n</ul>\n<ul>\n<li>7+ years of hands-on experience in technical operations, Site Reliability Engineering (SRE), Incident Management, or IT Service Management roles within SaaS or technical organizations</li>\n</ul>\n<ul>\n<li>Fluent English proficiency (written and verbal)</li>\n</ul>\n<ul>\n<li>Proven track record designing and implementing operational processes at scale</li>\n</ul>\n<ul>\n<li>Demonstrated expertise in SRE principles, practices, and tooling</li>\n</ul>\n<ul>\n<li>Strong data analysis skills with ability to define metrics, build or design dashboards, and use data to drive strategic decisions</li>\n</ul>\n<ul>\n<li>Proven ability to work effectively in a matrix organizational structure</li>\n</ul>\n<ul>\n<li>Ability and experience working with senior management at global organizations</li>\n</ul>\n<ul>\n<li>Hands-on experience with monitoring and observability tools such as PagerDuty and/or Datadog</li>\n</ul>\n<ul>\n<li>Familiarity with Jira, Confluence, Google Data Studio, or Tableau</li>\n</ul>\n<ul>\n<li>Experience with scripting and integrations (Python, JavaScript, Google AppScript, or similar)</li>\n</ul>\n<ul>\n<li>Background in SRE transformation or organizational process improvement initiatives</li>\n</ul>\n<p>#LI-SS4 #LI-Hybrid</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_da7679a6-e4f","directApply":true,"hiringOrganization":{"@type":"Organization","name":"ZoomInfo","sameAs":"https://www.zoominfo.com/","logo":"https://logos.yubhub.co/zoominfo.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/zoominfo/jobs/8451386002","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["Site Reliability Engineering (SRE)","Technical Operations","Incident Management","IT Service Management","Monitoring and Observability Tools","Jira","Confluence","Google Data Studio","Tableau","Scripting and Integrations","Python","JavaScript","Google AppScript"],"x-skills-preferred":[],"datePosted":"2026-04-18T15:45:47.393Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Ra'anana, Israel"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Site Reliability Engineering (SRE), Technical Operations, Incident Management, IT Service Management, Monitoring and Observability Tools, Jira, Confluence, Google Data Studio, Tableau, Scripting and Integrations, Python, JavaScript, Google AppScript"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_1a4d732c-42c"},"title":"Principal Site Reliability Engineer - Observability","description":"<p>We&#39;re looking for a Principal Site Reliability Engineer to join the Observability Solution team. As a key member of the team, you will collaborate with product management, product design, customers, and multiple teams across Elastic to define and evolve end-to-end InfraObs experiences. You will deliver and continually evolve these experiences leveraging the Elastic Platform capabilities and coding agents.</p>\n<p>Key responsibilities include being a contact point for other teams within Elastic, fostering a culture of mutual respect, collaboration, and consensus-based decision-making, and being an awesome person to work with.</p>\n<p>To be successful in this role, you will need to have a SRE background and experience operating large-scale production services with the help of Observability tools. You should be proficient in operating production infrastructure in K8s and at least one of the three major CSPs, as well as using Observability tools. You will also need to be able to use AI coding agents in the delivery workflow and have excellent verbal and written communication skills.</p>\n<p>Bonus points will be given to those with experience as a user of the Elastic Stack.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_1a4d732c-42c","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Elastic, the Search AI Company","sameAs":"https://www.elastic.co/","logo":"https://logos.yubhub.co/elastic.co.png"},"x-apply-url":"https://job-boards.greenhouse.io/elastic/jobs/7721575","x-work-arrangement":"remote","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["Site Reliability Engineering","Observability tools","Kubernetes","Cloud Service Providers","AI coding agents"],"x-skills-preferred":["Elastic Stack","Product management","Product design","Collaboration","Communication"],"datePosted":"2026-04-18T15:44:28.865Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Spain"}},"jobLocationType":"TELECOMMUTE","employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Site Reliability Engineering, Observability tools, Kubernetes, Cloud Service Providers, AI coding agents, Elastic Stack, Product management, Product design, Collaboration, Communication"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_982dd81e-416"},"title":"Principal Database Engineer, Data Engineering","description":"<p>As a Principal Database Engineer, you&#39;ll design and lead the evolution of the PostgreSQL backbone that powers GitLab.com and thousands of self-managed enterprise deployments. You&#39;ll solve critical challenges around uncontrolled data growth, complex upgrades and migrations, and always-on reliability at global scale, creating the database patterns and platforms that keep GitLab fast, resilient, and cost efficient as usage grows.</p>\n<p>You&#39;ll architect scalable, distributed database solutions, build proactive health and reliability frameworks, and drive adoption of modern database technologies and data stores that improve both product capabilities and production stability. Working hands-on in the codebase and partnering closely with product and infrastructure teams, you&#39;ll turn long-term database strategy into incremental, customer-visible improvements, shift incident response from reactive to proactive, and help define GitLab&#39;s next-generation data architecture, including sharding and multi-database support.</p>\n<p>Key Responsibilities:</p>\n<ul>\n<li>Lead the architecture and strategy for GitLab.com&#39;s PostgreSQL infrastructure, designing scalable, resilient solutions for both SaaS and self-managed deployments.</li>\n</ul>\n<ul>\n<li>Build proactive database health and reliability frameworks using continuous monitoring, automated remediation, and predictive analytics to prevent customer-impacting incidents.</li>\n</ul>\n<ul>\n<li>Drive database best practices across engineering by guiding schema design, migrations, and query optimization, and by creating self-service tools and guardrails for product teams.</li>\n</ul>\n<ul>\n<li>Own end-to-end observability for database systems, designing symptom-based monitoring, leading incident response, and turning learnings into automated, repeatable workflows.</li>\n</ul>\n<ul>\n<li>Shape the evolution of GitLab’s database platform by evaluating and implementing modern database technologies and data stores that improve reliability, performance, and product capabilities.</li>\n</ul>\n<ul>\n<li>Design solutions and patterns that address uncontrolled data growth, cost efficiency, sharding, multi-database support, and other next-generation data architecture needs.</li>\n</ul>\n<ul>\n<li>Collaborate closely with product and infrastructure teams to align product decisions with platform constraints and priorities, breaking down long-term goals into incremental, customer-visible outcomes.</li>\n</ul>\n<ul>\n<li>Contribute directly to the codebase to prototype and ship working solutions, maintain technical credibility, and deep-dive into complex production issues when needed.</li>\n</ul>\n<p>Requirements:</p>\n<ul>\n<li>Experience architecting, operating, and optimizing PostgreSQL in large-scale, distributed production environments with high availability and disaster recovery requirements.</li>\n</ul>\n<ul>\n<li>Deep knowledge of PostgreSQL internals, including the query planner, write-ahead logging, vacuum processes, and storage engine behavior.</li>\n</ul>\n<ul>\n<li>Background designing and maintaining highly distributed database platforms with automated failover, robust monitoring, and self-healing capabilities.</li>\n</ul>\n<ul>\n<li>Hands-on coding skills and comfort working across the stack, from low-level database and search systems to backend and frontend services.</li>\n</ul>\n<ul>\n<li>Familiarity with infrastructure-as-code, GitOps practices, security hardening, and site reliability engineering principles applied to database operations.</li>\n</ul>\n<ul>\n<li>Ability to debug complex, cross-system issues, translate findings into durable technical solutions, and turn incident learnings into repeatable automation.</li>\n</ul>\n<ul>\n<li>Experience influencing technical direction across multiple teams, providing practical guidance on migrations, query optimization, and database best practices.</li>\n</ul>\n<ul>\n<li>Openness to collaborating with people from diverse technical backgrounds, with a focus on clear communication, shared ownership, and learning transferable skills.</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_982dd81e-416","directApply":true,"hiringOrganization":{"@type":"Organization","name":"GitLab","sameAs":"https://about.gitlab.com/","logo":"https://logos.yubhub.co/about.gitlab.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/gitlab/jobs/8231379002","x-work-arrangement":"remote","x-experience-level":"staff","x-job-type":"full-time","x-salary-range":"$157,900-$338,400 USD","x-skills-required":["PostgreSQL","database architecture","data engineering","infrastructure-as-code","GitOps","security hardening","site reliability engineering","database operations","query optimization","schema design","migrations","query planning","write-ahead logging","vacuum processes","storage engine behavior"],"x-skills-preferred":[],"datePosted":"2026-04-18T15:44:15.402Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Remote, EMEA; Remote, North America"}},"jobLocationType":"TELECOMMUTE","employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"PostgreSQL, database architecture, data engineering, infrastructure-as-code, GitOps, security hardening, site reliability engineering, database operations, query optimization, schema design, migrations, query planning, write-ahead logging, vacuum processes, storage engine behavior","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":157900,"maxValue":338400,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_777a6e79-5d9"},"title":"Senior Software Engineer, Security Engineering","description":"<p>Secure Every Identity ----------------------- Okta secures AI by building the trusted, neutral infrastructure that enables organisations to safely embrace this new era.</p>\n<p>We are looking for builders and owners who operate with speed and urgency and execute with excellence. This is an opportunity to do career-defining work.</p>\n<p>The Role -------- We seek a knowledgeable and development-focused Security Engineer, who will build micro-services to secure Customer Identity Products and Infrastructure.</p>\n<p>Responsibilities --------------- Work across a globally distributed product-aligned team of security engineers Establish a deep understanding of Okta Customer Identity products and infrastructure Collaborate when necessary with the Okta Security team on security operations Build, deploy &amp; maintain scalable and reliable infrastructure services as well as security solutions for customer identity products Build, deploy &amp; maintain automation to improve platform security capabilities at scale including logging, threat detection and compliance benchmarks to increase our security posture Help meet our operational security commitments by thinking like an attacker, assessing the risk, and advising on mitigation strategies Support security investigations in coordination with the Okta Security team, participate in root cause analysis and perform necessary remediations. Support stakeholders by proposing mitigation strategies for end-of-life software and security vulnerability and patch management</p>\n<p>Requirements ----------- You have 3+ years of hands-on development experience writing microservices with Golang You have 3+ years of experience in cloud infrastructure security, product security You have working knowledge and hands on development experience with one or more of the following: AWS and/or Azure security Kubernetes You have strong knowledge in OWASP Top 10 and secure coding best practices You have strong foundation on secure software development lifecycle best practices You have strong written and verbal communication skills You have experience working with a globally distributed and remote team.</p>\n<p>Bonus points if: You have working knowledge and experience with one or more of the following: Full-stack engineering Site reliability engineering Identity and access management Vulnerability and threat management Security detection and response Governance, risk and compliance</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_777a6e79-5d9","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Okta","sameAs":"https://www.okta.com","logo":"https://logos.yubhub.co/okta.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/okta/jobs/7744352","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["Golang","Cloud infrastructure security","Product security","AWS security","Azure security","Kubernetes","OWASP Top 10","Secure coding best practices","Secure software development lifecycle best practices"],"x-skills-preferred":["Full-stack engineering","Site reliability engineering","Identity and access management","Vulnerability and threat management","Security detection and response","Governance, risk and compliance"],"datePosted":"2026-04-18T15:44:00.927Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Bengaluru, India"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Golang, Cloud infrastructure security, Product security, AWS security, Azure security, Kubernetes, OWASP Top 10, Secure coding best practices, Secure software development lifecycle best practices, Full-stack engineering, Site reliability engineering, Identity and access management, Vulnerability and threat management, Security detection and response, Governance, risk and compliance"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_7f43bb14-3c4"},"title":"Senior Cloud Engineer","description":"<p>Shield AI is seeking a Senior Cloud Engineer to support its leadership in applied artificial intelligence development. In this role, you will be responsible for engineering, deploying, provisioning, and managing critical cloud systems that drive innovation across Shield AI&#39;s public and private cloud environments, both domestically and internationally.</p>\n<p>As part of the Cloud and Infrastructure team within Enterprise Operations, you will play a key role in ensuring the performance, scalability, and reliability of these systems to support various business units. This position may involve occasional travel to Shield AI locations.</p>\n<p><strong>Responsibilities:</strong></p>\n<p><strong>Engineering:</strong></p>\n<ul>\n<li>Manage and optimize multi-cloud infrastructure (Azure, AWS) for performance, reliability, and scalability.</li>\n<li>Support and optimize cloud and virtual machine environments, assisting with capacity planning, performance monitoring, security compliance, and vulnerability remediation.</li>\n<li>Assist in implementing and maintaining infrastructure systems, including servers, storage, backup solutions, and disaster recovery processes, for both public and private clouds.</li>\n<li>Continuously learn and adapt to emerging technologies and platforms, leveraging automation wherever possible.</li>\n<li>Author and produce the necessary documentation for engineered and maintained systems along with associated processes that supporting teams can leverage.</li>\n<li>Assist in researching, recommending, and developing innovative solutions for complex requirements and issue resolution.</li>\n<li>Collaborate cross-functionally with AI, DevOps, and Security teams to ensure compliance, observability, and resilience in mission-critical environments.</li>\n<li>Participate in Agile methodologies and sound engineering principles.</li>\n</ul>\n<p><strong>Operations and Support:</strong></p>\n<ul>\n<li>Perform daily system monitoring, verifying the integrity and availability of all server resources, systems and key processes, reviewing system and application logs.</li>\n<li>Support system maintenance and upgrades, including OS patching, software configuration, hardware updates, and performance tuning to ensure optimal cloud infrastructure performance.</li>\n<li>Provide escalated support for operational issues possibly during and after normal business hours for systems, workloads, and Kubernetes AI infrastructure.</li>\n<li>Analyze, troubleshoot and resolve system infrastructure and software issues.</li>\n<li>Ability to participate in on-call, emergency, or maintenance roles</li>\n</ul>\n<p><strong>Requirements:</strong></p>\n<ul>\n<li>Bachelor’s degree in Computer Science or related field, or equivalent experience (4+ years) plus an engineer level certification, Azure/AWS Associate, or another similar level certification.</li>\n<li>4 years’ experience supporting applications and systems in a production environment in high-availability, mission-critical, or defense-grade environments preferred.</li>\n<li>Comfortable with operational efficiencies utilizing Infrastructure as Code (IaC) solutions (e.g., Terraform, Ansible).</li>\n<li>Strong understanding of networking concepts (VPCs, VPNs, subnets, routing, firewalls).</li>\n<li>Experience in automating repetitive tasks using scripting languages such as PowerShell, Python, or Bash.</li>\n<li>Experience with deployment and systems administration of at least one type of Linux distribution (i.e. RHEL, Ubuntu)</li>\n<li>Experience with concepts of Microsoft Windows Server administration, Azure and Active Directory environments</li>\n<li>Possesses organizational skills, with a process-oriented mindset, attention to detail, and effective verbal and written communication abilities.</li>\n<li>Ability to work independently to accomplish assigned tasks.</li>\n<li>Solution-oriented, constructive approach to problem-solving.</li>\n</ul>\n<p><strong>Preferred Qualifications:</strong></p>\n<ul>\n<li>Experience deploying and maintaining workloads in Azure public cloud environments.</li>\n<li>Hands-on experience with containerization and Kubernetes-based workloads.</li>\n<li>Strong understanding of virtualization and private cloud platforms (e.g., VMware, Hyper-V, KVM).</li>\n<li>Background in DevOps, Site Reliability Engineering (SRE), or cloud infrastructure roles.</li>\n<li>Proficiency with configuration management and automation tools (e.g., Ansible, Chef, Puppet, Terraform).</li>\n<li>Experience building and optimizing CI/CD pipelines.</li>\n</ul>\n<p><strong>Salary and Benefits:</strong></p>\n<ul>\n<li>$110,000 - $170,000 a year</li>\n<li>Full-time regular employee offer package: Pay within range listed + Bonus + Benefits + Equity</li>\n<li>Temporary employee offer package: Pay within range listed above + temporary benefits package (applicable after 60 days of employment)</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_7f43bb14-3c4","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Shield AI","sameAs":"https://www.shield.ai","logo":"https://logos.yubhub.co/shield.ai.png"},"x-apply-url":"https://jobs.lever.co/shieldai/702e2609-db48-49ab-8bec-d405c956a6ce","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$110,000 - $170,000 a year","x-skills-required":["Cloud Engineering","Multi-cloud infrastructure","Azure","AWS","Networking concepts","Infrastructure as Code","Scripting languages","Linux distribution","Microsoft Windows Server administration","Active Directory environments"],"x-skills-preferred":["Containerization","Kubernetes-based workloads","Virtualization","Private cloud platforms","DevOps","Site Reliability Engineering","Configuration management","Automation tools","CI/CD pipelines"],"datePosted":"2026-04-17T13:01:14.253Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Diego, California / Dallas, Texas / San Francisco, California"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Cloud Engineering, Multi-cloud infrastructure, Azure, AWS, Networking concepts, Infrastructure as Code, Scripting languages, Linux distribution, Microsoft Windows Server administration, Active Directory environments, Containerization, Kubernetes-based workloads, Virtualization, Private cloud platforms, DevOps, Site Reliability Engineering, Configuration management, Automation tools, CI/CD pipelines","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":110000,"maxValue":170000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_e308ff1b-d8b"},"title":"Software Engineer, DevOps, Research Platform","description":"<p>About Mistral AI\\n\\nAt Mistral AI, we believe in the power of AI to simplify tasks, save time, and enhance learning and creativity. Our technology is designed to integrate seamlessly into daily working life.\\n\\nWe are a team passionate about AI and its potential to transform society. Our diverse workforce thrives in competitive environments and is committed to driving innovation.\\n\\nRole Summary\\n\\nWe are seeking a talented and experienced software engineer to join our Research Platform team. You&#39;ll work closely with our R&amp;D team to build a cloud agnostic platform that improves the stability, scalability and velocity across the research department.\\n\\nResponsibilities\\n\\nAs a DevOps/Platform Engineer, your responsibilities will include:\\n\\n* Designing and implementing complex systems (e.g. scale our research CI with a strong focus toward reliability, reproducibility and speed)\\n\\n* Building flexible yet solid and accessible development environment for researchers, so they can focus on core mission.\\n\\n* Designing, implementing and advocating for solutions addressing large amounts of data and maintainable data pipelines.\\n\\n* Optimizing a variety of builds: container images, large libraries compilation times, python environments...\\n\\n* Building strong relationships with researchers, understanding their workflow and enabling them to achieve more by leveraging your expertise.\\n\\n* Communicating and producing documentation or any content that will help them to make the most out of the tools and systems you&#39;ll build.\\n\\n* Being part of the team that &quot;platformizes&quot; research and constantly improve the daily experience for researchers while avoiding future roadblocks.\\n\\nAbout You\\n\\n* 5+ years of successful experience in a similar DX / DevOps / SRE role.\\n\\n* Proficiency in software development (Python, Go...) and programming best practices.\\n\\n* Exposure to site reliability engineering: root cause analysis, in-production troubleshooting, on-call rotations...\\n\\n* Exposure to infrastructure management: CI/CD, containerization, orchestration, infra-as-code, monitoring, logging, alerting, observability...\\n\\n* Technical product mindset (e.g. understanding how to debug poor adoption).\\n\\n* Excellent problem-solving and communication skills (ability to contextualizing, gauging risks and getting buy-in for high stakes and impactful solutions).\\n\\n* Ownership, high agency and constantly seeking to learn and improving things for others.\\n\\n* Autonomous, self-driven and able to work well in a fast-paced startup environment.\\n\\n* Low ego and team spirit mindset.\\n\\nYour Application Will Be All The More Interesting If You Also Have:\\n\\n* First hand Bazel (or equivalent) experience.\\n\\n* Strong knowledge of Python&#39;s ecosystem.\\n\\n* Familiarity with GPU based workloads and ecosystems.\\n\\n* Experience of full remote environments (you&#39;re comfortable with having some of your users on the other side of the globe).\\n\\nHiring Process\\n\\n* Intro Call - 30 min\\n\\n* Tech Culture Interview - 30 min\\n\\n* Technical Rounds - 2 x 45 min\\n\\n* Culture-fit Discussion - 30 min\\n\\n* Reference Calls\\n\\nBy Applying, You Agree To Our Applicant Privacy Policy.\\n\\nAdditional Information\\n\\nLocation &amp; Remote\\n\\nThis role is primarily based at one of our European offices (Paris, France and London, UK). We will prioritize candidates who either reside there or are open to relocating. We strongly believe in the value of in-person collaboration to foster strong relationships and seamless communication within our team. In certain specific situations, we will also consider remote candidates based in one of the countries listed in this job posting , currently France &amp; UK. In that case, we ask all new hires to visit our local office:\\n\\n* for the first week of their onboarding (accommodation and travelling covered)\\n\\n* then at least 3 days per month\\n\\nWhat We Offer\\n\\n* Competitive salary and equity\\n\\n* Health insurance\\n\\n* Transportation allowance\\n\\n* Sport allowance\\n\\n* Meal vouchers\\n\\n* Private pension plan\\n\\n* Parental: Generous parental leave policy\\n\\n* Visa sponsorship\\n\\nBy Applying, You Agree To Our Applicant Privacy Policy.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_e308ff1b-d8b","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Mistral AI","sameAs":"https://mistral.ai","logo":"https://logos.yubhub.co/mistral.ai.png"},"x-apply-url":"https://jobs.lever.co/mistral/18be2b70-c05d-48e4-82ac-e5cb462c96c0","x-work-arrangement":"remote","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["software development","python","go","site reliability engineering","infrastructure management","CI/CD","containerization","orchestration","infra-as-code","monitoring","logging","alerting","observability"],"x-skills-preferred":["bazel","python's ecosystem","gpu based workloads","full remote environments"],"datePosted":"2026-04-17T12:48:20.869Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Paris"}},"jobLocationType":"TELECOMMUTE","employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"software development, python, go, site reliability engineering, infrastructure management, CI/CD, containerization, orchestration, infra-as-code, monitoring, logging, alerting, observability, bazel, python's ecosystem, gpu based workloads, full remote environments"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_e4891ce8-465"},"title":"Software Engineer, DevEx","description":"<p>We are seeking an experienced Software Engineer, Developer Experience to own and foster a collaborative, automated, and efficient software development lifecycle. In this role, you will collaborate closely with product engineering teams to ensure consistent code health, accelerate development velocity through well-maintained CI pipelines, faster builds, and secure release processes.</p>\n<p>Your mission is to empower our software engineering team with seamless workflows while securing our production environments.</p>\n<p>Responsibilities:</p>\n<ul>\n<li>Build, monitor, and enhance CI/CD pipelines to streamline development workflows and accelerate deployments.</li>\n<li>Design, operate, and maintain scalable, reliable, and secure multi-cloud infrastructures.</li>\n<li>Identify areas for improvement and create innovative solutions that enable high developer velocity.</li>\n</ul>\n<p>Team Collaboration &amp; Advocacy:</p>\n<ul>\n<li>Standardize DevOps practices to ensure consistency across all engineering teams.</li>\n<li>Establish measurable KPIs for security performance, reliability, and compliance adherence.</li>\n<li>Partner with development and operations teams to embed security into daily workflows.</li>\n<li>Lead training initiatives to upskill teams on secure coding, threat modeling, and incident response.</li>\n<li>Champion a security-first mindset, driving cultural adoption of DevSecOps principles across the organization.</li>\n</ul>\n<p>About you:</p>\n<ul>\n<li>5+ years of successful experience in a similar role (DevOps, Developer Experience, Platform Engineer, Internal tooling engineer, SRE...).</li>\n<li>Strong proficiency in scripting languages (Go, Python...) and software development best practices.</li>\n<li>Developer experience engineering: developer workflow optimization, tooling, and automation for productivity, real-time developer support, and escalation paths.</li>\n<li>Site Reliability Engineering: CI/CD, containerization, orchestration, infra-as-code, monitoring, logging, alerting, observability...</li>\n<li>Exposure to multi-cloud infrastructures (AWS / GCP / Azure or On-Prem).</li>\n<li>Security Tools &amp; Approaches: OWASP, SAST, DAST, SCA, vulnerability scanners.</li>\n<li>Proven problem-solving and communication skills , ability to contextualize, gauge risks, and get buy-in for high-stakes and impactful solutions.</li>\n<li>Ownership, high agency, and desire to improve things for others.</li>\n<li>Autonomy, self-drive, and ability to work well in a fast-paced startup environment.</li>\n<li>Low ego and team spirit mindset.</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_e4891ce8-465","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Mistral AI","sameAs":"https://mistral.ai/","logo":"https://logos.yubhub.co/mistral.ai.png"},"x-apply-url":"https://jobs.lever.co/mistral/c9e16eb0-0cb9-423d-8495-a96d10782622","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["scripting languages (Go, Python...)","software development best practices","developer experience engineering","site reliability engineering","multi-cloud infrastructures","security tools & approaches"],"x-skills-preferred":[],"datePosted":"2026-04-17T12:48:10.748Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Paris"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"scripting languages (Go, Python...), software development best practices, developer experience engineering, site reliability engineering, multi-cloud infrastructures, security tools & approaches"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_26bff84c-def"},"title":"Senior/Staff Platform Engineer/SRE","description":"<p>About the Role\nWe are seeking a Senior Platform Engineer who will design, develop, and deploy robust platform solutions to ensure the reliability, scalability, and security of our system.</p>\n<p>Responsibilities</p>\n<ul>\n<li>Identify and build AI-powered capabilities into Flow&#39;s platform, from intelligent automation in building operations to personalized resident experiences.</li>\n<li>Use AI-assisted development tools (e.g., Cursor, Claude Code) as part of your daily workflow to accelerate development, improve code quality, and push the boundaries of what a small team can ship.</li>\n<li>Collaborate with product and engineering teams to define clear requirements and translate them into software solutions.</li>\n<li>Core contributor to implementing foundational infrastructure, tooling and automation that is scalable, reliable, and secure.</li>\n<li>Elevate site reliability engineering best practices while collaborating with back-end developers.</li>\n<li>Develop service-level tooling to enhance productionization, data migrations, system hardening, and related initiatives.</li>\n<li>Manage and optimize a multi-region environment.</li>\n<li>Be available for on-call activities for infrastructure and services.</li>\n</ul>\n<p>Ideal Background</p>\n<ul>\n<li>A minimum 10 years in software engineering, site reliability engineering, or platform engineering.</li>\n<li>Fluency with AI-assisted development tools and a strong point of view on how AI changes the way software gets built.</li>\n<li>Ability to design, implement and maintain the tools and systems that support service reliability, monitoring, and alerting.</li>\n<li>Deep understanding of the principles of ensuring high availability, fault tolerance, and efficiency in distributed systems.</li>\n<li>Experience with Infrastructure as Code (IaC): Proficiency with Terraform.</li>\n<li>Experience with Kubernetes.</li>\n<li>Experience administering cloud-based infrastructure (GCP preferred).</li>\n<li>Experience troubleshooting production issues related to cloud infrastructure, configuration, monitoring, deployments, continuous integration and delivery.</li>\n<li>A keen ability to balance elegant design with pragmatic tradeoffs, prioritizing continuous delivery of business value.</li>\n<li>Ability to quickly learn and adapt to new skillsets.</li>\n<li>Experience building software in fast-moving startup environments.</li>\n<li>Participate in incident response and post-mortems to identify and address systemic issues.</li>\n</ul>\n<p>Additional Information\nBenefits</p>\n<ul>\n<li>Comprehensive Benefits Package (Medical / Dental / Vision / Disability / Life)</li>\n<li>Paid time off and 13 paid holidays</li>\n<li>401(k) retirement plan</li>\n<li>Healthcare and Dependent Care Flexible Spending Accounts (FSAs)</li>\n<li>Access to HSA-compatible plans</li>\n<li>Pre-tax commuter benefits</li>\n<li>Employee Assistance Program (EAP), free therapy through SpringHealth, acupuncture, and other wellness offerings</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_26bff84c-def","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Flow","sameAs":"https://flow.com","logo":"https://logos.yubhub.co/flow.com.png"},"x-apply-url":"https://jobs.lever.co/flowlife/3ae47b09-e4b4-41be-9312-fafb1d85cf4d","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$180,000-275,000 per year","x-skills-required":["AI-assisted development tools","Terraform","Kubernetes","Cloud-based infrastructure administration","Site reliability engineering","Monitoring and alerting","Service-level tooling","Multi-region environment management"],"x-skills-preferred":[],"datePosted":"2026-04-17T12:34:32.862Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Palo Alto"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"AI-assisted development tools, Terraform, Kubernetes, Cloud-based infrastructure administration, Site reliability engineering, Monitoring and alerting, Service-level tooling, Multi-region environment management","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":180000,"maxValue":275000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_871d4845-25a"},"title":"Software Engineer, DevOps, Research Platform","description":"<p>We are seeking a talented and experienced software engineer to join our Research Platform team. You&#39;ll work closely with our R&amp;D team to build a cloud agnostic platform that improves the stability, scalability and velocity across the research department.</p>\n<p>As a DevOps/Platform Engineer, your responsibilities will include designing and implementing complex systems, building flexible yet solid and accessible development environment for researchers, designing, implementing and advocating for solutions addressing large amounts of data and maintainable data pipelines, optimizing a variety of builds, building strong relationships with researchers, communicating and producing documentation or any content that will help them to make the most out of the tools and systems you&#39;ll build.</p>\n<p>About you:</p>\n<ul>\n<li>5+ years of successful experience in a similar DX / DevOps / SRE role.</li>\n<li>Proficiency in software development (Python, Go...) and programming best practices.</li>\n<li>Exposure to site reliability engineering: root cause analysis, in-production troubleshooting, on-call rotations...</li>\n<li>Exposure to infrastructure management: CI/CD, containerization, orchestration, infra-as-code, monitoring, logging, alerting, observability...</li>\n<li>Technical product mindset (e.g. understanding how to debug poor adoption).</li>\n<li>Excellent problem-solving and communication skills (ability to contextualizing, gauging risks and getting buy-in for high stakes and impactful solutions).</li>\n<li>Ownership, high agency and constantly seeking to learn and improving things for others.</li>\n<li>Autonomous, self-driven and able to work well in a fast-paced startup environment.</li>\n<li>Low ego and team spirit mindset.</li>\n</ul>\n<p>Your application will be all the more interesting if you also have:</p>\n<ul>\n<li>First hand Bazel (or equivalent) experience.</li>\n<li>Strong knowledge of Python&#39;s ecosystem.</li>\n<li>Familiarity with GPU based workloads and ecosystems.</li>\n<li>Experience of full remote environments (you&#39;re comfortable with having some of your users on the other side of the globe).</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_871d4845-25a","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Mistral AI","sameAs":"https://mistral.ai/careers"},"x-apply-url":"https://jobs.lever.co/mistral/18be2b70-c05d-48e4-82ac-e5cb462c96c0","x-work-arrangement":"remote","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["software development","Python","Go","site reliability engineering","infrastructure management","CI/CD","containerization","orchestration","infra-as-code","monitoring","logging","alerting","observability"],"x-skills-preferred":["Bazel","Python's ecosystem","GPU based workloads and ecosystems","full remote environments"],"datePosted":"2026-03-10T11:31:49.456Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Paris"}},"jobLocationType":"TELECOMMUTE","employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"software development, Python, Go, site reliability engineering, infrastructure management, CI/CD, containerization, orchestration, infra-as-code, monitoring, logging, alerting, observability, Bazel, Python's ecosystem, GPU based workloads and ecosystems, full remote environments"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_92a78695-a57"},"title":"Software Engineer, DevEx","description":"<p>We are seeking an experienced Software Engineer, Developer Experience to own and foster a collaborative, automated, and efficient software development lifecycle. In this role, you will collaborate closely with product engineering teams to ensure consistent code health, accelerate development velocity through well-maintained CI pipelines, faster builds, and secure release processes.</p>\n<p>Your mission is to empower our software engineering team with seamless workflows while securing our production environments.</p>\n<p>Responsibilities:</p>\n<ul>\n<li>Build, monitor, and enhance CI/CD pipelines to streamline development workflows and accelerate deployments.</li>\n<li>Design, operate and maintain scalable, reliable and secure multi-cloud infrastructures</li>\n<li>Identify areas for improvement and create innovative solutions that enable high developer velocity</li>\n</ul>\n<p>Team Collaboration &amp; Advocacy:</p>\n<ul>\n<li>Standardize DevOps practices to ensure consistency across all engineering teams.</li>\n<li>Establish measurable KPIs for security performance, reliability, and compliance adherence.</li>\n<li>Partner with development and operations teams to embed security into daily workflows.</li>\n<li>Lead training initiatives to upskill teams on secure coding, threat modeling, and incident response.</li>\n<li>Champion a security-first mindset, driving cultural adoption of DevSecOps principles across the organization.</li>\n</ul>\n<p>About you:</p>\n<ul>\n<li>5+ years of successful experience in a similar role (DevOps, Developer Experience, Platform Engineer, Internal tooling engineer, SRE...)</li>\n<li>Strong proficiency in scripting languages (Go, Python...) and software development best practices.</li>\n<li>Developer experience engineering: developer workflow optimization, tooling and automation for productivity, real-time developer support and escalation paths</li>\n<li>Site Reliability Engineering: CI/CD, containerization, orchestration, infra-as-code, monitoring, logging, alerting, observability...</li>\n<li>Exposure to multi-cloud infrastructures (AWS / GCP / Azure or On-Prem)</li>\n<li>Security Tools &amp; Approaches: OWASP, SAST, DAST, SCA, vulnerability scanners</li>\n</ul>\n<p>Proven problem-solving and communication skills — ability to contextualizing, gauging risks and getting buy-in for high stakes and impactful solutions.</p>\n<p>Ownership, high agency and desire to improve things for others.</p>\n<p>Autonomy, self-drive and ability to work well in a fast-paced startup environment.</p>\n<p>Low ego and team spirit mindset.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_92a78695-a57","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Mistral AI","sameAs":"https://mistral.ai"},"x-apply-url":"https://jobs.lever.co/mistral/c9e16eb0-0cb9-423d-8495-a96d10782622","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["scripting languages (Go, Python...)","software development best practices","developer experience engineering","site reliability engineering","multi-cloud infrastructures (AWS / GCP / Azure or On-Prem)","security tools & approaches (OWASP, SAST, DAST, SCA, vulnerability scanners)"],"x-skills-preferred":[],"datePosted":"2026-03-10T11:31:30.226Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Paris"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"scripting languages (Go, Python...), software development best practices, developer experience engineering, site reliability engineering, multi-cloud infrastructures (AWS / GCP / Azure or On-Prem), security tools & approaches (OWASP, SAST, DAST, SCA, vulnerability scanners)"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_eafe9949-c5e"},"title":"Cybersecurity Engineer, SIEM","description":"<p>About Mistral AI\\n====================\\n\\nAt Mistral AI, we believe in the power of AI to simplify tasks, save time, and enhance learning and creativity. Our technology is designed to integrate seamlessly into daily working life.\\n\\nWe are a global company with teams distributed between France, USA, UK, Germany and Singapore. Our comprehensive AI platform meets enterprise needs, whether on-premises or in cloud environments.\\n\\nRole Summary\\n============\\n\\nMistral is looking for a Security Platform Engineer to architect and maintain the infrastructure ensuring the observability of our production systems. You will treat the SIEM and logging infrastructure as a high-performance data product.\\n\\nResponsibilities\\n---------------\\n\\n* Own the set-up, lifecycle, availability, and performance of the SIEM solution, ensuring 99.9% uptime for log ingestion and query availability.\\n* Design and maintain high-throughput data pipelines to collect, buffer, and transport logs from distributed systems to the SIEM.\\n* Implement parsing logic and schema standardization to ensure unstructured logs are searchable and actionable for analysts.\\n* Manage alert rules, connectors, and dashboard configurations, avoiding manual console configuration (&quot;ClickOps&quot;).\\n* Analyze ingestion patterns to identify noisy, low-value data. Implement filtering and aggregation at the source to maximize signal-to-noise ratio.\\n* Architect data tiers to balance query performance with compliance retention requirements and cloud costs.\\n\\nAbout You\\n========\\n\\n* 5+ years of experience in Site Reliability Engineering (SRE), Data Engineering, or Security Engineering with a focus on logging infrastructure.\\n* Deep understanding of log management challenges at scale (indexing strategies, sharding, partitioning, throughput tuning).\\n* Strong experience deploying and monitoring stateful workloads on Kubernetes and Cloud providers (Azure/GCP) and On-Prem.\\n* Ability to write production-grade Python or Go for automation and custom log exporters.\\n* Experience managing monitoring, alerting, and on-call rotations for critical infrastructure.\\n\\nHiring Process\\n============\\n\\n* Introduction call - 30 min\\n* Hiring Manager interview - 30 min\\n* Technical Rounds I - 45 min\\n* Technical Rounds II - 60 min\\n* Culture-fit discussion - 30 min\\n* References\\n\\nAdditional Information\\n====================\\n\\nLocation &amp; Remote\\n-----------------\\nThe position is based in our Paris HQ offices and we encourage going to the office as much as we can (at least 3 days per week) to create bonds and smooth communication. Our remote policy aims to provide flexibility, improve work-life balance and increase productivity. Each manager can decide the amount of days worked remotely based on autonomy and a specific context (e.g. more flexibility can occur during summer). In any case, employees are expected to maintain regular communication with their teams and be available during core working hours.\\n\\nWhat We Offer\\n============\\n\\n* Competitive salary and equity package\\n* Health insurance\\n* Transportation allowance\\n* Sport allowance\\n* Meal vouchers\\n* Private pension plan\\n* Generous parental leave policy</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_eafe9949-c5e","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Mistral AI","sameAs":"https://mistral.ai"},"x-apply-url":"https://jobs.lever.co/mistral/6f7f6e7a-3dc4-430b-8957-a64450a10066","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["Site Reliability Engineering","Data Engineering","Security Engineering","Logging infrastructure","Kubernetes","Cloud providers","Python","Go","Monitoring","Alerting","On-call rotations"],"x-skills-preferred":[],"datePosted":"2026-03-10T11:24:38.630Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Paris"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Site Reliability Engineering, Data Engineering, Security Engineering, Logging infrastructure, Kubernetes, Cloud providers, Python, Go, Monitoring, Alerting, On-call rotations"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_eebf21c4-d1f"},"title":"Staff Site Reliability Engineer","description":"<p>Join our Site Reliability Engineering (SRE) team and help ensure the reliability, scalability, and performance of Replit&#39;s infrastructure that serves millions of developers worldwide.</p>\n<p>As a Staff Site Reliability Engineer, you will bridge the gap between development and operations, implementing automation and establishing best practices that enable our platform to scale efficiently while maintaining high availability.</p>\n<p>We are seeking Staff SREs who are passionate about building and maintaining resilient systems at scale. Your mission will be to proactively find and analyze reliability problems across our stack, then design and implement software and systems to create step-function improvements.</p>\n<p>You will design robust observability solutions, lead incident response, automate operational tasks, and continuously improve our infrastructure&#39;s reliability, all while mentoring and educating the broader engineering team to make reliability a core value at Replit.</p>\n<p><strong>Responsibilities</strong></p>\n<ul>\n<li>Architect and Implement Observability: Design, build, and lead the implementation of comprehensive monitoring, logging, and tracing solutions. Create dashboards and metrics that provide real-time visibility into system health and performance, enabling proactive issue detection.</li>\n</ul>\n<ul>\n<li>Define and Drive Reliability Standards: Work with product and engineering teams to define, implement, and track Service Level Objectives (SLOs) and Service Level Indicators (SLIs). Build systems to monitor and report on these metrics, holding teams accountable and ensuring we maintain high reliability standards while balancing innovation speed.</li>\n</ul>\n<ul>\n<li>Lead Incident Management and Response: Act as a senior leader during high-impact incidents, guiding the team to rapid resolution. Conduct thorough, blameless post-mortems and drive the implementation of preventative measures. Develop and refine runbooks and build automation to reduce Mean Time To Recovery (MTTR).</li>\n</ul>\n<ul>\n<li>Drive Automation and Infrastructure as Code: Architect, build, and improve automation to eliminate toil and operational work. Design and maintain CI/CD pipelines and infrastructure automation using tools like Terraform or Pulumi. Create self-healing systems that can automatically respond to common failure scenarios.</li>\n</ul>\n<ul>\n<li>Optimize Performance on Kubernetes: Collaborate with core infrastructure and product teams to performance-tune and optimize our large-scale cloud deployments, with a deep focus on Kubernetes, Docker, and GCP. Identify and resolve performance bottlenecks, implement capacity planning strategies, and reduce latency across global regions.</li>\n</ul>\n<ul>\n<li>Debug and Harden Distributed Systems: Dive deep into debugging extremely difficult technical problems across the stack. Use your findings to design and implement long-term fixes that make our systems and products more robust, operable, and easier to diagnose.</li>\n</ul>\n<ul>\n<li>Provide Staff-Level Guidance: Review feature and system designs from across the company, acting as a key owner for the reliability, scalability, security, and operational integrity of those designs.</li>\n</ul>\n<ul>\n<li>Educate and Mentor: Educate, mentor, and hold accountable the broader engineering team to improve the reliability of our systems, making reliability a core value of the Replit engineering culture.</li>\n</ul>\n<ul>\n<li>Build and Integrate: Write high-quality, well-tested code in Python or Go to meet the needs of your customers, whether it&#39;s building new internal tools or integrating with third-party vendors.</li>\n</ul>\n<p><strong>Required Skills and Experience</strong></p>\n<ul>\n<li>8-10 years of experience in Site Reliability Engineering or similar roles (e.g., DevOps, Systems Engineering, Infrastructure Engineering).</li>\n</ul>\n<ul>\n<li>Strong programming skills in languages like Python or Go. You write high-quality, well-tested code.</li>\n</ul>\n<ul>\n<li>Deep understanding of distributed systems. You’ve designed, built, scaled, and maintained production services and know how to compose a service-oriented architecture.</li>\n</ul>\n<ul>\n<li>Deep experience with container orchestration platforms, specifically Kubernetes, and cloud-native technologies.</li>\n</ul>\n<ul>\n<li>Proven track record of designing, implementing, and maintaining sophisticated monitoring and observability solutions (e.g., metrics, logging, tracing).</li>\n</ul>\n<ul>\n<li>Strong incident management skills with extensive experience leading incident response for complex systems and demonstrated critical thinking under pressure.</li>\n</ul>\n<ul>\n<li>Experience with infrastructure as code (e.g., Terraform, Pulumi) and configuration management tools.</li>\n</ul>\n<ul>\n<li>Excellent written and verbal communication skills, with an ability to explain complex technical concepts clearly and simply and a bias toward open, transparent cultural practices.</li>\n</ul>\n<ul>\n<li>Strong interpersonal skills, with experience working with and mentoring engineers from junior to principal levels.</li>\n</ul>\n<ul>\n<li>A willingness to dive into understanding, debugging, and improving any layer of the stack.</li>\n</ul>\n<ul>\n<li>You&#39;re passionate about making software creation accessible and empowering the next generation of builders.</li>\n</ul>\n<p><strong>Bonus Points</strong></p>\n<ul>\n<li>Deep experience with Google Cloud Platform (GCP) services and tools.</li>\n</ul>\n<ul>\n<li>Expert-level knowledge of modern observability platforms (e.g., Prometheus, Grafana, Datadog, OpenTelemetry).</li>\n</ul>\n<ul>\n<li>Experience designing and building reliable systems capable of handling high throughput and low latency.</li>\n</ul>\n<ul>\n<li>Significant experience with Go and Terraform.</li>\n</ul>\n<ul>\n<li>Familiarity with working in rapid-growth, startup environments.</li>\n</ul>\n<ul>\n<li>Experience writing company-facing blog posts and training materials.</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_eebf21c4-d1f","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Replit","sameAs":"https://jobs.ashbyhq.com","logo":"https://logos.yubhub.co/replit.com.png"},"x-apply-url":"https://jobs.ashbyhq.com/replit/d50ad15b-82d4-452f-b4ea-2a7f5e796170","x-work-arrangement":"remote","x-experience-level":"staff","x-job-type":"Full time","x-salary-range":"$220K - $325K","x-skills-required":["Site Reliability Engineering","DevOps","Systems Engineering","Infrastructure Engineering","Python","Go","Distributed Systems","Container Orchestration","Kubernetes","Cloud-Native Technologies","Monitoring and Observability","Incident Management","Infrastructure as Code","Terraform","Pulumi","Configuration Management"],"x-skills-preferred":["Google Cloud Platform","Prometheus","Grafana","Datadog","OpenTelemetry","Go","Terraform"],"datePosted":"2026-03-08T22:20:23.639Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Remote (United States)"}},"jobLocationType":"TELECOMMUTE","employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Site Reliability Engineering, DevOps, Systems Engineering, Infrastructure Engineering, Python, Go, Distributed Systems, Container Orchestration, Kubernetes, Cloud-Native Technologies, Monitoring and Observability, Incident Management, Infrastructure as Code, Terraform, Pulumi, Configuration Management, Google Cloud Platform, Prometheus, Grafana, Datadog, OpenTelemetry, Go, Terraform","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":220000,"maxValue":325000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_8c164f95-f8d"},"title":"Senior Infrastructure Engineer","description":"<p>Join our Infrastructure Engineering team and help ensure the reliability, scalability, and performance of Replit&#39;s infrastructure that serves millions of developers worldwide. As a Senior Infrastructure Engineer, you will bridge the gap between development and operations, implementing automation and establishing best practices that enable our platform to scale efficiently while maintaining high availability.</p>\n<p>We are seeking Senior Infrastructure Engineers who are passionate about building and maintaining resilient systems at scale. Your mission will be to proactively find and analyse reliability problems across our stack, then design and implement software and systems to address them. You will build robust monitoring solutions, automate operational tasks, and continuously improve our infrastructure&#39;s reliability.</p>\n<p><strong>You Will:</strong></p>\n<ul>\n<li>Drive Automation and Infrastructure as Code: Build and improve automation to eliminate toil and operational work. Maintain CI/CD pipelines and infrastructure automation using tools like Terraform or Pulumi. Create self-healing systems that can automatically respond to common failure scenarios.</li>\n<li>Optimise Performance and Infrastructure: Collaborate with core infrastructure and product teams to performance tune and optimise our cloud deployments (Kubernetes, Docker, GCP). Identify and resolve performance bottlenecks and implement capacity planning strategies.</li>\n<li>Elevate Developer Experience: Design and implement improvements to our build, test, and deployment systems to make software delivery faster, safer, and more reliable for all engineers.</li>\n<li>Drive Cross-Team Improvements: Partner with service owners across Replit to understand their pain points, and collaborate on implementing build/test/deploy enhancements within their specific services.</li>\n<li>Build Shared Tooling: Create and maintain centralized tooling and automation that improves the engineering lifecycle, from local development to production monitoring.</li>\n<li>Debug and Harden Systems: Dive deep into debugging difficult technical problems, making our systems and products more robust, operable, and easier to diagnose.</li>\n<li>Collaborate on Design Reviews: Participate in feature and system design reviews, contributing expertise on security, scale, and operational considerations.</li>\n<li>Build and Integrate: Write high-quality, well-tested code to meet the needs of your customers, including building pipelines to integrate with 3rd party vendors.</li>\n</ul>\n<p><strong>Required Skills and Experience:</strong></p>\n<ul>\n<li>4+ years of experience in Site Reliability Engineering or similar roles (DevOps, Systems Engineering, Infrastructure Engineering).</li>\n<li>Strong programming skills in languages like Python or Go.</li>\n<li>You write high-quality, well-tested code.</li>\n<li>Solid understanding of distributed systems. You&#39;ve built, scaled, and maintained production services and understand service-oriented architecture.</li>\n<li>Experience with container orchestration platforms (Kubernetes) and cloud-native technologies.</li>\n<li>Experience implementing and maintaining monitoring/observability solutions, with strong skills in debugging and performance tuning.</li>\n<li>Strong incident management skills with experience participating in incident response and demonstrated critical thinking under pressure.</li>\n<li>Experience with infrastructure as code (e.g., Terraform) and configuration management tools.</li>\n<li>Excellent written and verbal communication skills, with an ability to explain technical concepts clearly.</li>\n<li>A willingness to dive into understanding, debugging, and improving any layer of the stack.</li>\n<li>You&#39;re passionate about making software creation accessible and empowering the next generation of builders.</li>\n</ul>\n<p><strong>Bonus Points:</strong></p>\n<ul>\n<li>Experience with Google Cloud Platform (GCP) services and tools.</li>\n<li>Knowledge of modern observability platforms (Prometheus, Grafana, Datadog, etc.).</li>\n<li>Experience building reliable systems capable of handling high throughput and low latency.</li>\n<li>Experience with Go and Terraform.</li>\n<li>Familiarity with working in rapid-growth environments.</li>\n</ul>\n<p>_This is a full-time role that can be held from our Foster City, CA office. The role has an in-office requirement of Monday, Wednesday, and Friday._</p>\n<p><strong>Full-Time Employee Benefits Include:</strong></p>\n<ul>\n<li>Competitive Salary &amp; Equity</li>\n<li>401(k) Program with a 4% match</li>\n<li>Health, Dental, Vision and Life Insurance</li>\n<li>Short Term and Long Term Disability</li>\n<li>Paid Parental, Medical, Caregiver Leave</li>\n<li>Commuter Benefits</li>\n<li>Monthly Wellness Stipend</li>\n<li>Autonomous Work Environment</li>\n<li>In Office Set-Up Reimbursement</li>\n<li>Flexible Time Off (FTO) + Holidays</li>\n<li>Quarterly Team Gatherings</li>\n<li>In Office Amenities</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_8c164f95-f8d","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Replit","sameAs":"https://jobs.ashbyhq.com","logo":"https://logos.yubhub.co/replit.com.png"},"x-apply-url":"https://jobs.ashbyhq.com/replit/16c85abc-763c-4f36-ab67-64f416343384","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$190K - $240K","x-skills-required":["Site Reliability Engineering","DevOps","Systems Engineering","Infrastructure Engineering","Python","Go","Terraform","Kubernetes","Docker","GCP","Monitoring/observability solutions","Debugging and performance tuning","Incident management","Infrastructure as code","Configuration management tools"],"x-skills-preferred":["Google Cloud Platform (GCP) services and tools","Modern observability platforms (Prometheus, Grafana, Datadog, etc.)","Building reliable systems capable of handling high throughput and low latency","Go and Terraform","Familiarity with working in rapid-growth environments"],"datePosted":"2026-03-07T15:20:28.138Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Foster City, CA"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Site Reliability Engineering, DevOps, Systems Engineering, Infrastructure Engineering, Python, Go, Terraform, Kubernetes, Docker, GCP, Monitoring/observability solutions, Debugging and performance tuning, Incident management, Infrastructure as code, Configuration management tools, Google Cloud Platform (GCP) services and tools, Modern observability platforms (Prometheus, Grafana, Datadog, etc.), Building reliable systems capable of handling high throughput and low latency, Go and Terraform, Familiarity with working in rapid-growth environments","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":190000,"maxValue":240000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_b7de618e-5e1"},"title":"Site Reliability Engineer","description":"<p>Join our Site Reliability Engineering team and help ensure the reliability, scalability, and performance of Replit&#39;s infrastructure that serves millions of developers worldwide. As a Site Reliability Engineer, you will bridge the gap between development and operations, implementing automation and establishing best practices that enable our platform to scale efficiently while maintaining high availability.</p>\n<p>We are seeking SREs who are passionate about building and maintaining resilient systems at scale. Your mission will be to design and implement robust monitoring solutions, automate operational tasks, and continuously improve our infrastructure&#39;s reliability and performance.</p>\n<p><strong>Responsibilities</strong></p>\n<ul>\n<li>Design and Implement Observability Solutions: Develop comprehensive monitoring and alerting systems using modern observability tools. Create dashboards and metrics that provide real-time visibility into system health and performance. Implement logging strategies that enable quick problem identification and resolution.</li>\n</ul>\n<ul>\n<li>Drive Automation and Infrastructure as Code: Architect and implement infrastructure automation solutions using tools like Terraform, Ansible, or Pulumi. Design and maintain CI/CD pipelines that enable reliable and consistent deployments. Create self-healing systems that can automatically respond to common failure scenarios.</li>\n</ul>\n<ul>\n<li>Establish SLOs and SLIs: Work with product and engineering teams to define and implement Service Level Objectives (SLOs) and Service Level Indicators (SLIs). Build systems to track and report on these metrics, ensuring we maintain high reliability standards while balancing innovation speed.</li>\n</ul>\n<ul>\n<li>Incident Management and Response: Lead incident response efforts, conducting thorough post-mortems, and implementing improvements to prevent future occurrences. Develop and maintain runbooks for critical services. Build tools and processes that reduce Mean Time To Recovery (MTTR).</li>\n</ul>\n<ul>\n<li>Performance Optimization: Identify and resolve performance bottlenecks across our infrastructure. Implement capacity planning strategies and optimize resource utilization. Work on reducing latency and improving system efficiency across global regions.</li>\n</ul>\n<p><strong>Requirements</strong></p>\n<ul>\n<li>4-8 years of experience in Site Reliability Engineering or similar roles (DevOps, Systems Engineering, Infrastructure Engineering)</li>\n</ul>\n<ul>\n<li>Strong programming skills in languages commonly used for automation (Python, Go, or similar)</li>\n</ul>\n<ul>\n<li>Deep understanding of distributed systems</li>\n</ul>\n<ul>\n<li>Experience with container orchestration platforms (Kubernetes) and cloud-native technologies</li>\n</ul>\n<ul>\n<li>Proven track record of implementing and maintaining monitoring/observability solutions</li>\n</ul>\n<ul>\n<li>Strong incident management skills with experience leading incident response</li>\n</ul>\n<ul>\n<li>Experience with infrastructure as code and configuration management tools</li>\n</ul>\n<p><strong>Bonus Points</strong></p>\n<ul>\n<li>Experience with Google Cloud Platform (GCP) services and tools</li>\n</ul>\n<ul>\n<li>Knowledge of modern observability platforms (Prometheus, Grafana, Datadog, etc.)</li>\n</ul>\n<p><strong>What We Value</strong></p>\n<ul>\n<li>Problem-solving mindset: Ability to approach complex operational challenges systematically and devise effective solutions</li>\n</ul>\n<ul>\n<li>Self-directed and autonomous: Capable of working independently while collaborating effectively with cross-functional teams</li>\n</ul>\n<ul>\n<li>Strong communication skills: Ability to explain complex technical concepts to both technical and non-technical audiences</li>\n</ul>\n<ul>\n<li>Continuous learning: Passion for staying current with industry best practices and new technologies</li>\n</ul>\n<ul>\n<li>Focus on automation: Strong belief in automating repetitive tasks and building self-healing systems</li>\n</ul>\n<p><strong>Full-Time Employee Benefits Include</strong></p>\n<ul>\n<li>Competitive Salary &amp; Equity</li>\n</ul>\n<ul>\n<li>401(k) Program with a 4% match</li>\n</ul>\n<ul>\n<li>Health, Dental, Vision and Life Insurance</li>\n</ul>\n<ul>\n<li>Short Term and Long Term Disability</li>\n</ul>\n<ul>\n<li>Paid Parental, Medical, Caregiver Leave</li>\n</ul>\n<ul>\n<li>Commuter Benefits</li>\n</ul>\n<ul>\n<li>Monthly Wellness Stipend</li>\n</ul>\n<ul>\n<li>Autonomous Work Environment</li>\n</ul>\n<ul>\n<li>In Office Set-Up Reimbursement</li>\n</ul>\n<ul>\n<li>Flexible Time Off (FTO) + Holidays</li>\n</ul>\n<ul>\n<li>Quarterly Team Gatherings</li>\n</ul>\n<ul>\n<li>In Office Amenities</li>\n</ul>\n<p><strong>Want to Learn More About What We Are Up To?</strong></p>\n<ul>\n<li>Meet the Replit Agent</li>\n</ul>\n<ul>\n<li>Replit: Make an app for that</li>\n</ul>\n<ul>\n<li>Replit Blog</li>\n</ul>\n<ul>\n<li>Amjad TED Talk</li>\n</ul>\n<p><strong>Interviewing + Culture at Replit</strong></p>\n<ul>\n<li>Operating Principles</li>\n</ul>\n<ul>\n<li>Reasons not to work at Replit</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_b7de618e-5e1","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Replit","sameAs":"https://jobs.ashbyhq.com","logo":"https://logos.yubhub.co/replit.com.png"},"x-apply-url":"https://jobs.ashbyhq.com/replit/f6e6158e-eb89-4008-81ea-1b7512bc509d","x-work-arrangement":"remote","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$160K - $250K","x-skills-required":["Site Reliability Engineering","DevOps","Systems Engineering","Infrastructure Engineering","Python","Go","Distributed systems","Container orchestration platforms","Cloud-native technologies","Monitoring/observability solutions","Incident management","Infrastructure as code","Configuration management tools"],"x-skills-preferred":["Google Cloud Platform","Prometheus","Grafana","Datadog"],"datePosted":"2026-03-07T15:20:24.140Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"United States"}},"jobLocationType":"TELECOMMUTE","employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Site Reliability Engineering, DevOps, Systems Engineering, Infrastructure Engineering, Python, Go, Distributed systems, Container orchestration platforms, Cloud-native technologies, Monitoring/observability solutions, Incident management, Infrastructure as code, Configuration management tools, Google Cloud Platform, Prometheus, Grafana, Datadog","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":160000,"maxValue":250000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_323bc85d-b69"},"title":"Staff Infrastructure Engineer","description":"<p><strong>About the Role:</strong></p>\n<p>Join our Infrastructure Engineering team and help ensure the reliability, scalability, and performance of Replit&#39;s infrastructure that serves millions of developers worldwide. As a Staff Infrastructure Engineer, you will bridge the gap between development and operations, implementing automation and establishing best practices that enable our platform to scale efficiently while maintaining high availability.</p>\n<p><strong>Responsibilities:</strong></p>\n<ul>\n<li>Drive Automation and Infrastructure as Code: Architect, build, and improve automation to eliminate toil and operational work. Design and maintain CI/CD pipelines and infrastructure automation using tools like Terraform or Pulumi. Create self-healing systems that can automatically respond to common failure scenarios.</li>\n</ul>\n<ul>\n<li>Optimise Performance and Infrastructure: Collaborate with core infrastructure and product teams to performance tune and optimise our cloud deployments (Kubernetes, Docker, GCP). Identify and resolve performance bottlenecks, implement capacity planning strategies, and reduce latency across global regions.</li>\n</ul>\n<ul>\n<li>Elevate Developer Experience: Design and implement improvements to our build, test, and deployment systems to make software delivery faster, safer, and more reliable for all engineers.</li>\n</ul>\n<ul>\n<li>Drive Cross-Company Improvements: Partner directly with service owners across Replit to understand their pain points, and collaborate on implementing build/test/deploy enhancements within their specific services.</li>\n</ul>\n<ul>\n<li>Build Shared Tooling: Create and maintain centralized tooling and automation that improves the entire engineering lifecycle, from local development to production monitoring.</li>\n</ul>\n<ul>\n<li>Debug and Harden Systems: Dive deep into debugging extremely difficult technical problems, making our systems and products more robust, operable, and easier to diagnose.</li>\n</ul>\n<ul>\n<li>Provide Staff-Level Guidance: Review feature and system designs, acting as an owner for the security, scale, and operational integrity of those designs.</li>\n</ul>\n<ul>\n<li>Educate and Mentor: Educate, mentor, and hold accountable the engineering team to improve the reliability of our systems, making reliability a core value of the Replit engineering culture.</li>\n</ul>\n<ul>\n<li>Build and Integrate: Write high-quality, well-tested code to meet the needs of your customers, including building pipelines to integrate with 3rd party vendors.</li>\n</ul>\n<p><strong>Required Skills and Experience:</strong></p>\n<ul>\n<li>8-10 years of experience in Infrastructure Engineering or similar roles (DevOps, Systems Engineering, Site Reliability Engineering).</li>\n</ul>\n<ul>\n<li>Strong programming skills in languages like Python or Go.</li>\n</ul>\n<ul>\n<li>You write high-quality, well-tested code.</li>\n</ul>\n<ul>\n<li>Deep understanding of distributed systems. You&#39;ve designed, built, scaled, and maintained production services and know how to compose a service-oriented architecture.</li>\n</ul>\n<ul>\n<li>Experience with container orchestration platforms (Kubernetes) and cloud-native technologies.</li>\n</ul>\n<ul>\n<li>Proven track record of implementing and maintaining monitoring/observability solutions, with strong skills in debugging and performance tuning.</li>\n</ul>\n<ul>\n<li>Strong incident management skills with experience leading incident response and demonstrated critical thinking under pressure.</li>\n</ul>\n<ul>\n<li>Experience with infrastructure as code (e.g., Terraform) and configuration management tools.</li>\n</ul>\n<ul>\n<li>Excellent written and verbal communication skills, with an ability to explain technical concepts clearly and simply and a bias toward open, transparent cultural practices.</li>\n</ul>\n<ul>\n<li>Strong interpersonal skills, with experience working with engineers from junior to principal levels.</li>\n</ul>\n<ul>\n<li>A willingness to dive into understanding, debugging, and improving any layer of the stack.</li>\n</ul>\n<ul>\n<li>You&#39;re passionate about making software creation accessible and empowering the next generation of builders.</li>\n</ul>\n<p><strong>Bonus Points:</strong></p>\n<ul>\n<li>Deep experience with Google Cloud Platform (GCP) services and tools.</li>\n</ul>\n<ul>\n<li>Knowledge of modern observability platforms (Prometheus, Grafana, Datadog, etc.).</li>\n</ul>\n<ul>\n<li>Experience designing and building reliable systems capable of handling high throughput and low latency.</li>\n</ul>\n<ul>\n<li>Experience with Go and Terraform.</li>\n</ul>\n<ul>\n<li>Familiarity with working in rapid-growth environments.</li>\n</ul>\n<ul>\n<li>Experience writing company-facing blog posts and training materials.</li>\n</ul>\n<p><strong>Full-Time Employee Benefits Include:</strong></p>\n<ul>\n<li>Competitive Salary &amp; Equity</li>\n</ul>\n<ul>\n<li>401(k) Program with a 4% match</li>\n</ul>\n<ul>\n<li>Health, Dental, Vision and Life Insurance</li>\n</ul>\n<ul>\n<li>Short Term and Long Term Disability</li>\n</ul>\n<ul>\n<li>Paid Parental, Medical, Caregiver Leave</li>\n</ul>\n<ul>\n<li>Commuter Benefits</li>\n</ul>\n<ul>\n<li>Monthly Wellness Stipend</li>\n</ul>\n<ul>\n<li>Autonomous Work Environment</li>\n</ul>\n<ul>\n<li>In Office Set-Up Reimbursement</li>\n</ul>\n<ul>\n<li>Flexible Time Off (FTO) + Holidays</li>\n</ul>\n<ul>\n<li>Quarterly Team Gatherings</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_323bc85d-b69","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Replit","sameAs":"https://jobs.ashbyhq.com","logo":"https://logos.yubhub.co/replit.com.png"},"x-apply-url":"https://jobs.ashbyhq.com/replit/6481ec1e-527c-4c1f-a041-2fb5021e7bd5","x-work-arrangement":"hybrid","x-experience-level":"staff","x-job-type":"full-time","x-salary-range":"$220K – $325K","x-skills-required":["Infrastructure Engineering","DevOps","Systems Engineering","Site Reliability Engineering","Python","Go","Distributed systems","Container orchestration platforms","Cloud-native technologies","Monitoring/observability solutions","Infrastructure as code","Configuration management tools"],"x-skills-preferred":["Google Cloud Platform","Prometheus","Grafana","Datadog","Go","Terraform","Rapid-growth environments","Company-facing blog posts","Training materials"],"datePosted":"2026-03-07T15:18:43.191Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Foster City, CA"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Infrastructure Engineering, DevOps, Systems Engineering, Site Reliability Engineering, Python, Go, Distributed systems, Container orchestration platforms, Cloud-native technologies, Monitoring/observability solutions, Infrastructure as code, Configuration management tools, Google Cloud Platform, Prometheus, Grafana, Datadog, Go, Terraform, Rapid-growth environments, Company-facing blog posts, Training materials","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":220000,"maxValue":325000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_237ffb32-054"},"title":"Software Engineer, Security Observability","description":"<p><strong>Software Engineer, Security Observability</strong></p>\n<p><strong>Location</strong></p>\n<p>Remote - US</p>\n<p><strong>Employment Type</strong></p>\n<p>Full time</p>\n<p><strong>Location Type</strong></p>\n<p>Remote</p>\n<p><strong>Department</strong></p>\n<p>Security</p>\n<p><strong>Compensation</strong></p>\n<ul>\n<li>$234.4K – $385K • Offers Equity</li>\n</ul>\n<p>The base pay offered may vary depending on multiple individualized factors, including market location, job-related knowledge, skills, and experience. If the role is non-exempt, overtime pay will be provided consistent with applicable laws. In addition to the salary range listed above, total compensation also includes generous equity, performance-related bonus(es) for eligible employees, and the following benefits.</p>\n<ul>\n<li>Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts</li>\n</ul>\n<ul>\n<li>Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)</li>\n</ul>\n<ul>\n<li>401(k) retirement plan with employer match</li>\n</ul>\n<ul>\n<li>Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)</li>\n</ul>\n<ul>\n<li>Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees</li>\n</ul>\n<ul>\n<li>13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)</li>\n</ul>\n<ul>\n<li>Mental health and wellness support</li>\n</ul>\n<ul>\n<li>Employer-paid basic life and disability coverage</li>\n</ul>\n<ul>\n<li>Annual learning and development stipend to fuel your professional growth</li>\n</ul>\n<ul>\n<li>Daily meals in our offices, and meal delivery credits as eligible</li>\n</ul>\n<ul>\n<li>Relocation support for eligible employees</li>\n</ul>\n<ul>\n<li>Additional taxable fringe benefits, such as charitable donation matching and wellness stipends, may also be provided.</li>\n</ul>\n<p>More details about our benefits are available to candidates during the hiring process.</p>\n<p>This role is at-will and OpenAI reserves the right to modify base pay and other compensation components at any time based on individual performance, team or company results, or market conditions.</p>\n<p><strong>About the Team</strong></p>\n<p>Security is at the foundation of OpenAI’s mission to ensure that artificial general intelligence benefits all of humanity.</p>\n<p>The Security team protects OpenAI’s technology, people, and products. We are technical in what we build but are operational in how we do our work, and are committed to supporting all products and research at OpenAI. Our Security team tenets include: prioritizing for impact, enabling researchers, preparing for future transformative technologies, and engaging a robust security culture.</p>\n<p><strong>About the Role</strong></p>\n<p>We are seeking a Software Engineer, Security Observability to join our Security team. In this role, you will be responsible for building secure, scalable systems that enhance our security observability infrastructure. Leveraging your strong engineering skills, you will collaborate with cross-functional teams to develop, deploy, and maintain robust software solutions that support our security and detection capabilities.</p>\n<p>This role is open to remote employees, or relocation assistance is available to one of our OpenAI offices in San Francisco, Seattle, or New York City.</p>\n<p><strong>In this role, you will:</strong></p>\n<ul>\n<li>Design and develop scalable software systems that facilitate security observability across our infrastructure.</li>\n</ul>\n<ul>\n<li>Build and maintain data pipelines that centralize and store security-relevant data from diverse sources.</li>\n</ul>\n<ul>\n<li>Proactively improve the resilience and reliability of data systems to ensure high platform availability</li>\n</ul>\n<ul>\n<li>Collaborate closely with Detection &amp; Response (D&amp;R) and other security teams to reduce the company’s security risk.</li>\n</ul>\n<ul>\n<li>Contribute to data engineering in support of forensic investigations and compliance efforts.</li>\n</ul>\n<p><strong>You might thrive in this role if you have:</strong></p>\n<ul>\n<li>Strong software engineering experience, with proficiency in programming languages such as Python, Golang, or similar.</li>\n</ul>\n<ul>\n<li>A background in infrastructure as code, with experience using tools like Terraform and working with cloud platforms such as Azure.</li>\n</ul>\n<ul>\n<li>Experience with building and maintaining data pipelines, particularly for security-related use cases.</li>\n</ul>\n<ul>\n<li>A generalist engineering mindset, with the flexibility to pivot between various technical domains such as databases, site reliability engineering (SRE), or security.</li>\n</ul>\n<ul>\n<li>The ability to collaborate effectively with security and engineering teams to understand evolving data needs and implement scalable solutions.</li>\n</ul>\n<ul>\n<li>A proactive and detail-oriented approach to problem-solving, with a focus on improving security data visibility and forensic capabilities.</li>\n</ul>\n<p><strong>About OpenAI</strong></p>\n<p>OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_237ffb32-054","directApply":true,"hiringOrganization":{"@type":"Organization","name":"OpenAI","sameAs":"https://jobs.ashbyhq.com","logo":"https://logos.yubhub.co/openai.com.png"},"x-apply-url":"https://jobs.ashbyhq.com/openai/92bf4ff3-7acf-4e49-8e09-47e4e8bd1f83","x-work-arrangement":"remote","x-experience-level":"mid","x-job-type":"full-time","x-salary-range":"$234.4K – $385K","x-skills-required":["Python","Golang","Terraform","Azure","data pipelines","security-related use cases","databases","site reliability engineering (SRE)","security"],"x-skills-preferred":["infrastructure as code","cloud platforms","forensic investigations","compliance efforts"],"datePosted":"2026-03-06T18:41:31.640Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Remote - US"}},"jobLocationType":"TELECOMMUTE","employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Python, Golang, Terraform, Azure, data pipelines, security-related use cases, databases, site reliability engineering (SRE), security, infrastructure as code, cloud platforms, forensic investigations, compliance efforts","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":234400,"maxValue":385000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_edcdad0c-360"},"title":"Software Engineer, Security Observability","description":"<p><strong>Software Engineer, Security Observability</strong></p>\n<p><strong>Location</strong></p>\n<p>San Francisco</p>\n<p><strong>Employment Type</strong></p>\n<p>Full time</p>\n<p><strong>Location Type</strong></p>\n<p>Hybrid</p>\n<p><strong>Department</strong></p>\n<p>Security</p>\n<p><strong>Compensation</strong></p>\n<ul>\n<li>$234.4K – $385K • Offers Equity</li>\n</ul>\n<p>The base pay offered may vary depending on multiple individualized factors, including market location, job-related knowledge, skills, and experience. If the role is non-exempt, overtime pay will be provided consistent with applicable laws. In addition to the salary range listed above, total compensation also includes generous equity, performance-related bonus(es) for eligible employees, and the following benefits.</p>\n<p><strong>Benefits</strong></p>\n<ul>\n<li>Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts</li>\n</ul>\n<ul>\n<li>Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)</li>\n</ul>\n<ul>\n<li>401(k) retirement plan with employer match</li>\n</ul>\n<ul>\n<li>Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)</li>\n</ul>\n<ul>\n<li>Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees</li>\n</ul>\n<ul>\n<li>13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)</li>\n</ul>\n<ul>\n<li>Mental health and wellness support</li>\n</ul>\n<ul>\n<li>Employer-paid basic life and disability coverage</li>\n</ul>\n<ul>\n<li>Annual learning and development stipend to fuel your professional growth</li>\n</ul>\n<ul>\n<li>Daily meals in our offices, and meal delivery credits as eligible</li>\n</ul>\n<ul>\n<li>Relocation support for eligible employees</li>\n</ul>\n<ul>\n<li>Additional taxable fringe benefits, such as charitable donation matching and wellness stipends, may also be provided.</li>\n</ul>\n<p><strong>About the Team</strong></p>\n<p>Security is at the foundation of OpenAI’s mission to ensure that artificial general intelligence benefits all of humanity.</p>\n<p>The Security team protects OpenAI’s technology, people, and products. We are technical in what we build but are operational in how we do our work, and are committed to supporting all products and research at OpenAI. Our Security team tenets include: prioritizing for impact, enabling researchers, preparing for future transformative technologies, and engaging a robust security culture.</p>\n<p><strong>About the Role</strong></p>\n<p>We are seeking a Software Engineer, Security Observability to join our Security team. In this role, you will be responsible for building secure, scalable systems that enhance our security observability infrastructure. Leveraging your strong engineering skills, you will collaborate with cross-functional teams to develop, deploy, and maintain robust software solutions that support our security and detection capabilities.</p>\n<p>This role is open to remote employees, or relocation assistance is available to one of our OpenAI offices in San Francisco, Seattle, or New York City.</p>\n<p><strong>In this role, you will:</strong></p>\n<ul>\n<li>Design and develop scalable software systems that facilitate security observability across our infrastructure.</li>\n</ul>\n<ul>\n<li>Build and maintain data pipelines that centralize and store security-relevant data from diverse sources.</li>\n</ul>\n<ul>\n<li>Proactively improve the resilience and reliability of data systems to ensure high platform availability</li>\n</ul>\n<ul>\n<li>Collaborate closely with Detection &amp; Response (D&amp;R) and other security teams to reduce the company’s security risk.</li>\n</ul>\n<ul>\n<li>Contribute to data engineering in support of forensic investigations and compliance efforts.</li>\n</ul>\n<p><strong>You might thrive in this role if you have:</strong></p>\n<ul>\n<li>Strong software engineering experience, with proficiency in programming languages such as Python, Golang, or similar.</li>\n</ul>\n<ul>\n<li>A background in infrastructure as code, with experience using tools like Terraform and working with cloud platforms such as Azure.</li>\n</ul>\n<ul>\n<li>Experience with building and maintaining data pipelines, particularly for security-related use cases.</li>\n</ul>\n<ul>\n<li>A generalist engineering mindset, with the flexibility to pivot between various technical domains such as databases, site reliability engineering (SRE), or security.</li>\n</ul>\n<ul>\n<li>The ability to collaborate effectively with security and engineering teams to understand evolving data needs and implement scalable solutions.</li>\n</ul>\n<ul>\n<li>A proactive and detail-oriented approach to problem-solving, with a focus on improving security data visibility and forensic capabilities.</li>\n</ul>\n<p><strong>About OpenAI</strong></p>\n<p>OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_edcdad0c-360","directApply":true,"hiringOrganization":{"@type":"Organization","name":"OpenAI","sameAs":"https://jobs.ashbyhq.com","logo":"https://logos.yubhub.co/openai.com.png"},"x-apply-url":"https://jobs.ashbyhq.com/openai/3e254907-5101-438d-8708-f6f34e5c75ea","x-work-arrangement":"hybrid","x-experience-level":"mid","x-job-type":"full-time","x-salary-range":"$234.4K – $385K • Offers Equity","x-skills-required":["Python","Golang","Terraform","Azure","data pipelines","security-related use cases","databases","site reliability engineering (SRE)","security"],"x-skills-preferred":["infrastructure as code","cloud platforms","data engineering","forensic investigations","compliance efforts"],"datePosted":"2026-03-06T18:31:34.060Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Python, Golang, Terraform, Azure, data pipelines, security-related use cases, databases, site reliability engineering (SRE), security, infrastructure as code, cloud platforms, data engineering, forensic investigations, compliance efforts","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":234400,"maxValue":385000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_88643d65-f58"},"title":"Software Engineer, Security Observability","description":"<p><strong>Software Engineer, Security Observability</strong></p>\n<p><strong>Location</strong></p>\n<p>Seattle</p>\n<p><strong>Employment Type</strong></p>\n<p>Full time</p>\n<p><strong>Department</strong></p>\n<p>Security</p>\n<p><strong>Compensation</strong></p>\n<ul>\n<li>$234.4K – $385K • Offers Equity</li>\n</ul>\n<p>The base pay offered may vary depending on multiple individualized factors, including market location, job-related knowledge, skills, and experience. If the role is non-exempt, overtime pay will be provided consistent with applicable laws. In addition to the salary range listed above, total compensation also includes generous equity, performance-related bonus(es) for eligible employees, and the following benefits.</p>\n<p><strong>Benefits</strong></p>\n<ul>\n<li>Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts</li>\n</ul>\n<ul>\n<li>Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)</li>\n</ul>\n<ul>\n<li>401(k) retirement plan with employer match</li>\n</ul>\n<ul>\n<li>Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)</li>\n</ul>\n<ul>\n<li>Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees</li>\n</ul>\n<ul>\n<li>13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)</li>\n</ul>\n<ul>\n<li>Mental health and wellness support</li>\n</ul>\n<ul>\n<li>Employer-paid basic life and disability coverage</li>\n</ul>\n<ul>\n<li>Annual learning and development stipend to fuel your professional growth</li>\n</ul>\n<ul>\n<li>Daily meals in our offices, and meal delivery credits as eligible</li>\n</ul>\n<ul>\n<li>Relocation support for eligible employees</li>\n</ul>\n<ul>\n<li>Additional taxable fringe benefits, such as charitable donation matching and wellness stipends, may also be provided.</li>\n</ul>\n<p><strong>About the Team</strong></p>\n<p>Security is at the foundation of OpenAI’s mission to ensure that artificial general intelligence benefits all of humanity.</p>\n<p>The Security team protects OpenAI’s technology, people, and products. We are technical in what we build but are operational in how we do our work, and are committed to supporting all products and research at OpenAI. Our Security team tenets include: prioritizing for impact, enabling researchers, preparing for future transformative technologies, and engaging a robust security culture.</p>\n<p><strong>About the Role</strong></p>\n<p>We are seeking a Software Engineer, Security Observability to join our Security team. In this role, you will be responsible for building secure, scalable systems that enhance our security observability infrastructure. Leveraging your strong engineering skills, you will collaborate with cross-functional teams to develop, deploy, and maintain robust software solutions that support our security and detection capabilities.</p>\n<p>This role is open to remote employees, or relocation assistance is available to one of our OpenAI offices in San Francisco, Seattle, or New York City.</p>\n<p><strong>In this role, you will:</strong></p>\n<ul>\n<li>Design and develop scalable software systems that facilitate security observability across our infrastructure.</li>\n</ul>\n<ul>\n<li>Build and maintain data pipelines that centralize and store security-relevant data from diverse sources.</li>\n</ul>\n<ul>\n<li>Proactively improve the resilience and reliability of data systems to ensure high platform availability</li>\n</ul>\n<ul>\n<li>Collaborate closely with Detection &amp; Response (D&amp;R) and other security teams to reduce the company’s security risk.</li>\n</ul>\n<ul>\n<li>Contribute to data engineering in support of forensic investigations and compliance efforts.</li>\n</ul>\n<p><strong>You might thrive in this role if you have:</strong></p>\n<ul>\n<li>Strong software engineering experience, with proficiency in programming languages such as Python, Golang, or similar.</li>\n</ul>\n<ul>\n<li>A background in infrastructure as code, with experience using tools like Terraform and working with cloud platforms such as Azure.</li>\n</ul>\n<ul>\n<li>Experience with building and maintaining data pipelines, particularly for security-related use cases.</li>\n</ul>\n<ul>\n<li>A generalist engineering mindset, with the flexibility to pivot between various technical domains such as databases, site reliability engineering (SRE), or security.</li>\n</ul>\n<ul>\n<li>The ability to collaborate effectively with security and engineering teams to understand evolving data needs and implement scalable solutions.</li>\n</ul>\n<ul>\n<li>A proactive and detail-oriented approach to problem-solving, with a focus on improving security data visibility and forensic capabilities.</li>\n</ul>\n<p><strong>About OpenAI</strong></p>\n<p>OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_88643d65-f58","directApply":true,"hiringOrganization":{"@type":"Organization","name":"OpenAI","sameAs":"https://jobs.ashbyhq.com","logo":"https://logos.yubhub.co/openai.com.png"},"x-apply-url":"https://jobs.ashbyhq.com/openai/747bb870-4ef1-4bfd-b2c0-d48042a85080","x-work-arrangement":"remote","x-experience-level":"mid","x-job-type":"full-time","x-salary-range":"$234.4K – $385K","x-skills-required":["Python","Golang","Terraform","Azure","data pipelines","security-related use cases","databases","site reliability engineering (SRE)","security"],"x-skills-preferred":["infrastructure as code","cloud platforms","data engineering","forensic investigations","compliance efforts"],"datePosted":"2026-03-06T18:31:29.641Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Seattle"}},"jobLocationType":"TELECOMMUTE","employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Python, Golang, Terraform, Azure, data pipelines, security-related use cases, databases, site reliability engineering (SRE), security, infrastructure as code, cloud platforms, data engineering, forensic investigations, compliance efforts","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":234400,"maxValue":385000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_7f4e2dd8-338"},"title":"Software Engineer, Security Observability","description":"<p><strong>Software Engineer, Security Observability</strong></p>\n<p><strong>Location</strong></p>\n<p>New York City</p>\n<p><strong>Employment Type</strong></p>\n<p>Full time</p>\n<p><strong>Location Type</strong></p>\n<p>Hybrid</p>\n<p><strong>Department</strong></p>\n<p>Security</p>\n<p><strong>Compensation</strong></p>\n<ul>\n<li>$325K – $405K • Offers Equity</li>\n</ul>\n<p>The base pay offered may vary depending on multiple individualized factors, including market location, job-related knowledge, skills, and experience. If the role is non-exempt, overtime pay will be provided consistent with applicable laws. In addition to the salary range listed above, total compensation also includes generous equity, performance-related bonus(es) for eligible employees, and the following benefits.</p>\n<ul>\n<li>Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts</li>\n</ul>\n<ul>\n<li>Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)</li>\n</ul>\n<ul>\n<li>401(k) retirement plan with employer match</li>\n</ul>\n<ul>\n<li>Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)</li>\n</ul>\n<ul>\n<li>Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees</li>\n</ul>\n<ul>\n<li>13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)</li>\n</ul>\n<ul>\n<li>Mental health and wellness support</li>\n</ul>\n<ul>\n<li>Employer-paid basic life and disability coverage</li>\n</ul>\n<ul>\n<li>Annual learning and development stipend to fuel your professional growth</li>\n</ul>\n<ul>\n<li>Daily meals in our offices, and meal delivery credits as eligible</li>\n</ul>\n<ul>\n<li>Relocation support for eligible employees</li>\n</ul>\n<ul>\n<li>Additional taxable fringe benefits, such as charitable donation matching and wellness stipends, may also be provided.</li>\n</ul>\n<p>More details about our benefits are available to candidates during the hiring process.</p>\n<p>This role is at-will and OpenAI reserves the right to modify base pay and other compensation components at any time based on individual performance, team or company results, or market conditions.</p>\n<p><strong>About the Team</strong></p>\n<p>Security is at the foundation of OpenAI’s mission to ensure that artificial general intelligence benefits all of humanity.</p>\n<p>The Security team protects OpenAI’s technology, people, and products. We are technical in what we build but are operational in how we do our work, and are committed to supporting all products and research at OpenAI. Our Security team tenets include: prioritizing for impact, enabling researchers, preparing for future transformative technologies, and engaging a robust security culture.</p>\n<p><strong>About the Role</strong></p>\n<p>We are seeking a Software Engineer, Security Observability to join our Security team. In this role, you will be responsible for building secure, scalable systems that enhance our security observability infrastructure. Leveraging your strong engineering skills, you will collaborate with cross-functional teams to develop, deploy, and maintain robust software solutions that support our security and detection capabilities.</p>\n<p>This role is open to remote employees, or relocation assistance is available to one of our OpenAI offices in San Francisco, Seattle, or New York City.</p>\n<p><strong>In this role, you will:</strong></p>\n<ul>\n<li>Design and develop scalable software systems that facilitate security observability across our infrastructure.</li>\n</ul>\n<ul>\n<li>Build and maintain data pipelines that centralize and store security-relevant data from diverse sources.</li>\n</ul>\n<ul>\n<li>Proactively improve the resilience and reliability of data systems to ensure high platform availability</li>\n</ul>\n<ul>\n<li>Collaborate closely with Detection &amp; Response (D&amp;R) and other security teams to reduce the company’s security risk.</li>\n</ul>\n<ul>\n<li>Contribute to data engineering in support of forensic investigations and compliance efforts.</li>\n</ul>\n<p><strong>You might thrive in this role if you have:</strong></p>\n<ul>\n<li>Strong software engineering experience, with proficiency in programming languages such as Python, Golang, or similar.</li>\n</ul>\n<ul>\n<li>A background in infrastructure as code, with experience using tools like Terraform and working with cloud platforms such as Azure.</li>\n</ul>\n<ul>\n<li>Experience with building and maintaining data pipelines, particularly for security-related use cases.</li>\n</ul>\n<ul>\n<li>A generalist engineering mindset, with the flexibility to pivot between various technical domains such as databases, site reliability engineering (SRE), or security.</li>\n</ul>\n<ul>\n<li>The ability to collaborate effectively with security and engineering teams to understand evolving data needs and implement scalable solutions.</li>\n</ul>\n<ul>\n<li>A proactive and detail-oriented approach to problem-solving, with a focus on improving security data visibility and forensic capabilities.</li>\n</ul>\n<p><strong>About OpenAI</strong></p>\n<p>OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives, voices, and experiences that form the full spectrum of humanity.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_7f4e2dd8-338","directApply":true,"hiringOrganization":{"@type":"Organization","name":"OpenAI","sameAs":"https://jobs.ashbyhq.com","logo":"https://logos.yubhub.co/openai.com.png"},"x-apply-url":"https://jobs.ashbyhq.com/openai/1e4e9985-babf-4bd9-8fe8-a2016250780d","x-work-arrangement":"hybrid","x-experience-level":"mid","x-job-type":"full-time","x-salary-range":"$325K – $405K • Offers Equity","x-skills-required":["Python","Golang","Terraform","Azure","data pipelines","security-related use cases","databases","site reliability engineering (SRE)","security"],"x-skills-preferred":["infrastructure as code","cloud platforms","forensic investigations","compliance efforts"],"datePosted":"2026-03-06T18:30:54.020Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"New York City"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Python, Golang, Terraform, Azure, data pipelines, security-related use cases, databases, site reliability engineering (SRE), security, infrastructure as code, cloud platforms, forensic investigations, compliance efforts","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":325000,"maxValue":405000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_fb4acb2b-bab"},"title":"Security Reliability Engineering, Lead","description":"<p><strong>Security Reliability Engineering, Lead</strong></p>\n<p><strong>Location</strong></p>\n<p>San Francisco</p>\n<p><strong>Employment Type</strong></p>\n<p>Full time</p>\n<p><strong>Department</strong></p>\n<p>Security</p>\n<p><strong>Compensation</strong></p>\n<ul>\n<li>$293K – $385K</li>\n</ul>\n<p>The base pay offered may vary depending on multiple individualized factors, including market location, job-related knowledge, skills, and experience. If the role is non-exempt, overtime pay will be provided consistent with applicable laws. In addition to the salary range listed above, total compensation also includes generous equity, performance-related bonus(es) for eligible employees, and the following benefits.</p>\n<ul>\n<li>Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts</li>\n</ul>\n<ul>\n<li>Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)</li>\n</ul>\n<ul>\n<li>401(k) retirement plan with employer match</li>\n</ul>\n<ul>\n<li>Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)</li>\n</ul>\n<ul>\n<li>Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees</li>\n</ul>\n<ul>\n<li>13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)</li>\n</ul>\n<ul>\n<li>Mental health and wellness support</li>\n</ul>\n<ul>\n<li>Employer-paid basic life and disability coverage</li>\n</ul>\n<ul>\n<li>Annual learning and development stipend to fuel your professional growth</li>\n</ul>\n<ul>\n<li>Daily meals in our offices, and meal delivery credits as eligible</li>\n</ul>\n<ul>\n<li>Relocation support for eligible employees</li>\n</ul>\n<ul>\n<li>Additional taxable fringe benefits, such as charitable donation matching and wellness stipends, may also be provided.</li>\n</ul>\n<p>More details about our benefits are available to candidates during the hiring process.</p>\n<p>This role is at-will and OpenAI reserves the right to modify base pay and other compensation components at any time based on individual performance, team or company results, or market conditions.</p>\n<p><strong>About the Team</strong></p>\n<p>The Infrastructure Engineering function sits within IT and is responsible for reliably building, deploying, and operating critical on prem and hybrid environments that power internal services and critical R&amp;D environments.</p>\n<p>This is a new, bootstrap team focused on applying strong Site Reliability Engineering discipline to environments where uptime, safety, recoverability, and security are non-negotiable. The team replaces bespoke, one off infrastructure with standardized infrastructure-as-code building blocks that compound reliability and operational leverage as OpenAI scales.</p>\n<p><strong>About the Role</strong></p>\n<p>We are looking for a Security Reliability Engineering Lead to design, build, and operate reliable, secure, and scalable infrastructure that underpins identity, access, endpoint, and shared platform services across the company.</p>\n<p>In this role, you will own infrastructure and identity systems end to end, from foundational design and provisioning through policy enforcement, upgrades, recovery, and day two operations. You will establish durable, production grade platforms that remove operational friction, enforce security by default, and enable teams to move faster with confidence.</p>\n<p>This role is well suited for a senior engineer who thrives in ambiguity, enjoys owning complex systems end to end, and raises the reliability and security bar by replacing fragile implementations with standardized, repeatable infrastructure.</p>\n<p>This role is based in our San Francisco HQ and requires in-office presence.</p>\n<p><strong>In this role, you will:</strong></p>\n<p><strong>Set direction and establish strong foundations</strong></p>\n<ul>\n<li>Define and evolve infrastructure patterns for on prem and hybrid environments, including self hosted platforms, vendor supported systems, and lab environments.</li>\n</ul>\n<ul>\n<li>Establish standardized, production grade deployment and operational models that replace bespoke implementations.</li>\n</ul>\n<ul>\n<li>Partner with IT, Security, Identity, and Network teams to ensure infrastructure meets reliability, security, and access requirements by design.</li>\n</ul>\n<ul>\n<li>Design and mature the production architecture for IAM adjacent platforms such as Microsoft Entra using SRE principles.</li>\n</ul>\n<ul>\n<li>Establish common management rules and shared resources within Azure subscriptions to ensure consistent, policy aligned operations.</li>\n</ul>\n<p><strong>Build, operate, and scale reliably</strong></p>\n<ul>\n<li>Own the full lifecycle of infrastructure systems, including deployment, upgrades, patching, recovery, and ongoing operations.</li>\n</ul>\n<ul>\n<li>Operate and harden shared infrastructure provisioned through Infra Terraform, ensuring repeatability, auditability, and safe change management.</li>\n</ul>\n<ul>\n<li>Design and implement infrastructure as code and configuration management to support shared services, identity adjacent systems, and endpoint platforms using tools like Chef, Ansible and Terraform.</li>\n</ul>\n<ul>\n<li>Build and operate monitoring, alerting, and incident response mechanisms to meet high availability and recoverability targets.</li>\n</ul>\n<ul>\n<li>Lead incident response and postmortems across infrastructure, identity adjacent platforms, and fleet systems, driving durable fixes and shared learning.</li>\n</ul>\n<ul>\n<li>Build and operate containerized and platform services, including Kubernetes and Docker-based workloads, using DevOps practices that emphasize reliability, repeatability, and safe change management.</li>\n</ul>\n<ul>\n<li>Use Git-based workflows as the source of truth for infrastructure and policy changes, enabling review, auditability, and safe, reversible automation.</li>\n</ul>\n<p><strong>Automate for leverage and safety</strong></p>\n<ul>\n<li>Identify high leverage automation opportunities that eliminate manual toil and reduce operational risk across infrastructure and access related systems.</li>\n</ul>\n<ul>\n<li>Implement guardrails, safety mechanisms, and progressive rollout patterns for infrastructure and policy enforcement changes.</li>\n</ul>\n<ul>\n<li>Ensure automation is safe, observable, and resilient under failure conditions, particularly for shared services and high blast radius systems.</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_fb4acb2b-bab","directApply":true,"hiringOrganization":{"@type":"Organization","name":"OpenAI","sameAs":"https://jobs.ashbyhq.com","logo":"https://logos.yubhub.co/openai.com.png"},"x-apply-url":"https://jobs.ashbyhq.com/openai/645ccd65-eb60-4eb7-b094-b01c2269638c","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$293K – $385K","x-skills-required":["Security Reliability Engineering","Infrastructure as Code","Cloud Computing","Containerization","DevOps","Git","Terraform","Ansible","Chef","Kubernetes","Docker","Microsoft Entra","Azure","Identity and Access Management","Endpoint Security","Platform Services"],"x-skills-preferred":["Site Reliability Engineering","Cloud Security","Container Orchestration","Infrastructure Automation","Monitoring and Alerting","Incident Response","Postmortem Analysis","DevOps Practices","Cloud-Native Applications","Microservices Architecture"],"datePosted":"2026-03-06T18:29:47.579Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Security Reliability Engineering, Infrastructure as Code, Cloud Computing, Containerization, DevOps, Git, Terraform, Ansible, Chef, Kubernetes, Docker, Microsoft Entra, Azure, Identity and Access Management, Endpoint Security, Platform Services, Site Reliability Engineering, Cloud Security, Container Orchestration, Infrastructure Automation, Monitoring and Alerting, Incident Response, Postmortem Analysis, DevOps Practices, Cloud-Native Applications, Microservices Architecture","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":293000,"maxValue":385000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_d4efa5c8-cef"},"title":"Offensive Security Engineer, Hardware","description":"<p><strong>Job Posting</strong></p>\n<p><strong>Offensive Security Engineer, Hardware</strong></p>\n<p><strong>Location</strong></p>\n<p>San Francisco</p>\n<p><strong>Employment Type</strong></p>\n<p>Full time</p>\n<p><strong>Department</strong></p>\n<p>Security</p>\n<p><strong>Compensation</strong></p>\n<ul>\n<li>San Francisco$293K – $490K • Offers Equity</li>\n</ul>\n<p>The base pay offered may vary depending on multiple individualized factors, including market location, job-related knowledge, skills, and experience. If the role is non-exempt, overtime pay will be provided consistent with applicable laws. In addition to the salary range listed above, total compensation also includes generous equity, performance-related bonus(es) for eligible employees, and the following benefits.</p>\n<ul>\n<li>Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts</li>\n</ul>\n<ul>\n<li>Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)</li>\n</ul>\n<ul>\n<li>401(k) retirement plan with employer match</li>\n</ul>\n<ul>\n<li>Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)</li>\n</ul>\n<ul>\n<li>Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees</li>\n</ul>\n<ul>\n<li>13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)</li>\n</ul>\n<ul>\n<li>Mental health and wellness support</li>\n</ul>\n<ul>\n<li>Employer-paid basic life and disability coverage</li>\n</ul>\n<ul>\n<li>Annual learning and development stipend to fuel your professional growth</li>\n</ul>\n<ul>\n<li>Daily meals in our offices, and meal delivery credits as eligible</li>\n</ul>\n<ul>\n<li>Relocation support for eligible employees</li>\n</ul>\n<ul>\n<li>Additional taxable fringe benefits, such as charitable donation matching and wellness stipends, may also be provided.</li>\n</ul>\n<p>More details about our benefits are available to candidates during the hiring process.</p>\n<p>This role is at-will and OpenAI reserves the right to modify base pay and other compensation components at any time based on individual performance, team or company results, or market conditions.</p>\n<p><strong>About the Team</strong></p>\n<p>Security is at the foundation of OpenAI’s mission to ensure that artificial general intelligence benefits all of humanity. The Security team protects OpenAI’s technology, people, and products. We are technical in what we build but are operational in how we do our work, and are committed to supporting all products and research at OpenAI. Our Security team tenets include: prioritizing for impact, enabling researchers, preparing for future transformative technologies, and engaging a robust security culture.</p>\n<p><strong>About the Role</strong></p>\n<p>We&#39;re seeking an exceptional Principal-level Offensive Security Engineer to challenge and strengthen OpenAI&#39;s security posture. This role isn&#39;t your typical red team job - it&#39;s an opportunity to engage broadly and deeply, craft innovative attack simulations, collaborate closely with defensive teams, and influence strategic security improvements across the organization.</p>\n<p>You&#39;ll have the chance to not only find vulnerabilities but actively drive their resolution, automate offensive techniques with cutting-edge technologies, and use your unique attacker perspective to shape our security strategy. This role will be primarily focused on continuously testing our hardware products and related services.</p>\n<p><strong>In this role you will:</strong></p>\n<ul>\n<li>Collaborate proactively with engineering teams to enhance security and mitigate risks in hardware, firmware, and software.</li>\n</ul>\n<ul>\n<li>Perform comprehensive penetration testing on our diverse suite of products.</li>\n</ul>\n<ul>\n<li>Leverage advanced automation and OpenAI technologies to optimize your offensive security work.</li>\n</ul>\n<ul>\n<li>Present insightful, actionable findings clearly and compellingly to inspire impactful change.</li>\n</ul>\n<ul>\n<li>Influence security strategy by providing attacker-driven insights into risk and threat modeling.</li>\n</ul>\n<p><strong>You might thrive in this role if you have:</strong></p>\n<ul>\n<li>7+ years of hands-on experience or exceptional accomplishments demonstrating equivalent expertise.</li>\n</ul>\n<ul>\n<li>Exceptional skill in code review, identifying novel and subtle vulnerabilities.</li>\n</ul>\n<ul>\n<li>Demonstrated mastery assessing complex technology stacks, including:</li>\n</ul>\n<ul>\n<li>Proven ability to reverse engineer bootrom images, firmware, or silicon-level components.</li>\n</ul>\n<ul>\n<li>Deep familiarity with low-level kernel operations, secure boot processes, and hardware-software interactions.</li>\n</ul>\n<ul>\n<li>Hands-on experience building and validating secure boot chains and threat models.</li>\n</ul>\n<ul>\n<li>Proficiency with hardware debugging tools (UART, JTAG, SWD, oscilloscopes, logic analyzers).</li>\n</ul>\n<ul>\n<li>Solid programming skills in C/C++, Python, or assembly for embedded systems.</li>\n</ul>\n<ul>\n<li>Industry experience securing consumer hardware (e.g., mobile devices, IoT, chipsets).</li>\n</ul>\n<ul>\n<li>Excellent written and verbal communication skills for technical and non-technical audiences.</li>\n</ul>\n<ul>\n<li>Strong intuitive understanding of trust boundaries and risk assessment in dynamic contexts.</li>\n</ul>\n<ul>\n<li>Excellent coding skills, capable of writing robust tools and automation for offensive operations.</li>\n</ul>\n<ul>\n<li>Ability to communicate complex technical concepts effectively through compelling storytelling.</li>\n</ul>\n<ul>\n<li>Proven track record of not just finding vulnerabilities but actively contributing to solutions in complex codebases.</li>\n</ul>\n<p><strong>Bonus points:</strong></p>\n<ul>\n<li>Prior experience working in tech startups or fast-paced technology environments.</li>\n</ul>\n<ul>\n<li>Experience in related disciplines such as Software Engineering (SWE), Detection Engineering, Site Reliability Engineering (SRE), Security Engineering, or IT Infrastructure.</li>\n</ul>\n<p><strong>About OpenAI</strong></p>\n<p>OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. We push the boundaries of the capabilities of AI systems and seek to safely deploy them to the world through our products. AI is an extremely powerful tool that must be created with safety and human needs at its core, and to achieve our mission, we must encompass and value the many different perspectives and experiences of our team members.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_d4efa5c8-cef","directApply":true,"hiringOrganization":{"@type":"Organization","name":"OpenAI","sameAs":"https://jobs.ashbyhq.com","logo":"https://logos.yubhub.co/openai.com.png"},"x-apply-url":"https://jobs.ashbyhq.com/openai/f123bbe4-7f19-46c8-a6ab-4a5d7b714988","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$293K – $490K","x-skills-required":["code review","penetration testing","advanced automation","secure boot processes","hardware debugging tools","C/C++","Python","assembly","embedded systems","consumer hardware","firmware","silicon-level components","low-level kernel operations","secure boot chains","threat models","UART","JTAG","SWD","oscilloscopes","logic analyzers","solid programming skills","industry experience","excellent written and verbal communication skills","trust boundaries","risk assessment","dynamic contexts","compelling storytelling","complex technical concepts","offensive operations","robust tools and automation"],"x-skills-preferred":["tech startups","fast-paced technology environments","Software Engineering","Detection Engineering","Site Reliability Engineering","Security Engineering","IT Infrastructure"],"datePosted":"2026-03-06T18:29:30.545Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"code review, penetration testing, advanced automation, secure boot processes, hardware debugging tools, C/C++, Python, assembly, embedded systems, consumer hardware, firmware, silicon-level components, low-level kernel operations, secure boot chains, threat models, UART, JTAG, SWD, oscilloscopes, logic analyzers, solid programming skills, industry experience, excellent written and verbal communication skills, trust boundaries, risk assessment, dynamic contexts, compelling storytelling, complex technical concepts, offensive operations, robust tools and automation, tech startups, fast-paced technology environments, Software Engineering, Detection Engineering, Site Reliability Engineering, Security Engineering, IT Infrastructure","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":293000,"maxValue":490000,"unitText":"YEAR"}}}]}