<?xml version="1.0" encoding="UTF-8"?>
<source>
  <jobs>
    <job>
      <externalid>bffa66a3-0b4</externalid>
      <Title>Senior Software Engineer - Robinhood Command Center</Title>
      <Description><![CDATA[<p>Join us in building the future of finance.</p>
<p>Our mission is to democratize finance for all. An estimated $124 trillion of assets will be inherited by younger generations in the next two decades. The largest transfer of wealth in human history. If you’re ready to be at the epicenter of this historic cultural and financial shift, keep reading.</p>
<p>We are building an elite team, applying frontier technologies to the world’s biggest financial problems. We’re looking for bold thinkers. Sharp problem-solvers. Builders who are wired to make an impact. Robinhood isn’t a place for complacency, it’s where ambitious people do the best work of their careers. We’re a high-performing, fast-moving team with ethics at the center of everything we do. Expectations are high, and so are the rewards.</p>
<p>The Robinhood Command Center (RCC) is a newly formed reliability team that serves as the front line for detecting, coordinating, and mitigating production incidents across Robinhood.</p>
<p>As part of Robinhood’s broader reliability initiative, RCC works closely with product engineering, reliability, observability, infrastructure, and business teams to reduce customer impact and shorten incident duration.</p>
<p>As a Senior Engineer, you will be part of the founding RCC team, helping define how Robinhood responds to and learns from incidents at scale. This is a highly visible role focused on incident leadership, operational excellence, and reliability tooling. You will not own product services or core infrastructure, but you will own the processes and tools that enable fast, high-quality incident response.</p>
<p>This role is based in our Menlo Park, California office, with in-person attendance expected at least 3 days per week.</p>
<p><strong>Responsibilities:</strong></p>
<ul>
<li>Serve as a senior technical leader driving the long-term reliability and observability strategy across Robinhood’s infrastructure</li>
</ul>
<ul>
<li>Partner closely across many different types of engineers to raise the bar for operational excellence and incident response</li>
</ul>
<ul>
<li>Lead incident mitigation efforts by coordinating service owners, facilitating time-sensitive decisions like rollbacks, traffic shifts, and maintaining a clear source of truth during active incidents</li>
</ul>
<ul>
<li>Develop and maintain incident management processes and procedures to ensure timely resolution and minimize customer impact</li>
</ul>
<ul>
<li>Own incident discovery at the company level by defining and maintaining global dashboards and alerts tied to critical user journeys (CUJs), availability, and business-impact metrics</li>
</ul>
<ul>
<li>Own and evolve incident response tooling and processes, including education, adoption, and measurement of MTTD/MTTR improvements</li>
</ul>
<ul>
<li>Drive post-incident governance and learning, defining standards for postmortems, SEV reviews, and follow-up tracking to ensure durable reliability improvements</li>
</ul>
<ul>
<li>Design and implement next-generation failure mitigation strategies that avoid full-region or full-datacenter failovers</li>
</ul>
<ul>
<li>Define and build frameworks to improve monitoring, alerting, and observability across hundreds of services and systems</li>
</ul>
<ul>
<li>Define and own the roadmap of bringing observability to critical user journeys for Robinhood’s products</li>
</ul>
<ul>
<li>Deliver key insights and executive-level reporting to enable better business decisions around service quality and reliability</li>
</ul>
<ul>
<li>Act as a force multiplier through mentoring, technical influence, and contributions to hiring and engineering culture</li>
</ul>
<p><strong>Requirements:</strong></p>
<ul>
<li>5+ years of software engineering experience, including significant experience operating production systems</li>
</ul>
<ul>
<li>2+ years focused on reliability engineering, infrastructure, distributed systems, or production operations</li>
</ul>
<ul>
<li>Hands-on experience serving in incident leadership roles (e.g., IMOC, incident commander, primary oncall)</li>
</ul>
<ul>
<li>Strong communication and cross-functional collaboration skills, especially during high-severity incidents</li>
</ul>
<ul>
<li>Deep knowledge of systems reliability, observability frameworks, and fault-tolerant architecture design</li>
</ul>
<ul>
<li>Experience with multi-region or multi-cluster architectures, capacity planning, and failover strategies</li>
</ul>
<ul>
<li>Familiarity with modern observability stacks (e.g., OpenTelemetry, Prometheus, Grafana)</li>
</ul>
<ul>
<li>Demonstrated ability to drive measurable improvements in MTTD, MTTR, availability, or customer impact</li>
</ul>
<p><strong>What we offer:</strong></p>
<ul>
<li>Challenging, high-impact work to grow your career</li>
</ul>
<ul>
<li>Performance driven compensation with multipliers for outsized impact, bonus programs, equity ownership, and 401(k) matching</li>
</ul>
<ul>
<li>Best in class benefits to fuel your work, including 100% paid health insurance for employees with 90% coverage for dependents</li>
</ul>
<ul>
<li>Lifestyle wallet - a highly flexible benefits spending account for wellness, learning, and more</li>
</ul>
<ul>
<li>Employer-paid life &amp; disability insurance, fertility benefits, and mental health benefits</li>
</ul>
<ul>
<li>Time off to recharge including company holidays, paid time off, sick time, parental leave, and more!</li>
</ul>
<ul>
<li>Exceptional office experience with catered meals, events, and comfortable workspaces</li>
</ul>
<p>In addition to the base pay range listed below, this role is also eligible for bonus opportunities + equity + benefits.</p>
<p>Base pay for the successful applicant will depend on a variety of job-related factors, which may include education, training, experience, location, business needs, or market demands. The expected base pay range for this role is based on the location where the work will be performed and is aligned to one of 3 compensation zones. For other locations not listed, compensation can be discussed with your recruiter during the interview process.</p>
<p>Base Pay Range:</p>
<p>Zone 1 (Menlo Park, CA; New York, NY; Bellevue, WA; Washington, DC): $196,000-$230,000 USD</p>
<p>Zone 2 (Denver, CO; Westlake, TX; Chicago, IL): $172,000-$202,000 USD</p>
<p>Zone 3 (Lake Mary, FL; Clearwater, FL; Gainesville, FL): $153,000-$179,000 USD</p>
<p style="margin-top:24px;font-size:13px;color:#666;">XML job scraping automation by <a href="https://yubhub.co">YubHub</a></p>]]></Description>
      <Jobtype>full-time</Jobtype>
      <Experiencelevel>senior</Experiencelevel>
      <Workarrangement>onsite</Workarrangement>
      <Salaryrange>Based on performance</Salaryrange>
      <Skills>software engineering, reliability engineering, infrastructure, distributed systems, production operations, incident leadership, operational excellence, reliability tooling, processes and tools, fast and high-quality incident response, long-term reliability and observability strategy, multi-region or multi-cluster architectures, capacity planning, failover strategies, modern observability stacks, OpenTelemetry, Prometheus, Grafana</Skills>
      <Category>Engineering</Category>
      <Industry>Finance</Industry>
      <Employername>Robinhood</Employername>
      <Employerlogo>https://logos.yubhub.co/robinhood.com.png</Employerlogo>
      <Employerdescription>Robinhood is a financial services company that provides a mobile app for buying and selling stocks, options, ETFs, and cryptocurrencies.</Employerdescription>
      <Employerwebsite>https://www.robinhood.com/</Employerwebsite>
      <Compensationcurrency></Compensationcurrency>
      <Compensationmin></Compensationmin>
      <Compensationmax></Compensationmax>
      <Applyto>https://job-boards.greenhouse.io/robinhood/jobs/7838644?utm_source=yubhub.co&amp;utm_medium=jobs_feed&amp;utm_campaign=apply</Applyto>
      <Location>Menlo Park, CA</Location>
      <Country></Country>
      <Postedate>2026-04-25</Postedate>
    </job>
    <job>
      <externalid>49ef318f-90a</externalid>
      <Title>Director, Site Reliability Engineer | Senior Engineering Team Director</Title>
      <Description><![CDATA[<p>We&#39;re seeking a Site Reliability Engineering (SRE) Lead to design, build, and maintain resilient, high-scale systems supporting BlackRock&#39;s Private Markets platform. In this hands-on leadership role, you&#39;ll apply deep engineering expertise to solve complex challenges, guide a global team, shape technical direction, and communicate effectively with senior stakeholders,ensuring the reliability of mission-critical systems that power private market investment workflows and decision-making. You will drive the adoption of AI-driven solutions to accelerate incident detection and triage, reduce toil, improve forecasting and capacity planning, and strengthen end-to-end observability and resilience.</p>
<p>Key Responsibilities:</p>
<ul>
<li>Take ownership of project priorities, deadlines and deliverables using Agile methodologies, with clear outcomes around reliability automation and AI-enabled operations</li>
<li>Understand and refine business and functional requirements, translating them into SLOs/SLIs and AI-assisted observability and support capabilities</li>
<li>Hands on approach to getting work done,this role requires a “roll your sleeves up” mentality, including building and operationalizing reliability tooling and automation that measurably reduces toil and improves stability</li>
<li>Be a leader with vision and a partner in brainstorming solutions for team productivity and efficiency to improve engineering effectiveness</li>
<li>Drive priority setting of the engineering teams, balancing foundational reliability work with delivery of new product features</li>
<li>Improve Engineering culture by encouraging continuous focus on reliability across the entire application lifecycle, and by adopting AI-enabled SRE practices (e.g., intelligent alerting, automated diagnosis, and self-healing where appropriate)</li>
<li>Proactive participant in architectural and design decisions, including AI-ready telemetry, data quality, and model integration patterns for operational analytics</li>
<li>Design and implement end-to-end monitoring solutions for application and infrastructure components, leveraging modern observability platforms plus AI/ML techniques for anomaly detection, correlation, and alert noise reduction</li>
<li>Drive the engineering of capacity management and demand forecasting solutions, including predictive analytics/ML approaches where they add measurable value</li>
<li>Act as a culture carrier and leader, passing on SRE knowledge and best practices to the engineering team</li>
<li>Drive detailed root cause investigations for production incidents with rigorous focus on issue avoidance, using AI-assisted correlation/analysis to accelerate time-to-insight</li>
<li>Create/coordinate retros for significant incidents, ensuring learnings are captured in automated/AI-assisted runbooks and embedded into prevention mechanisms</li>
<li>Additional core engineering functions, such as adding custom telemetry metrics/logs/traces to the code base of in-scope applications to enable AI/ML-driven operational insights</li>
<li>Anticipate new opportunities to continuously evolve the resiliency profile of scoped applications and infrastructure</li>
</ul>
<p>Requirements:</p>
<ul>
<li>B.S. / M.S. degree in Computer Science, Engineering or a related discipline with 10+ years of experience</li>
<li>Experience leading high performing engineering/SRE teams, with a track record of driving continuous improvement through automation and AI-enabled operations</li>
<li>Demonstrated ability to represent engineering/SRE priorities, status, and risk to senior leadership stakeholders with clear, executive-ready communication</li>
<li>Hands-on experience building or operating AI-assisted capabilities (AIOps, ML-based anomaly detection, or GenAI workflows) in an engineering/production environment</li>
<li>A passion for providing engineering support for highly available, performant full stack applications with a “Student of Technology” attitude</li>
<li>Experience with relational database and NoSQL Database (e.g. Redis, Apache Cassandra)</li>
</ul>
<p>Benefits:</p>
<ul>
<li>Retirement investment and tools designed to help you in building a sound financial future</li>
<li>Access to education reimbursement</li>
<li>Comprehensive resources to support your physical health and emotional well-being</li>
<li>Family support programs</li>
<li>Flexible Time Off (FTO) so you can relax, recharge and be there for the people you care about</li>
</ul>
<p>Hybrid Work Model:</p>
<ul>
<li>BlackRock’s hybrid work model is designed to enable a culture of collaboration and apprenticeship that enriches the experience of our employees, while supporting flexibility for all</li>
<li>Employees are currently required to work at least 4 days in the office per week, with the flexibility to work from home 1 day a week</li>
<li>Some business groups may require more time in the office due to their roles and responsibilities</li>
<li>We remain focused on increasing the impactful moments that arise when we work together in person – aligned with our commitment to performance and innovation</li>
</ul>
<p>About BlackRock:</p>
<ul>
<li>At BlackRock, we are all connected by one mission: to help more and more people experience financial well-being</li>
<li>Our clients, and the people they serve, are saving for retirement, paying for their children’s educations, buying homes and starting businesses</li>
<li>Their investments also help to strengthen the global economy: support businesses small and large; finance infrastructure projects that connect and power cities; and facilitate innovations that drive progress</li>
</ul>
<p style="margin-top:24px;font-size:13px;color:#666;">XML job scraping automation by <a href="https://yubhub.co">YubHub</a></p>]]></Description>
      <Jobtype>full-time</Jobtype>
      <Experiencelevel>senior</Experiencelevel>
      <Workarrangement>hybrid</Workarrangement>
      <Salaryrange></Salaryrange>
      <Skills>Site Reliability Engineering, Agile Methodologies, Reliability Automation, AI-Enabled Operations, Business Requirements, Functional Requirements, SLOs/SLIs, Observability, Support Capabilities, Reliability Tooling, Automation, Stability, Leadership, Vision, Team Productivity, Efficiency, Engineering Effectiveness, Priority Setting, Foundational Reliability, New Product Features, Engineering Culture, Reliability Across Application Lifecycle, AI-Enabled SRE Practices, Intelligent Alerting, Automated Diagnosis, Self-Healing, Architectural Decisions, AI-Ready Telemetry, Data Quality, Model Integration Patterns, Operational Analytics, Monitoring Solutions, Application Components, Infrastructure Components, Anomaly Detection, Correlation, Alert Noise Reduction, Capacity Management, Demand Forecasting, Predictive Analytics, ML Approaches, Root Cause Investigations, Production Incidents, Issue Avoidance, AI-Assisted Correlation, Time-To-Insight, Retros, Significant Incidents, Learnings, Runbooks, Prevention Mechanisms, Custom Telemetry Metrics, Logs, Traces, AI/ML-Driven Operational Insights, Resiliency Profile, Scoped Applications, Infrastructure, Relational Database, NoSQL Database, Redis, Apache Cassandra</Skills>
      <Category>Engineering</Category>
      <Industry>Finance</Industry>
      <Employername>BlackRock</Employername>
      <Employerlogo>https://logos.yubhub.co/blackrock.com.png</Employerlogo>
      <Employerdescription>BlackRock is a global investment management corporation that provides investments in equity, fixed income, alternatives, and money market instruments. It has over $9 trillion in assets under management.</Employerdescription>
      <Employerwebsite>https://www.blackrock.com/</Employerwebsite>
      <Compensationcurrency></Compensationcurrency>
      <Compensationmin></Compensationmin>
      <Compensationmax></Compensationmax>
      <Applyto>https://jobs.workable.com/view/cLBuSgz7avHiG3cKzS91ZB/director%2C-site-reliability-engineer-%7C-senior-engineering-team-director-in-england-at-blackrock?utm_source=yubhub.co&amp;utm_medium=jobs_feed&amp;utm_campaign=apply</Applyto>
      <Location>England</Location>
      <Country></Country>
      <Postedate>2026-04-24</Postedate>
    </job>
  </jobs>
</source>