{"version":"0.1","company":{"name":"YubHub","url":"https://yubhub.co","jobsUrl":"https://yubhub.co/jobs/skill/production-issues"},"x-facet":{"type":"skill","slug":"production-issues","display":"Production Issues","count":6},"x-feed-size-limit":100,"x-feed-sort":"enriched_at desc","x-feed-notice":"This feed contains at most 100 jobs (the most recently enriched). For the full corpus, use the paginated /stats/by-facet endpoint or /search.","x-generator":"yubhub-xml-generator","x-rights":"Free to redistribute with attribution: \"Data by YubHub (https://yubhub.co)\"","x-schema":"Each entry in `jobs` follows https://schema.org/JobPosting. YubHub-native raw fields carry `x-` prefix.","jobs":[{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_bbbb80e4-2fc"},"title":"Senior Staff Software Engineer - Delta","description":"<p>We are seeking a highly skilled and experienced Senior Staff Software Engineer to join our backend team. In this role, you will be instrumental in designing, developing, and maintaining robust backend systems that power Databricks workspaces.</p>\n<p>You will build the next-generation platform for serving workspace assets, ensuring high QPS, low latency, reliable, and performant systems, proactively addressing the future growth challenges.</p>\n<p>Additionally, as a senior member of the team, you will provide technical leadership, mentorship, and guidance to junior engineers, contributing to the overall improvement of team coding practices and system designs.</p>\n<p>The Impact you will have:</p>\n<ul>\n<li>Solve real business needs at large scale by applying your software engineering.</li>\n<li>Low level systems debugging, performance measurement, and optimization on large production clusters.</li>\n<li>Build architecture design, influence product roadmap, and take ownership and responsibility over new projects.</li>\n<li>Introduce tools to allow greater automation and operability of services.</li>\n<li>Use your deep experience to help prevent and investigate production issues.</li>\n<li>Plan and lead complicated technical projects that work with several teams within the company.</li>\n<li>Contribute as a technical team lead by mentoring others, lead sprint planning, delegating work and assignments to team members and participate in project planning.</li>\n</ul>\n<p>What we look for:</p>\n<ul>\n<li>15+ years industry experience building and supporting large-scale distributed systems.</li>\n<li>Comfortable working towards a multi-year vision with incremental deliverables.</li>\n<li>Motivated by delivering customer value and impact.</li>\n<li>Strong foundation in algorithms and data structures and their real-world use cases.</li>\n<li>Experience driving company initiatives towards customer satisfaction.</li>\n<li>BS/MS/PhD in Computer Science or related majors, or equivalent experience.</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_bbbb80e4-2fc","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Databricks","sameAs":"https://databricks.com","logo":"https://logos.yubhub.co/databricks.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/databricks/jobs/8303020002","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["large-scale distributed systems","software engineering","low-level systems debugging","performance measurement","optimization","architecture design","product roadmap","ownership and responsibility","tools automation","operability","deep experience","production issues","complicated technical projects","team lead","mentoring","sprint planning","delegating work","assignments","project planning"],"x-skills-preferred":[],"datePosted":"2026-04-18T15:57:37.464Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"London, United Kingdom"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"large-scale distributed systems, software engineering, low-level systems debugging, performance measurement, optimization, architecture design, product roadmap, ownership and responsibility, tools automation, operability, deep experience, production issues, complicated technical projects, team lead, mentoring, sprint planning, delegating work, assignments, project planning"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_9ab44406-4c0"},"title":"Principal Engineer, Gemini App Infrastructure","description":"<p>As the Principal Engineer, you will focus on architecting and building the flagship Gemini App infrastructure. You will serve as the technical anchor for the application and orchestration layer, owning the code quality, architectural decisions, and system design of new design systems and functionality.</p>\n<p>Key responsibilities include: Architecting the Gemini app serving and orchestration layers, defining interfaces for a scalable, modular codebase. Designing and implementing robust CI/CD pipelines and experimentation platforms. Driving application performance initiatives, debugging complex production issues, and advocating for code quality standards. Acting as the strategic technical counterpart to product and design leadership, assessing feasibility of ambitious concepts and proposing technical solutions. Mentoring staff and senior engineers, leading code reviews, and fostering a culture of technical accuracy, psychological safety, and user-centricity.</p>\n<p>To succeed in this role, you will need a Bachelor&#39;s degree in Computer Science or Engineering, or equivalent practical experience, and 15 years of experience in software engineering, building and working with systems in the technology organization.</p>\n<p>The US base salary range for this full-time position is between $307,000 - $427,000 + bonus + equity + benefits.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_9ab44406-4c0","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Google DeepMind","sameAs":"https://deepmind.com/","logo":"https://logos.yubhub.co/deepmind.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/deepmind/jobs/7793048","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$307,000 - $427,000 + bonus + equity + benefits","x-skills-required":["software engineering","building and working with systems","architecting","designing and implementing CI/CD pipelines","experimentation platforms","application performance initiatives","debugging complex production issues","code quality standards","strategic technical counterpart","product and design leadership","mentoring staff and senior engineers","leading code reviews","fostering a culture of technical accuracy","psychological safety","user-centricity"],"x-skills-preferred":[],"datePosted":"2026-04-18T15:42:24.141Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Mountain View, California, US"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"software engineering, building and working with systems, architecting, designing and implementing CI/CD pipelines, experimentation platforms, application performance initiatives, debugging complex production issues, code quality standards, strategic technical counterpart, product and design leadership, mentoring staff and senior engineers, leading code reviews, fostering a culture of technical accuracy, psychological safety, user-centricity","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":307000,"maxValue":427000,"unitText":"YEAR"}}},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_53aedaa9-48a"},"title":"Engineering Manager, Payments API","description":"<p>We&#39;re seeking an experienced Engineering Manager to lead the team at the forefront of delivering best-in-class Checkout services that power Elements with CheckoutSession and define and drive our payment integrations into the future.</p>\n<p>You will work closely with Optimized Checkout teams, playing a critical role in owning the strategy and execution of our core Checkout primitives. Your leadership will ensure we balance rapid innovation with the five-nines reliability required to support the world&#39;s largest enterprises and the growing global economy.</p>\n<p>This is a role for a strategic leader who excels at navigating complex technical and product challenges. You will drive the evolution of our Checkout services, helping to abstract immense complexity into simple, powerful integrations for our users. Beyond the immediate technical goals, you will be a cultural pillar for the team, mentoring the next generation of technical leaders and shaping an engineering and product vision that defines the future of commerce.</p>\n<p>Responsibilities:</p>\n<ul>\n<li>Scope, design, build, and maintain APIs, services, and large-scale systems that reliably and efficiently handle billions of money movement requests</li>\n<li>Debug and solve critical production issues across services and multiple levels of the stack</li>\n<li>Partner with Engineering Managers to create roadmaps that deliver milestones toward a cohesive engineering vision</li>\n<li>Serve as a role model for our high engineering standards and bring consistency to the many codebases and processes you will encounter</li>\n<li>Arbitrate critical decisions correctly that fully consider software best practices, Stripe system realities, and numerous stakeholders&#39; preferences and concerns</li>\n<li>Collaborate with stakeholders across the organization such as experts product, design, infrastructure, and operations</li>\n<li>Teach and mentor the next generation of technical leaders at Stripe</li>\n</ul>\n<p>Requirements:</p>\n<ul>\n<li>3+ years of experience in an engineering management role</li>\n<li>8+ years of full-time software development experience</li>\n<li>Demonstrated track record of successfully hiring, developing, and mentoring engineers</li>\n<li>Ability to lead by example in fast-paced, high-impact, and ambiguous environments</li>\n<li>Commitment to high operational rigor when working with production systems</li>\n<li>Experience building extensible software solutions that scale with business growth</li>\n<li>Adept at thriving in a collaborative, cross-functional, and cross-timezone environment, while fostering a positive team culture</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_53aedaa9-48a","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Stripe","sameAs":"https://stripe.com/","logo":"https://logos.yubhub.co/stripe.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/stripe/jobs/7663636","x-work-arrangement":"remote","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["APIs","services","large-scale systems","money movement requests","production issues","software development","engineering management","team leadership","collaboration","communication"],"x-skills-preferred":[],"datePosted":"2026-03-31T18:14:40.307Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"New York, NY"}},"jobLocationType":"TELECOMMUTE","employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"APIs, services, large-scale systems, money movement requests, production issues, software development, engineering management, team leadership, collaboration, communication"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_9cda5783-7c4"},"title":"Backend Engineer, Payments and Risk","description":"<p>We&#39;re looking for a Backend Engineer to join our Payments and Risk team. As a Backend Engineer, you will play a key role in extending our balance management platform and building out a new funds accessibility platform leveraged by enterprises and SMBs alike.</p>\n<p>Our team collaborates with many cross-functional teams – from Infrastructure to Product – at Stripe to deliver innovative solutions that address evolving user needs.</p>\n<p>Responsibilities:</p>\n<ul>\n<li>Scope, design, build, and maintain APIs, services, and large-scale systems that reliably and efficiently handle billions of money movement requests</li>\n<li>Debug and solve critical production issues across services and multiple levels of the stack</li>\n<li>Mentor engineers to help them grow</li>\n<li>Collaborate with stakeholders across the company to build new features at large-scale, while improving internal engineering standards, tooling, and processes</li>\n<li>Collaborate effectively in a distributed and hybrid team, maintaining open communication and strong connections with colleagues</li>\n</ul>\n<p>Requirements:</p>\n<ul>\n<li>2-12+ years of industry software engineering experience</li>\n<li>Strong coding skills in any programming language</li>\n<li>Strong collaboration skills, can work across workstreams within your team and contribute to your peers&#39; success</li>\n<li>Have the ability to thrive on a high level of autonomy, responsibility, and think of yourself as entrepreneurial</li>\n<li>Interest in working as a generalist across varying technologies and stacks to solve problems and delight both internal and external users</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_9cda5783-7c4","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Stripe","sameAs":"https://stripe.com/","logo":"https://logos.yubhub.co/stripe.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/stripe/jobs/6163230","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["APIs","services","large-scale systems","money movement requests","production issues","critical issues","collaboration","engineering standards","tooling","processes"],"x-skills-preferred":["large-scale financial tracking systems","cloud-based services","gRPC","GraphQL","Docker","Kubernetes","AWS"],"datePosted":"2026-03-31T18:02:58.158Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"US"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"APIs, services, large-scale systems, money movement requests, production issues, critical issues, collaboration, engineering standards, tooling, processes, large-scale financial tracking systems, cloud-based services, gRPC, GraphQL, Docker, Kubernetes, AWS"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_57cab463-aa2"},"title":"Backend Engineer, Payments and Risk","description":"<p>We&#39;re looking for a Backend Engineer to join our Payments and Risk team. As a Backend Engineer, you will play a key role in extending our balance management platform and building out a new funds accessibility platform leveraged by enterprises and SMBs alike.</p>\n<p>Our team collaborates with many cross-functional teams – from Infrastructure to Product – at Stripe to deliver innovative solutions that address evolving user needs.</p>\n<p>Responsibilities:</p>\n<ul>\n<li>Scope, design, build, and maintain APIs, services, and large-scale systems that reliably and efficiently handle billions of money movement requests</li>\n<li>Debug and solve critical production issues across services and multiple levels of the stack</li>\n<li>Mentor engineers to help them grow</li>\n<li>Collaborate with stakeholders across the company to build new features at large-scale, while improving internal engineering standards, tooling, and processes</li>\n<li>Collaborate effectively in a distributed and hybrid team, maintaining open communication and strong connections with colleagues</li>\n</ul>\n<p>Requirements:</p>\n<ul>\n<li>2-12+ years of industry software engineering experience</li>\n<li>Strong coding skills in any programming language</li>\n<li>Strong collaboration skills, can work across workstreams within your team and contribute to your peers&#39; success</li>\n<li>Have the ability to thrive on a high level of autonomy, responsibility, and think of yourself as entrepreneurial</li>\n<li>Interest in working as a generalist across varying technologies and stacks to solve problems and delight both internal and external users</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_57cab463-aa2","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Stripe","sameAs":"https://stripe.com/","logo":"https://logos.yubhub.co/stripe.com.png"},"x-apply-url":"https://job-boards.greenhouse.io/stripe/jobs/7232592","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["APIs","services","large-scale systems","money movement requests","critical production issues","collaboration","engineering standards","tooling","processes"],"x-skills-preferred":["large-scale financial tracking systems","cloud-based services","gRPC","GraphQL","Docker/Kubernetes","AWS"],"datePosted":"2026-03-31T18:02:50.072Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"US"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"APIs, services, large-scale systems, money movement requests, critical production issues, collaboration, engineering standards, tooling, processes, large-scale financial tracking systems, cloud-based services, gRPC, GraphQL, Docker/Kubernetes, AWS"},{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_3514d749-08c"},"title":"Senior Support Engineer","description":"<p><strong>Senior Support Engineer - San Francisco</strong></p>\n<p><strong>Location</strong></p>\n<p>San Francisco</p>\n<p><strong>Employment Type</strong></p>\n<p>Full time</p>\n<p><strong>Department</strong></p>\n<p><strong>Compensation</strong></p>\n<ul>\n<li>$234K – $260K • Offers Equity</li>\n</ul>\n<p>The base pay offered may vary depending on multiple individualized factors, including market location, job-related knowledge, skills, and experience. If the role is non-exempt, overtime pay will be provided consistent with applicable laws. In addition to the salary range listed above, total compensation also includes generous equity, performance-related bonus(es) for eligible employees, and the following benefits.</p>\n<p><strong>Benefits</strong></p>\n<ul>\n<li>Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts</li>\n</ul>\n<ul>\n<li>Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)</li>\n</ul>\n<ul>\n<li>401(k) retirement plan with employer match</li>\n</ul>\n<ul>\n<li>Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)</li>\n</ul>\n<ul>\n<li>Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees</li>\n</ul>\n<ul>\n<li>13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)</li>\n</ul>\n<ul>\n<li>Mental health and wellness support</li>\n</ul>\n<ul>\n<li>Employer-paid basic life and disability coverage</li>\n</ul>\n<ul>\n<li>Annual learning and development stipend to fuel your professional growth</li>\n</ul>\n<ul>\n<li>Daily meals in our offices, and meal delivery credits as eligible</li>\n</ul>\n<ul>\n<li>Relocation support for eligible employees</li>\n</ul>\n<ul>\n<li>Additional taxable fringe benefits, such as charitable donation matching and wellness stipends, may also be provided.</li>\n</ul>\n<p><strong>About the Team</strong></p>\n<p>The Technical Support team is responsible for ensuring that developers and enterprises can reliably build mission critical solutions using OpenAI models. We provide technical guidance, resolve complex issues and support customers in maximizing value and adoption from deploying our highly-capable models. We work closely with Technical Success, Product, Engineering and others to deliver the best possible experience to our customers at scale. We think from an automation-first mindset and leverage the latest in AI to scale our support operations. Join the Senior Support Engineering (SSE) team at OpenAI and help shape the future of Technical Support in the age of AI.</p>\n<p><strong>About the Role</strong></p>\n<p>We are looking for a Senior Support Engineer to collaborate directly with our strategic enterprise accounts and product teams, helping solve some of the most difficult problems faced by our Customers. You will be part of the best technical troubleshooting team at OpenAI, and our Customers and Engineering teams will look to you for technical guidance in addressing the most technically difficult issues in our environment.</p>\n<p>As a Senior Support Engineer, you will design and run operational processes to monitor our top strategic customers and a 24x7 response team. You’ll work closely with our Infrastructure and Engineering teams to deliver the best possible experience to customers at scale. Working directly with our most strategic Customers - You will be crucial to the success of the most innovative, disruptive, and high-scale AI solutions being built with the OpenAI API platform.</p>\n<p>The nature of this role will be low volume, high difficulty.</p>\n<p>This role is based in San Francisco, CA. We use a hybrid work model of 3 days in the office per week and offer relocation assistance to new employees.</p>\n<p><strong>In this role, you will:</strong></p>\n<ul>\n<li>Be among the foremost technical and troubleshooting experts for our API platform at OpenAI. You are the last line of defense before the core Engineering team.</li>\n</ul>\n<ul>\n<li>Proactively identify and implement opportunities to scale support operations by leveraging automation and advancements in AI technologies. Contribute to shaping the future of technical support in an AI-driven era.</li>\n</ul>\n<ul>\n<li>Configure and use advanced monitoring and alerting workflows to proactively detect customer impacting issues in real time.</li>\n</ul>\n<ul>\n<li>In partnership with engineering, contribute to reliability reviews and preparedness for new features, launches, or strategic customer requirement updates. Ensure that operational readiness (monitoring, alerting, and fallback plans) is in place for any such changes.</li>\n</ul>\n<ul>\n<li>Design and refine incident response processes and documentation across strategic customers, engineering and support teams.</li>\n</ul>\n<ul>\n<li>Analyze operational metrics and incident RCAs to identify areas for improvement. Proactively recommend and implement enhancements to monitoring dashboards, alert configurations, and support workflows.</li>\n</ul>\n<ul>\n<li>Provide support coverage during holidays and weekends based on business needs.</li>\n</ul>\n<p><strong>You might thrive in this role if you:</strong></p>\n<ul>\n<li>Have a Bachelor’s degree in Computer Science or a related field. A strong software engineering foundation is important for this role’s success.</li>\n</ul>\n<ul>\n<li>Have 8+ years of experience in technical operations roles such as SRE/NOC, designing monitoring systems and resolving production issues in fast-paced and mission-critical environments. A strong track record of troubleshooting complex technical problems at the systems level.</li>\n</ul>\n<ul>\n<li>Have deep familiarity with modern monitoring, alerting, and observability practices. Hands‑on experience setting up or managing metrics, logging, and tracing for distributed systems (e.g., understanding of SLIs/SLOs, alert tuning, dashboard creation).</li>\n</ul>\n<ul>\n<li>Have proven experience leading incident response for high‑severity outages or service disruptions. Able to perform real‑time incident coordination, root cause analysis, and communication with stakeholders.</li>\n</ul>\n<ul>\n<li>Are able to work effectively in a fast-paced environment, prioritize tasks, and manage multiple projects simultaneously.</li>\n</ul>\n<ul>\n<li>Are a strong communicator and team player, with excellent written and verbal communication skills.</li>\n</ul>\n<ul>\n<li>Are able to adapt to changing priorities and requirements, and are flexible in your approach to problem-solving.</li>\n</ul>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_3514d749-08c","directApply":true,"hiringOrganization":{"@type":"Organization","name":"OpenAI","sameAs":"https://jobs.ashbyhq.com","logo":"https://logos.yubhub.co/openai.com.png"},"x-apply-url":"https://jobs.ashbyhq.com/openai/5431666c-530b-49c0-b67e-32477f9eaf5e","x-work-arrangement":"hybrid","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":"$234K – $260K","x-skills-required":["Bachelor’s degree in Computer Science or a related field","8+ years of experience in technical operations roles such as SRE/NOC","Designing monitoring systems and resolving production issues in fast-paced and mission-critical environments","Troubleshooting complex technical problems at the systems level","Modern monitoring, alerting, and observability practices","Metrics, logging, and tracing for distributed systems","SLIs/SLOs, alert tuning, dashboard creation","Incident response for high‑severity outages or service disruptions","Real-time incident coordination, root cause analysis, and communication with stakeholders"],"x-skills-preferred":["Automation and advancements in AI technologies","Automation-first mindset and leveraging the latest in AI to scale support operations","Technical and troubleshooting expertise for API platform at OpenAI","Proactive identification and implementation of opportunities to scale support operations","Advanced monitoring and alerting workflows to proactively detect customer impacting issues in real time","Reliability reviews and preparedness for new features, launches, or strategic customer requirement updates","Operational readiness (monitoring, alerting, and fallback plans)","Incident response processes and documentation across strategic customers, engineering and support teams","Operational metrics and incident RCAs to identify areas for improvement","Enhancements to monitoring dashboards, alert configurations, and support workflows"],"datePosted":"2026-03-06T18:43:55.714Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Bachelor’s degree in Computer Science or a related field, 8+ years of experience in technical operations roles such as SRE/NOC, Designing monitoring systems and resolving production issues in fast-paced and mission-critical environments, Troubleshooting complex technical problems at the systems level, Modern monitoring, alerting, and observability practices, Metrics, logging, and tracing for distributed systems, SLIs/SLOs, alert tuning, dashboard creation, Incident response for high‑severity outages or service disruptions, Real-time incident coordination, root cause analysis, and communication with stakeholders, Automation and advancements in AI technologies, Automation-first mindset and leveraging the latest in AI to scale support operations, Technical and troubleshooting expertise for API platform at OpenAI, Proactive identification and implementation of opportunities to scale support operations, Advanced monitoring and alerting workflows to proactively detect customer impacting issues in real time, Reliability reviews and preparedness for new features, launches, or strategic customer requirement updates, Operational readiness (monitoring, alerting, and fallback plans), Incident response processes and documentation across strategic customers, engineering and support teams, Operational metrics and incident RCAs to identify areas for improvement, Enhancements to monitoring dashboards, alert configurations, and support workflows","baseSalary":{"@type":"MonetaryAmount","currency":"USD","value":{"@type":"QuantitativeValue","minValue":234000,"maxValue":260000,"unitText":"YEAR"}}}]}