{"version":"0.1","company":{"name":"YubHub","url":"https://yubhub.co","jobsUrl":"https://yubhub.co/jobs/skill/debugging-performance-issues"},"x-facet":{"type":"skill","slug":"debugging-performance-issues","display":"Debugging Performance Issues","count":1},"x-feed-size-limit":100,"x-feed-sort":"enriched_at desc","x-feed-notice":"This feed contains at most 100 jobs (the most recently enriched). For the full corpus, use the paginated /stats/by-facet endpoint or /search.","x-generator":"yubhub-xml-generator","x-rights":"Free to redistribute with attribution: \"Data by YubHub (https://yubhub.co)\"","x-schema":"Each entry in `jobs` follows https://schema.org/JobPosting. YubHub-native raw fields carry `x-` prefix.","jobs":[{"@context":"https://schema.org","@type":"JobPosting","identifier":{"@type":"PropertyValue","name":"YubHub","value":"job_d6450ee6-847"},"title":"Data Infrastructure Engineer","description":"<p><strong>About the Role</strong></p>\n<p>Cursor ships daily. Every release leaves signals behind: telemetry, prompts, completions, agent runs, sessions. Those signals power model improvement, evals, and experimentation. Data infrastructure is what turns them into something teams can trust.</p>\n<p>A lot of systems here started simple so we could move fast. Over time, the constraints change and the “good enough” version becomes the bottleneck. This role owns the full ladder: patch what should be patched, redesign what should be redesigned, ship the replacement, and operate it.</p>\n<p>Privacy guarantees are part of correctness. What we can retain and use depends on Privacy Mode and org configuration, and getting that wrong breaks a product promise. We choose work by business impact: what blocks product and model teams today, and what will block them next month.</p>\n<p><strong>Sample projects include...</strong></p>\n<ul>\n<li>A core pipeline started as a pragmatic reuse of infrastructure built for something else. It works, but it cannot guarantee properties downstream consumers now need (for example, point-in-time consistency). You design and ship the replacement while keeping the existing system running.</li>\n</ul>\n<ul>\n<li>A new product surface ships without instrumentation. You talk to the team, define what needs to be captured, and wire it through before the absence becomes anyone else’s problem.</li>\n</ul>\n<ul>\n<li>Eval coverage drops. You trace it to an instrumentation gap introduced weeks ago by a product change nobody flagged. You fix the gap, add a contract so it cannot recur, and ship the dashboard that would have caught it earlier.</li>\n</ul>\n<ul>\n<li>Multiple consumers depend on overlapping data. You design schema evolution and validation so changes in one place do not silently degrade the others.</li>\n</ul>\n<ul>\n<li>Storage costs rise faster than usage. You decide what is worth keeping, implement retention and compression, and delete what is not.</li>\n</ul>\n<p><strong>What we&#39;re looking for</strong></p>\n<p>We’re looking for someone who has built real systems at scale and cares about correctness, cost, and ergonomics.</p>\n<p>Strong signals include:</p>\n<ul>\n<li>Deep experience with Spark (Databricks or open-source Spark both count)</li>\n</ul>\n<ul>\n<li>Production experience with Ray Data</li>\n</ul>\n<ul>\n<li>Hands-on ownership of large data pipelines and storage systems</li>\n</ul>\n<ul>\n<li>Comfort debugging performance issues across client instrumentation, streaming, storage, and model-facing workflows, as well as, compute, storage, and networking layers</li>\n</ul>\n<ul>\n<li>Clear thinking about data modeling and long-term maintainability</li>\n</ul>\n<ul>\n<li>You have good judgment about when to patch and when to rebuild</li>\n</ul>\n<p>Nice to have</p>\n<ul>\n<li>Experience running or scaling ClickHouse</li>\n</ul>\n<ul>\n<li>Familiarity with dbt, Dagster, or similar orchestration and modeling tools</li>\n</ul>\n<p>We&#39;re in-person with cozy offices in North Beach, San Francisco and Manhattan, New York, replete with well-stocked libraries.</p>\n<p style=\"margin-top:24px;font-size:13px;color:#666;\">XML job scraping automation by <a href=\"https://yubhub.co\">YubHub</a></p>","url":"https://yubhub.co/jobs/job_d6450ee6-847","directApply":true,"hiringOrganization":{"@type":"Organization","name":"Cursor","sameAs":"https://cursor.com","logo":"https://logos.yubhub.co/cursor.com.png"},"x-apply-url":"https://cursor.com/careers/software-engineer-data-infrastructure","x-work-arrangement":"onsite","x-experience-level":"senior","x-job-type":"full-time","x-salary-range":null,"x-skills-required":["Spark","Ray Data","data pipelines","storage systems","debugging performance issues","data modeling","long-term maintainability"],"x-skills-preferred":["ClickHouse","dbt","Dagster"],"datePosted":"2026-03-08T00:17:58.290Z","jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Francisco"}},"employmentType":"FULL_TIME","occupationalCategory":"Engineering","industry":"Technology","skills":"Spark, Ray Data, data pipelines, storage systems, debugging performance issues, data modeling, long-term maintainability, ClickHouse, dbt, Dagster"}]}